# Hypothesis Testing

We conduct a z test on proportions, and t test on continous variables.

## Completion Rate

Null Hypothesis: The completion rate for the Test group (new design) is equal to the completion rate for the Control group (old design).

p_test = p_control

In [29]:
import numpy as np
import statsmodels.api as sm
import pandas as pd
import scipy.stats as st
from statsmodels.stats.proportion import proportions_ztest

In [69]:
control = pd.read_csv('../data/clean/merged_control.csv')
control = control.drop(columns = 'Unnamed: 0')
display(control.head(5))

test = pd.read_csv('../data/clean/merged_test.csv')
test = test.drop(columns = 'Unnamed: 0')
display(test.head(5))

Unnamed: 0,client_id,visitor_id,visit_id,process_step,date_time
0,1104,194240915_18158000533,543158812_46395476577_767725,start,2017-06-12 07:49:18
1,1104,194240915_18158000533,643221571_99977972121_69283,start,2017-06-20 22:31:33
2,1186,446844663_31615102958,507052512_11309370126_442139,start,2017-04-08 15:59:16
3,1186,446844663_31615102958,795373564_99931517312_810896,start,2017-04-08 18:05:02
4,1186,446844663_31615102958,795373564_99931517312_810896,step_1,2017-04-08 18:05:13


Unnamed: 0,client_id,visitor_id,visit_id,process_step,date_time
0,555,402506806_56087378777,637149525_38041617439_716659,start,2017-04-15 12:57:56
1,555,402506806_56087378777,637149525_38041617439_716659,step_1,2017-04-15 12:58:03
2,555,402506806_56087378777,637149525_38041617439_716659,step_2,2017-04-15 12:58:35
3,555,402506806_56087378777,637149525_38041617439_716659,step_3,2017-04-15 13:00:14
4,555,402506806_56087378777,637149525_38041617439_716659,confirm,2017-04-15 13:00:34


In [4]:
n_control = control['visit_id'].nunique()
n_test = test['visit_id'].nunique()

# Identify visits who completed the whole process
completed_visits_control = control[control['process_step'] == 'confirm']['visit_id'].nunique()
completed_visits_test = test[test['process_step'] == 'confirm']['visit_id'].nunique()

print(f"Total number of visits in the control group: {n_control}")
print(f"Visits who completed the process in the control group: {completed_visits_control}\n")
print(f"Total number of visits in the test group: {n_test}")
print(f"Visits who completed the process in the test group: {completed_visits_test}")

Total number of visits in the control group: 28527
Visits who completed the process in the control group: 13176

Total number of visits in the test group: 30799
Visits who completed the process in the test group: 16593


## Check for Validity

We assume the underlying population is binomial (Probability of x successes in an experiment of success probability p with n trials) when we conduct a Z-test for proportions because the Z-test is based on approximating the sampling distribution of the sample proportion to be normal, using the Central Limit Theorem. The binomial distribution becomes nearly normal as the sample size increases, making the Z-test applicable.

The variance on a proportion is given by: var = p * (1 - p) / n

In [58]:
n = 120
p = 0.95
120 * 0.0139

1.668

In [59]:
# Prepare the data for the Z-test
count = np.array([completed_visits_test, completed_visits_control])  # Success counts
nobs = np.array([n_test, n_control])  # Total observations

# Calculate completion rates for test and control
p = count / nobs

# Calculate variance for test and control
var = p * (1 - p) / nobs

print(f'Completion rate in the test group: {round(p[0],3)}')
print(f'Variance of the completion rate in the test group: {var[0]}')
print(f'Completion rate in the control group: {round(p[1],3)}')
print(f'Variance of the completion rate in the control group: {var[0]}')

Completion rate in the test group: 0.539
Variance of the completion rate in the test group: 8.068389882502444e-06
Completion rate in the control group: 0.462
Variance of the completion rate in the control group: 8.068389882502444e-06


In [20]:
# Perform Two-proportion Z-test
z_stat, p_value = sm.stats.proportions_ztest(count, nobs)

# Print the results
print(f"Z-statistic: {z_stat:.4f}")
print(f"P-value: {p_value}")

# Interpret the results
alpha = 0.05  # Significance level
if p_value < alpha:
    print("Reject the null hypothesis: There is a significant difference in completion rates.")
else:
    print("Fail to reject the null hypothesis: No significant difference in completion rates.")

Z-statistic: 18.7103
P-value: 4.081095438268519e-78
Reject the null hypothesis: There is a significant difference in completion rates.


# Completion Rate with a Cost-Effectiveness Threshold

Null Hypothesis: The completion rate for the test group is equal to or greater than the completion rate for the control group increased by 5%.

p_test >= p_control * 1.05

In [61]:
count = np.array([completed_visits_test, completed_visits_control])  # Success counts (test, control)
nobs = np.array([n_test, n_control])  # Total observations (test, control)

count[1] = round(count[1] * 1.05)
p = count / nobs

print(f'Completion rate in the test group: {round(p[0],3)}')
print(f'Completion rate in the control group: {round(p[1],3)}')

Completion rate in the test group: 0.539
Completion rate in the control group: 0.485


In [64]:
# Perform the two-proportion z-test
z_stat, p_value = proportions_ztest(count, nobs)

# Print results
print(f"Z-statistic:", z_stat)
print(f"P-value:", p_value)

alpha = 0.05  # Significance level
if p_value < alpha:
    print("Reject the null hypothesis: The completion rate for the test group is higher than the completion rate for the control group increased by 5%.")
else:
    print("Fail to reject the null hypothesis: No significant difference in completion rates.")

Z-statistic: 13.0919825885706
P-value: 3.65929914189837e-39
Reject the null hypothesis: The completion rate for the test group is lower than the completion rate for the control group increased by 5%.


# Errors

In [73]:
print(f"Total number of visits in the control group: {74848}")
print(f"Visits who completed the process in the control group: {4053}\n")
print(f"Total number of visits in the test group: {97803}")
print(f"Visits who completed the process in the test group: {7590}")

Total number of visits in the control group: 74848
Visits who completed the process in the control group: 4053

Total number of visits in the test group: 97803
Visits who completed the process in the test group: 7590


In [76]:
count = np.array([7590, 4053]) # Error counts
nobs = np.array([97803, 74848]) # Total observations

p_err = count / nobs * 100
print(p_err)
z_stat, p_value = sm.stats.proportions_ztest(count, nobs)

# Print the results
print(f"Z-statistic: {z_stat:.4f}")
print(f"P-value: {p_value}")

[7.76049814 5.41497435]
Z-statistic: 19.2590
P-value: 1.1857961039234477e-82


# Age

You might want to test whether the average age of clients engaging with the new process is the same as those engaging with the old process.

In [34]:
test = pd.read_csv('../data/clean/client_id_test.csv')
control = pd.read_csv('../data/clean/client_id_control.csv')
info = pd.read_csv('../data/clean/total_client_info.csv')
info

Unnamed: 0,client_id,Variation,client_tenure_year,client_tenure_month,client_age,gender,account_number,balance,calls_6_months,logons_6_months
0,9988021,Test,5.0,64.0,79.0,U,2.0,189023.86,1.0,4.0
1,8320017,Test,22.0,274.0,34.5,M,2.0,36001.90,5.0,8.0
2,4033851,Control,12.0,149.0,63.5,M,2.0,142642.26,5.0,8.0
3,1982004,Test,6.0,80.0,44.5,U,2.0,30231.76,1.0,4.0
4,9294070,Control,5.0,70.0,29.0,U,2.0,34254.54,0.0,3.0
...,...,...,...,...,...,...,...,...,...,...
50482,393005,Control,15.0,191.0,52.5,M,2.0,60344.67,1.0,4.0
50483,2908510,Control,21.0,252.0,34.0,M,3.0,141808.05,6.0,9.0
50484,7230446,Test,6.0,74.0,62.0,M,2.0,58778.11,2.0,5.0
50485,5230357,Test,23.0,278.0,30.5,M,2.0,61349.70,0.0,3.0


## Mean Ages

In [35]:
info_test = round(info[info['Variation'] == 'Test']['client_age'].mean(), 3)
info_control = round(info[info['Variation'] == 'Control']['client_age'].mean(), 3)
print('Mean client age test group:', info_test)
print('Mean client age control group:', info_control)

Mean client age test group: 47.164
Mean client age control group: 47.498


In [66]:
info_test = info[info['Variation'] == 'Test']['client_age']
info_control = info[info['Variation'] == 'Control']['client_age']
t_stat, p_value = st.ttest_ind(info_test, info_control, alternative = 'two-sided', equal_var = False)

print('t_stat, p_value:', round(t_stat, 2), round(p_value, 5))
if p_value < alpha:
    print('We succesfully reject the null hypothesis: Clients in the control group are older on average.')
else: 
    print('We fail to reject the null hypothesis.')

1 - p_value

t_stat, p_value: -2.42 0.01569
We succesfully reject the null hypothesis: Clients in the control group are older on average.


np.float64(0.9843072805386113)

# Tenure

## Mean Tenure

In [37]:
info_test = round(info[info['Variation'] == 'Test']['client_tenure_year'].mean(), 3)
info_control = round(info[info['Variation'] == 'Control']['client_tenure_year'].mean(), 3)
print('Mean client tenure test group:', info_test)
print('Mean client tenure control group:', info_control)

Mean client tenure test group: 11.983
Mean client tenure control group: 12.088


In [67]:
info_test = info[info['Variation'] == 'Test']['client_tenure_year']
info_control = info[info['Variation'] == 'Control']['client_tenure_year']
t_stat, p_value = st.ttest_ind(info_test, info_control, alternative = 'two-sided', equal_var = False)

print('t_stat, p_value:', round(t_stat, 2), round(p_value, 5))
if p_value < alpha:
    print('We succesfully reject the null hypothesis: Clients in the control group have been with Vanguard for more years.')
else: 
    print('We fail to reject the null hypothesis.')
1 - p_value

t_stat, p_value: -1.71 0.08647
We fail to reject the null hypothesis.


np.float64(0.9135259081867357)