# Hypothesis Testing

We conduct a z test on proportions, and t test on continous variables.

## Completion Rate

Null Hypothesis: The completion rate for the Test group (new design) is equal to the completion rate for the Control group (old design).

p_test = p_control

In [1]:
import numpy as np
import statsmodels.api as sm
import pandas as pd
import scipy.stats as st

In [2]:
control = pd.read_csv('../data/clean/merged_control.csv')
control = control.drop(columns = 'Unnamed: 0')
display(control)

test = pd.read_csv('../data/clean/merged_test.csv')
test = test.drop(columns = 'Unnamed: 0')
display(test)

Unnamed: 0,client_id,visitor_id,visit_id,process_step,date_time
0,1104,194240915_18158000533,543158812_46395476577_767725,start,2017-06-12 07:49:18
1,1104,194240915_18158000533,643221571_99977972121_69283,start,2017-06-20 22:31:33
2,1186,446844663_31615102958,507052512_11309370126_442139,start,2017-04-08 15:59:16
3,1186,446844663_31615102958,795373564_99931517312_810896,start,2017-04-08 18:05:02
4,1186,446844663_31615102958,795373564_99931517312_810896,step_1,2017-04-08 18:05:13
...,...,...,...,...,...
103401,9997391,494669706_3354361161,84654768_90613632047_633963,step_2,2017-04-05 15:41:34
103402,9997391,494669706_3354361161,84654768_90613632047_633963,step_3,2017-04-05 15:41:39
103403,9997470,395791369_55562604618,904791598_9725982898_416914,start,2017-04-20 20:04:38
103404,9997470,91394485_75296404278,655572400_94971272893_411965,start,2017-04-07 16:11:03


Unnamed: 0,client_id,visitor_id,visit_id,process_step,date_time
0,555,402506806_56087378777,637149525_38041617439_716659,start,2017-04-15 12:57:56
1,555,402506806_56087378777,637149525_38041617439_716659,step_1,2017-04-15 12:58:03
2,555,402506806_56087378777,637149525_38041617439_716659,step_2,2017-04-15 12:58:35
3,555,402506806_56087378777,637149525_38041617439_716659,step_3,2017-04-15 13:00:14
4,555,402506806_56087378777,637149525_38041617439_716659,confirm,2017-04-15 13:00:34
...,...,...,...,...,...
128686,9999729,843385170_36953471821,493310979_9209676464_421146,start,2017-04-20 14:21:27
128687,9999729,843385170_36953471821,493310979_9209676464_421146,step_1,2017-04-20 14:22:49
128688,9999729,843385170_36953471821,493310979_9209676464_421146,step_2,2017-04-20 14:27:36
128689,9999832,145538019_54444341400,472154369_16714624241_585315,start,2017-05-16 16:46:03


In [3]:
n_control = control['visit_id'].nunique()
n_test = test['visit_id'].nunique()

# Identify visits who completed the whole process
completed_visits_control = control[control['process_step'] == 'confirm']['visit_id'].nunique()
completed_visits_test = test[test['process_step'] == 'confirm']['visit_id'].nunique()

print(f"Total number of visits in the control group: {n_control}")
print(f"Visits who completed the process in the control group: {completed_visits_control}\n")
print(f"Total number of visits in the test group: {n_test}")
print(f"Visits who completed the process in the test group: {completed_visits_test}")

Total number of visits in the control group: 28527
Visits who completed the process in the control group: 13176

Total number of visits in the test group: 30799
Visits who completed the process in the test group: 16593


In [4]:
# Prepare the data for the Z-test
count = np.array([completed_visits_control, completed_visits_test])  # Success counts
nobs = np.array([n_control, n_test])  # Total observations

# Perform Two-proportion Z-test
z_stat, p_value = sm.stats.proportions_ztest(count, nobs)

# Print the results
print(f"Z-statistic: {z_stat:.4f}")
print(f"P-value: {p_value:.4f}")

# Interpret the results
alpha = 0.05  # Significance level
if p_value < alpha:
    print("Reject the null hypothesis: There is a significant difference in completion rates.")
else:
    print("Fail to reject the null hypothesis: No significant difference in completion rates.")

Z-statistic: -18.7103
P-value: 0.0000
Reject the null hypothesis: There is a significant difference in completion rates.


# Completion Rate with a Cost-Effectiveness Threshold

Null Hypothesis: The completion rate for the test group is equal to or greater than the completion rate for the control group increased by 5%.

p_test >= p_control * 1.05

Alternative Hypothesis: The completion rate for the test group is lower than the completion rate for the Control group increased by 5%.

p_test < p_control * 1.05

In [5]:
completed = np.array([13176, 16593]) # sucess counts for the control and test groups
total = np.array([28527, 30799])

completed[1] = round(completed[0] * 1.05)
completion_rate = completed / total
completion_rate

array([0.46187822, 0.4492029 ])

In [6]:
from statsmodels.stats.proportion import proportions_ztest
# Perform the two-proportion z-test
z_stat, p_value = proportions_ztest(completed, total, alternative='smaller')

# Print results
print(f"Z-statistic:", z_stat)
print(f"P-value:", p_value)

alpha = 0.05  # Significance level
if p_value < alpha:
    print("Reject the null hypothesis: The completion rate for the test group is lower than the completion rate for the control group increased by 5%.")
else:
    print("Fail to reject the null hypothesis: No significant difference in completion rates.")

Z-statistic: 3.0974589919506594
P-value: 0.9990240630021752
Fail to reject the null hypothesis: No significant difference in completion rates.


# Age

You might want to test whether the average age of clients engaging with the new process is the same as those engaging with the old process.

In [7]:
test = pd.read_csv('../data/clean/client_id_test.csv')
control = pd.read_csv('../data/clean/client_id_control.csv')
info = pd.read_csv('../data/clean/total_client_info.csv')
info

Unnamed: 0,client_id,Variation,client_tenure_year,client_tenure_month,client_age,gender,account_number,balance,calls_6_months,logons_6_months
0,9988021,Test,5.0,64.0,79.0,U,2.0,189023.86,1.0,4.0
1,8320017,Test,22.0,274.0,34.5,M,2.0,36001.90,5.0,8.0
2,4033851,Control,12.0,149.0,63.5,M,2.0,142642.26,5.0,8.0
3,1982004,Test,6.0,80.0,44.5,U,2.0,30231.76,1.0,4.0
4,9294070,Control,5.0,70.0,29.0,U,2.0,34254.54,0.0,3.0
...,...,...,...,...,...,...,...,...,...,...
50482,393005,Control,15.0,191.0,52.5,M,2.0,60344.67,1.0,4.0
50483,2908510,Control,21.0,252.0,34.0,M,3.0,141808.05,6.0,9.0
50484,7230446,Test,6.0,74.0,62.0,M,2.0,58778.11,2.0,5.0
50485,5230357,Test,23.0,278.0,30.5,M,2.0,61349.70,0.0,3.0


## Mean Ages

In [8]:
info_test = round(info[info['Variation'] == 'Test']['client_age'].mean(), 3)
info_control = round(info[info['Variation'] == 'Control']['client_age'].mean(), 3)
print('Mean client age test group:', info_test)
print('Mean client age control group:', info_control)

Mean client age test group: 47.164
Mean client age control group: 47.498


In [9]:
info_test = info[info['Variation'] == 'Test']['client_age']
info_control = info[info['Variation'] == 'Control']['client_age']
t_stat, p_value = st.ttest_ind(info_test, info_control, alternative = 'two-sided', equal_var = False)

print('t_stat, p_value:', round(t_stat, 2), round(p_value, 5))
if p_value < alpha:
    print('We succesfully reject the null hypothesis: Clients in the control group are older on average.')
else: 
    print('We fail to reject the null hypothesis.')

t_stat, p_value: -2.42 0.01569
We succesfully reject the null hypothesis: Clients in the control group are older on average.


# Tenure

## Mean Tenure

In [10]:
info_test = round(info[info['Variation'] == 'Test']['client_tenure_year'].mean(), 3)
info_control = round(info[info['Variation'] == 'Control']['client_tenure_year'].mean(), 3)
print('Mean client tenure test group:', info_test)
print('Mean client tenure control group:', info_control)

Mean client tenure test group: 11.983
Mean client tenure control group: 12.088


In [11]:
info_test = info[info['Variation'] == 'Test']['client_tenure_year']
info_control = info[info['Variation'] == 'Control']['client_tenure_year']
t_stat, p_value = st.ttest_ind(info_test, info_control, alternative = 'two-sided', equal_var = False)

print('t_stat, p_value:', round(t_stat, 2), round(p_value, 5))
if p_value < alpha:
    print('We succesfully reject the null hypothesis: Clients in the control group have been with Vanguard for more years.')
else: 
    print('We fail to reject the null hypothesis.')

t_stat, p_value: -1.71 0.08647
We fail to reject the null hypothesis.
