# Hypothesis testing: Completion Rate
The goal is to determine if the difference in completion rates between the Test (new design) and Control (old design) groups is statistically significant. This will help confirm if the redesign has had a meaningful impact on the completion rate.

## Completion Hypothesis:
#### Null hypothesis (H₀): The new design does not lead to a higher completion rate than the old design.
#### Alternative hypothesis (H₁): The new design leads to a higher completion rate than the old design.

In [1]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
import re
import scipy.stats as st
from scipy.stats import norm

In [2]:
control = pd.read_csv('../data/clean/time_spent_control_df.csv')
display(control)

test = pd.read_csv('../data/clean/time_spent_test_df.csv')
display(test)

Unnamed: 0.1,Unnamed: 0,client_id,visitor_id,visit_id,from_step,to_step,time_spent,is_error
0,0,1028,42237450_62128060588,557292053_87239438319_391157,start,step_1,0 days 00:00:49,False
1,1,1028,42237450_62128060588,557292053_87239438319_391157,step_1,step_2,0 days 00:01:12,False
2,2,1028,42237450_62128060588,557292053_87239438319_391157,step_2,step_3,0 days 00:04:35,False
3,3,1028,42237450_62128060588,557292053_87239438319_391157,step_3,step_1,0 days 00:01:51,True
4,4,1028,42237450_62128060588,557292053_87239438319_391157,step_1,step_2,0 days 00:00:22,False
...,...,...,...,...,...,...,...,...
91856,91856,9998346,292425655_16607136645,189177304_69869411700_783154,step_2,step_3,0 days 00:01:13,False
91857,91857,9998346,292425655_16607136645,189177304_69869411700_783154,step_3,step_1,0 days 00:01:55,True
91858,91858,9998346,292425655_16607136645,189177304_69869411700_783154,step_1,step_2,0 days 00:00:16,False
91859,91859,9998346,292425655_16607136645,189177304_69869411700_783154,step_2,step_3,0 days 00:00:14,False


Unnamed: 0.1,Unnamed: 0,client_id,visitor_id,visit_id,from_step,to_step,time_spent,is_error
0,0,555,402506806_56087378777,637149525_38041617439_716659,start,step_1,0 days 00:00:07,False
1,1,555,402506806_56087378777,637149525_38041617439_716659,step_1,step_2,0 days 00:00:32,False
2,2,555,402506806_56087378777,637149525_38041617439_716659,step_2,step_3,0 days 00:01:39,False
3,3,555,402506806_56087378777,637149525_38041617439_716659,step_3,confirm,0 days 00:00:20,False
4,4,647,66758770_53988066587,40369564_40101682850_311847,start,step_1,0 days 00:00:07,False
...,...,...,...,...,...,...,...,...
114524,114524,9999729,834634258_21862004160,870243567_56915814033_814203,step_2,step_3,0 days 00:00:39,False
114525,114525,9999729,834634258_21862004160,870243567_56915814033_814203,step_3,confirm,0 days 00:00:21,False
114526,114526,9999729,843385170_36953471821,493310979_9209676464_421146,start,step_1,0 days 00:01:22,False
114527,114527,9999729,843385170_36953471821,493310979_9209676464_421146,step_1,step_2,0 days 00:04:47,False


## First, let's test the differences between visit_ids

In [3]:
n_control = control['visit_id'].nunique()
n_test = test['visit_id'].nunique()

# Identify visits who completed the whole process
completed_visits_control = control[control['to_step'] == 'confirm']['visit_id'].nunique()
completed_visits_test = test[test['to_step'] == 'confirm']['visit_id'].nunique()

print(f"Total number of visits in the control group: {n_control}")
print(f"Visits who completed the process in the control group: {completed_visits_control}\n")
print(f"Total number of visits in the test group: {n_test}")
print(f"Visits who completed the process in the test group: {completed_visits_test}")

Total number of visits in the control group: 23793
Visits who completed the process in the control group: 15205

Total number of visits in the test group: 28570
Visits who completed the process in the test group: 18280


In [4]:
# Calculate proportions
p_control = completed_visits_control / n_control
p_test = completed_visits_test / n_test

display(p_control)
display(p_test)

0.639053503131173

0.6398319915995799

In [5]:
# Pooled proportion
pooled_p = (completed_visits_test + completed_visits_control) / (n_test + n_control)

In [6]:
# Standard error
se = np.sqrt(pooled_p * (1 - pooled_p) * (1 / n_test + 1 / n_control))

# Z-Statistic
z_stat = (p_test - p_control) / se
print(z_stat)

0.1847315286322637


In [7]:
# One-tailed p-value
p_value = 1 - norm.cdf(z_stat)

In [8]:
alpha = 0.05  # Significance level

if p_value < alpha:
    print(f"The result is statistically significant (p = {p_value:.4f}). Reject the null hypothesis.")
else:
    print(f"The result is not statistically significant (p = {p_value:.4f}). Fail to reject the null hypothesis.")

The result is not statistically significant (p = 0.4267). Fail to reject the null hypothesis.


## Now, let's try the differences between client_ids

In [9]:
nc_control = control['client_id'].nunique()
nc_test = test['client_id'].nunique()

# Identify clients who completed the whole process
completed_clients_control = control[control['to_step'] == 'confirm']['client_id'].nunique()
completed_clients_test = test[test['to_step'] == 'confirm']['client_id'].nunique()

print(f"Total number of clients in the control group: {nc_control}")
print(f"Clients who completed the process in the control group: {completed_clients_control}\n")
print(f"Total number of clients in the test group: {nc_test}")
print(f"Clients who completed the process in the test group: {completed_clients_test}")

Total number of clients in the control group: 20177
Clients who completed the process in the control group: 15225

Total number of clients in the test group: 24287
Clients who completed the process in the test group: 18293


In [10]:
# Calculate proportions
pc_control = completed_clients_control / nc_control
pc_test = completed_clients_test / nc_test

display(pc_control)
display(pc_test)

0.7545720374684046

0.7532013011075884

In [11]:
# Pooled proportion
pooled_pc = (completed_clients_test + completed_clients_control) / (nc_test + nc_control)

In [12]:
# Standard error
se = np.sqrt(pooled_p * (1 - pooled_pc) * (1 / nc_test + 1 / nc_control))

# Z-Statistic
zc_stat = (pc_test - pc_control) / se
print(z_stat)

0.1847315286322637


In [13]:
# One-tailed p-value
pc_value = 1 - norm.cdf(zc_stat)

In [14]:
alpha = 0.05  # Significance level

if pc_value < alpha:
    print(f"The result is statistically significant (p = {pc_value:.4f}). Reject the null hypothesis.")
else:
    print(f"The result is not statistically significant (p = {pc_value:.4f}). Fail to reject the null hypothesis.")

The result is not statistically significant (p = 0.6416). Fail to reject the null hypothesis.
