# Analysing the Impact of Altering Paywall Headline on Subsription Rate

An A/B Test is a randomized experiment in which a two or more variants a variable are deployed to different segments of customers to determine which of the variant is most effective in boosting KPIs. Here are some scenarios where AB testing is used:  

 - Streaming services: Determining whether changing the movie recommendation algorithm increases user engagement
 - E-commerce: Determining which product page layout results in the highest proportion of checkouts
 - Product & Service Advertising: Determining if the usage of emojis in Advertisment headlines result in higher click rates

For this case, we will be analysing the A/B test results data of an online learning platform called MindMapers. MindMappers operates on a freemium model, some of its courses are free, but the advanced courses are locked behind a paywall. MindMappers provides a 2 week trial for its advanced courses. At the end of the trial period, users are directed to a paywall. Two paywall variants with differing headlines were compared in this A/B test. Below is an illustration of the control and test:

![](https://github.com/Qalif-R/AB_Test_Analysis/blob/main/ABill.png?raw=true)

In [1]:
#Loading the AB test results data
import pandas as pd
url = 'https://raw.githubusercontent.com/Qalif-R/AB_Test_Analysis/main/ab_testing_results.csv'
ab_results = pd.read_csv(url)
ab_results.head(10)

Unnamed: 0,uid,country,gender,group,device,subscribed
0,72629692,FRA,F,control,android,Yes
1,25633647,GBR,F,control,android,No
2,31206551,BRA,M,test,ios,No
3,87162368,USA,M,control,android,Yes
4,88562222,USA,M,test,android,No
5,90074796,USA,F,test,ios,No
6,13863366,FRA,F,test,ios,No
7,26390385,USA,F,control,android,Yes
8,49549715,BRA,M,control,ios,No
9,96945880,BRA,M,test,ios,Yes


Before we can analyse the results, we need to ensure that the A/B test was was deployed correctly. Randomness is key for the test to yield optimal results. Hence, the distribution of the **country,gender and device** features within the control and test groups should be  approximately equal to one another. This is to eliminate the possibility of a confounding variable impacting our results.
Also, the sample size for both groups would ideally be  large (>30) and roughly the same. Any violation of these criteria is considered to be sub-optimal practices which could lead to misleading test results. 

In [2]:
# Check the distribution of country for control and test groups
print(pd.crosstab(ab_results.country, ab_results.group, normalize='columns'))

group     control      test
country                    
AUS      0.021991  0.022340
BRA      0.196358  0.196336
CAN      0.030336  0.035630
DEU      0.082142  0.078954
ESP      0.042157  0.042144
FRA      0.062280  0.061598
GBR      0.060237  0.062210
MEX      0.125429  0.115502
TUR      0.078013  0.076156
USA      0.301056  0.309128


In [3]:
# Check the distribution of gender and device for control and test groups
ab_results.groupby('group')[['gender','device']].value_counts(normalize=True)

group    gender  device 
control  M       android    0.255161
                 ios        0.250641
         F       ios        0.250120
                 android    0.244078
test     F       android    0.251814
         M       ios        0.251071
                 android    0.250721
         F       ios        0.246393
dtype: float64

In [4]:
print('No.of customers in total sample: ' + str(ab_results['uid'].nunique()))
# check sample size of control and test groups
ab_results['group'].value_counts()

No.of customers in total sample: 45883


control    23009
test       22874
Name: group, dtype: int64

We can observe that the distribution of country, gender and device in both control and test along with the sample size of two groups
adhere to optimal A/B testing deployment.

We will now perform hypothesis test to draw a conclusion from the A/B test conducted.

## Hypothesis Test - Two Sample proportion test
Managment of MindMappers only considers the test paywall to be worth implementing if its subscription rate is 3% greater than that of the control group. Thus, the null and alternative hypotheses for our hypothesis test is as follows:

$\Large H_{0}: \quad p_t - p_c = 3\%$\
$\Large H_{1}: \quad p_t - p_c > 3\%$

where $p_t$ and $p_c$ are the subscription rates for the test and control group respectively. This is a one-sided two sample proportion test as we are testing if $p_t - p_c$ is **greater** than $3%$. We will be conducting this test at the **5% significance level**. We will assume that $H_0$ is true to conduct the test

### Calculating the Test Statistic
The test statistic, z can be calculated as follows:

$\Large z = \frac{(\hat{p}_t - \hat{p}_c) - 0.03}{SE(\hat{p}_t - \hat{p}_c)}$



$\large SE(\hat{p}_t - \hat{p}_c)$ is the standard error of $(\hat{p}_t - \hat{p}_c)$ where

$\large SE(\hat{p}_t - \hat{p}_c) = \sqrt{\frac{\hat{p}(1-\hat{p})}{n_t} + \frac{\hat{p}(1-\hat{p})}{n_c}}$

 - $\hat{p}_t$ and $\hat{p}_c$ are the sample subscription rates of the test and control groups respectively.
 - $\hat{p}$ is the weighted mean of $\hat{p}_t$ and $\hat{p}_c$
 - $n_t$ and $n_c$ are the sample sizes of the test and control groups respectively.

In [6]:
#Set significance level
alpha = 0.05

#Get the sample proportions and sample sizes
p_hat_samples = ab_results.groupby('group')['subscribed'].value_counts(normalize = True)
n = ab_results['group'].value_counts()


p_hat_t = p_hat_samples[('test','Yes')]
p_hat_c = p_hat_samples[('control','Yes')]

n_t = n['test']
n_c = n['control']

#Calculate difference of sample proportions
print(p_hat_t - p_hat_c)

0.04007702404356617


So the subscription rate of the test sample is 4% higher than that of the control sample. We need to determine if this is statistically significant.

In [7]:
import numpy as np
#Calculate weighted mean of sample proportions
p_hat = ((n_t * p_hat_t) + (n_c * p_hat_c))/(n_t + n_c)

#Calculate standard error of difference in sample proportions
se = np.sqrt((p_hat * (1-p_hat)) / n_t +
             (p_hat * (1-p_hat)) / n_c)
             
z_score = (p_hat_t - p_hat_c - 0.03)/se
print(z_score)

2.158522748076968


Now that we have caculated the test statistic, we can proceed to calculate the p-value. The p-value is defined as the probabilty of obtaining a result which is equal to or more extreme than the result observed, assuming the null hypothesis is true.
 - p_value > alpha indicates that there is evidence supporting $H_0$
 - p_value <= alpha indicates that there is evidence against $H_0$

In [8]:
from scipy.stats import norm
#Calculate p-value
p_value = 1 - norm.cdf(z_score)

print('p_value is: ' + str(round(p_value,3)))
print('alpha is: ' + str(alpha))

p_value is: 0.015
alpha is: 0.05


In [9]:
p_value <= alpha

True

Since the p_value is less than the significance level, there is evidence aganist the null hypothesis. We reject the null hypothesis.
**There is evidence at the 5% significance level that the subscription rate of the test group is 3% more than that of the control group.**
Hence MindMappers should go ahead and implement the proposed paywall headline! Moreover, we can calculate a 95% confidence interval for the difference in subscription rates as follows:

$ \Large (\hat{p}_t - \hat{p}_c) \pm [z_{\alpha/2} * SE(\hat{p}_t - \hat{p}_c)] $

In [30]:
conf_int_95 = [(p_hat_t - p_hat_c) - (norm.ppf(1-0.025)*se), (p_hat_t - p_hat_c) + (norm.ppf(1-0.025)*se)]  
print(conf_int_95)

[0.030926967962698025, 0.04922708012443431]


Hence there is a 95% chance that this interval encompasses the true increase in subscription rate when switching to the proposed paywall headline.