# A/B Test Challenge



---

#### What is an A/B Test? 

It is a decision making support & research methodology that allow you to measure an impact of a change in a product (e.g.: a digital product). For this challenge you will analyse the data resulting of an A/B test performed on a digital product where a new set of sponsored ads are included.


#### Measure of success

Metrics are needed to measure the success of your product. They are typically split in the following categories: 

- __Enganged based metrics:__ number of users, number of downloads, number of active users, user retention, etc.

- __Revenue and monetization metrics:__ ads and affiliate links, subscription-based, in-app purchases, etc.

- __Technical metrics:__ service level indicators (uptime of the app, downtime of the app, latency).



---

## Metrics understanding

In this part you must analyse the metrics involved in the test. We will focus in the following metrics:

- Activity level + Daily active users (DAU).

- Click-through rate (CTR)

### Activity level

In the following part you must perform every calculation you consider necessary in order to answer the following questions:

- How many activity levels you can find in the dataset (Activity level of zero means no activity).

- What is the amount of users for each activity level.

- How many activity levels do you have per day and how many records per each activity level.

At the end of this section you must provide your conclusions about the _activity level_ of the users.

__Dataset:__ `activity_pretest.csv`

In [1]:
import pandas as pd
import numpy as np
from statsmodels.stats.weightstats import ztest
from statsmodels.stats.weightstats import ttest_ind

In [2]:
activity = pd.read_csv('./data/activity_pretest.csv')
activity.head()

Unnamed: 0,userid,dt,activity_level
0,a5b70ae7-f07c-4773-9df4-ce112bc9dc48,2021-10-01,0
1,d2646662-269f-49de-aab1-8776afced9a3,2021-10-01,0
2,c4d1cfa8-283d-49ad-a894-90aedc39c798,2021-10-01,0
3,6889f87f-5356-4904-a35a-6ea5020011db,2021-10-01,0
4,dbee604c-474a-4c9d-b013-508e5a0e3059,2021-10-01,0


In [3]:
levels = activity['activity_level'].unique()
print(len(levels))
levels

21


array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19, 20], dtype=int64)

In [4]:
act_levels = activity.groupby('activity_level').count()['userid']


In [5]:
activity.groupby(['dt','activity_level']).count().reset_index().groupby('dt').count()['activity_level']

dt
2021-10-01    21
2021-10-02    21
2021-10-03    21
2021-10-04    21
2021-10-05    21
2021-10-06    21
2021-10-07    21
2021-10-08    21
2021-10-09    21
2021-10-10    21
2021-10-11    21
2021-10-12    21
2021-10-13    21
2021-10-14    21
2021-10-15    21
2021-10-16    21
2021-10-17    21
2021-10-18    21
2021-10-19    21
2021-10-20    21
2021-10-21    21
2021-10-22    21
2021-10-23    21
2021-10-24    21
2021-10-25    21
2021-10-26    21
2021-10-27    21
2021-10-28    21
2021-10-29    21
2021-10-30    21
2021-10-31    21
Name: activity_level, dtype: int64

In [6]:
activity.groupby(['dt','activity_level']).count().reset_index().groupby('dt').sum()['userid']

dt
2021-10-01    60000
2021-10-02    60000
2021-10-03    60000
2021-10-04    60000
2021-10-05    60000
2021-10-06    60000
2021-10-07    60000
2021-10-08    60000
2021-10-09    60000
2021-10-10    60000
2021-10-11    60000
2021-10-12    60000
2021-10-13    60000
2021-10-14    60000
2021-10-15    60000
2021-10-16    60000
2021-10-17    60000
2021-10-18    60000
2021-10-19    60000
2021-10-20    60000
2021-10-21    60000
2021-10-22    60000
2021-10-23    60000
2021-10-24    60000
2021-10-25    60000
2021-10-26    60000
2021-10-27    60000
2021-10-28    60000
2021-10-29    60000
2021-10-30    60000
2021-10-31    60000
Name: userid, dtype: int64

### Daily active users (DAU)

![ab_test](./img/user_activity_ab_testing.JPG)


The daily active users (DAU) refers to the amount of users that are active per day (activity level of zero means no activity). You must perform the calculation of this metric and provide your insights about it.

__Dataset:__ `activity_pretest.csv`

In [7]:
dau = activity.loc[activity['activity_level'] != 0,:].groupby('dt').count()['userid']
dau

dt
2021-10-01    30634
2021-10-02    30775
2021-10-03    30785
2021-10-04    30599
2021-10-05    30588
2021-10-06    30639
2021-10-07    30637
2021-10-08    30600
2021-10-09    30902
2021-10-10    30581
2021-10-11    30489
2021-10-12    30715
2021-10-13    30761
2021-10-14    30716
2021-10-15    30637
2021-10-16    30708
2021-10-17    30741
2021-10-18    30694
2021-10-19    30587
2021-10-20    30795
2021-10-21    30705
2021-10-22    30573
2021-10-23    30645
2021-10-24    30815
2021-10-25    30616
2021-10-26    30673
2021-10-27    30661
2021-10-28    30734
2021-10-29    30723
2021-10-30    30628
2021-10-31    30519
Name: userid, dtype: int64

### Click-through rate (CTR)

![ab_test](./img/ad_click_through_rate_ab_testing.JPG)

Click-through rate (CTR) refers to the percentage of clicks that the user perform from the total amount ads showed to that user during a certain day. You must perform the analysis of this metric (e.g.: average CTR per day) and provide your insights about it.

__Dataset:__ `ctr_pretest.csv`

In [8]:
ctr_pretest = pd.read_csv('./data/ctr_pretest.csv')
ctr_pretest.head()

avg_ctr_per_day = ctr_pretest[['dt','ctr']].groupby('dt').mean()
print(avg_ctr_per_day['ctr'].mean())
print(np.std(avg_ctr_per_day['ctr']))


33.00024304382363
0.009219811514079199


---

## Pretest metrics 

In this section you will perform the analysis of the metrics using the dataset that includes the result for the test and control groups, but only for the pretest data (i.e.: prior to November 1st, 2021). You must provide insights about the metrics (__Activity level__, __DAU__ and __CTR__) and also perform an hyphotesis test in order to determine whether there is any statistical significant difference between the groups prior to the start of the experiment. You must try different approaches (i.e.: __z-test__ and __t-test__) and compare the results.


__Datasets:__ `activity_all.csv`, `ctr_all.csv`

In [9]:
act_all = pd.read_csv('./data/activity_all.csv', date_format='dt')
ctr_all = pd.read_csv('./data/ctr_all.csv', date_format='dt')
ctr_all.sort_values(by= 'dt')



Unnamed: 0,userid,dt,groupid,ctr
831712,d640dc32-7993-48d3-8735-0e684012a122,2021-10-01,1,35.09
829132,3bb6f17f-479c-4bd7-836f-5106e6a2505a,2021-10-01,1,35.74
829131,3da7f626-9e2a-4472-9786-97e5f8bab9bc,2021-10-01,1,32.00
829130,255170b9-4365-4f72-9208-04940b95c5c9,2021-10-01,1,31.32
829129,f29c4ffc-5dd9-4e32-8e66-0c1fcb1711c3,2021-10-01,1,31.66
...,...,...,...,...
778989,8eb1dc42-d4ca-4112-aff7-c478fcf563f9,2021-11-30,0,35.16
778990,47bc72e2-3005-4199-82ca-d245b65d147f,2021-11-30,0,35.66
778991,2b5b1d0f-96ea-4796-9bba-927aac40a9df,2021-11-30,0,32.94
778983,cf8e6005-a3f1-4232-9ced-443412d8ef33,2021-11-30,0,31.29


In [10]:
act_pre = act_all.loc[act_all['dt'] <= '2021-10-30',:]
act_pre1 = act_pre.loc[act_pre['groupid'] == 1,:]
act_pre0 = act_pre.loc[act_pre['groupid'] == 0,:]
print(act_pre0.info())
print(act_pre1.info())

<class 'pandas.core.frame.DataFrame'>
Index: 898530 entries, 0 to 3624659
Data columns (total 4 columns):
 #   Column          Non-Null Count   Dtype 
---  ------          --------------   ----- 
 0   userid          898530 non-null  object
 1   dt              898530 non-null  object
 2   groupid         898530 non-null  int64 
 3   activity_level  898530 non-null  int64 
dtypes: int64(2), object(2)
memory usage: 34.3+ MB
None
<class 'pandas.core.frame.DataFrame'>
Index: 901470 entries, 2 to 3624660
Data columns (total 4 columns):
 #   Column          Non-Null Count   Dtype 
---  ------          --------------   ----- 
 0   userid          901470 non-null  object
 1   dt              901470 non-null  object
 2   groupid         901470 non-null  int64 
 3   activity_level  901470 non-null  int64 
dtypes: int64(2), object(2)
memory usage: 34.4+ MB
None


In [11]:
dau_pre0 = act_pre0.loc[act_pre0['activity_level'] != 0,:].groupby('dt').count()['userid']
dau_pre1 = act_pre1.loc[act_pre1['activity_level'] != 0,:].groupby('dt').count()['userid']
print(ztest(dau_pre1,dau_pre0,alternative = "two-sided"))
print(ttest_ind(dau_pre1,dau_pre0,alternative = "two-sided"))

(1.2838487803542895, 0.1991948717043318)
(1.2838487803542895, 0.20430014922232073, 58.0)


In [12]:
'''P value for both tests is greater than 0.05 therefore we cannot reject the null hypothesis,
the means of the dau are not different before the experiment'''

'P value for both tests is greater than 0.05 therefore we cannot reject the null hypothesis,\nthe means of the dau are not different before the experiment'

In [13]:
ctr_pre = ctr_all.loc[ctr_all['dt'] <= '2021-10-30',:]
ctr_pre1 = ctr_pre.loc[ctr_pre['groupid'] == 1,:]
ctr_pre0 = ctr_pre.loc[ctr_pre['groupid'] == 0,:]
print(ctr_pre0.info())
print(ctr_pre1.info())

<class 'pandas.core.frame.DataFrame'>
Index: 459739 entries, 808703 to 1713715
Data columns (total 4 columns):
 #   Column   Non-Null Count   Dtype  
---  ------   --------------   -----  
 0   userid   459739 non-null  object 
 1   dt       459739 non-null  object 
 2   groupid  459739 non-null  int64  
 3   ctr      459739 non-null  float64
dtypes: float64(1), int64(1), object(2)
memory usage: 17.5+ MB
None
<class 'pandas.core.frame.DataFrame'>
Index: 460617 entries, 824040 to 1729058
Data columns (total 4 columns):
 #   Column   Non-Null Count   Dtype  
---  ------   --------------   -----  
 0   userid   460617 non-null  object 
 1   dt       460617 non-null  object 
 2   groupid  460617 non-null  int64  
 3   ctr      460617 non-null  float64
dtypes: float64(1), int64(1), object(2)
memory usage: 17.6+ MB
None


In [14]:
avg_ctr_pre1 = ctr_pre1[['dt','ctr']].groupby('dt').mean()
avg_ctr_pre0 = ctr_pre0[['dt','ctr']].groupby('dt').mean()

ztest(avg_ctr_pre1,avg_ctr_pre0,alternative = "two-sided")

(array([-0.08853166]), array([0.92945412]))

In [15]:
'''P value is 0.92 and higher than 0.05 so we cannot reject the null hypotesis 
(the average ctr is the same for both groups), before the test starts'''

'P value is 0.92 and higher than 0.05 so we cannot reject the null hypotesis \n(the average ctr is the same for both groups), before the test starts'

In [16]:
ttest_ind(avg_ctr_pre1,avg_ctr_pre0)

(array([-0.08853166]), array([0.92975911]), 58.0)

In [17]:
'''t test also rejects null hypotesis'''

't test also rejects null hypotesis'

---

## Experiment metrics 

In this section you must perform the same analysis as in the previous section, but using the data generated during the experiment (i.e.: after November 1st, 2021). You must provide insights about the metrics (__Activity level__, __DAU__ and __CTR__) and also perform an hyphotesis test in order to determine whether there is any statistical significant difference between the groups during the experiment. You must try different approaches (i.e.: __z-test__ and __t-test__) and compare the results.


__Datasets:__ `activity_all.csv`, `ctr_all.csv`

In [18]:
act_test = act_all.loc[act_all['dt'] > '2021-10-30',:]
act_test1 = act_test.loc[act_test['groupid'] == 1,:]
act_test0 = act_test.loc[act_test['groupid'] == 0,:]
print(act_test0.info())
print(act_test1.info())

<class 'pandas.core.frame.DataFrame'>
Index: 928481 entries, 879645 to 3659999
Data columns (total 4 columns):
 #   Column          Non-Null Count   Dtype 
---  ------          --------------   ----- 
 0   userid          928481 non-null  object
 1   dt              928481 non-null  object
 2   groupid         928481 non-null  int64 
 3   activity_level  928481 non-null  int64 
dtypes: int64(2), object(2)
memory usage: 35.4+ MB
None
<class 'pandas.core.frame.DataFrame'>
Index: 931519 entries, 879644 to 3659998
Data columns (total 4 columns):
 #   Column          Non-Null Count   Dtype 
---  ------          --------------   ----- 
 0   userid          931519 non-null  object
 1   dt              931519 non-null  object
 2   groupid         931519 non-null  int64 
 3   activity_level  931519 non-null  int64 
dtypes: int64(2), object(2)
memory usage: 35.5+ MB
None


In [19]:
dau_test0 = act_test0.loc[act_test0['activity_level'] != 0,:].groupby('dt').count()['userid']
dau_test1 = act_test1.loc[act_test1['activity_level'] != 0,:].groupby('dt').count()['userid']
print(ztest(dau_test1,dau_test0,alternative = "larger"))
print(ttest_ind(dau_test1,dau_test0,alternative = "larger"))

(28.670986070030523, 4.3891855327736243e-181)
(28.670986070030526, 5.0689403875943095e-37, 60.0)


In [20]:
'''P value is very low for both test meaning we can reject the null hipotesis and state that
dau were higher for the test group than for the control group during the experiment'''

'P value is very low for both test meaning we can reject the null hipotesis and state that\ndau were higher for the test group than for the control group during the experiment'

In [29]:
print(ztest(dau_test1,dau_pre1,alternative = "larger"))
print(ttest_ind(dau_test1,dau_pre1,alternative = "larger"))

(29.391254318243686, 3.551828336000857e-190)
(29.39125431824369, 3.1356653363564878e-37, 59.0)


In [None]:
'''DAU is bigger in the test group after the test starts'''

In [33]:
print(ztest(dau_test0,dau_pre0,alternative = "larger"))
print(ttest_ind(dau_test0,dau_pre0,alternative = "larger"))

(6.176292053931398, 3.281227229478719e-10)
(6.176292053931398, 3.29036628476891e-08, 59.0)


In [None]:
'''DAU is also larger for the control group after the test starts, so part of the activity increase can be time related'''

In [21]:
ctr_test = ctr_all.loc[ctr_all['dt'] > '2021-10-30',:]
ctr_test1 = ctr_test.loc[ctr_test['groupid'] == 1,:]
ctr_test0 = ctr_test.loc[ctr_test['groupid'] == 0,:]
print(ctr_test0.info())
print(ctr_test1.info())

<class 'pandas.core.frame.DataFrame'>
Index: 488668 entries, 0 to 2274129
Data columns (total 4 columns):
 #   Column   Non-Null Count   Dtype  
---  ------   --------------   -----  
 0   userid   488668 non-null  object 
 1   dt       488668 non-null  object 
 2   groupid  488668 non-null  int64  
 3   ctr      488668 non-null  float64
dtypes: float64(1), int64(1), object(2)
memory usage: 18.6+ MB
None
<class 'pandas.core.frame.DataFrame'>
Index: 894384 entries, 15973 to 2303407
Data columns (total 4 columns):
 #   Column   Non-Null Count   Dtype  
---  ------   --------------   -----  
 0   userid   894384 non-null  object 
 1   dt       894384 non-null  object 
 2   groupid  894384 non-null  int64  
 3   ctr      894384 non-null  float64
dtypes: float64(1), int64(1), object(2)
memory usage: 34.1+ MB
None


In [26]:
avg_ctr_test1 = ctr_test1[['dt','ctr']].groupby('dt').mean()
avg_ctr_test0 = ctr_test0[['dt','ctr']].groupby('dt').mean()

print(ztest(avg_ctr_test1,avg_ctr_test0,alternative = "larger"))
print(ttest_ind(avg_ctr_test1,avg_ctr_test0,alternative='larger'))


(array([29.83571824]), array([6.72526788e-196]))
(array([29.83571824]), array([5.42269504e-38]), 60.0)


In [23]:
'''p value is very small mutch smaller than any reasonable alpha so we can reject the nule hypothesis 
(test group ctr is not bigger than control group ctr). Therefore, test group ctr is bigger than control's after the test'''

"p value is very small mutch smaller than any reasonable alpha so we can reject the nule hypothesis \n(test group ctr is not bigger than control group ctr). Therefore, test group ctr is bigger than control's"

In [27]:
print(ztest(avg_ctr_test1,avg_ctr_pre1,alternative = "larger"))
print(ttest_ind(avg_ctr_test1,avg_ctr_pre1,alternative='larger'))

(array([29.32193033]), array([2.72467546e-189]))
(array([29.32193033]), array([3.57292858e-37]), 59.0)


In [None]:
'''p value is very small mutch smaller than any reasonable alpha so we can reject the nule hypothesis 
(test group ctr is not bigger than test group ctr before the experiment). Therefore, test group ctr is bigger after the test starts'''

In [30]:
print(ztest(avg_ctr_test0,avg_ctr_pre0,alternative = "two-sided"))
print(ttest_ind(avg_ctr_test0,avg_ctr_pre0,alternative='two-sided'))

(array([-1.0709066]), array([0.28421143]))
(array([-1.0709066]), array([0.28857278]), 59.0)


In [None]:
'''There are no statistically significant difference between control's group CTR before and after the experiment starts'''

---

## Conclusions

Please provide your conclusions after the analyses and your recommendation whether we may or may not implement the changes in the digital product.

In [25]:
# your-conclusions
'''The change we are testing results both in a an activity increase (DAU) as a CTR increase.
Therefore it should be implemented according to statistics'''



'The change we are testing results both in a an activity increase as a CTR increase.\nTherefore it should be implemented according to statistics'

---