# Goal
Many sites make money by selling ads. For these sites, the number of pages visited by users on each session is one of the most important metric, if not the most important metric. Data science plays a huge role here, especially by building models to suggest personalized content. In order to check if the model is actually improving engagement, companies then run A/B tests. It is often data scientist responsibility to analyze test data and understand whether the model has been successful. The goal of this project is to look at A/B test results and draw conclusions.

# Challenge Description
The company of this exercise is a social network. They decided to add a feature called: Recommended Friends, i.e. they suggest people you may know. A data scientist has built a model to suggest 5 people to each user. These potential friends will be shown on the user newsfeed. At ﬁrst, the model is tested just on a random subset of users to see how it performs compared to the newsfeed without the new feature. The test has been running for some time and your boss asks you to check the results. You are asked to check, for each user, the number of pages visited during their ﬁrst session since the test started. If this number increased, the test is a success. 

# Question1
Is the test winning? That is, should 100% of the users see the Recommended Friends feature? Is the test performing similarly for all user segments or are there diﬀerences among diﬀerent segments? If you identiﬁed segments that responded diﬀerently to the test, can you guess the reason? Would this change your point 1 conclusions?


In [38]:
import pandas as pd
import numpy as np
from scipy import stats

In [8]:
test_file='data/Engagement_Test/test_table.csv'
user_file='data/Engagement_Test/user_table.csv'

In [9]:
test=pd.read_csv(test_file)
user=pd.read_csv(user_file)

In [12]:
test.describe()

Unnamed: 0,user_id,test,pages_visited
count,100000.0,100000.0,100000.0
mean,4511960.0,0.50154,4.60403
std,2596973.0,0.5,2.467845
min,34.0,0.0,0.0
25%,2271007.0,0.0,3.0
50%,4519576.0,1.0,5.0
75%,6764484.0,1.0,6.0
max,8999849.0,1.0,17.0


In [19]:
TestRate=test.loc[test["test"]==1].shape[0]/100000

In [20]:
TestRate

0.50154

## sanity check for test event rate, assuming experiment was set up 50% and 50%


In [26]:
upper=0.5+1.56*np.sqrt(0.5*0.5/100000)
lower=0.5-1.56*np.sqrt(0.5*0.5/100000)
print('test rate should be range between {0:.4f}~{1:.4f}'.format(lower,upper))

test rate should be range between 0.4975~0.5025


In [13]:
test.head()

Unnamed: 0,user_id,date,browser,test,pages_visited
0,600597,2015-08-13,IE,0,2
1,4410028,2015-08-26,Chrome,1,5
2,6004777,2015-08-17,Chrome,0,8
3,5990330,2015-08-27,Safari,0,8
4,3622310,2015-08-07,Firefox,0,1


In [28]:
len(test.user_id.unique())#user_id is unique 

100000

In [14]:
user.describe()

Unnamed: 0,user_id
count,100000.0
mean,4511960.0
std,2596973.0
min,34.0
25%,2271007.0
50%,4519576.0
75%,6764484.0
max,8999849.0


In [30]:
user.shape[0] #100000 records 

100000

In [31]:
len(user.user_id.unique())#confirm user_id is unique 

100000

In [15]:
user.head()

Unnamed: 0,user_id,signup_date
0,34,2015-01-01
1,59,2015-01-01
2,178,2015-01-01
3,285,2015-01-01
4,383,2015-01-01


In [34]:
data=test.merge(user,how='left',on='user_id')

In [36]:
data.head()

Unnamed: 0,user_id,date,browser,test,pages_visited,signup_date
0,600597,2015-08-13,IE,0,2,2015-01-19
1,4410028,2015-08-26,Chrome,1,5,2015-05-11
2,6004777,2015-08-17,Chrome,0,8,2015-06-26
3,5990330,2015-08-27,Safari,0,8,2015-06-25
4,3622310,2015-08-07,Firefox,0,1,2015-04-17


In [47]:
test_exp=test.loc[test['test']==1]
test_cont=test.loc[test['test']==0]


In [48]:
test_exp['pages_visited'].mean()


4.5996929457271607

In [49]:
test_cont['pages_visited'].mean()

4.6083938530674473

In [41]:
stats.ttest_ind(test_exp.pages_visited,test_cont.pages_visited,equal_var=False)
# ttest_ind assume equal variance by default

Ttest_indResult(statistic=-0.55711184355547971, pvalue=0.57745231715591183)

# Conclusion to question 1
for overall, change is not significant


In [68]:
def run_ttest(df):
    df_cont=df.loc[df['test']==0,'pages_visited']
    df_exp=df.loc[df['test']==1,'pages_visited']
    cont_mean=df_cont.mean()
    exp_mean=df_exp.mean()
    result=stats.ttest_ind(df_exp,df_cont,equal_var=False)
    conclusion='significant' if result.pvalue<0.05 else 'not significant'
    return pd.Series({'n_test':df_exp.shape[0],
                      'n_ctrl': df_cont.shape[0],
                      'mean_test': exp_mean,
                      'mean_ctrl': cont_mean,
                      'test-ctrl': exp_mean - cont_mean,
                      'pvalue':result.pvalue,
                      'conclusion':conclusion})
    

In [69]:
run_ttest(test)

conclusion    not significant
mean_ctrl             4.60839
mean_test             4.59969
n_ctrl                  49846
n_test                  50154
pvalue               0.577452
test-ctrl         -0.00870091
dtype: object

In [79]:
data['signup_date']=pd.to_datetime(data['signup_date'])
data['date']=pd.to_datetime(data['date'])

In [72]:
data.head()

Unnamed: 0,user_id,date,browser,test,pages_visited,signup_date
0,600597,2015-08-13,IE,0,2,2015-01-19
1,4410028,2015-08-26,Chrome,1,5,2015-05-11
2,6004777,2015-08-17,Chrome,0,8,2015-06-26
3,5990330,2015-08-27,Safari,0,8,2015-06-25
4,3622310,2015-08-07,Firefox,0,1,2015-04-17


In [74]:
data.groupby('browser').apply(run_ttest)

Unnamed: 0_level_0,conclusion,mean_ctrl,mean_test,n_ctrl,n_test,pvalue,test-ctrl
browser,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
Chrome,significant,4.613341,4.69068,21453,21974,0.0009434084,0.077339
Firefox,significant,4.600164,4.714259,10972,10786,0.0005817199,0.114095
IE,significant,4.598478,4.685985,10906,10974,0.007829509,0.087507
Opera,significant,4.546438,0.0,1109,1018,2.253e-321,-4.546438
Safari,not significant,4.63818,4.692336,5406,5402,0.2411738,0.054156


In [77]:
data.loc[(data['browser']=='Opera')&(data['test']==1)].head()

Unnamed: 0,user_id,date,browser,test,pages_visited,signup_date
16,151456,2015-08-27,Opera,1,0,2015-01-05
52,2757666,2015-08-03,Opera,1,0,2015-03-23
290,2115329,2015-08-04,Opera,1,0,2015-03-03
295,4319847,2015-08-30,Opera,1,0,2015-05-08
297,8251676,2015-08-19,Opera,1,0,2015-08-19


# conclusion for question 2
Improvement is significant for browsers like Chrome, Firefox, IE.
In Opera, experiment group page visited are all ZERO. it shows a bug either in page visited count, or one bug which blocks user to further clicks. 
In Safari, the change in pages_visited is not significant. Maybe because in this browser, friend recommendation is not set up properly. 

In [82]:
data['month']=data['date'].apply(lambda x:x.month)

In [87]:
np.min(data.date)

Timestamp('2015-08-01 00:00:00')

In [88]:
np.max(data.date)

Timestamp('2015-08-31 00:00:00')

In [100]:
data['days_latency']=(data['date']-data['signup_date']).dt.days

In [102]:
data['first_time']=data['days_latency'].apply(lambda x: True if x  == 0 else False)

In [103]:
data.groupby('first_time').apply(run_ttest)

Unnamed: 0_level_0,conclusion,mean_ctrl,mean_test,n_ctrl,n_test,pvalue,test-ctrl
first_time,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
False,not significant,4.603284,4.622379,39890,40109,0.261837,0.019095
True,significant,4.628867,4.509109,9956,10045,0.001742,-0.119758


# conclusion
This experiment is significant for new sign up users. And in experiment group, users' pages visited are less than control group, which is different from experiment expectation.
while for existing users, experiment group pages visited are slightly higher than control group, while this change is not significant. 


In [104]:
data.groupby(['browser','first_time']).apply(run_ttest)

Unnamed: 0_level_0,Unnamed: 1_level_0,conclusion,mean_ctrl,mean_test,n_ctrl,n_test,pvalue,test-ctrl
browser,first_time,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
Chrome,False,significant,4.607945,4.701512,17092,17525,0.0002290889,0.093567
Chrome,True,not significant,4.634488,4.648011,4361,4449,0.8149175,0.013523
Firefox,False,significant,4.59059,4.757306,8842,8657,3.692901e-06,0.166716
Firefox,True,not significant,4.639906,4.53922,2130,2129,0.2210706,-0.100686
IE,False,significant,4.590576,4.721494,8744,8779,0.0002669847,0.130918
IE,True,not significant,4.630435,4.543964,2162,2195,0.2808421,-0.086471
Opera,False,significant,4.594564,0.0,883,833,7.204927000000001e-255,-4.594564
Opera,True,significant,4.358407,0.0,226,185,1.222949e-68,-4.358407
Safari,False,not significant,4.638254,4.720973,4329,4315,0.1000829,0.08272
Safari,True,not significant,4.637883,4.578657,1077,1087,0.6015241,-0.059226


Overall recommendation is:
Recommend 5 users can significant improve user engagement for certain browsers, Chrome, Firefox, IE. Especially, it works well for existing users.
We need to double check for Opera, data showed some issues in either data collection, or feature implementation part. 
For Safari, it looks probelmatic as well. 
