### The company of this exercise is a social network. They decided to add a feature called: Recommended Friends, i.e. they suggest people you may know.
### A data scientist has built a model to suggest 5 people to each user. These potential friends will be shown on the user newsfeed. At first, the model is tested just on a random subset of users to see how it performs compared to the newsfeed without the new feature.
### The test has been running for some time and your boss asks you to check the results. You are asked to check, for each user, the number of pages visited during their first session since the test started. If this number increased, the test is a success. Specifically, your boss wants to know:
#### (1) Is the test winning? That is, should 100\% of the users see the Recommended Friends feature?
#### (2) Is the test performing similarly for all user segments or are there differences among different segments?
#### (3) If you identified segments that responded differently to the test, can you guess the reason? Would this change your point 1 conclusions?

### Load the package would be used

In [1]:
import pandas as pd
pd.set_option("display.max_columns", 10)
pd.set_option("display.width", 350)

from scipy import stats

import matplotlib.pyplot as plt
from matplotlib import rcParams
rcParams.update({"figure.autolayout": True})
import seaborn as sns
sns.set(style = "white")
sns.set(style = "whitegrid", color_codes = True)

In [2]:
user = pd.read_csv("../Datasets/Engagement_Test/user_table.csv")
test = pd.read_csv("../Datasets/Engagement_Test/test_table.csv")

### Look into data

In [3]:
print(user.shape)
print(test.shape)

print(user.head)
print(test.head)

print(user.info)
print(test.info)

print(len(user["user_id"]) == len(pd.unique(user["user_id"])))
print(len(test["user_id"]) == len(pd.unique(test["user_id"])))

(100000, 2)
(100000, 5)
<bound method NDFrame.head of        user_id signup_date
0           34  2015-01-01
1           59  2015-01-01
2          178  2015-01-01
3          285  2015-01-01
4          383  2015-01-01
...        ...         ...
99995  8999327  2015-08-31
99996  8999539  2015-08-31
99997  8999550  2015-08-31
99998  8999709  2015-08-31
99999  8999849  2015-08-31

[100000 rows x 2 columns]>
<bound method NDFrame.head of        user_id        date  browser  test  pages_visited
0       600597  2015-08-13       IE     0              2
1      4410028  2015-08-26   Chrome     1              5
2      6004777  2015-08-17   Chrome     0              8
3      5990330  2015-08-27   Safari     0              8
4      3622310  2015-08-07  Firefox     0              1
...        ...         ...      ...   ...            ...
99995  2698493  2015-08-21   Chrome     1              6
99996  3396864  2015-08-04   Chrome     0              5
99997  7507573  2015-08-06       IE     1          

### Data processing

In [4]:
dat = user.merge(test, on = "user_id", how = "inner")
dat["signup_date"] = pd.to_datetime(dat["signup_date"])
dat["date"] = pd.to_datetime(dat["date"])
dat.head()

Unnamed: 0,user_id,signup_date,date,browser,test,pages_visited
0,34,2015-01-01,2015-08-15,Chrome,0,6
1,59,2015-01-01,2015-08-12,Chrome,1,6
2,178,2015-01-01,2015-08-10,Safari,1,3
3,285,2015-01-01,2015-08-03,Opera,0,5
4,383,2015-01-01,2015-08-05,Firefox,1,9


#### (1) Is the test winning? That is, should 100% of the users see the Recommended Friends feature?

#### Overall

##### Define the function

In [5]:
def overall_ttest_mean(dat, variable, test):
    overall = stats.ttest_ind(dat[dat[test] == 1][variable], dat[dat[test] == 0][variable], equal_var = False)
    test_group = dat[dat[test] == 1][variable].mean()
    control_group = dat[dat[test] == 0][variable].mean()
    pvalue = overall.pvalue
    overall_result = {"test_group": test_group, "control_group": control_group, "pvalue": pvalue}
    return overall_result

In [6]:
overall_ttest_mean(dat = dat, variable = "pages_visited", test = "test")

{'test_group': 4.599692945727161,
 'control_group': 4.608393853067447,
 'pvalue': 0.5774523171559118}

#### (2) Is the test performing similarly for all user segments or are there differences among different segments?

#### Stratified test

##### Define the function

In [7]:
def stratified_ttest_mean(dat, stratified, variable, test):
    stratified_result = dat.groupby(stratified)[variable].agg({
        "test_group": lambda x: x[dat[test] == 1].mean(),
        "control_group": lambda x: x[dat[test] == 0].mean(),
        "p_value": lambda x: stats.ttest_ind(x[dat[test] == 1], x[dat[test] == 0], equal_var = False).pvalue
    }).reindex(["test_group", "control_group", "p_value"], axis = 1)
    return stratified_result.sort_values(by = "p_value")

##### Stratified by browser

In [8]:
stratified_ttest_mean(dat = dat, stratified = "browser", variable = "pages_visited", test = "test").reset_index()

is deprecated and will be removed in a future version. Use                 named aggregation instead.

    >>> grouper.agg(name_1=func_1, name_2=func_2)

  stratified_result = dat.groupby(stratified)[variable].agg({


Unnamed: 0,browser,test_group,control_group,p_value
0,Opera,0.0,4.546438,2.253e-321
1,Firefox,4.714259,4.600164,0.0005817199
2,Chrome,4.69068,4.613341,0.0009434084
3,IE,4.685985,4.598478,0.007829509
4,Safari,4.692336,4.63818,0.2411738


##### Stratified by date

In [9]:
stratified_ttest_mean(dat = dat, stratified = "date", variable = "pages_visited", test = "test").reset_index().sort_values(by = "date")

is deprecated and will be removed in a future version. Use                 named aggregation instead.

    >>> grouper.agg(name_1=func_1, name_2=func_2)

  stratified_result = dat.groupby(stratified)[variable].agg({


Unnamed: 0,date,test_group,control_group,p_value
19,2015-08-01,4.612378,4.55584,0.571976
12,2015-08-02,4.583475,4.498371,0.394535
13,2015-08-03,4.596356,4.664621,0.411361
6,2015-08-04,4.435338,4.547009,0.164519
30,2015-08-05,4.591736,4.58542,0.937803
23,2015-08-06,4.599778,4.581225,0.823933
26,2015-08-07,4.613122,4.598576,0.860229
21,2015-08-08,4.668805,4.711682,0.67455
20,2015-08-09,4.64934,4.701981,0.601104
11,2015-08-10,4.565242,4.638781,0.368794


#### (3) If you identified segments that responded differently to the test, can you guess the reason? Would this change your point 1 conclusions?

In [10]:
overall_ttest_mean(dat = dat[dat["browser"] != "Opera"], variable = "pages_visited", test = "test")

{'test_group': 4.694989417127971,
 'control_group': 4.609803639945011,
 'pvalue': 4.403954129457701e-08}

Still rush to make the decision to change. Need to consider the novelty effect.