# Goal
A/B tests play a huge role in website optimization. Analyzing A/B tests data is a very important
data scientist responsibility. Especially, data scientists have to make sure that results are
reliable, trustworthy, and conclusions can be drawn.

Furthermore, companies often run tens, if not hundreds, of A/B tests at the same time. Manually
analyzing all of them would require lot of time and people. Therefore, it is common practice to
look at the typical A/B test analysis steps and try to automate as much as possible. This frees
up time for the data scientists to work on more high level topics.

Analyze results from an A/B test. Also, design an algorithm to automate some steps.

# Challenge Description



Company XYZ is a worldwide e-commerce site with localized versions of the site.

A data scientist at XYZ noticed that Spain-based users have a much higher conversion rate than
any other Spanish-speaking country. She therefore went and talked to the international team in
charge of Spain And LatAm to see if they had any ideas about why that was happening.

Spain and LatAm country manager suggested that one reason could be translation. All Spanishspeaking countries had the same translation of the site which was written by a Spaniard. They
agreed to try a test where each country would have its one translation written by a local. That is,
Argentinian users would see a translation written by an Argentinian, Mexican users by a Mexican
and so on. Obviously, nothing would change for users from Spain.

After they run the test however, they are really surprised cause the test is negative. I.e., it
appears that the non-localized translation was doing better!

Asked to:

* Confirm that the test is actually negative. That is, it appears that the old version of the
site with just one translation across Spain and LatAm performs better
* Explain why that might be happening. Are the localized translations really worse?
* If you identified what was wrong, design an algorithm that would return FALSE if the
same problem is happening in the future and TRUE if everything is good and the results
can be trusted.

# Data

## Columns:
### Test Table
* user_id : the id of the user. Unique by user. Can be joined to user id in the other table.
For each user, we just check whether conversion happens the first time they land on the
site since the test started.
* date : when they came to the site for the first time since the test started
* source : marketing channel: Ads, SEO, Direct . Direct means everything except for ads
and SEO. Such as directly typing site URL on the browser, downloading the app w/o
coming from SEO or Ads, referral friend, etc.
* device : device used by the user. It can be mobile or web
* browser_language : in browser or app settings, the language chosen by the user. It can
be EN, ES, Other (Other means any language except for English and Spanish)
* ads_channel : if marketing channel is ads, this is the site where the ad was displayed. It
can be: Google, Facebook, Bing, Yahoo ,Other. If the user didn't come via an ad, this
field is NA
* browser : user browser. It can be: IE, Chrome, Android_App, FireFox, Iphone_App,
Safari, Opera
* conversion : whether the user converted (1) or not (0). This is our label. A test is
considered successful if it increases the proportion of users who convert.
* test : users are randomly split into test (1) and control (0). Test users see the new
translation and control the old one. For Spain-based users, this is obviously always 0
since there is no change there.

### User Table`

* user_id : the id of the user. It can be joined to user id in the other table
* sex : user sex: Male or Female
* age : user age (self-reported)
* country : user country based on ip address


In [2]:
# read the data into python
import pandas as pd
import numpy as np
test_table = pd.read_csv('test_table.csv')
user_table = pd.read_csv('user_table.csv')

user_id              int64
date                object
source              object
device              object
browser_language    object
ads_channel         object
browser             object
conversion           int64
test                 int64
dtype: object

In [5]:
#look at the data type
test_table.dtypes

user_id              int64
date                object
source              object
device              object
browser_language    object
ads_channel         object
browser             object
conversion           int64
test                 int64
dtype: object

In [6]:
user_table.dtypes

user_id     int64
sex        object
age         int64
country    object
dtype: object

In [7]:
# look the several row of data
test_table.head()

Unnamed: 0,user_id,date,source,device,browser_language,ads_channel,browser,conversion,test
0,315281,12/3/2015,Direct,Web,ES,,IE,1,0
1,497851,12/4/2015,Ads,Web,ES,Google,IE,0,1
2,848402,12/4/2015,Ads,Web,ES,Facebook,Chrome,0,0
3,290051,12/3/2015,Ads,Mobile,Other,Facebook,Android_App,0,1
4,548435,11/30/2015,Ads,Web,ES,Google,FireFox,0,1


In [8]:
user_table.head()

Unnamed: 0,user_id,sex,age,country
0,765821,M,20,Mexico
1,343561,F,27,Nicaragua
2,118744,M,23,Colombia
3,987753,F,27,Venezuela
4,554597,F,20,Spain


In [13]:
test_table.describe()

Unnamed: 0,user_id,conversion,test
count,453321.0,453321.0,453321.0
mean,499937.514728,0.049579,0.476446
std,288665.193436,0.217073,0.499445
min,1.0,0.0,0.0
25%,249816.0,0.0,0.0
50%,500019.0,0.0,0.0
75%,749522.0,0.0,1.0
max,1000000.0,1.0,1.0


In [12]:
test_table.drop_duplicates

<bound method DataFrame.drop_duplicates of         user_id        date  source  device browser_language ads_channel  \
0        315281   12/3/2015  Direct     Web               ES         NaN   
1        497851   12/4/2015     Ads     Web               ES      Google   
2        848402   12/4/2015     Ads     Web               ES    Facebook   
3        290051   12/3/2015     Ads  Mobile            Other    Facebook   
4        548435  11/30/2015     Ads     Web               ES      Google   
...         ...         ...     ...     ...              ...         ...   
453316   425010   12/4/2015     SEO     Web               ES         NaN   
453317   826793   12/1/2015     SEO  Mobile               ES         NaN   
453318   514870   12/2/2015     Ads  Mobile               ES        Bing   
453319   785224   12/4/2015     SEO  Mobile               ES         NaN   
453320   241662   12/4/2015     Ads     Web               ES    Facebook   

            browser  conversion  test  
0   

In [16]:
user_table.describe()

Unnamed: 0,user_id,age
count,452867.0,452867.0
mean,499944.805166,27.13074
std,288676.264784,6.776678
min,1.0,18.0
25%,249819.0,22.0
50%,500019.0,26.0
75%,749543.0,31.0
max,1000000.0,70.0


In [23]:
user_table.drop_duplicates

<bound method DataFrame.drop_duplicates of         user_id sex  age    country
0        765821   M   20     Mexico
1        343561   F   27  Nicaragua
2        118744   M   23   Colombia
3        987753   F   27  Venezuela
4        554597   F   20      Spain
...         ...  ..  ...        ...
452862   756215   F   27  Venezuela
452863    36888   M   18  Argentina
452864   800559   M   28    Bolivia
452865   176584   M   19      Chile
452866   314649   M   24     Mexico

[452867 rows x 4 columns]>

In [38]:
distinct_user = np.unique(user_table['user_id'])
print('distinct user number:', distinct_user.size )
print('all user number:', user_table['user_id'].size)

distinct user number: 452867
all user number 452867


In [41]:
#Looks like the user table is busted and we have some user ids missing. When joining, we have to be careful to do not lose the user ids in the test table, but not in the user table.
test_user = np.unique(test_table['user_id'])
print('distinct user number of test table:', test_user.size )
print('all user number for user table:', user_table['user_id'].size)
print('different number of user between test and user table', test_user.size-user_table['user_id'].size)


distinct user number of test table: 453321
all user number for user table: 452867
different number of user between test and user table 454


In [60]:
# join two table
all_table = test_table.merge(user_table, on = 'user_id', how = 'left')
all_table.describe()

Unnamed: 0,user_id,conversion,test,age
count,453321.0,453321.0,453321.0,452867.0
mean,499937.514728,0.049579,0.476446,27.13074
std,288665.193436,0.217073,0.499445,6.776678
min,1.0,0.0,0.0,18.0
25%,249816.0,0.0,0.0,22.0
50%,500019.0,0.0,0.0,26.0
75%,749522.0,0.0,1.0,31.0
max,1000000.0,1.0,1.0,70.0


In [61]:
all_table.head()


Unnamed: 0,user_id,date,source,device,browser_language,ads_channel,browser,conversion,test,sex,age,country
0,315281,12/3/2015,Direct,Web,ES,,IE,1,0,M,32.0,Spain
1,497851,12/4/2015,Ads,Web,ES,Google,IE,0,1,M,21.0,Mexico
2,848402,12/4/2015,Ads,Web,ES,Facebook,Chrome,0,0,M,34.0,Spain
3,290051,12/3/2015,Ads,Mobile,Other,Facebook,Android_App,0,1,F,22.0,Mexico
4,548435,11/30/2015,Ads,Web,ES,Google,FireFox,0,1,M,19.0,Mexico


In [62]:
# look at the result of conversion to make sure it is ture spain converts much better than the rest of LatAm countries.

all_table.groupby(['country','test'],dropna=False).mean()

Unnamed: 0_level_0,Unnamed: 1_level_0,user_id,conversion,age
country,test,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Argentina,0,496688.319367,0.015071,27.124198
Argentina,1,499662.469888,0.013725,27.132782
Bolivia,0,496550.053333,0.049369,27.196937
Bolivia,1,499866.009508,0.047901,26.997309
Chile,0,505092.208566,0.048107,27.188268
Chile,1,498097.984622,0.051295,27.2328
Colombia,0,499438.290867,0.052089,27.178455
Colombia,1,498809.285926,0.050571,27.106777
Costa Rica,0,496111.016165,0.052256,27.222556
Costa Rica,1,498039.87165,0.054738,27.07399


In [None]:
# implement t-test on two test set for other than spain

