![alt text](https://www.invespcro.com/blog/images/blog-images/ab-test-1-1.jpg)

## Table of Contents
- [Introduction](#intro)
- [Part I - Data Wrangling](#Data_wrangling)
- [Part II - Probability](#prob_test)
- [Part III - A/B Test](#A/B_test)
- [Part IV - A regression approach](#reg)
- [Part V -  Influences associated with time](#time)
- [Conclusion](#conc.)
- [Limitations](#lim)

<a id='intro'></a>
### Introduction
A/B tests are very commonly performed by data analysts and data scientists. For this project, I will be working to understand the results of an A/B test run by an e-commerce website. My goal is to work through this notebook to help the company understand if they should implement the new page, keep the old page, or perhaps run the experiment longer to make their decision.<br>

Dataset used is __ab_data.csv__ dataset.<br>

### Information about features in dataset.

- __user_id__ - unique identifier for each user
- __timestamp__ - associated date and time for each visit to the website by a given user
- __group__ - the category a user was grouped into pre-A/B test (control or treatment groups)
- __landing_page__ - the page that was displayed to a user when they visited the company website (new_page or old_page)
- __converted__ - whether a user converted or not (0 or 1) NB: Users in the control group ought to be displayed the old page, while those in the treatment group ought to see the new page.

__Importing important libraries__

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

__Loading Data__

In [None]:
df = pd.read_csv('../input/ab-tests/ab_data.csv')

In [None]:
# Showing Top-5 rows from our data.
df.head()

<a id='Data_wrangling'></a>
## PART I- Data Wrangling

In [None]:
# Size of our data. 1472390 entries.
df.size

In [None]:
# Rows:294478, Columns:5
df.shape

In [None]:
# Number of unique users in our data.
df.user_id.nunique()

290584 unique users in our data.

In [None]:
# Null Values count in our data
df.isnull().sum()

No null values in our data.

In [None]:
# Total number of duplicate users in our data.
df.user_id.duplicated().sum()

This could mean that some users were experimented more than once. Or it could be some kind of error.<br>
We will deal with duplicate data later.

In [None]:
df.info()

Columns datatypes:
- user_id ___int64___
- timestamp ___object___
- group ___object___
- landing_page ___object___
- converted ___int64___

__Calculated the proportion of users who were converted from one group to another.__

In [None]:
(df['converted']==1).mean()

Approximately 12 percent of users were converted from one group to another but right now we don't know which group had higher conversion rate. We will find out later.

In [None]:
df.head()

In [None]:
df['group'].unique()

Their are two types of group in our `group` column:
- Control
- Treatment

In [None]:
df['landing_page'].unique()

Their are two types of page in our `landing_page`column:
- Old Page
- New Page

__The number of times the `new_page` and `treatment` don't line up.__

In [None]:
op = df.query('group == "treatment"')

In [None]:
op['landing_page'].value_counts()

We can see that number of times the new_page and treatment don't line up is __1965__.

__The number of times the `old_page` and `control` don't line up.__

In [None]:
op2 = df.query('group == "control"')

In [None]:
op2['landing_page'].value_counts()

We can see that number of times the old_page and control don't line up is __1928__.

So, now since we have the evidence that some groups and treatment don't line up as they should be, we can remove the rows which contain this erroneous data in order to make more accurate decisions further in our analysis.

In [None]:
# Getting the data where 'group' don't line up with 'landed_page'
error_df = df[((df['group']=='treatment')== (df['landing_page']=='new_page'))==False]

In [None]:
# List of indices of rows which contain erroneous data.
indx = error_df.index.to_list()

In [None]:
# Making another dataframe df2 in order to remove erroneous data.
df2 = df.copy()

In [None]:
# Dropping rows where 'group' don't line up with 'landed_page'.
df2.drop(index=indx,axis=0,inplace=True)

In [None]:
# Checking.
df2[((df2['group']=='treatment')== (df2['landing_page']=='new_page'))==False].shape[0]

__Checking unique user_ids in df2__

In [None]:
df2.user_id.nunique()

In [None]:
df2['user_id'].value_counts()

There is one user id 773192 which is repeated two times in our data.<br>
I will drop the row with this user id as we don't want duplicate data for our analysis.

In [None]:
# Row information for the repeated user id.
df2.query('user_id == "773192"')

In [None]:
# Dropping row.
df2.drop(index=2893,axis=0,inplace=True)

In [None]:
df2.user_id.duplicated().sum()

<a id='prob_test'></a>
## PART II- Probability test.

__What is the probability of an individual converting regardless of the page they receive?__

In [None]:
df2['converted'].mean()

the probability of an individual converting regardless of the page they receive is approx. __0.1196__.

__Given that an individual was in the `control` group, what is the probability they converted?__

In [None]:
df2[df2['group']=='control']['converted'].mean()

The probability of converting giving individual was in the control group is approx. __0.1203__.

__Given that an individual was in the `treatment` group, what is the probability they converted?__

In [None]:
df2[df2['group']=='treatment']['converted'].mean()

The probability of converting giving individual was in the treatment group is approx. __0.1188__.

__What is the probability that an individual received the new page?__

In [None]:
(df2['landing_page']=='new_page').sum()/df2.shape[0]

Probability that an individual received the new page is approx. __0.5001__.

In [None]:
df2.head(20
        )

From the above analysis I have made several conclusions:<br>
1- We cannot say that new treatment page leads to more conversion.<br>
2- Probability of conversion when an individual was in control group was higher than probability of conversion when an individual was in treatment group.<br>
3- Probability than an individual was given new page (__0.5001__) was higher than the probability of an individual who received old page (__0.4999__).


<a id='A/B_test'></a>
## PART III- A/B Test

__Null Hypothesis__: Conversion rate of an individual who was landed new page is smaller or equal to the conversion rate of an individual who was landed old page.<br>
__Alternative Hypothesis__:  Conversion rate of an individual who was landed new page is greater comapre to the conversion rate of an individual who was landed old page.<br>
                                $H_0$ (Null Hypothesis): $p_{new}$ < = $p_{old}$.<br>
                                $H_1$ (Alternative Hypothesis): $p_{new}$ > $p_{old}$.<br>


Assume under the null hypothesis, $p_{new}$ and $p_{old}$ both have "true" success rates equal to the **converted** success rate regardless of page - that is $p_{new}$ and $p_{old}$ are equal. Furthermore, assume they are equal to the **converted** rate in **ab_data.csv** regardless of the page.<br><br>
I am taking type-1 error rate as 5% (0.05) because of the following reasons:
- It is better to have low type-1 error rate. (Choosing alternative hypothesis when null hypothesis is correct is considered as type-1 error which is the worst type of error among type-1 and type-2).
- If our p-value comes out to be greater than 0.05 (type-1 error rate) than we can say that we have failed to reject null hypothesis. But if it is opposite, than we can reject null hypothesis and go forward with our alternatie hypothesis.

In [None]:
# Null Hypothesis Pnew = Pold
Pnew = df2['converted'].mean()
Pold = df2['converted'].mean()

In [None]:
# Sample size.
n_new= df2[df2['landing_page']=='new_page']['group'].shape[0]
n_old= df2[df2['landing_page']=='old_page']['group'].shape[0]

In [None]:
# Simulating n_new and n_old transactions with a convert rate of Pnew and Pold under the null.
new_page_converted = np.random.binomial(1,p=Pnew,size=n_new)
old_page_converted = np.random.binomial(1,p=Pold,size=n_old)

In [None]:
# Calculating the difference of the mean of new_page conversion rate and old_page conversion rate under the null.
_diff = new_page_converted.mean() - old_page_converted.mean()
print(_diff)

We can see that difference is approximately equal to 0 which means that sample size is good enough to perform hypothesis testing. I will perform the same steps above but this time I will iterate sample size over 10000 times in order to verify above result (__difference__). 

In [None]:
# Simulate 10,000 Pnew - Pold values using this same process above.
new = np.random.binomial(n_new,Pnew,10000)
old = np.random.binomial(n_old,Pold,10000)
p_diff = []
p_diff.append(new/n_new-old/n_old)

In [None]:
# Plot a histogram of the p_diffs.
plt.hist(p_diff);

This plot is exactly what I expected. Sampling distribution of p_diff is normally distributed.

__Calculating p-value.__<br>
p-value is the probability of observing your statistic (or one more extreme in favor of the alternative) if the null hypothesis is true.

In [None]:
# Calculating observed difference by taking conversion rate of new_page and old_page from df2 respectively.
# Also, plotting histogram to show the p_diff under null and observed difference value.
# If maximum p_diff values are present above the observed diff, than we can say that we have fail to reject null hypothesis.
obs_diff = df2[df2['landing_page']=='new_page']['converted'].mean() - df2[df2['landing_page']=='old_page']['converted'].mean()
plt.hist(p_diff)
plt.axvline(obs_diff,c='red');

The red line in the above plot shows the location of observed difference value which is approx. __-0.00157__.

In [None]:
(np.array(p_diff) > obs_diff).mean()

- p-value is 0.907.<br>
- Since p-value is greater than type-1 error rate (0.05), we can say that we have failed to reject null hypothesis.
- Conversion rate of new page is either smaller or equal to the conversion rate of old page which is our null hypothesis.

We could also use a built-in to achieve similar results.  Though using the built-in might be easier to code, the above portions are a walkthrough of the ideas that are critical to correctly thinking about statistical significance. We have calculated the number of conversions for each page, as well as the number of individuals who received each page. Let `n_old` and `n_new` refer the the number of rows associated with the old page and new pages, respectively.

In [None]:
import statsmodels.api as sm

convert_old = df2[df2['landing_page']=='old_page']['converted'].sum()
convert_new = df2[df2['landing_page']=='new_page']['converted'].sum()
n_old = df2[df2['landing_page']=='old_page']['group'].shape[0]
n_new = df2[df2['landing_page']=='new_page']['group'].shape[0]

In [None]:
sm.stats.proportions_ztest([convert_old,convert_new],[n_old,n_new],alternative='smaller')

I observed that using stats.proportions_ztest, p-value (0.905) is still close to what we got above (0.907). This means that conversion rate of new page is either small or equal to the conversion rate of old page (Null Hypothesis). This result agrees with my findings.<br><br>

<a id='reg'></a>
## PART IV- A regression approach

In [None]:
df2['ab_page'] = pd.get_dummies(df2.group)['treatment']

In [None]:
df2.head()

In [None]:
# Instantiate the model, and fit the model using the two columns 'intercept' and 'ab_page'.
df2['intercept'] = 1
li = sm.Logit(df2['converted'],df2[['intercept','ab_page']])
m = li.fit()
m.summary2()

The p-value obtained from above summary is 0.189 which is different from what we obtained in our A/B tests. This is because of change in null and alternative hypothesis.<br>
- $H_0$ (Null Hypothesis): $p_{new}$ - $p_{old}$ = 0.<br>
- $H_1$ (Alternative Hypothesis): $p_{new}$ - $p_{old}  !=0$.<br>

But it is still not statistically significant since p-value is greater than our type-1 error rate (0.05).

Now, I am considering other things that might influence whether or not an individual converts. I think it is a good idea to add other things because we might get to observed the conversion by an individual. But their are some disadvantages associated with adding more features:<br>
1- Whether a feature added is independent of predictor variables or not. If it is not than we can get errors.<br>
2- Adding features with high correlation factor with other features can damage our results and sometimes leads to wrong decisions. Thus we have to make sure that feature added is correlated to a certain level.

Now along with testing if the conversion rate changes for different pages, also add an effect based on which country a user lives. We will need to read in the countries.csv dataset and merge together our datasets on the approporiate rows.

In [None]:
# Loading countries data.
countries_df = pd.read_csv('../input/countries-data/countries.csv',index_col='user_id')

In [None]:
countries_df.head()

In [None]:
df3 = df2.set_index('user_id')

In [None]:
# Joining countries data with df2 on 'user_id' index as this feature is common in both dataframes.
df4 = countries_df.join(df3,on='user_id',how='inner')

In [None]:
df4.head()

In [None]:
df4.shape

In [None]:
df4['country'].unique()

In [None]:
# Converting labels 'UK' and 'US' from categorical variable to dummy variable. 
df4[['UK','US']] = pd.get_dummies(df4.country)[['UK','US']]
df4.head()

In [None]:
# Instantiate the model, and fit the model using the two columns created above.
li2 = sm.Logit(df4['converted'],df4[['intercept','ab_page','UK','US']])
m2 = li2.fit()
m2.summary2()

Does it appear that country had an impact on conversion?<br>
- No, it does not appear that country had an impact on conversion.
- Although p-value of 'UK' is very close to type-1 error rate 0.05, it is still not statistically significant.
- All p-values are greater than 0.05. This shows that adding 'country' feature does not influence the change in conversion rate of any page by an individual.

Though we have now looked at the individual factors of country and page on conversion, we would now like to look at an interaction between page and country to see if there significant effects on conversion. We will create the necessary additional columns, and fit the new model.

In [None]:
df4.head()

In [None]:
# Making another feature by interacting page and country feature.
df4['UK_treatment'] = df4['ab_page'] * df4['UK']
df4['US_treatment'] = df4['ab_page'] * df4['US']

In [None]:
df4.head()

In [None]:
# Instantiate the model, and fit the model using the two columns created above.
li3 = sm.Logit(df4['converted'],df4[['intercept','ab_page','UK','US','UK_treatment','US_treatment']])
m3 = li3.fit()
m3.summary2()

<a id='time'> </a>
## PART V- Influences associated with time

In [None]:
df4.head()

<br> 

__Trimming timestamp feature.__

In [None]:
df4['date'] = df4.timestamp.apply(lambda x: x[:10])

<br>

__Converting datatype of date feature from object to datetime.__

In [None]:
df4['date'] = pd.to_datetime(df4['date'])

In [None]:
# Checking datatype.
df4.info()

<br>

__Seperating 'year','month',and 'day' from date feature.__

In [None]:
df4['year'] = df4['date'].dt.year
df4['month'] = df4['date'].dt.month
df4['day'] = df4['date'].dt.day

In [None]:
# Adding week feature.
df4['week'] = df4['date'].dt.week

In [None]:
df4.head()

In [None]:
df4['week'].value_counts()

In [None]:
# Converting week feature into dummy variable and taken week 1 as baseline.
df4[['week_2','week_3','week_4']] = pd.get_dummies(df4.week)[[2,3,4]]

In [None]:
li4 = sm.Logit(df4['converted'],df4[['intercept','ab_page','UK','US','UK_treatment','US_treatment','week_2','week_3','week_4']])
m4 = li4.fit()
m4.summary2()

<a id='conc.'></a>
# Conclusion

- From the summary in part-4 observed that their is no change in conversion rate.
- The interaction between page landed and country does not produce any significant changes in conversion rate since p-value is greater than 0.05.
- Overall, their is no evidence to say that conversion rate of new page is higher compare to old page since p-value for interaction of predictor variables with response variable is greater that alpha that is type-1 error rate(0.05).
- From the summary in part-5, we can say that their is still no change in conversion rate by adding weeks in which A/B test was carried out by the company. But p-value of week-4 is the lowest compare to other weeks. So, I conclude that if the company carry out A/B test for longer time than I think that change in conversion rate by an individual can be observed.
- Finally, I have concluded that group, webpage, country, and webpage given to an individual belonging to a particular country does not influence the change in conversion rate of new and old page.<br><br>
__Company should stick with the old page for now and try to add some more features or content in new page.__

<a id='lim'> </a>
# Limitations

- Not enough features in our dataset. Some more features might have helped us understand more about the conversion rate by user.
- Not enough time given for A/B test by the company. A/B test was carried out only for 24 days which I think are not enough to select one page and reject another.