# A/B Testing for ShoeFly.com
The online store: ShoeFly.com is performing an A/B Test.
They have two different versions of an ad, which they have placed in emails, 
as well as in banner ads on Facebook, Twitter and Google. 
They want to know how the ads are performing on each of the different platforms on each day of the week.
*(Codecademy Project)*

In [2]:
# Preperation
import pandas as pd
ad_clicks = pd.read_csv("ABTesting/ad_click.csv")

In [3]:
# Initial data frame
ad_clicks.head()

Unnamed: 0,user_id,utm_source,day,ad_click_timestamp,experimental_group
0,008b7c6c-7272-471e-b90e-930d548bd8d7,google,6 - Saturday,7:18,A
1,009abb94-5e14-4b6c-bb1c-4f4df7aa7557,facebook,7 - Sunday,,B
2,00f5d532-ed58-4570-b6d2-768df5f41aed,twitter,2 - Tuesday,,A
3,011adc64-0f44-4fd9-a0bb-f1506d2ad439,google,2 - Tuesday,,B
4,012137e6-7ae7-4649-af68-205b4702169c,facebook,7 - Sunday,,B


## Analyzing the Ad Sources

In [5]:
# Add a column to the initial data frame to check if the user visited the site by clicking an ad
ad_clicks["is_ad_click"] = ~ad_clicks.ad_click_timestamp.isnull()

# Create a new data frame grouping the initial data frame by utm_source and is_ad_click
clicks_by_source = ad_clicks.groupby(["utm_source", "is_ad_click"]).user_id.count().reset_index()

# Pivot data frame for an better overview
clicks_by_source_pivot = clicks_by_source.pivot(columns="is_ad_click", index="utm_source", values="user_id").reset_index()

# Add a column to the pivoted data frame to display the percent of ad clicks
clicks_by_source_pivot["percent_clicked"] = clicks_by_source_pivot[True] / (clicks_by_source_pivot[True] + clicks_by_source_pivot[False]) * 100

clicks_by_source_pivot

is_ad_click,utm_source,False,True,percent_clicked
0,email,175,80,31.372549
1,facebook,324,180,35.714286
2,google,441,239,35.147059
3,twitter,149,66,30.697674


## Analyzing an A/B Test

In [7]:
# Create a new data frame grouping the initial data frame by experimental_group and is_ad_click
clicks_by_group = ad_clicks.groupby(["experimental_group", "is_ad_click"]).user_id.count().reset_index()

# Pivot data frame for a better overview
clicks_by_group_pivot = clicks_by_group.pivot(columns="is_ad_click", index="experimental_group", values="user_id").reset_index()

# Add a column to the pivoted data frame to display the percent of ad clicks
clicks_by_group_pivot["percent_clicked"] = clicks_by_group_pivot[True] / (clicks_by_group_pivot[True] + clicks_by_group_pivot[False]) * 100

clicks_by_group_pivot

is_ad_click,experimental_group,False,True,percent_clicked
0,A,517,310,37.484885
1,B,572,255,30.834341


## Analyzing by day

In [9]:
# Creating two seperate data frames for the results of the A and B experimental group
a_clicks = ad_clicks[ad_clicks.experimental_group == "A"]
b_clicks = ad_clicks[ad_clicks.experimental_group == "B"]

In [10]:
# Analyze group A by day
a_click_group = a_clicks.groupby(["is_ad_click", "day"]).user_id.count().reset_index()
a_click_pivot = a_click_group.pivot(columns="is_ad_click", index="day", values="user_id").reset_index()
a_click_pivot["percent_clicked"] = a_click_pivot[True] / (a_click_pivot[True] + a_click_pivot[False]) * 100

a_click_pivot

is_ad_click,day,False,True,percent_clicked
0,1 - Monday,70,43,38.053097
1,2 - Tuesday,76,43,36.134454
2,3 - Wednesday,86,38,30.645161
3,4 - Thursday,69,47,40.517241
4,5 - Friday,77,51,39.84375
5,6 - Saturday,73,45,38.135593
6,7 - Sunday,66,43,39.449541


In [11]:
#Analyzing group B by day
b_click_group = b_clicks.groupby(["is_ad_click", "day"]).user_id.count().reset_index()
b_click_pivot = b_click_group.pivot(columns="is_ad_click", index="day", values="user_id").reset_index()
b_click_pivot["percent_clicked"] = b_click_pivot[True] / (b_click_pivot[True] + b_click_pivot[False]) * 100

b_click_pivot

is_ad_click,day,False,True,percent_clicked
0,1 - Monday,81,32,28.318584
1,2 - Tuesday,74,45,37.815126
2,3 - Wednesday,89,35,28.225806
3,4 - Thursday,87,29,25.0
4,5 - Friday,90,38,29.6875
5,6 - Saturday,76,42,35.59322
6,7 - Sunday,75,34,31.192661


## Conclusion
After reviewing the results of the A/B Test I would definitely recommend the Ad "A" over Ad "B".
Ad "A" outperforms Ad "B" overall.
In the week day specific analysis we can see Ad "A" performes much better on nearly every day of the week but tuesday and there it is only slighty behind Ad "B".

Putting Ads up on Facebook and google seems to make the most sence since, most of the ad clicks happen there and are also more frequently than Ad clicks in Emails or on Twitter.