<a href="https://colab.research.google.com/github/coding-cosmos/AB-Testing/blob/main/AB_Testing.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Loading Data


In [None]:
import pandas as pd
import datetime
from datetime import date, timedelta
import plotly.graph_objects as go
import plotly.express as px
import plotly.io as pio
pio.templates.default = 'plotly_white'

control_data = pd.read_csv("/content/drive/MyDrive/Colab Notebooks/AB Testing/control_group.csv",sep = ';')
test_data = pd.read_csv("/content/drive/MyDrive/Colab Notebooks/AB Testing/test_group.csv",sep = ';')

In [None]:
control_data.head()

Unnamed: 0,Campaign Name,Date,Spend [USD],# of Impressions,Reach,# of Website Clicks,# of Searches,# of View Content,# of Add to Cart,# of Purchase
0,Control Campaign,1.08.2019,2280,82702.0,56930.0,7016.0,2290.0,2159.0,1819.0,618.0
1,Control Campaign,2.08.2019,1757,121040.0,102513.0,8110.0,2033.0,1841.0,1219.0,511.0
2,Control Campaign,3.08.2019,2343,131711.0,110862.0,6508.0,1737.0,1549.0,1134.0,372.0
3,Control Campaign,4.08.2019,1940,72878.0,61235.0,3065.0,1042.0,982.0,1183.0,340.0
4,Control Campaign,5.08.2019,1835,,,,,,,


In [None]:
test_data.head()


Unnamed: 0,Campaign Name,Date,Spend [USD],# of Impressions,Reach,# of Website Clicks,# of Searches,# of View Content,# of Add to Cart,# of Purchase
0,Test Campaign,1.08.2019,3008,39550,35820,3038,1946,1069,894,255
1,Test Campaign,2.08.2019,2542,100719,91236,4657,2359,1548,879,677
2,Test Campaign,3.08.2019,2365,70263,45198,7885,2572,2367,1268,578
3,Test Campaign,4.08.2019,2710,78451,25937,4216,2216,1437,566,340
4,Test Campaign,5.08.2019,2297,114295,95138,5863,2106,858,956,768


#Data Preparation

### Fix columns names

In [None]:
control_data.columns =["Campaign Name", "Date", "Amount Spent",
                        "Number of Impressions", "Reach", "Website Clicks",
                        "Searches Received", "Content Viewed", "Added to Cart",
                        "Purchases"]


test_data.columns = ["Campaign Name", "Date", "Amount Spent",
                        "Number of Impressions", "Reach", "Website Clicks",
                        "Searches Received", "Content Viewed", "Added to Cart",
                        "Purchases"]

### Check null values

In [None]:
control_data.isnull().sum()

Unnamed: 0,0
Campaign Name,0
Date,0
Amount Spent,0
Number of Impressions,1
Reach,1
Website Clicks,1
Searches Received,1
Content Viewed,1
Added to Cart,1
Purchases,1


In [None]:
test_data.isnull().sum()

Unnamed: 0,0
Campaign Name,0
Date,0
Amount Spent,0
Number of Impressions,0
Reach,0
Website Clicks,0
Searches Received,0
Content Viewed,0
Added to Cart,0
Purchases,0


### Fix the null values in control_data

- Using mean to fill the null values

In [None]:
control_data["Number of Impressions"]=control_data["Number of Impressions"].fillna(value = control_data["Number of Impressions"].mean())
control_data["Reach"] = control_data["Reach"].fillna(value = control_data["Reach"].mean())

control_data["Website Clicks"] = control_data["Website Clicks"].fillna(value = control_data["Website Clicks"].mean())
control_data["Searches Received"] = control_data["Searches Received"].fillna(value = control_data["Searches Received"].mean())

control_data["Content Viewed"] = control_data["Content Viewed"].fillna(value=control_data["Content Viewed"].mean())
control_data["Added to Cart"] = control_data["Added to Cart"].fillna(value=control_data["Added to Cart"].mean())
control_data["Purchases"] = control_data["Purchases"].fillna(value=control_data["Purchases"].mean())

- Sanity check

In [None]:
control_data.isnull().sum()

Unnamed: 0,0
Campaign Name,0
Date,0
Amount Spent,0
Number of Impressions,0
Reach,0
Website Clicks,0
Searches Received,0
Content Viewed,0
Added to Cart,0
Purchases,0


### Merge the both dataset

In [None]:
ab_data = control_data.merge(test_data,how='outer').sort_values(["Date"])
ab_data = ab_data.reset_index(drop = True)

ab_data


You are merging on int and float columns where the float values are not equal to their int representation.



Unnamed: 0,Campaign Name,Date,Amount Spent,Number of Impressions,Reach,Website Clicks,Searches Received,Content Viewed,Added to Cart,Purchases
0,Control Campaign,1.08.2019,2280,82702.0,56930.0,7016.0,2290.0,2159.0,1819.0,618.0
1,Test Campaign,1.08.2019,3008,39550.0,35820.0,3038.0,1946.0,1069.0,894.0,255.0
2,Control Campaign,10.08.2019,2149,117624.0,91257.0,2277.0,2475.0,1984.0,1629.0,734.0
3,Test Campaign,10.08.2019,2790,95054.0,79632.0,8125.0,2312.0,1804.0,424.0,275.0
4,Control Campaign,11.08.2019,2490,115247.0,95843.0,8137.0,2941.0,2486.0,1887.0,475.0
5,Test Campaign,11.08.2019,2420,83633.0,71286.0,3750.0,2893.0,2617.0,1075.0,668.0
6,Control Campaign,12.08.2019,2319,116639.0,100189.0,2993.0,1397.0,1147.0,1439.0,794.0
7,Test Campaign,12.08.2019,2831,124591.0,10598.0,8264.0,2081.0,1992.0,1382.0,709.0
8,Control Campaign,13.08.2019,2697,82847.0,68214.0,6554.0,2390.0,1975.0,1794.0,766.0
9,Test Campaign,13.08.2019,1972,65827.0,49531.0,7568.0,2213.0,2058.0,1391.0,812.0


- Sanity check : If we have both types in equal number

In [18]:
ab_data['Campaign Name'].value_counts()

Unnamed: 0_level_0,count
Campaign Name,Unnamed: 1_level_1
Control Campaign,30
Test Campaign,30


## A/B Testing

### Number of Impressions

In [21]:
fig = px.scatter(data_frame = ab_data,x="Number of Impressions", y = "Amount Spent", size = "Amount Spent",color = "Campaign Name",trendline = "ols")
fig.show()

- Control campaign has more impressions

### Number of website searches

In [25]:
labels = ["Total searches from Control Campaign", "Total searches from Test Campaign"]
counts = [control_data['Website Clicks'].sum(), test_data['Website Clicks'].sum()]
colors  = ['gold','lightgreen']

fig = go.Figure(data=[go.Pie(labels=labels,values = counts)])
fig.update_layout(title_text = "Control Vs Test : Searches")
fig.update_traces(hoverinfo='label+percent',textinfo='value',textfont_size=30,marker=dict(colors=colors,line=dict(color='black',width=3)))

- Test campaign resulted in more searches

### Content Viewed

In [26]:
label = ["Content Viewed from Control Campaign",
         "Content Viewed from Test Campaign"]
counts = [sum(control_data["Content Viewed"]),
          sum(test_data["Content Viewed"])]
colors = ['gold','lightgreen']
fig = go.Figure(data=[go.Pie(labels=label, values=counts)])
fig.update_layout(title_text='Control Vs Test: Content Viewed')
fig.update_traces(hoverinfo='label+percent', textinfo='value',
                  textfont_size=30,
                  marker=dict(colors=colors,
                              line=dict(color='black', width=3)))
fig.show()

- Content Campaign audience viewed more content
- Althogugh the difference is not much

### Number of product added in the cart

In [27]:
label = ["Products Added to Cart from Control Campaign",
         "Products Added to Cart from Test Campaign"]
counts = [sum(control_data["Added to Cart"]),
          sum(test_data["Added to Cart"])]
colors = ['gold','lightgreen']
fig = go.Figure(data=[go.Pie(labels=label, values=counts)])
fig.update_layout(title_text='Control Vs Test: Added to Cart')
fig.update_traces(hoverinfo='label+percent', textinfo='value',
                  textfont_size=30,
                  marker=dict(colors=colors,
                              line=dict(color='black', width=3)))
fig.show()

- More product are added in cart from control campaign

### Amount Spent

In [28]:
label = ["Amount Spent in Control Campaign",
         "Amount Spent in Test Campaign"]
counts = [sum(control_data["Amount Spent"]),
          sum(test_data["Amount Spent"])]
colors = ['gold','lightgreen']
fig = go.Figure(data=[go.Pie(labels=label, values=counts)])
fig.update_layout(title_text='Control Vs Test: Amount Spent')
fig.update_traces(hoverinfo='label+percent', textinfo='value',
                  textfont_size=30,
                  marker=dict(colors=colors,
                              line=dict(color='black', width=3)))
fig.show()

- More amount is spent in test campaign
- More views are generated and more product are added in cart by control campaign
- Control > Test

### Purchases made

In [29]:
label = ["Purchases Made by Control Campaign",
         "Purchases Made by Test Campaign"]
counts = [sum(control_data["Purchases"]),
          sum(test_data["Purchases"])]
colors = ['gold','lightgreen']
fig = go.Figure(data=[go.Pie(labels=label, values=counts)])
fig.update_layout(title_text='Control Vs Test: Purchases')
fig.update_traces(hoverinfo='label+percent', textinfo='value',
                  textfont_size=30,
                  marker=dict(colors=colors,
                              line=dict(color='black', width=3)))
fig.show()

- Little difference in number of purchases
- Control > Test for this
- As more sales with less amount spent on marketing

Control > Test

### Relation between **Content Viewed** & **Website Clicks**

In [30]:
figure = px.scatter(data_frame = ab_data,
                    x="Content Viewed",
                    y="Website Clicks",
                    size="Website Clicks",
                    color= "Campaign Name",
                    trendline="ols")
figure.show()

- Clicks : Test > Control
- Engagement : Control > Test

Control > Test

### Relation between **Content Viewed** & **Added to Cart**

In [31]:
figure = px.scatter(data_frame = ab_data,
                    x="Added to Cart",
                    y="Content Viewed",
                    size="Added to Cart",
                    color= "Campaign Name",
                    trendline="ols")
figure.show()

Control > Test

### Relation between **Added to Cart** & **Purchases**

In [32]:
figure = px.scatter(data_frame = ab_data,
                    x="Purchases",
                    y="Added to Cart",
                    size="Purchases",
                    color= "Campaign Name",
                    trendline="ols")
figure.show()

- Conversion rate is higher for test campaign

## Observations

We found that
- Control campgain => More sales & Engagement
- Conversion rate is higher for test campaign for product in cart
- Test campaign results in more sales as per product viewed and product added to cart
- Control campaign results in overall higher sales


## Conclusion

AS test campaign results in more sales according to product it can be used to market specific products to a specific audience but control campaign can be used to market product to a wider audience