<p> A/B Testing means analyzing two marketing strategies to choose the best marketing strategy that can convert more traffic into sales (or more traffic into your desired goal) effectively and efficiently. <p>

source of the dataset: https://statso.io/a-b-testing-case-study/

In [2]:
import pandas as pd
import datetime
from datetime import date, timedelta
import plotly.graph_objects as go
import plotly.express as px
import plotly.io as pio
pio.templates.default = 'plotly_white'

In [3]:
control_data = pd.read_csv("control_group.csv", sep = ";")
test_data = pd.read_csv("test_group.csv", sep = ";")

In [4]:
control_data.head()

Unnamed: 0,Campaign Name,Date,Spend [USD],# of Impressions,Reach,# of Website Clicks,# of Searches,# of View Content,# of Add to Cart,# of Purchase
0,Control Campaign,1.08.2019,2280,82702.0,56930.0,7016.0,2290.0,2159.0,1819.0,618.0
1,Control Campaign,2.08.2019,1757,121040.0,102513.0,8110.0,2033.0,1841.0,1219.0,511.0
2,Control Campaign,3.08.2019,2343,131711.0,110862.0,6508.0,1737.0,1549.0,1134.0,372.0
3,Control Campaign,4.08.2019,1940,72878.0,61235.0,3065.0,1042.0,982.0,1183.0,340.0
4,Control Campaign,5.08.2019,1835,,,,,,,


In [5]:
test_data.head()

Unnamed: 0,Campaign Name,Date,Spend [USD],# of Impressions,Reach,# of Website Clicks,# of Searches,# of View Content,# of Add to Cart,# of Purchase
0,Test Campaign,1.08.2019,3008,39550,35820,3038,1946,1069,894,255
1,Test Campaign,2.08.2019,2542,100719,91236,4657,2359,1548,879,677
2,Test Campaign,3.08.2019,2365,70263,45198,7885,2572,2367,1268,578
3,Test Campaign,4.08.2019,2710,78451,25937,4216,2216,1437,566,340
4,Test Campaign,5.08.2019,2297,114295,95138,5863,2106,858,956,768


In [6]:
control_data.columns = ["Campaign Name", "Date", "Amount Spent", "Number of Impression", 
                        "Reach", "Website Clicks", "Searches Received", "Content Viewed", 
                        "Added to Cart", "Purchases"]

In [7]:
test_data.columns = ["Campaign Name", "Date", "Amount Spent", "Number of Impression", 
                    "Reach", "Website Clicks", "Searches Received", "Content Viewed", 
                    "Added to Cart", "Purchases"]

In [8]:
control_data.isnull().sum()

Campaign Name           0
Date                    0
Amount Spent            0
Number of Impression    1
Reach                   1
Website Clicks          1
Searches Received       1
Content Viewed          1
Added to Cart           1
Purchases               1
dtype: int64

In [9]:
test_data.isnull().sum()

Campaign Name           0
Date                    0
Amount Spent            0
Number of Impression    0
Reach                   0
Website Clicks          0
Searches Received       0
Content Viewed          0
Added to Cart           0
Purchases               0
dtype: int64

fill in these missing values in control_data by the mean value of each column

In [10]:
control_data['Number of Impression'].fillna(value=control_data["Number of Impression"].mean(), inplace=True)
control_data["Reach"].fillna(value=control_data["Reach"].mean(), inplace=True)
control_data["Website Clicks"].fillna(value=control_data["Website Clicks"].mean(), inplace=True)
control_data["Searches Received"].fillna(value=control_data["Searches Received"].mean(), inplace=True)
control_data["Content Viewed"].fillna(value = control_data['Content Viewed'].mean(), inplace=True)
control_data["Added to Cart"].fillna(value=control_data["Added to Cart"].mean(), inplace=True)
control_data["Purchases"].fillna(value=control_data['Purchases'].mean(), inplace=True)

In [11]:
ab_data = control_data.merge(test_data, how = "outer").sort_values(["Date"])
ab_data = ab_data.reset_index(drop=True)
ab_data.head()

  ab_data = control_data.merge(test_data, how = "outer").sort_values(["Date"])


Unnamed: 0,Campaign Name,Date,Amount Spent,Number of Impression,Reach,Website Clicks,Searches Received,Content Viewed,Added to Cart,Purchases
0,Control Campaign,1.08.2019,2280,82702.0,56930.0,7016.0,2290.0,2159.0,1819.0,618.0
1,Test Campaign,1.08.2019,3008,39550.0,35820.0,3038.0,1946.0,1069.0,894.0,255.0
2,Test Campaign,10.08.2019,2790,95054.0,79632.0,8125.0,2312.0,1804.0,424.0,275.0
3,Control Campaign,10.08.2019,2149,117624.0,91257.0,2277.0,2475.0,1984.0,1629.0,734.0
4,Test Campaign,11.08.2019,2420,83633.0,71286.0,3750.0,2893.0,2617.0,1075.0,668.0


In [12]:
ab_data["Campaign Name"].value_counts()

Campaign Name
Control Campaign    30
Test Campaign       30
Name: count, dtype: int64

Analyzing the relationship between the number of impression of control_data and test_data

In [47]:

figure = px.scatter(data_frame = ab_data,
                    x = "Number of Impression",
                    y = "Amount Spent",
                    # size= "Campaign Name",
                    color = "Campaign Name",
                    trendline = "ols",
                    size_max=10000)

figure.show()

In [19]:
label = ["Total Searches from Control Campaign", "Total Searches from Test Campaign"]
counts = [sum(control_data["Searches Received"]), sum(test_data["Searches Received"])]
colors = ['baby blue', 'pink']

In [32]:
fig = go.Figure(data=[go.Pie(labels=label, 
                             values=counts)])
fig.update_layout(title_text = 'Control vs Test: Searches')
fig.update_traces(hoverinfo = 'label+percent', 
                  textinfo = 'value',
                  textfont_size = 15,
                  marker = dict(colors = colors,
                                line = dict(color = 'black', width = 0.5)))

In [21]:
label = ["Website Clicks from Control Campaign",
         "Website Clicks from Test Campaign"]
counts = [sum(control_data["Website Clicks"]),
          sum(control_data["Website Clicks"])]
colors = ['baby blue', 'pink']

In [33]:
fig = go.Figure(data = [go.Pie(labels = label,
                               values= counts)])
fig.update_layout(title_text = 'Control vs Test Data: Website Clicks')
fig.update_traces(hoverinfo = 'label+percent+value',
                  textinfo = 'value',
                  textfont_size = 20,
                  marker = dict(colors = colors,
                                line = dict(color = 'black', width = 1)))
fig.show()

In [28]:
label = ["Content Viewed from Control Campaign",
         "Content Viewed from Test Campaign"]
counts = [sum(control_data["Content Viewed"]),
          sum(test_data["Content Viewed"])]
colors = ['baby blue', 'pink']

In [34]:
fig = go.Figure(data = [go.Pie(labels = label, values = counts)])
fig.update_layout(title_text = 'Control vs Text: Content Viewed')
fig.update_traces(hoverinfo = 'label+percent',
                  textinfo = 'value',
                  textfont_size = 15,
                  marker = dict(colors = colors,
                                line = dict(
                                    color = 'black',
                                    width = 1
                                )))

In [35]:
label = ["Product Added to Cart from Control Campaign",
         "Product Added to Cart from Test Campaign"]
counts = [sum(control_data["Added to Cart"]),
          sum(test_data["Added to Cart"])]

In [36]:
fig = go.Figure(data=[go.Pie(labels=label,
                             values= counts)])
fig.update_layout(title_text = "Control vs Test: Added to Cart")
fig.update_traces(hoverinfo = 'label+percent',
                  textinfo = 'value',
                  textfont_size = 20,
                  marker = dict(colors = colors,
                                line = dict(color = 'black',
                                            width = 1)))

In [37]:
label = ["Amount Spent in Control Campaign",
         "Amount Spent in Test Campaign"]
counts = [sum(control_data["Amount Spent"]), 
          sum(test_data["Amount Spent"])]

In [40]:
fig = go.Figure(data = [go.Pie(labels = label,
                               values = counts)])
fig.update_layout(title_text = 'Control vs Test: Amount Spent')
fig.update_traces(hoverinfo = 'label+percent',
                  textinfo = 'value',
                  textfont_size = 20,
                  marker = dict(colors = colors,
                                line = dict(color = 'black',
                                            width = 1)))
fig.show()

In [41]:
label = ["Purchases Made by Control Campaign", 
         "Purchases Made by Test Campaign"]
counts = [sum(control_data['Purchases']),
          sum(test_data["Purchases"])]

In [42]:
fig = go.Figure(data = [go.Pie(labels = label,
                               values = counts)])
fig.update_layout(title_text = 'Control vs Test: Number of Purchases')
fig.update_traces(hoverinfo = 'label+percent',
                  textinfo = 'value',
                  marker = dict(colors = colors,
                                line = dict(color = 'black',
                                            width = 1)))
fig.show()

In [43]:
figure = px.scatter(data_frame=ab_data,
                    x="Content Viewed",
                    y="Website Clicks",
                    size="Website Clicks",
                    color="Campaign Name",
                    trendline="ols")
figure.show()

In [49]:
figure = px.scatter(data_frame=ab_data,
                    x="Added to Cart",
                    y = "Content Viewed",
                    size="Added to Cart",
                    color = "Campaign Name",
                    trendline="ols")
figure.show()

In [50]:
figure = px.scatter(data_frame=ab_data,
                    x = "Purchases",
                    y = "Added to Cart",
                    size = "Purchases",
                    color = "Campaign Name",
                    trendline = "ols")
figure.show()

Based on the A/B tests mentioned earlier, we discovered that the control campaign yielded greater sales and engagement among visitors. More products were viewed and added to the cart through the control campaign, resulting in increased sales. However, the test campaign exhibited a higher conversion rate for products in the cart. In terms of products viewed and added to the cart, the test campaign generated more sales, but the control campaign still outperformed it in overall sales. Therefore, the test campaign is suitable for promoting a particular product to a specific target audience, while the control campaign is better suited for marketing multiple products to a broader audience.