# A/B Test Setup: Banner Impact on Desktop Conversions (May 2019)

Customer data from completed sales transactions, split by the type of banner ad customers were shown.

### Question  
Was there a difference in **sales conversions** between **desktop customers** who saw the **sneakers banner** vs. those who saw the **accessories banner** during **May 2019**?

### Considerations

- **Who are we including in the test?**  
  Only *desktop users* who were exposed to either banner type **in May 2019**.

- **How big of a difference matters?**  
  What size of effect would make the difference worth noticing - statistically and business-wise? That defines how sensitive our test needs to be.

In [None]:
# Mathematical computation and data manipulation libraries
import pandas as pd
import numpy as np

# Visualisations
import plotly.graph_objects as go

# Statistics library
from scipy import stats
from scipy.stats import chi2_contingency

import warnings as wr
wr.filterwarnings('ignore')

In [None]:
# Load data
customer_df = pd.read_csv('https://github.com/flatiron-school/ds-ab_testing/releases/download/v1.2/products_small.csv')

# Preview the data
customer_df.head()

Unnamed: 0.1,Unnamed: 0,order_id,user_id,page_id,product,site_version,time,title,target
0,4122928,3e6c5e89fdddcaee0eed210ec2c9cadf,90d58d967eb72656e86059ec6f208092,2fdc16a09e0016555dd4da4a3fe84414,accessories,desktop,2019-03-06 08:42:47,banner_show,0
1,564306,feed6203517d3abf6aab13761633174b,08703dab1f004eabba25aacb7f0e5484,6b0a902b9b73d5a158d0119d6feb38ac,sneakers,mobile,2019-04-19 18:30:45,banner_show,0
2,1872289,e33d5d7941edc281646aa37763729771,bdf1d25697e21419901c94fabdafad15,9ddb7315c4357929931b48f2b3d11c62,company,mobile,2019-01-20 17:20:10,banner_show,0
3,3616779,7c4caa8d508fa7c3bbc25f35cdd9168a,8d2f23a732c9527d95678088a3bac122,1f86cd0bea31d54a5b511b42fd19401a,sneakers,mobile,2019-02-20 09:38:32,banner_show,0
4,5871482,12874b29bde8bbd43fb2b95735caf9e6,5a22604f8f31ae98ee1211ece3a02004,b533f7e2003418c63fd71471264c559a,sneakers,mobile,2019-04-24 09:19:02,banner_show,0


In [None]:
# Get metadata
customer_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1000000 entries, 0 to 999999
Data columns (total 9 columns):
 #   Column        Non-Null Count    Dtype 
---  ------        --------------    ----- 
 0   Unnamed: 0    1000000 non-null  int64 
 1   order_id      1000000 non-null  object
 2   user_id       1000000 non-null  object
 3   page_id       1000000 non-null  object
 4   product       1000000 non-null  object
 5   site_version  1000000 non-null  object
 6   time          1000000 non-null  object
 7   title         1000000 non-null  object
 8   target        1000000 non-null  int64 
dtypes: int64(2), object(7)
memory usage: 68.7+ MB


In [None]:
# What products is the company selling
print(customer_df['product'].value_counts())

product
clothes             210996
company             203020
sneakers            201298
sports_nutrition    193200
accessories         191486
Name: count, dtype: int64


## About the target Column- Cuz what even is going on here!!

The 'target' column captures **sales conversion**:

- 0 - Customer saw the ad but **did not** complete a transaction  
- 1 - Customer saw the ad and **did** complete a transaction

So basically, it's our indicator for whether a customer **converted** or not.

### Conversion Rates by Banner Type

Here’s how conversion rates stack up for the different product banners:


In [None]:
# Groupby the product and examine the distribution of the target
print(customer_df.groupby('product')['target'].value_counts())

product           target
accessories       0         186127
                  1           5359
clothes           0         197720
                  1          13276
company           0         203020
sneakers          0         193375
                  1           7923
sports_nutrition  0         190363
                  1           2837
Name: count, dtype: int64


### Conversion Breakdown by Product Banner

| Product            | Converted (1) | Not Converted (0) | Total     | Conversion Rate (%) |
|--------------------|---------------|--------------------|-----------|----------------------|
| **Accessories**     | 5,359         | 186,127            | 191,486   | 2.80%               |
| **Clothes**         | 13,276        | 197,720            | 210,996   | 6.29%               |
| **Sneakers**        | 7,923         | 193,375            | 201,298   | 3.94%               |
| **Sports Nutrition**| 2,837         | 190,363            | 193,200   | 1.47%               |
| **Company**         | 0             | 203,020            | 203,020   | 0.00%               |

> ### Interpretation

> - **Clothes** have the highest conversion rate at **6.29%**, standing out clearly from the rest.
> - **Sneakers** convert better (**3.94%**) than **Accessories** (**2.80%**) — this difference might be meaningful and is worth testing (our A/B test).
> - **Sports Nutrition** has the lowest non-zero rate (**1.47%**), while **Company** had **zero conversions** - marketing leadership need to look into this

Some clearly pull more weight in getting users to buy.

In [None]:
# Check timestamp range: Time the banner was up

print(f"Start date:", customer_df['time'].min())
print(f"\nEnd date:", customer_df['time'].max())

Start date: 2019-01-01 00:00:25

End date: 2019-05-31 23:59:21


The most recent banner was set up in May 31st, 2019. The month of May is our focus point.

In [None]:
# Check counts of the different site version
print(customer_df['site_version'].value_counts())

site_version
mobile     718521
desktop    281479
Name: count, dtype: int64


We are focusing on the desktop version to see how the ad fared against sneakers and accessories.

Initial observation ponts out that not many customers interract with the desktop site version

In [None]:
# Check counts of banner titles
print(customer_df['title'].value_counts())

title
banner_show     872275
banner_click     98330
order            29395
Name: count, dtype: int64


> ### Interpretation

> - Nearly **900K users saw a banner**, but only about **11% clicked** on one → classic drop-off at the awareness stage.
> - Of those who clicked, roughly **30% went on to place an order** — not bad!
> - Overall, only about **3.4% of banner viewers** ended up ordering (29,395 / 872,275).

This is a typical funnel in digital marketing:
> **Impressions → Clicks → Conversions**  

In [None]:
# What title had the most positive impact
print(customer_df.groupby('title').agg({'target': 'mean'}))

              target
title               
banner_click     0.0
banner_show      0.0
order            1.0


> ### Interpretation

> - Only order rows have target = 1, which makes sense becuase **target tracks actual sales conversions**, and 'order' is the moment of purchase.
> - Both 'banner_show' and 'banner_click' have a mean of '0.0' — meaning those events alone don’t guarantee a conversion.

### Experimental Setup: A/B Test Cohort Selection

We filter the dataset to create a focused A/B test cohort based on the following criteria:

- **Device**: Only users on the 'desktop' site version  
- **Timeframe**: Interactions that happened during **May 2019**  
- **Products Tested**: Users who were shown either the **sneakers** or **accessories** banners

In [None]:
df_AB = customer_df[(customer_df['site_version'] == 'desktop') &
            (customer_df['time'] >= '2019-05-01') &
            ((customer_df['product'] == 'accessories') | (customer_df['product'] == 'sneakers'))].reset_index(drop = True)

df_AB.tail()

Unnamed: 0.1,Unnamed: 0,order_id,user_id,page_id,product,site_version,time,title,target
25967,5738290,ecfe9da2dbea255f05b2f526a4c663a7,cd7598ba5798ad6bd3f5a667ae4c217a,644ad50dd900d6735d8ceb15b71d50f8,accessories,desktop,2019-05-01 23:01:05,banner_show,0
25968,7885931,8c28b45ff7fbb5d5df0b7670183689db,2f28106a4db2babcfde1c4095c805632,a3d2de7675556553a5f08e4c88d2c228,sneakers,desktop,2019-05-26 11:19:09,order,1
25969,8099155,c613db2f19c7285f101196062c178490,97473368941b89c4716c4f389e9b8bd7,a3d2de7675556553a5f08e4c88d2c228,sneakers,desktop,2019-05-11 09:04:46,order,1
25970,6690689,6f7f32ecf6790e4eb06384b479f43162,73f403b4843c70c41c0db294ee479582,13f395d7c86ef276f7ce52a557fd3491,accessories,desktop,2019-05-09 09:17:24,banner_show,0
25971,7650805,af3a19fcceb477d2992f1771af3a9b34,515cb969971fb5e08ae0316f58e3f434,ef38d906f15db1efde867c68f1336946,sneakers,desktop,2019-05-20 17:58:16,banner_show,0


## LET'S SET THE STAGE  
### What Kind of Test Are We Running?

We’re trying to answer: Did **banner type** (*sneakers* vs. *accessories*) affect whether desktop users made a **purchase** in May 2019?

### Why a **Chi-Square Test**?

Because we’re comparing **two categorical variables**:

- **Banner type** ('product': sneakers or accessories)  
- **Conversion outcome** ('target': 1 = bought, 0 = didn’t buy)

We’re not measuring averages. We’re not predicting anything. We just want to know if **conversion rates differ between two independent groups**.

That’s exactly what a **Chi-Square Test of Independence** is built for:
- No assumptions about normality  
- Works great on proportions/frequencies  
- Tells us whether **product type and purchase behavior are associated**
  
No means, no medians - just categories and counts.  
So the **Chi-Square Test** isn’t just appropriate - it’s the statistical equivalent of *"You called?"*

In [None]:
hypotheses = """
  H_0  = The sneakers banner is just another pretty face- no real effect on sales
  H_1  = The sneakers banner *did* change behaviour
  """

print(hypotheses)

alpha = 0.05
print("False Positive(alpha):", alpha)
print("Transalation: We’re okay with a 5% chance of falsely accusing the sneakers banner of being special when it’s actually just average.")


  H_0  = The sneakers banner is just another pretty face- no real effect on sales
  H_1  = The sneakers banner *did* change behaviour
  
False Positive(alpha): 0.05
Transalation: We’re okay with a 5% chance of falsely accusing the sneakers banner of being special when it’s actually just average.


## Splitting The Crowd

In [None]:
# df_A = Desktop users shown the accessories banner
df_A = df_AB[df_AB['product'] == 'accessories']

# Confirm it worked
print("Product in df_A:", df_A['product'].unique())

# Preview filtration
df_A.head()

Product in df_A: ['accessories']


Unnamed: 0.1,Unnamed: 0,order_id,user_id,page_id,product,site_version,time,title,target
0,8064199,f909eb9bbf2795337dd38323b7951705,5a96585c3db94eb2ae40e559cbd2e164,94b6ea2316b773afa20925d9b79d5bf8,accessories,desktop,2019-05-19 09:26:29,banner_show,0
5,6465757,d737056421a6d45ac53f65b0db9b4deb,cbc6feb75b2a19b256952a2428120c95,6eb917afede2b54810d348b941115862,accessories,desktop,2019-05-15 14:58:07,banner_show,0
6,8197222,6cf466aa5b446b410199c62e4850fea5,9b3391ea31e499df16d2a6aea1fce63c,ab8f97ddc4408f657ba05f0f9bd99e58,accessories,desktop,2019-05-13 06:45:36,banner_show,0
7,8254764,93476f224f144c30f86ad5863ed59150,ada08956317e42121981ee1fc46e510b,900af58da0d6a3a7098c55095145f929,accessories,desktop,2019-05-29 15:30:24,banner_show,0
10,163631,07fa341b076f99613b029a3c923f60be,7795154c23b1ae9465fe1d6986ec9af9,a3d2de7675556553a5f08e4c88d2c228,accessories,desktop,2019-05-20 23:51:33,order,1


In [None]:
# df_B = Desktop users shown the sneakers banner
df_B = df_AB[df_AB['product'] == 'sneakers']

# Confirm it worked
print("Product in df_B:", df_B['product'].unique())

# Preview filtration
df_B.head()

Product in df_B: ['sneakers']


Unnamed: 0.1,Unnamed: 0,order_id,user_id,page_id,product,site_version,time,title,target
1,7560076,97808cfd3d44fa7e8cfeb71bb5f02590,1e2db5d01051fc65fffc6c27f34400a3,48c8e3f287d814493cff887ff445f535,sneakers,desktop,2019-05-03 05:34:46,banner_show,0
2,5923209,71c39cc2e2ede95a7ca296d4e3e56f85,46a6f13750d5032f366349e3e3171678,30559d0b23980e7a778c93c52ba5deca,sneakers,desktop,2019-05-19 05:04:32,banner_show,0
3,8161379,e969a575bcf5b10d6676c2fab9cbf784,aa77bca62e9a84f5cd8d734c70afd060,70ed6e66458fda7a3164e2633d81ea07,sneakers,desktop,2019-05-29 17:51:46,banner_show,0
4,4564531,3068b616fac2838de6a2d33a6fad60ad,aec8ee5450418b5bc97127c2269058ca,14cbe8cf92c259f5d18310347e3cb5cb,sneakers,desktop,2019-05-04 09:33:44,banner_show,0
8,4947111,59bca76ab31800aa2e77f7b764443fcb,ada06245102c2da6119f4cbdccb0fdef,ac69d939939799d30a1384f289d90e9f,sneakers,desktop,2019-05-05 20:29:45,banner_show,0


### Total Orders Placed by Banner Type:
We sum up the number of successful conversions (target = 1) for each group to  get the raw win count for each group.

This allows us to see if the difference is just a coincidence or statistically significant.

In [None]:
# Sum of accessories orders that were placed
accessories_orders = sum(df_A['target'])
sneakers_orders = sum(df_B['target'])

print("Total accessories orders placed:", accessories_orders)
print("Total sneakers orders placed:", sneakers_orders)

Total accessories orders placed: 496
Total sneakers orders placed: 799


### Total Views vs. Orders Placed

We now calculate how many desktop users saw each banner and how many of them **did not convert** to get the full breakdown of who saw what and what they did (or didn’t do) next.

In [None]:
# To get the number of people who didnt place orders

# Banner Views
accessories_total = sum(df_A['title'] == 'banner_show')
sneakers_total = sum(df_B['title'] == 'banner_show')

# Non-conversions
accessories_no_orders = accessories_total - accessories_orders
sneakers_no_orders = sneakers_total - sneakers_orders

print("Total number of people who saw the accessories banner:", accessories_total)
print("Total number of people who saw the sneakers banner:", sneakers_total)
print("\nTotal customers who did not place accessories orders:", accessories_no_orders)
print("Total customers who did not place sneakers orders:", sneakers_no_orders)

Total number of people who saw the accessories banner: 11715
Total number of people who saw the sneakers banner: 11854

Total customers who did not place accessories orders: 11219
Total customers who did not place sneakers orders: 11055


## Chi-Square Test: Setting Up the Contingency Table

We’re now ready to test whether there's a **statistically significant difference** in conversion behavior between users who saw the **accessories** banner vs. the **sneakers** banner.

In [None]:
# Set up contingency table
# Number of people who did or did not place orders for both the sneakers and accessories banners

contingency_table = np.array([
    (accessories_orders, accessories_no_orders),
    (sneakers_orders, sneakers_no_orders)
])

contingency_table

array([[  496, 11219],
       [  799, 11055]])

### Array Interpretation

|                      | **Converted (1)**     | **Did Not Convert (0)**   |
|----------------------|-----------------------|---------------------------|
| **Accessories Group**| accessories_orders    | accessories_no_orders     |
| **Sneakers Group**   | sneakers_orders       | sneakers_no_orders        |

This matrix will be used to determine whether **conversion rate is independent of banner type** - or if something’s actually going on behind the scenes

### Next step:
Run the test and **let the p-value do the talking**.

In [None]:
result = stats.chi2_contingency(contingency_table)

# Unpack
chi2, p, dof, expected = result

# Print one row at a time
print("Chi-Squared Results:")
print(f"\nChi-squared statistic: {chi2}")
print(f"\np-value: {p}")
print(f"\nDegrees of freedom: {dof}")
print("\nExpected frequencies:")
for row in expected:
    print(row)


Chi-Squared Results:

Chi-squared statistic: 70.80332433558804

p-value: 3.946714706061366e-17

Degrees of freedom: 1

Expected frequencies:
[  643.68131868 11071.31868132]
[  651.31868132 11202.68131868]


In [None]:
from scipy.stats import chi2_contingency

# Run the chi-square test
chi2, p, dof, expected = chi2_contingency(contingency_table)

# Print chi-square test results
print("Chi-squared statistic:", chi2)
print("\np-value:", p)
print("\nDegrees of freedom:", dof)

# Print expected frequencies
print("\nExpected Frequencies (under H_0):")
print(expected)

# Extract expected orders for accessories and sneakers
expected_accessories_orders = expected[0][0]
expected_sneakers_orders = expected[1][0]

print("\nExpected Accessories Orders:", expected_accessories_orders)
print("Expected Sneakers Orders:", expected_sneakers_orders)

Chi-squared statistic: 70.80332433558804

p-value: 3.946714706061366e-17

Degrees of freedom: 1

Expected Frequencies (under H_0):
[[  643.68131868 11071.31868132]
 [  651.31868132 11202.68131868]]

Expected Accessories Orders: 643.6813186813187
Expected Sneakers Orders: 651.3186813186813


### Interpretation

- The **chi-squared statistic (70.80)** is large - way beyond what we'd expect by random chance if the banners performed the same.
- The **p-value (~0.00000000000000004)** is far below our α threshold of 0.05, meaning this result is **statistically significant**.

**There is a strong association between banner type and conversion behavior.**

So we can confidently **reject the null hypothesis (H₀)**.  
The banner type *did* impact user decisions - and it's not just noise.

Next question:
Was it sneakers or accessories that converted better? Let's dig into the conversion rates.

In [None]:
# Now we get the difference in conversion rate

accessory_CR, sneakers_CR = contingency_table[:,0] / contingency_table[:, 1]

print(f"Conversion Rate for Accessory Banner: {100 * accessory_CR:.4f}%")
print(f"Conversion Rate for Sneakers Banner: {100 * sneakers_CR:.4f}%")
print(f"\nAbsolute Difference of Conversion Rate: {100 * (sneakers_CR - accessory_CR):.4f}%")

Conversion Rate for Accessory Banner: 4.4211%
Conversion Rate for Sneakers Banner: 7.2275%

Absolute Difference of Conversion Rate: 2.8064%


## Now We Visualize- For the Non-Data People

In [None]:
conversion_rates = [
    round(accessories_orders / accessories_total * 100, 4),
    round(sneakers_orders / sneakers_total * 100, 4)
]

fig = go.Figure(data=[
    go.Bar(
        x = ['Accessories', 'Sneakers'],
        y = conversion_rates,
        text = [f'{rate:.2f}%' for rate in conversion_rates],
        textposition = 'auto',
        marker_color = ['gold', 'royalblue'],
        hovertemplate = '<b>%{x}</b><br>Conversion Rate: <b>%{y:.2f}%%</b><extra></extra>'
    )
])

fig.update_layout(
    title = 'Conversion Rate by Banner Type',
    yaxis_title = 'Conversion Rate (%)',
    xaxis_title = 'Banner Type',
    template = 'plotly_white',
    width = 700,
    height = 500,
)

fig.show()

In [None]:
fig = go.Figure()

fig.add_trace(go.Bar(
    name = 'Converted',
    x = ['Accessories', 'Sneakers'],
    y = [accessories_orders, sneakers_orders],
    marker_color = 'seagreen',
    hovertemplate = '<b>%{x}</b><br>Converted: <b>%{y}</b><extra></extra>'
))

fig.add_trace(go.Bar(
    name = 'Did Not Convert',
    x = ['Accessories', 'Sneakers'],
    y = [accessories_no_orders, sneakers_no_orders],
    marker_color = 'lightgray',
    hovertemplate = '<b>%{x}</b><br>Did Not Convert: <b>%{y}</b><extra></extra>'
))

fig.update_layout(
    barmode = 'stack',
    title = 'Conversion Breakdown by Banner Type',
    yaxis_title = 'Number of Users',
    xaxis_title = 'Banner Type',
    legend_title = 'Conversion Status',
    template = 'plotly_white',
    hovermode = 'x unified',
    width = 700,
    height = 500
)

fig.show()

In [None]:
expected_orders = [643.68, 651.32]  # Refer to Chi2's expected frequency matrix
observed_orders = [accessories_orders, sneakers_orders]

fig = go.Figure()

fig.add_trace(go.Bar(
    x = ['Accessories', 'Sneakers'],
    y = observed_orders,
    name = 'Observed Orders',
    marker_color = 'mediumslateblue',
    hovertemplate = '<b>%{x}</b><br>Observed Orders: <b>%{y}</b><extra></extra>'
))

fig.add_trace(go.Bar(
    x = ['Accessories', 'Sneakers'],
    y = expected_orders,
    name = 'Expected (Under H₀)',
    marker_color = 'lightsalmon',
    hovertemplate = '<b>%{x}</b><br>Expected Orders (H_0): <b>%{y}</b><extra></extra>'
))

fig.update_layout(
    barmode = 'group',
    title = 'Observed vs. Expected Orders',
    yaxis_title = 'Number of Orders',
    xaxis_title = 'Banner Type',
    template = 'plotly_white',
    legend_title = 'Order Type',
    hovermode = 'x unified',
    width = 800,
    height = 500
)

fig.show()

# Interpretation Summary

- There is a **statistically significant difference** in conversion rates at our chosen confidence level (α = 0.05).  
- The difference is about **2.8% in favor of the sneakers banner** - meaning users were more likely to buy after seeing the sneakers ad compared to the accessories ad.

This result is both **statistically and practically meaningful**.
