# A/B Test Analysis Report: Website Landing page

## 1. Objective and hypothesis

The goal of this A/B test is to determine which homepage version provides a better user experience:

- **Version A** = current homepage  
- **Version B** = new homepage design  

We measure user experience using **bounce rate**:
- **Bounce** = user leaves after viewing only one page  
- **Lower bounce rate = better**

### Hypotheses
- **H₀ (Null):** No difference in bounce rate between A and B  
- **H₁ (Alternative):** Bounce rates are different  

This helps us check whether changes are real or due to chance.

## 2. Import libraries and load data

In this section, we:
- Import the required libraries  
- Load the dataset  
- Check how many users were in each group  
- Count how many users bounced  

This ensures the data is clean and ready for analysis.

In [12]:
# Import libraries
import pandas as pd
import numpy as np
from statsmodels.stats.proportion import proportions_ztest

In [3]:
# Load your data
bounce_df = pd.read_csv('/Users/harrydo/Downloads/Maven+Fuzzy+Factory/Cleaned data/D4_Calculate bounce rate.csv')
sig_df = pd.read_csv('/Users/harrydo/Downloads/Maven+Fuzzy+Factory/Cleaned data/D4_Prepare for statistical significance.csv')

In [14]:
# Total sessions per variant
sessions_summary = sig_df.groupby('variant')['sessions'].sum()
print("Total Sessions by Variant:")
print(sessions_summary)

# Total bounces per variant
bounces_summary = bounce_df.groupby('variant')['bounces'].sum()
print("\nTotal Bounces by Variant:")
print(bounces_summary)


Total Sessions by Variant:
variant
A    137576
B     47574
Name: sessions, dtype: int64

Total Bounces by Variant:
variant
A    57346
B    25330
Name: bounces, dtype: int64


## 3. Metric calculation

Here we calculate the **bounce rate** for both versions:

- **Bounce Rate A** = (Bounced Users in A) ÷ (Total Users in A)  
- **Bounce Rate B** = (Bounced Users in B) ÷ (Total Users in B)

In [8]:
# Extract numbers
sessions_A = int(sig_df[sig_df['variant']=='A']['sessions'].iloc[0])
sessions_B = int(sig_df[sig_df['variant']=='B']['sessions'].iloc[0])

bounces_A = int(bounce_df[bounce_df['variant']=='A']['bounces'].iloc[0])
bounces_B = int(bounce_df[bounce_df['variant']=='B']['bounces'].iloc[0])

# Calculate bounce rates
rate_A = bounces_A / sessions_A
rate_B = bounces_B / sessions_B

print(f"Bounce Rate A: {rate_A:.4f} ({rate_A*100:.2f}%)")
print(f"Bounce Rate B: {rate_B:.4f} ({rate_B*100:.2f}%)")


Bounce Rate A: 0.4168 (41.68%)
Bounce Rate B: 0.5324 (53.24%)


Bounce rate increased by +11.6 percentage points in Version B (a large drop in engagement).

**What this means:**
- More users left instantly when shown Version B.
- They did not continue browsing or start shopping.

## 4. Statistical Testing (Z-Test for Two Proportions)

We use a two-proportion Z-test to check whether this difference is real.

In [11]:
# Z-test for two proportions
count = np.array([bounces_A, bounces_B])
nobs = np.array([sessions_A, sessions_B])

z_stat, p_value = proportions_ztest(count, nobs, alternative='two-sided')

print("\n=== Statistical Test Results ===")
print(f"Z-statistic: {z_stat:.4f}")
print(f"P-value: {p_value:.6f}")

# Evaluate significance
alpha = 0.05
if p_value < alpha:
    print("✔️ Statistically significant difference (reject H0)")
else:
    print("❌ Not statistically significant (fail to reject H0)")



=== Statistical Test Results ===
Z-statistic: -43.7208
P-value: 0.000000
✔️ Statistically significant difference (reject H0)


Another statistical significance test was ran (similar to a reliability test) to check whether this difference could have happened by luck.
The test showed:
- p-value < 0.0001

This means there is less than a 0.01% chance this difference is random. The worse performance of Version B is real, consistent and not due to chance. This shows that the new design made the website less engaging.


## 5. Effect Size (How big is the impact?)

In [10]:
# Effect size
absolute_change = rate_B - rate_A
relative_change = (rate_B / rate_A - 1) * 100

print("\n=== Effect Size ===")
print(f"Absolute Change: {absolute_change:.4f} ({absolute_change*100:.2f}%)")
print(f"Relative Change: {relative_change:.2f}%")


=== Effect Size ===
Absolute Change: 0.1156 (11.56%)
Relative Change: 27.73%


Results:
- Absolute change: +11.56 percentage points
- Relative change: Bounce rate is 27.7% worse

**What this means:**
- Users are 28% more likely to leave immediately with Version B than Version A.
- This is a very negative performance shift.

## 6. Recommendations


Landing Page B should not be released. Keep using Version A.

This is because:
- Version B caused significantly more users to leave immediately.
- This reduces opportunities for product discovery and purchasing.
- The A/B test shows the impact is statistically reliable, not random.
- Rolling out B would likely reduce revenue and customer engagement.

**Next steps:**
1. Reject Version B.
2. Review what changed: headline, layout, CTA, image, spacing, load time.
3. Create a smaller, more focused Variant C.
4. Test individual changes instead of a full redesign.
5. Consider segmentation (mobile vs desktop, new vs returning users).



## 7. Prepare data for dashboard

In [17]:
dashboard_metrics_df = pd.DataFrame({
    "metric": [
        "sessions_A", "sessions_B",
        "bounces_A", "bounces_B",
        "bounce_rate_A", "bounce_rate_B",
        "absolute_change", "relative_change_percent",
        "p_value", "z_statistic"
    ],
    "value": [
        sessions_A, sessions_B,
        bounces_A, bounces_B,
        round(rate_A, 4), round(rate_B, 4),
        round(absolute_change, 4), round(relative_change, 2),
        p_value, round(z_stat, 4)
    ]
})

In [18]:
# Export
dashboard_metrics_df.to_csv("/Users/harrydo/Downloads/Maven+Fuzzy+Factory/Cleaned data/D4_AB_test_dashboard_metrics.csv", index=False)