# A/B Testing Analysis – Detailed Report

## 1. Analysis Background
The dataset comes from an e-commerce website that conducted an A/B test to compare an old landing page with a new version.
The objective is to determine whether the new page leads to a statistically significant improvement in user conversion rate.

---

## 2. Data Overview
- Dataset size: **294,478 rows**
- Columns:
  - `user_id`
  - `timestamp`
  - `group`
  - `landing_page`
  - `converted`

---

## 3. Data Cleaning

### 3.1 Missing Values
- Number of rows containing any null values: **0**

No missing value handling was required.

---

### 3.2 Logical Consistency Check
Valid combinations:
- `control` → `old_page`
- `treatment` → `new_page`

Rows violating this logic were removed.

| Description | Count |
|------------|------|
| Rows before cleaning | 294,478 |
| Rows after cleaning | 290,585 |
| Invalid rows removed | 3,893 |

---

### 3.3 Duplicate Users
To ensure independence of observations, each user was counted only once.

| Description | Count |
|------------|------|
| Rows before deduplication | 290,585 |
| Rows after deduplication | 290,584 |
| Duplicate users removed | 1 |

---

### 3.4 Traffic Distribution
- New page traffic share: **50.01%**

Traffic was evenly split between the control and treatment groups, indicating a well-balanced experiment.

---

## 4. Hypothesis Testing

### 4.1 Problem Definition
Let:
- \( p_1 \): conversion rate of the old page
- \( p_2 \): conversion rate of the new page

**Null hypothesis (H₀):**
\( p_1 > p_2 \)

**Alternative hypothesis (H₁):**
\( p_1 < p_2 \)

---

### 4.2 Sampling Distribution
- Binary outcome (converted / not converted)
- Binomial distribution
- Two independent samples
- Large sample size (n > 30)

A **two-sample Z-test for proportions** is appropriate.

---

### 4.3 Test Direction
Since the goal is to test whether the new page performs better, a **one-sided (right-tailed) test** is used.

---

### 4.4 Significance Level
- \( alpha = 0.05 \)

---

### 4.5 Test Statistics

| Metric | Value |
|------|------|
| Z-score | 2.148 |
| P-value | 0.0158 |
| Z-critical | 1.6449 |

---
### 4.6 Conversion Rate Comparison

The following chart compares the conversion rates between the old landing page (control group) and the new landing page (treatment group).

![Conversion Rate Comparison](figures/conversion_rate_comparison.png)

---
## 5. Conclusion
Since the p-value is smaller than the significance level (0.05), the null hypothesis is rejected.

**There is statistically significant evidence that the new landing page has a higher conversion rate than the old page.**
