# Evaluating Website Redesign Impact with A/B Testing


## Business Context and Problem Statement


In this project, a controlled A/B experiment is evaluated to determine whether a new website landing page leads to a meaningful improvement in user conversion compared to the existing design. Users are randomly assigned to either a control group (old landing page) or a treatment group (new landing page), and conversion outcomes are observed. 

The objective of this analysis is to assess whether the observed difference in conversion performance between the two groups is statistically significant and practically meaningful, and to provide a data-driven recommendation on whether the redesigned page should be adopted.

## Dataset Overview

This analysis is based on user-level data from a controlled A/B experiment designed to evaluate the impact of a website landing page redesign on conversion behavior. Each observation represents an individual user interaction recorded during the experiment.

The dataset is structured to allow a comparison between a control group and a treatment group, with conversion outcome as the primary metric of interest. A detailed inspection of the dataset structure and variables is conducted after loading the data.

## Dataset Source

The dataset used in this project was obtained from Kaggle:

- **Source**: Kaggle â€” A/B Testing Dataset  
- **Link**: https://www.kaggle.com/datasets/zhangluyuan/ab-testing

### Import Required Libraries

In [2]:
import pandas as pd
import numpy as np

pd.set_option("display.max_columns", 50)
pd.set_option("display.width", 120)

### Load Dataset

In [4]:
df = pd.read_csv("../data/ab_data.csv")
df.head()

Unnamed: 0,user_id,timestamp,group,landing_page,converted
0,851104,2017-01-21 22:11:48.556739,control,old_page,0
1,804228,2017-01-12 08:01:45.159739,control,old_page,0
2,661590,2017-01-11 16:55:06.154213,treatment,new_page,0
3,853541,2017-01-08 18:28:03.143765,treatment,new_page,0
4,864975,2017-01-21 01:52:26.210827,control,old_page,1


### Dataset Shape

In [6]:
df.shape

(294478, 5)

### Column Names

In [8]:
df.columns.tolist()

['user_id', 'timestamp', 'group', 'landing_page', 'converted']

### Data Types

In [10]:
df.dtypes

user_id          int64
timestamp       object
group           object
landing_page    object
converted        int64
dtype: object

### Missing Values Check

In [11]:
df.isna().sum()

user_id         0
timestamp       0
group           0
landing_page    0
converted       0
dtype: int64

### Duplicate Rows Check

In [13]:
df.duplicated().sum()

0

### Observations from Initial Data Inspection

The dataset contains 294,478 user-level observations and five variables relevant to a controlled A/B experiment. All variables are present with appropriate data types, and no missing or duplicate records are observed.