# Sales Funnel Analysis using Python

This project analyzes user drop-offs across different stages of an
e-commerce sales funnel using event-based data from Kaggle.


## Data Loading

In [39]:
import pandas as pd
home = pd.read_csv("home_page_table.csv")
search = pd.read_csv("search_page_table.csv")
payment = pd.read_csv("payment_page_table.csv")
confirmation = pd.read_csv("payment_confirmation_table.csv")

In [40]:
home.head()

Index(['user_id', 'page'], dtype='object')

In [41]:
home.shape

(90400, 2)

In [42]:
home.columns

Index(['user_id', 'page'], dtype='object')

## Funnel Stage Mapping

In [27]:
home['stage'] = 'Visit'
search['stage'] = 'Product_View'
payment['stage'] = 'Checkout'
confirmation['stage'] = 'Purchase'

In [43]:
home.head()

Unnamed: 0,user_id,page
0,313593,home_page
1,468315,home_page
2,264005,home_page
3,290784,home_page
4,639104,home_page


In [28]:
home = home[['user_id', 'stage']]
search = search[['user_id', 'stage']]
payment = payment[['user_id', 'stage']]
confirmation = confirmation[['user_id', 'stage']]

## Funnel Construction

In [29]:
funnel_df = pd.concat([home, search, payment, confirmation])

In [44]:
funnel_df.head()

Unnamed: 0,user_id,stage
0,313593,Visit
1,468315,Visit
2,264005,Visit
3,290784,Visit
4,639104,Visit


In [30]:
funnel_counts = (
    funnel_df.groupby('stage')['user_id']
    .nunique()
    .reset_index())

In [31]:
stage_order = ['Visit', 'Product_View', 'Checkout', 'Purchase']
funnel_counts['stage'] = pd.Categorical(
    funnel_counts['stage'],
    categories=stage_order,
    ordered=True)
funnel_counts = funnel_counts.sort_values('stage')

## Conversion Rate Analysis

In [32]:
funnel_counts['conversion_rate'] = (
    funnel_counts['user_id'] /
    funnel_counts['user_id'].shift(1)) * 100
funnel_counts.loc[0, 'conversion_rate'] = 100

In [45]:
funnel_counts = funnel_df.groupby('stage')['user_id'].nunique()

In [46]:
funnel_counts

stage
Checkout         6030
Product_View    45200
Purchase          452
Visit           90400
Name: user_id, dtype: int64

## Key Insights
- A significant drop-off is observed between **Product View and Checkout**
- Only a small percentage of users who visit the site complete a purchase
- Checkout stage represents the biggest opportunity for conversion improvement

## Business Recommendations
- Simplify checkout flow to reduce friction
- Retarget users who viewed products but did not check out
- Improve payment experience to increase completed purchases