<a href="https://colab.research.google.com/github/chandu844/BANK/blob/main/Amazon_Website_Traffic_%26_User_Behavior_project.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Unlike the mock data, this dataset uses binary flags (0 or 1). We need to see how many users performed each action.**bold text**

In [1]:
import pandas as pd
import plotly.graph_objects as go
import plotly.express as px

# 1. Load the dataset
df = pd.read_csv('/content/testing_sample.csv')

# 2. Check the "Health" of the data
print(f"Total Users in Dataset: {len(df)}")
print(df[['saw_homepage', 'saw_checkout', 'ordered']].sum())

Total Users in Dataset: 151655
saw_homepage    43313
saw_checkout     6464
ordered             0
dtype: int64


Since this data is per-user, we create a funnel by summing up the occurrences of key milestones.

In [2]:
# Aggregate counts for the funnel
funnel_stages = {
    'Homepage': df['saw_homepage'].sum(),
    'Added to Basket': (df['basket_add_list'] | df['basket_add_detail']).sum(),
    'Reached Checkout': df['saw_checkout'].sum(),
    'Purchased': df['ordered'].sum()
}

funnel_df = pd.DataFrame(list(funnel_stages.items()), columns=['Stage', 'Count'])

This will show the massive "drop-off" typical of e-commerce sites.

In [3]:
fig = go.Figure(go.Funnel(
    y = funnel_df['Stage'],
    x = funnel_df['Count'],
    textinfo = "value+percent initial"
))

fig.update_layout(title_text="Amazon User Conversion Funnel")
fig.show()

"Do mobile users buy more than computer users?"

In [4]:
# Calculate conversion rate per device
devices = ['device_mobile', 'device_computer', 'device_tablet']
device_conversion = {}

for device in devices:
    subset = df[df[device] == 1]
    conv_rate = (subset['ordered'].sum() / len(subset)) * 100
    device_conversion[device] = conv_rate

device_df = pd.DataFrame(list(device_conversion.items()), columns=['Device', 'Conversion_Rate'])

# Plotting the comparison
fig_device = px.bar(device_df, x='Device', y='Conversion_Rate',
                    title="Conversion Rate by Device Type",
                    color='Conversion_Rate',
                    labels={'Conversion_Rate': 'Conversion %'})
fig_device.show()

Which website features actually lead to sales? You can check which clicks have the highest correlation with ordered.

In [5]:
# Correlation of specific actions with ordering
correlations = df.corr(numeric_only=True)['ordered'].sort_values(ascending=False)

# Filter for relevant feature interactions (removing 'ordered' and ID columns)
top_features = correlations.drop(['ordered', 'UserID'], errors='ignore').head(5)
print("Top 5 Actions that lead to a Purchase:")
print(top_features)

Top 5 Actions that lead to a Purchase:
basket_icon_click   NaN
basket_add_list     NaN
basket_add_detail   NaN
sort_by             NaN
image_picker        NaN
Name: ordered, dtype: float64
