In [3]:
import pandas as pd

# 1. Loading and cleaning (Basic Setup)
df = pd.read_excel("customer.xlsx")
df = df.rename(columns={
    'InvoiceNo': 'order_id', 'CustomerID': 'customer_id',
    'InvoiceDate': 'date', 'Description': 'product',
    'Quantity': 'quantity', 'UnitPrice': 'price', 'Country': 'country'
})

# 2. Revenue Calculation
df['revenue'] = df['quantity'] * df['price']
total_revenue = df['revenue'].sum()
total_orders = df['order_id'].nunique()
order_revenue = df.groupby('order_id')['revenue'].sum()
aov_order_level = order_revenue.mean()

# 3. Retention Metrics (Day 4 Core)
orders_per_customer = df.groupby('customer_id')['order_id'].nunique()
repeat_customers = orders_per_customer[orders_per_customer > 1]
one_time_customers = orders_per_customer[orders_per_customer == 1]

total_customers = orders_per_customer.shape[0]
repeat_rate = (repeat_customers.shape[0] / total_customers) * 100

# Revenue Segmentation
revenue_per_customer = df.groupby('customer_id')['revenue'].sum()
repeat_revenue = revenue_per_customer.loc[repeat_customers.index].sum()
one_time_revenue = revenue_per_customer.loc[one_time_customers.index].sum()

avg_revenue_repeat = repeat_revenue / repeat_customers.shape[0]
avg_revenue_one_time = one_time_revenue / one_time_customers.shape[0]

# 4. Priority segment for Retention
# Identifying customers who return AND spend more than average
priority_customers = df[df['customer_id'].isin(repeat_customers.index) & (df['revenue'] > aov_order_level)]

# --- OUTPUT ---
print(f"Repeat Purchase Rate: {repeat_rate:.2f}%")
print(f"Revenue from Repeat Customers: ${repeat_revenue:,.2f}")
print(f"Revenue from One-time Customers: ${one_time_revenue:,.2f}")
print(f"Avg Revenue per Repeat Customer: ${avg_revenue_repeat:,.2f}")
print(f"Avg Revenue per One-time Customer: ${avg_revenue_one_time:,.2f}")

Repeat Purchase Rate: 69.97%
Revenue from Repeat Customers: $7,866,281.14
Revenue from One-time Customers: $433,784.67
Avg Revenue per Repeat Customer: $2,571.52
Avg Revenue per One-time Customer: $330.38


ðŸ“ˆ Business analysis
1. Situation (What happened?)

High Retention: The analysis reveals a strong 70% Repeat Purchase Rate.

Revenue Distribution: Loyal customers (repeat buyers) generated $7.86M, while one-time buyers contributed only $433k.

Value Gap: A returning customer is worth **$2,571** on averageâ€”8 times more than a one-time shopper ($330).

2. Why it matters? (Hypothesis)

The business is almost entirely sustained by a loyal core.

High-value repeat customers are the primary engine of profit. Losing a repeat customer is 8x more expensive than losing a one-time buyer.

The current strategy should shift from "finding new people" to "keeping existing ones."

3. Action Plan (Recommendations)

Direct Retargeting: Use the priority_customers list to launch a dedicated loyalty campaign.

Conversion Funnel: Create "Second Order" incentives (coupons, discounts) for one-time buyers to move them into the high-value repeat segment.

LTV Growth: Since repeat customers already trust the brand, focus on upselling premium items (like the Picnic Baskets found on Day 2) to this specific group.
