# Module 10: Final Project - E-Commerce Analytics

**Estimated Time:** 90 minutes

## Project Overview

Congratulations on making it to the final project! You'll now apply everything you've learned to analyze an e-commerce database and provide actionable business insights.

## Learning Objectives

By completing this project, you will:
- Apply all SQL concepts learned in previous modules
- Perform comprehensive data analysis
- Create business intelligence queries
- Generate insights for decision-making
- Build a complete analytical report

## Project Scenario

You're a data analyst at an online retail company. Management has asked you to analyze sales data and provide insights on:
1. Customer behavior and segmentation
2. Product performance
3. Sales trends
4. Revenue optimization opportunities
5. Inventory management

## Setup

In [None]:
import sqlite3
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from pathlib import Path

%load_ext sql

# Set plotting style
plt.style.use("seaborn-v0_8-darkgrid")
sns.set_palette("husl")

# Connect to database
DB_PATH = Path.cwd().parent / "data" / "databases" / "ecommerce.db"
conn = sqlite3.connect(DB_PATH)
%sql sqlite:///$DB_PATH

print("✓ Connected to ecommerce.db")
print("✓ Ready to start the final project!")

## Part 1: Customer Analysis (25 minutes)

### Task 1.1: Customer Lifetime Value (CLV)

Calculate the total revenue generated by each customer. Identify VIP customers (top 10% by spending).

In [None]:
# Your code here: Calculate CLV for each customer
%%sql
-- Hint: JOIN customers and orders, GROUP BY customer, SUM total_amount

### Task 1.2: Customer Segmentation

Segment customers into tiers based on their total spending:
- Platinum: > $1000
- Gold: $500 - $1000
- Silver: $200 - $500
- Bronze: < $200

Show count of customers in each tier.

In [None]:
# Your code here: Customer segmentation with CASE
%%sql

### Task 1.3: Customer Retention Analysis

Find customers who made repeat purchases (more than one order). What percentage of customers are repeat buyers?

In [None]:
# Your code here: Identify repeat customers
%%sql

### Task 1.4: Customer Geography Analysis

Analyze customer distribution and spending by country. Which countries generate the most revenue?

In [None]:
# Your code here: Revenue by country
%%sql

## Part 2: Product Performance (20 minutes)

### Task 2.1: Best-Selling Products

Identify the top 10 products by:
1. Total quantity sold
2. Total revenue generated

In [None]:
# Your code here: Top products by quantity
%%sql

In [None]:
# Your code here: Top products by revenue
%%sql

### Task 2.2: Category Performance

Compare product categories by:
- Number of products
- Total sales
- Average product price
- Revenue contribution (%)

In [None]:
# Your code here: Category analysis
%%sql

### Task 2.3: Slow-Moving Inventory

Identify products that:
- Have never been ordered, OR
- Have been ordered fewer than 3 times

These are candidates for promotions or discontinuation.

In [None]:
# Your code here: Slow-moving products
%%sql

## Part 3: Sales Trends Analysis (20 minutes)

### Task 3.1: Monthly Sales Trends

Analyze sales trends by month:
- Total orders
- Total revenue
- Average order value

In [None]:
# Your code here: Monthly sales analysis
query = """
-- Your SQL here
"""
df_monthly = pd.read_sql_query(query, conn)
display(df_monthly)

In [None]:
# Visualization: Monthly revenue trend
if len(df_monthly) > 0:
    plt.figure(figsize=(12, 5))
    plt.plot(df_monthly["month"], df_monthly["total_revenue"], marker="o")
    plt.title("Monthly Revenue Trend")
    plt.xlabel("Month")
    plt.ylabel("Revenue ($)")
    plt.xticks(rotation=45)
    plt.tight_layout()
    plt.show()

### Task 3.2: Order Status Analysis

Analyze orders by status. What percentage of orders are:
- Completed
- Pending
- Shipped
- Other

In [None]:
# Your code here: Order status distribution
%%sql

### Task 3.3: Average Order Value (AOV) by Customer Segment

Calculate AOV for different customer segments (by country or spending tier).

In [None]:
# Your code here: AOV analysis
%%sql

## Part 4: Revenue Optimization (15 minutes)

### Task 4.1: Product Bundling Opportunities

Find products that are frequently bought together (appear in the same order).

In [None]:
# Your code here: Products frequently bought together
%%sql
-- Hint: Self-join order_items on order_id, filter product_id != product_id

### Task 4.2: Customer Win-Back Campaign

Identify customers who:
- Placed at least one order in the past
- Haven't ordered in the last 6 months
- Had a lifetime value > $100

These are targets for a win-back campaign.

In [None]:
# Your code here: Inactive high-value customers
%%sql

### Task 4.3: Pricing Analysis

Analyze the relationship between product price and sales volume. Do lower-priced products sell better?

In [None]:
# Your code here: Price vs Sales analysis
query = """
-- Your SQL here
"""
df_price_sales = pd.read_sql_query(query, conn)
display(df_price_sales.head(20))

In [None]:
# Visualization: Price vs Sales scatter plot
if len(df_price_sales) > 0:
    plt.figure(figsize=(10, 6))
    plt.scatter(df_price_sales["price"], df_price_sales["units_sold"], alpha=0.6)
    plt.title("Product Price vs Units Sold")
    plt.xlabel("Price ($)")
    plt.ylabel("Units Sold")
    plt.tight_layout()
    plt.show()

## Part 5: Inventory Management (10 minutes)

### Task 5.1: Stock Reorder Alert

Identify products that need restocking based on:
- Current stock < 30 units
- Has been ordered at least 5 times

Prioritize by sales velocity.

In [None]:
# Your code here: Reorder recommendations
%%sql

### Task 5.2: Inventory Value Analysis

Calculate total inventory value (stock_quantity × price) by category.

In [None]:
# Your code here: Inventory value by category
%%sql

## Part 6: Executive Dashboard Query

Create a comprehensive dashboard query that shows:

```
Key Metrics:
- Total Customers
- Total Orders
- Total Revenue
- Average Order Value
- Total Products
- Products Out of Stock
- Repeat Customer Rate (%)
```

In [None]:
# Your code here: Executive dashboard
%%sql

## Part 7: Insights and Recommendations

Based on your analysis, provide answers to these questions:

1. **Customer Insights**: 
   - What percentage of customers are repeat buyers?
   - Which customer segment generates the most revenue?
   - What geographic markets should we focus on?

2. **Product Insights**:
   - Which product categories perform best?
   - Are there products we should discontinue?
   - What products frequently sell together?

3. **Sales Insights**:
   - What are the sales trends over time?
   - What is the average order value?
   - How many orders are pending/shipped vs completed?

4. **Recommendations**:
   - What inventory needs immediate restocking?
   - Which customers should we target for win-back campaigns?
   - What pricing strategies could improve sales?

**Write your insights below:**

### Your Analysis and Recommendations:

#### Customer Insights:
*(Write your findings here based on the queries above)*

#### Product Insights:
*(Write your findings here)*

#### Sales Insights:
*(Write your findings here)*

#### Key Recommendations:
*(List 3-5 actionable recommendations)*

## Bonus Challenges (Optional)

If you want to go further:

### Challenge 1: RFM Analysis
Perform RFM (Recency, Frequency, Monetary) analysis to segment customers.

In [None]:
# Your code here
%%sql

### Challenge 2: Cohort Analysis
Analyze customer cohorts based on signup month.

In [None]:
# Your code here
%%sql

### Challenge 3: Sales Forecasting
Use historical data to predict next month's sales (simple trend analysis).

In [None]:
# Your code here

## Project Summary

Congratulations! You've completed the SQL Fundamentals course.

### Skills Demonstrated:
- ✓ Complex multi-table JOINs
- ✓ Aggregations and GROUP BY
- ✓ Subqueries and CTEs
- ✓ Window functions
- ✓ CASE statements for segmentation
- ✓ Data analysis and business intelligence
- ✓ Query optimization
- ✓ Real-world problem solving

### Next Steps:
1. Add this project to your portfolio
2. Practice with real datasets (Kaggle, public databases)
3. Learn advanced SQL (window functions, recursive CTEs, stored procedures)
4. Explore BI tools (Tableau, Power BI, Looker)
5. Study database administration and optimization
6. Practice SQL interview questions (LeetCode, HackerRank)

### Resources:
- SQL Cheat Sheet: `docs/SQL_CHEAT_SHEET.md`
- SQL Glossary: `docs/SQL_GLOSSARY.md`
- FAQ: `docs/FAQ.md`

**Excellent work!** You're now equipped with intermediate SQL skills ready for real-world data analysis.

In [None]:
# Cleanup
conn.close()
print("Project complete! Great job!")