<div align="right" style=" font-size: 80%; text-align: center; margin: 0 auto">
<img
 src="https://raw.githubusercontent.com/Explore-AI/Pictures/master/alx-courses/aice/assets/Content_page_banner_blue_dots.png"
 alt="ALX Content Header"
 class="full-width-image"
/>
</div>

# **Regional Showdown: 2014 Business Performance Battle**

## **Background**
The executive board at **Superstore HQ** is gearing up for the **end-of-year awards**. Four regional managers are vying for the title of **Top Performer**, and you're the data analyst trusted to deliver the truth.





![Regional Managers Planning](images/manager_meeting.jpg)

## **Problem**
There's no clear way to compare performance across regions using just spreadsheets — and rumors are flying between regional offices about who's doing better. The board needs a **data-backed report** to settle the debate and guide next quarter's investment. 



You, the data analyst, have been asked to gather key insights across all regions. Your SQL queries will feed into the final showdown metrics.

## **Learning objectives**

- Write production-ready SQL to answer real business questions.

- Query and join tables effectively in notebooks.

- Use aggregations and numeric functions to compute KPIs.

- Apply window functions to analyze trends and rank data.

- Think critically to support data-driven decisions.

### 1. Install the required libraries (if not yet installed)
In a Jupyter notebook cell, run:

In [None]:
# !pip install pandas sqlalchemy ipython-sql

Collecting ipython-sql
  Obtaining dependency information for ipython-sql from https://files.pythonhosted.org/packages/30/8f/9e50fa53ffc371483f9d1b90c1175b706d28a2e978e90a8894035af01905/ipython_sql-0.5.0-py3-none-any.whl.metadata
  Downloading ipython_sql-0.5.0-py3-none-any.whl.metadata (17 kB)
Collecting prettytable (from ipython-sql)
  Obtaining dependency information for prettytable from https://files.pythonhosted.org/packages/02/c7/5613524e606ea1688b3bdbf48aa64bafb6d0a4ac3750274c43b6158a390f/prettytable-3.16.0-py3-none-any.whl.metadata
  Downloading prettytable-3.16.0-py3-none-any.whl.metadata (33 kB)
Collecting sqlalchemy>=2.0 (from ipython-sql)
  Obtaining dependency information for sqlalchemy>=2.0 from https://files.pythonhosted.org/packages/9d/8e/8344f8ae1cb6a479d0741c02cd4f666925b2bf02e2468ddaf5ce44111f30/sqlalchemy-2.0.41-cp311-cp311-win_amd64.whl.metadata
  Downloading sqlalchemy-2.0.41-cp311-cp311-win_amd64.whl.metadata (9.8 kB)
Collecting sqlparse (from ipython-sql)
  Obta

### 2. Load CSV and Set Up SQLite Database for SQL Queries in Jupyter

In [None]:
import pandas as pd
from sqlalchemy import create_engine

# Step 1: Load CSV
df = pd.read_csv("superstore.csv")  # Make sure this file is in your folder

# Step 2: Save to file-based SQLite DB
engine = create_engine("sqlite:///superstore.db")
df.to_sql("superstore", con=engine, index=False, if_exists="replace")

# Step 3: Use SQL magic
%load_ext sql
%sql sqlite:///superstore.db

The sql extension is already loaded. To reload it, use:
  %reload_ext sql


### 3. Review the first 5 rows

In [None]:
%%sql
SELECT * FROM superstore LIMIT 5;

 * sqlite:///superstore.db
Done.


Category,City,Country,Customer.ID,Customer.Name,Discount,Market,记录数,Order.Date,Order.ID,Order.Priority,Product.ID,Product.Name,Profit,Quantity,Region,Row.ID,Sales,Segment,Ship.Date,Ship.Mode,Shipping.Cost,State,Sub.Category,Year,Market2,weeknum
Office Supplies,Los Angeles,United States,LS-172304,Lycoris Saunders,0.0,US,1,2011-01-07 00:00:00.000,CA-2011-130813,High,OFF-PA-10002005,Xerox 225,9.3312,3,West,36624,19,Consumer,2011-01-09 00:00:00.000,Second Class,4.37,California,Paper,2011,North America,2
Office Supplies,Los Angeles,United States,MV-174854,Mark Van Huff,0.0,US,1,2011-01-21 00:00:00.000,CA-2011-148614,Medium,OFF-PA-10002893,"Wirebound Service Call Books, 5 1/2"" x 4""",9.2928,2,West,37033,19,Consumer,2011-01-26 00:00:00.000,Standard Class,0.94,California,Paper,2011,North America,4
Office Supplies,Los Angeles,United States,CS-121304,Chad Sievert,0.0,US,1,2011-08-05 00:00:00.000,CA-2011-118962,Medium,OFF-PA-10000659,"Adams Phone Message Book, Professional, 400 Message Capacity, 5 3/6” x 11”",9.8418,3,West,31468,21,Consumer,2011-08-09 00:00:00.000,Standard Class,1.81,California,Paper,2011,North America,32
Office Supplies,Los Angeles,United States,CS-121304,Chad Sievert,0.0,US,1,2011-08-05 00:00:00.000,CA-2011-118962,Medium,OFF-PA-10001144,Xerox 1913,53.2608,2,West,31469,111,Consumer,2011-08-09 00:00:00.000,Standard Class,4.59,California,Paper,2011,North America,32
Office Supplies,Los Angeles,United States,AP-109154,Arthur Prichep,0.0,US,1,2011-09-29 00:00:00.000,CA-2011-146969,High,OFF-PA-10002105,Xerox 223,3.1104,1,West,32440,6,Consumer,2011-10-03 00:00:00.000,Standard Class,1.32,California,Paper,2011,North America,40


# Integrated project notebook

## 1. SQL in Production (Basic Concepts)

### **Task 1.** 
Which region generated the highest total sales in year 2014?

In [None]:
%%sql

# Add code here


 * sqlite:///superstore.db
Done.


Region,Total_Sales
Central,939062
South,545272
North,429178
Oceania,362440
Southeast Asia,323068
EMEA,301702
Africa,283034
North Asia,264560
Central Asia,259142
West,250664


### **Task 2.**

What is the average order quantity by region for the past quarter (Q4 2014)?

In [None]:
%%sql
 
#Add code here


 * sqlite:///superstore.db
Done.


Region,Avg_Order_Quantity
West,4.0
Southeast Asia,4.0
South,4.0
Oceania,4.0
North Asia,4.0
North,4.0
East,4.0
Central Asia,4.0
Central,4.0
Caribbean,4.0


<!-- Note -->
<div style="background-color:#f5f5f5; border-left:5px solid rgb(180, 180, 180); padding:10px; border-radius:4px;">
  <strong>Hold on...</strong> Sales alone don’t tell the full story. What if a region makes fewer sales but has better margins?
</div>

### **Task 3.**
If one region consistently has lower sales but higher profit margins, how might that affect decisions about future investments in that region?

## 2. Querying in Notebooks

### **Task 4.** 
List the top 3 states by number of orders placed.


In [None]:
%%sql

# Add code here

 * sqlite:///superstore.db
Done.


State,Number_Of_Orders
California,2001
England,1499
New York,1128



### **Task 5.**
Show the product categories with negative profit margins.

In [None]:
%%sql

# Add code here

 * sqlite:///superstore.db
Done.


Category,Total_Profit


<!-- Note -->
<div style="background-color:#f5f5f5; border-left:5px solid rgb(180, 180, 180); padding:10px; border-radius:4px;">
  <strong>Hmm... </strong> High sales are great, but what if customers are returning products left and right?
It's time to investigate what customers think and whether the product mix is working.
</div>

### **Task 6.** 
If you observe a high number of returns in a specific product category, what would you recommend as a regional manager?

![Loyal Customers](images/loyal_customers.jpg)

## 3. Numeric Functions & Aggregations

### **Task 7.**
Calculate the average delivery delay per region (Order Date vs Ship Date).

In [None]:
%%sql

# Add code here


 * sqlite:///superstore.db
Done.


Region,Avg_Delivery_Delay_Days
Africa,3.91
Canada,3.68
Caribbean,3.97
Central,4.03
Central Asia,4.01
EMEA,3.93
East,3.91
North,4.03
North Asia,3.91
Oceania,3.93


### **Task 8.** 
Which customer segment yields the highest average order value?

In [None]:
%%sql

# Add code here


 * sqlite:///superstore.db
Done.


Segment,Avg_Order_Value
Corporate,248.0


<!-- Note -->
<div style="background-color:#f5f5f5; border-left:5px solid rgb(180, 180, 180); padding:10px; border-radius:4px;">
  <strong>Efficiency matters. </strong> A region might be pulling in good revenue, but are they doing it efficiently?
Long delivery times and high shipping costs can eat away at performance.
</div>

### **Task 9.**
Two regions have similar total sales, but one has significantly higher shipping costs. What factors could be causing this?

![Shipping Costs](images/shipping_costs.jpg)

## 4. Window Functions

### **Task 10.**
Rank customers within each region by total sales and find the top customer per region.

In [None]:
%%sql
SELECT Region, "Customer.Name", MAX(Total_Sales) AS Max_Sales 
# Add code here

 * sqlite:///superstore.db
Done.


Region,Customer.Name,Max_Sales
Africa,Barry Weirich,8957
Canada,Stuart Van,4009
Caribbean,Frank Merwin,4656
Central,Tamara Chand,27345
Central Asia,Cynthia Arntzen,10462
EMEA,Sally Hughsby,7537
East,Tom Ashbrook,13724
North,Fred Hopkins,11122
North Asia,Carol Adams,9055
Oceania,Dave Poirier,11865


### **Task 11.**
Show the year-over-year sales growth per region.

In [67]:
%%sql
WITH YearlySales AS (
  SELECT 
    Region,
    strftime('%Y', "Order.Date") AS Year,
    SUM(Sales) AS Total_Sales
  FROM superstore
  GROUP BY Region, Year
)

SELECT 
  cur.Region,
  cur.Year AS Current_Year,
  cur.Total_Sales AS Current_Year_Sales,
  prev.Total_Sales AS Previous_Year_Sales,
  ROUND(
    CASE 
      WHEN prev.Total_Sales IS NULL OR prev.Total_Sales = 0 THEN NULL
      ELSE ((cur.Total_Sales - prev.Total_Sales) * 100.0 / prev.Total_Sales)
    END, 2
  ) AS YoY_Growth_Percent
FROM YearlySales cur
LEFT JOIN YearlySales prev
  ON cur.Region = prev.Region
  AND CAST(cur.Year AS INTEGER) = CAST(prev.Year AS INTEGER) + 1
ORDER BY cur.Region, cur.Year;



 * sqlite:///superstore.db
Done.


Region,Current_Year,Current_Year_Sales,Previous_Year_Sales,YoY_Growth_Percent
Africa,2011,127186,,
Africa,2012,144487,127186.0,13.6
Africa,2013,229069,144487.0,58.54
Africa,2014,283034,229069.0,23.56
Canada,2011,8507,,
Canada,2012,16099,8507.0,89.24
Canada,2013,19162,16099.0,19.03
Canada,2014,23164,19162.0,20.89
Caribbean,2011,57043,,
Caribbean,2012,64149,57043.0,12.46


<!-- Note -->
<div style="background-color:#f5f5f5; border-left:5px solid rgb(180, 180, 180); padding:10px; border-radius:4px;">
  <strong>Yikes! </strong> One region saw huge growth in Q3, but something’s not right in Q4…
Time to analyze trends and see whether that dip is a red flag.
</div>

### **Task 12.**
If a region had strong sales in Q3 but declining sales in Q4, how should that influence their standing in the final leaderboard?

<!-- Note -->
<div style="background-color:#f5f5f5; border-left:5px solid rgb(180, 180, 180); padding:10px; border-radius:4px;">
  <strong>Loyalty check! </strong> Let's find out which region's customers keep coming back for more.
Repeat business might just be the secret weapon of the winning region.
</div>

### **Task 13.** 
Which region has the highest number of repeat customers (customers who ordered more than once)

In [71]:
%%sql
SELECT Region, COUNT(DISTINCT "Customer.ID") AS Repeat_Customers
FROM superstore
WHERE "Customer.ID" IN (
    SELECT "Customer.ID"
    FROM superstore
    GROUP BY "Customer.ID"
    HAVING COUNT(DISTINCT "Order.ID") > 1
)
GROUP BY Region
ORDER BY Repeat_Customers DESC
LIMIT 1;


 * sqlite:///superstore.db
Done.


Region,Repeat_Customers
Central,2067


<!-- Note -->
<div style="background-color:#f5f5f5; border-left:5px solid rgb(180, 180, 180); padding:10px; border-radius:4px;">
  <strong>And now… the moment of truth. </strong> You’ve done the digging, cleaned the data, and run the queries. Let’s put it all together to score each region across Sales, Profit, Delivery, and Loyalty.
Time to crown the Ultimate Regional Manager of the year!
</div>

### **Task 14**

Which regions rank highest overall in 2014 based on total sales, profit, delivery delay, and shipping cost? Explain how these metrics contribute to the overall ranking.

In [102]:
%%sql
WITH SalesProfit AS (
    SELECT 
        Region,
        ROUND(SUM(Sales), 2) AS Total_Sales,
        ROUND(SUM(Profit), 2) AS Total_Profit,
        ROUND(AVG(Quantity), 2) AS Avg_Quantity,
        ROUND(AVG(julianday("Ship.Date") - julianday("Order.Date")), 2) AS Avg_Delivery_Delay,
        ROUND(SUM("Shipping.Cost"), 2) AS Total_Shipping_Cost
    FROM superstore
    WHERE strftime('%Y', "Order.Date") = '2014'
    GROUP BY Region
),
RankedRegions AS (
    SELECT *,
        RANK() OVER (ORDER BY Total_Sales DESC) AS Sales_Rank,
        RANK() OVER (ORDER BY Total_Profit DESC) AS Profit_Rank,
        RANK() OVER (ORDER BY Avg_Delivery_Delay ASC) AS Delivery_Rank,
        RANK() OVER (ORDER BY Total_Shipping_Cost ASC) AS Shipping_Rank
    FROM SalesProfit
),
Final AS (
    SELECT *,
        (Sales_Rank + Profit_Rank + Delivery_Rank + Shipping_Rank) AS Overall_Rank_Score
    FROM RankedRegions
)
SELECT 
    Region, 
    Total_Sales, 
    Total_Profit, 
    Avg_Quantity, 
    Avg_Delivery_Delay, 
    Total_Shipping_Cost, 
    Overall_Rank_Score
FROM Final
ORDER BY Overall_Rank_Score
;


 * sqlite:///superstore.db
Done.


Region,Total_Sales,Total_Profit,Avg_Quantity,Avg_Delivery_Delay,Total_Shipping_Cost,Overall_Rank_Score
North Asia,264560.0,52770.44,3.8,3.94,28013.83,21
West,250664.0,43900.63,3.9,3.82,26320.47,22
North,429178.0,56658.35,3.8,3.96,46422.47,23
Central Asia,259142.0,47547.48,3.75,3.89,28147.1,23
South,545272.0,51776.16,3.74,3.95,56937.41,24
Africa,283034.0,39331.47,2.31,3.93,30082.93,25
Central,939062.0,97723.57,3.71,3.98,101607.15,27
Canada,23164.0,5993.01,2.17,3.75,2498.4,28
Oceania,362440.0,31431.58,3.63,3.96,38488.33,30
EMEA,301702.0,22600.32,2.29,3.96,33366.04,31


### **Task 15**

Using your previous answers and the score weights in the image, which region wins overall — and why? (Bonus Question)

![Regional Managers Score Matrix](images/Regional_Manager_Score_Matrix.png)

In [99]:
%%sql

WITH Base AS (
    SELECT 
        Region,
        "Customer.ID" AS Customer_ID,
        "Order.ID" AS Order_ID,
        Sales,
        Profit,
        Quantity,
        "Shipping.Cost" AS Shipping_Cost,
        julianday("Ship.Date") - julianday("Order.Date") AS Delivery_Delay,
        strftime('%m', "Order.Date") AS Month
    FROM superstore
    WHERE strftime('%Y', "Order.Date") = '2014'
),

SalesProfit AS (
    SELECT 
        Region,
        ROUND(SUM(Sales), 2) AS Total_Sales,
        ROUND(SUM(Profit), 2) AS Total_Profit,
        ROUND(AVG(Quantity), 2) AS Avg_Quantity,
        ROUND(AVG(Delivery_Delay), 2) AS Avg_Delivery_Delay,
        ROUND(SUM(Shipping_Cost), 2) AS Total_Shipping_Cost
    FROM Base
    GROUP BY Region
),

RepeatCustomers AS (
    SELECT Customer_ID
    FROM Base
    GROUP BY Customer_ID
    HAVING COUNT(DISTINCT Order_ID) > 1
),

RegionRepeat AS (
    SELECT 
        b.Region, 
        COUNT(DISTINCT b.Customer_ID) AS Repeat_Customers
    FROM Base b
    JOIN RepeatCustomers r ON b.Customer_ID = r.Customer_ID
    GROUP BY b.Region
),

QuarterSales AS (
    SELECT 
        Region,
        SUM(CASE WHEN Month IN ('07','08','09') THEN Sales ELSE 0 END) AS Q3_Sales,
        SUM(CASE WHEN Month IN ('10','11','12') THEN Sales ELSE 0 END) AS Q4_Sales
    FROM Base
    WHERE Month IN ('07','08','09', '10','11','12')
    GROUP BY Region
),

QoQGrowth AS (
    SELECT 
        Region,
        ROUND((Q4_Sales - Q3_Sales) * 100.0 / NULLIF(Q3_Sales, 0), 2) AS Sales_Growth
    FROM QuarterSales
),

Combined AS (
    SELECT 
        sp.Region,
        Total_Sales,
        Total_Profit,
        Avg_Quantity,
        Avg_Delivery_Delay,
        Total_Shipping_Cost,
        COALESCE(rr.Repeat_Customers, 0) AS Repeat_Customers,
        COALESCE(g.Sales_Growth, 0) AS Sales_Growth
    FROM SalesProfit sp
    LEFT JOIN RegionRepeat rr ON sp.Region = rr.Region
    LEFT JOIN QoQGrowth g ON sp.Region = g.Region
),

Ranked AS (
    SELECT *,
        RANK() OVER (ORDER BY Total_Sales DESC) AS Sales_Rank,
        RANK() OVER (ORDER BY Total_Profit DESC) AS Profit_Rank,
        RANK() OVER (ORDER BY Avg_Quantity DESC) AS Quantity_Rank,
        RANK() OVER (ORDER BY Avg_Delivery_Delay ASC) AS Delivery_Rank,
        RANK() OVER (ORDER BY Repeat_Customers DESC) AS Loyalty_Rank,
        RANK() OVER (ORDER BY Sales_Growth DESC) AS Growth_Rank
    FROM Combined
),

Final AS (
    SELECT *,
        ROUND(0.3 * Sales_Rank + 
              0.25 * Profit_Rank + 
              0.15 * Quantity_Rank + 
              0.10 * Delivery_Rank + 
              0.10 * Loyalty_Rank + 
              0.10 * Growth_Rank, 2) AS Overall_Score
    FROM Ranked
)

SELECT 
    Region,
    Total_Sales,
    Total_Profit,
    Avg_Quantity,
    Avg_Delivery_Delay,
    Repeat_Customers,
    Sales_Growth,
    Overall_Score
FROM Final
ORDER BY Overall_Score DESC
LIMIT 4;


 * sqlite:///superstore.db
Done.


Region,Total_Sales,Total_Profit,Avg_Quantity,Avg_Delivery_Delay,Repeat_Customers,Sales_Growth,Overall_Score
Canada,23164.0,5993.01,2.17,3.75,2,118.44,10.6
Caribbean,105496.0,12532.62,3.78,4.16,187,67.26,9.9
EMEA,301702.0,22600.32,2.29,3.96,255,-19.91,9.0
East,213259.0,33195.35,3.7,3.97,299,45.01,8.85


#  

<div align="center" style=" font-size: 80%; text-align: center; margin: 0 auto">
<img src="https://raw.githubusercontent.com/Explore-AI/Pictures/refs/heads/master/ALX_banners/ALX_Navy.png"  style="width:100px"  ;/>
</div>