This project analyzes a retail sales dataset to identify key revenue drivers, product performance trends, and customer behavior patterns using SQL and Excel.
- Identify top-performing product categories by revenue
- Compare sales volume versus revenue contribution
- Evaluate customer purchasing behavior
- SQL (MySQL) – data validation, aggregation
- Microsoft Excel – PivotTables, dashboard
- Imported dataset into MySQL
- Resolved date format issue (
MM/DD/YYYY→YYYY-MM-DD) - Validated totals and data consistency
Revenue by category:
SELECT Product_Category, SUM(Total_Amount) AS revenue
FROM retail_sales
GROUP BY Product_Category
ORDER BY revenue DESC;Sales volume by category:
SELECT Product_Category, SUM(Quantity) AS total_quantity
FROM retail_sales
GROUP BY Product_Category
ORDER BY total_quantity DESC;Age-based revenue segmentation:
SELECT
CASE
WHEN Age < 25 THEN 'Under 25'
WHEN Age BETWEEN 25 AND 40 THEN '25-40'
WHEN Age BETWEEN 41 AND 60 THEN '41-60'
ELSE '60+'
END AS age_group,
SUM(Total_Amount) AS revenue
FROM retail_sales
GROUP BY age_group
ORDER BY revenue DESC;- Electronics generates the highest revenue despite lower sales volume, indicating higher unit pricing
- Clothing has the highest quantity sold but lower revenue per unit
- Customer purchases are evenly distributed across gender, indicating no strong demographic bias
- Younger customers (18–27) generate the highest revenue, while older customers (48–57) show higher transaction frequency, indicating differences in spending behavior
- Focus on high-value categories such as Electronics to drive revenue growth
- Optimize pricing or bundling strategies in Clothing to improve margins
- Target younger customers for high-value sales opportunities
- Leverage frequent buyers in older segments through retention strategies
Developed a structured analysis using SQL and Excel to transform raw sales data into actionable insights and a clear, executive-level dashboard.
