An end-to-end retail sales analytics project that takes a messy, multi-sheet Excel workbook β cleans & models the data in Python β performs exploratory data analysis β and surfaces business insights in a 3-page interactive Power BI dashboard.
This project is designed to reflect real-world retail & commercial analytics workflows used by sales, finance, and operations teams.
Retail leadership asks practical questions every week:
- Which products and channels drive revenue and margin?
- Which regions are over or under-performing?
- Which customers and states deserve commercial focus?
This project demonstrates the full analytics lifecycle required to answer those questions: data modeling, feature engineering, EDA, and executive dashboarding.
It is built to show both:
- Technical ability (Python, data modeling, Power BI)
- Commercial thinking (sales performance, profitability, channel mix)
The project was built from a multi-sheet Excel workbook containing:
- Sales Orders β order details (date, product, customer, quantity, price, channel)
- Customers β customer master data (location, demographics)
- Products β product & category master
- Regions / State Regions β geographic hierarchy
- 2017 Budget β product-level sales budget for variance analysis
- Identified PK/FK relationships between sales, customers, products, and regions
- Designed a star-style reporting model
- Fixed malformed headers
- Normalized column names
- Enforced correct data types
- Removed duplicates and irrelevant fields
- Sales joined to customers, products, regions, and budgets
- Removed redundant columns after joins
Created the following metrics:
- Revenue
- Total Cost
- Gross Profit
- Profit Margin %
- Average Order Value (AOV)
- Monthly & yearly time buckets
- Channel classification
- Monthly revenue & profit trends
- Channel mix distribution
- Product-level profitability analysis
- AOV distribution and outlier detection
3-page report:
-
Executive Overview
- Revenue, Profit, Margin, AOV trends
-
Product & Channel Performance
- Best & worst products
- Channel contribution analysis
-
Customer & Region Analysis
- Sales by state and region
- Top & bottom customers
Includes slicers for:
- Year
- Channel
- Region
- Product
- Clear seasonality pattern in revenue
- Wholesale dominates revenue, but some smaller channels deliver higher margins
- A small number of SKUs contribute to a large portion of revenue
- AOV is right-skewed β opportunity for basket-size optimization via promotions
β What This Project Demonstrates
Real-world retail data modeling
Python-based data preparation
Profit & margin analytics
Channel and customer performance analysis
Executive-level Power BI dashboard design
Commercial decision support reporting
π Suggested Production Enhancements
Automate ingestion with ADF or Python scheduling
Add forecasting & anomaly detection
Implement data validation tests
Centralize KPIs in a semantic layer
Parameterize Power BI data sources
Budget table applies only to 2017 data
File-based data source used (not live database)
Dataset is for demonstration purposes