A complete end-to-end sales data analysis project using Python — covering data cleaning, exploratory data analysis (EDA), and multi-dimensional visualizations across sales reps, regions, product categories, customer types, and payment methods.
This project analyzes a sales dataset using Python in Google Colab, following a structured workflow from raw data ingestion to actionable business insights — producing 6 charts covering every key business dimension.
| Detail | Info |
|---|---|
| Tool | Python (Google Colab) |
| Libraries | Pandas · Matplotlib · Seaborn |
| Dataset | Sales.csv — multi-rep, multi-region sales data |
| Dimensions | Month · Region · Sales Rep · Product · Customer Type · Payment Method |
Monthly sales trend showing growth from ~$250K in January to a peak of ~$450K in Oct–Nov, then a slight dip in December. Clear seasonal upward trend through Q3–Q4.
Regional distribution is nearly equal across all 4 regions — North leads at 27.3%, followed by East (25.1%), West (24.6%), and South (23.0%). No dominant region — balanced market penetration.
David is the top-performing sales rep with ~$1.1M+ in total sales. Bob follows closely. Charlie is the lowest performer. All reps fall in the $850K–$1.1M range — a fairly competitive team.
David also leads in quantity sold (~6,000 units), followed by Alice and Bob (~5,000). Charlie has the lowest quantity at ~4,200 units — consistent with his lower revenue performance.
Clothing leads product categories with ~7,000 units, followed closely by Furniture (~6,700). Food has the lowest quantity at ~5,500 units. All categories show strong demand.
Nearly perfect split between Returning (50.1%) and New customers (49.9%) — indicating strong customer retention AND healthy new customer acquisition simultaneously.
Credit Card leads at $1.76M, followed by Bank Transfer ($1.72M), with Cash slightly lower (~$1.54M). Digital payment methods dominate.
- 📈 Sales grow steadily from Jan to Nov — strong Q3/Q4 performance with a Dec dip
- 🏆 David is the top rep in both revenue (~$1.1M) and quantity (~6,000 units)
- 🌍 All 4 regions are nearly equal — no single dominant market
- 👕 Clothing is the best-selling category by volume at ~7,000 units
- 🔄 50/50 split between new and returning customers — excellent retention rate
- 💳 Credit Card is the preferred payment method, but all 3 methods are well-distributed
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns| Library | Usage |
|---|---|
| Pandas | Data loading, cleaning, grouping, aggregation |
| Matplotlib | Line chart, Donut charts, layout control |
| Seaborn | Bar charts with viridis/magma color palettes |
| File | Description |
|---|---|
Sales_Data.ipynb |
Full Jupyter Notebook with all code and outputs |
Sales.csv |
Raw sales dataset |
1.png – 6.png |
Output chart screenshots |
- Open Google Colab
- Upload
Sales_Data.ipynbandSales.csv - Run all cells (
Runtime→Run all) - All 7 charts will render inline
Belal Farrag — Data Analyst