This project performs an end-to-end business analysis on an E-commerce dataset using SQL and Python.
The objective is to extract actionable business insights by analyzing customer behavior, sales performance, product trends, seller performance, and retention metrics.
This project simulates a real-world data analyst workflow:
- Data extraction using SQL
- Data processing using Python (Pandas)
- Data visualization using Matplotlib
- Business insight generation
- Analyze customer distribution across states and cities
- Evaluate monthly and yearly sales trends
- Identify top-performing product categories
- Rank sellers based on revenue
- Measure customer retention rate
- Calculate Year-over-Year growth
- Analyze installment payment behavior
- MySQL Workbench – SQL Querying & Data Extraction
- Python (Pandas, NumPy) – Data Manipulation
- Matplotlib – Data Visualization
- Git & GitHub – Version Control
- Google Colab – Python Execution Environment
- Unique customer cities
- Orders placed in 2017
- Total sales per category
- Installment payment percentage
- Customers per state
- Monthly orders in 2018
- Average products per order by city
- Revenue contribution by category
- Seller revenue ranking
- Product price vs purchase frequency
- Moving average of customer order value
- Cumulative monthly revenue
- Year-over-Year growth rate
- Customer retention rate (6-month logic)
- Top 3 customers per year
- Sales show strong seasonal growth trends.
- A small number of categories contribute a significant portion of total revenue.
- Certain states dominate customer concentration.
- Installment payments indicate affordability-driven purchasing behavior.
- Year-over-Year growth reflects business expansion.
- Retention analysis highlights opportunity for customer loyalty improvement.
- Monthly Orders Trend
- Top Categories by Revenue
- Yearly Revenue Growth
- State-wise Customer Distribution
- Installment Usage Analysis
- Focus marketing efforts on high-performing categories.
- Strengthen retention strategy to increase repeat purchases.
- Expand logistics operations in high-demand states.
- Optimize pricing for frequently purchased products.
- Develop loyalty programs to improve retention rate.
This project demonstrates practical implementation of SQL for structured querying and Python for analytical modeling and visualization.
It reflects real-world data analysis workflow and business-driven insight generation suitable for Data Analyst and Business Analyst roles.