This project analyzes an E-Commerce dataset analysis using SQL queries and Python for insights & visualizations.
It explores customer behavior, sales patterns, and business performance by answering both basic and advanced analytical questions.
The dataset includes information about:
- Customers β Demographic details (city, state, ID mapping)
- Orders β Order IDs, timestamps, status
- Order Items β Product ID, seller ID, quantity, price, freight value
- Products β Category, dimensions, weight
- Payments β Payment type, installments, value
- Sellers β Seller details and performance
- Geolocation β City, state, latitude, longitude
Using these tables, SQL was used to extract insights, and Python (Pandas, NumPy, Matplotlib, Seaborn) was used for further analysis.
- Unique customer cities
- Orders placed in 2017
- Sales per product category
- % of orders with installments
- Customers count per state
- Orders per month in 2018
- Avg. products per order (by city)
- % revenue by category
- Correlation: product price vs. purchase frequency
- Revenue per seller (ranked)
- Moving average of order values per customer
- Cumulative sales per month (by year)
- Year-over-year sales growth rate
- Customer retention (within 6 months)
- Top 3 spenders per year
- SQL β Data extraction, transformations, aggregation
- Python β Data cleaning, analysis & visualization
- Pandas, NumPy, Matplotlib, Seaborn
-
Clone this repository:
git clone https://github.com/your-username/ecommerce-data-analysis.git cd ecommerce-data-analysis
-
Import dataset files into your SQL database (MySQL/PostgreSQL/SQLite).
-
Run queries from Questions.txt or queries.sql.
-
Open Ecommerce_python_sql.ipynb in Jupyter Notebook to view the Python analysis.