This project analyzes a retail sales dataset using SQL.
The goal was to clean the data and answer common business questions such as:
- How many sales and customers are there?
- Which categories perform best?
- Who are the top customers?
- What are the peak sales periods?
This project helped me practice SQL concepts like:
- Data cleaning
- Aggregate functions
- Grouping and filtering
- Window functions
- SQL (PostgreSQL syntax)
- Data Cleaning
- Exploratory Data Analysis (EDA)
- Business Query Writing
The dataset contains retail sales information with the following columns:
transactions_id
– Unique ID for each transactionsale_date
,sale_time
– Date and time of the salecustomer_id
,gender
,age
– Customer detailscategory
– Product category (Clothing, Beauty, etc.)quantity
,price_per_unit
,cogs
,total_sale
– Sales details
- Checked and removed null values.
- Counted total sales, customers, and product categories.
- Retrieved sales made on a specific date.
- Filtered transactions based on category and quantity.
- Calculated total and average sales per category.
- Found the average age of customers buying Beauty products.
- Identified transactions with sales above 1000.
- Counted sales by gender across categories.
- Found the best-selling month each year.
- Listed top 5 customers by total purchase value.
- Counted unique customers per category.
- Divided sales into shifts (Morning, Afternoon, Evening).
- Found the top-selling categories by total revenue.
- Identified the best month for sales each year.
- Highlighted the top 5 customers contributing the most to revenue.
- Observed sales patterns by shift of the day.