Skip to content

Maheshbirajadar/SQL-Sales-Analysis-project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 

Repository files navigation

Retail Sales Analysis SQL Project

Project Overview

Project Title: Retail Sales Analysis
Level: Beginner
Database: Sales_db

This project is designed to demonstrate SQL skills and techniques typically used by data analysts to explore, clean, and analyze retail sales data. The project involves setting up a retail sales database, performing exploratory data analysis (EDA), and answering specific business questions through SQL queries.

Objectives

  1. Set up a retail sales database: Created and populate a retail sales database with the provided sales data.
  2. Data Cleaning: Identify and remove any records with missing or null values.
  3. Exploratory Data Analysis (EDA): Perform basic exploratory data analysis to understand the dataset.
  4. Business Analysis: Used SQL to answer specific business questions and derive insights from the sales data.

Project Structure

1. Database Setup

  • Database Creation: The project starts by creating a database named Sales_db.
  • Table Creation: A table named sales_table is created to store the sales data. The table structure includes columns for transaction ID, sale date, sale time, customer ID, gender, age, product category, quantity sold, price per unit, cost of goods sold (COGS), and total sale amount.
CREATE DATABASE Sales_db;

CREATE TABLE sales_table
 (
    transaction_id INT PRIMARY KEY,
    sale_date DATE,
    sale_time TIME,
    customer_id INT,
    gender VARCHAR(10),
    age INT,
    category VARCHAR(100),
    quantity INT,
    price_per_unit DECIMAL(10,2),
    cogs DECIMAL(10,2),
    total_sale DECIMAL(10,2)
);

2. Data Exploration & Cleaning

  • Record Count: Determine the total number of records in the dataset.
  • Customer Count: Find out how many unique customers are in the dataset.
  • Category Count: Identify all unique product categories in the dataset.
  • Null Value Check: Check for any null values in the dataset and delete records with missing data.
SELECT COUNT(*) FROM sales_table;
SELECT COUNT(DISTINCT customer_id) FROM sales_table;
SELECT DISTINCT category FROM sales_table;

SELECT * FROM sales_table
WHERE transaction_id IS NULL 
   OR sale_date IS NULL 
   OR sale_time IS NULL 
   OR customer_id IS NULL 
   OR gender IS NULL 
   OR age IS NULL 
   OR category IS NULL 
   OR quantity IS NULL 
   OR price_per_unit IS NULL 
   OR cogs IS NULL 
   OR total_sale IS NULL;

DELETE FROM sales_table
WHERE 
    sale_date IS NULL OR sale_time IS NULL OR customer_id IS NULL OR 
    gender IS NULL OR age IS NULL OR category IS NULL OR 
    quantity IS NULL OR price_per_unit IS NULL OR cogs IS NULL;

3. Data Analysis & Findings

The following SQL queries were developed to answer specific business questions:

  1. Write a SQL query to retrieve all columns for sales made on '2022-11-07:
Select *
From sales_table
Where sale_date = '2022-11-07';
  1. Write a SQL query to retrieve all transactions where the category is 'Clothing' and the quantity sold is > = to 4 in the month of Nov-2022:
SELECT *  
FROM sales_table  
WHERE category = 'Clothing'  
AND quantity >= 4  
AND MONTH(sale_date) = 11  
AND YEAR(sale_date) = 2022;
  1. Write a SQL query to calculate the total sales (total_sale) for each category.:
Select category, SUM(TOTAL_SALE) as Net_sales
From sales_table
Group by category;
  1. Write a SQL query to find the average age of customers who purchased items from the 'Beauty' category.:
Select category, avg(age) as Avg_age
From sales_table
where category = 'Beauty';
  1. Write a SQL query to find all transactions where the total_sale is greater than 1000.:
Select * from sales_table
Where total_sale > 1000;
  1. Write a SQL query to find the total number of transactions (transaction_id) made by each gender in each category.:
Select 
  gender, 
  category, 
  count(transaction_id) as total_transactions
  From sales_table
Group by 
  gender, category
Order by category DESC;
  1. Write a SQL query to calculate the average sale for each month. Find out best selling month in each year:
SELECT 
    year,
    month,
    avg_sale
FROM (    
    SELECT 
        EXTRACT(YEAR FROM sale_date) AS year,
        EXTRACT(MONTH FROM sale_date) AS month,
        AVG(total_sale) AS avg_sale,
        SUM(total_sale) AS total_sales,
        DENSE_RANK() OVER (PARTITION BY EXTRACT(YEAR FROM sale_date) ORDER BY SUM(total_sale) DESC) AS ranking
    FROM sales_table
    GROUP BY EXTRACT(YEAR FROM sale_date), EXTRACT(MONTH FROM sale_date)
) AS t1
WHERE ranking = 1;
  1. **Write a SQL query to find the top 5 customers based on the highest total sales **:
SELECT customer_id, SUM(total_sale) AS total_sales  
FROM Sales_table  
GROUP BY customer_id  
ORDER BY total_sales DESC  
LIMIT 5;
  1. Write a SQL query to find the number of unique customers who purchased items from each category.:
Select 
    count(distinct customer_id) as Unique_customers_purchase, category
From 
   sales_table
 Group by category;
  1. Write a SQL query to create each shift and number of orders (Example Morning <12, Afternoon Between 12 & 17, Evening >17):
SELECT 
    CASE 
        WHEN HOUR(sale_time) < 12 THEN 'Morning'
        WHEN HOUR(sale_time) BETWEEN 12 AND 17 THEN 'Afternoon'
        ELSE 'Evening'
    END AS shift, 
    COUNT(customer_id) AS num_orders
FROM sales_table
GROUP BY shift;

Findings

  • Customer Demographics: The dataset includes customers from various age groups, with sales distributed across different categories such as Clothing and Beauty.
  • High-Value Transactions: Several transactions had a total sale amount greater than 1000, indicating premium purchases.
  • Sales Trends: Monthly analysis shows variations in sales, helping identify peak seasons.
  • Customer Insights: The analysis identifies the top-spending customers and the most popular product categories.

Reports

  • Sales Summary: A detailed report summarizing total sales, customer demographics, and category performance.
  • Trend Analysis: Insights into sales trends across different months and shifts.
  • Customer Insights: Reports on top customers and unique customer counts per category.

Conclusion

This project serves as a comprehensive introduction to SQL for data analysts, covering database setup, data cleaning, exploratory data analysis, and business-driven SQL queries. The findings from this project can help drive business decisions by understanding sales patterns, customer behavior, and product performance.

Thank you, and I look forward to connecting with you!

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published