Skip to content

sriyanshujaiswal/SQL-Python-Ecommerce_Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SQL-Python E-commerce Project

Overview

This project involves an in-depth analysis of an e-commerce dataset using SQL for data extraction and Python (Pandas, Numpy, Matplotlib, and Seaborn) for exploratory data analysis (EDA). The goal is to uncover insights and trends within the data, ranging from basic to advanced queries.

Table of Contents

  1. Project Description
  2. Dataset
  3. Setup and Installation
  4. EDA and SQL Queries
  5. Results and Visualizations
  6. Conclusion
  7. Acknowledgements

Project Description

This project utilizes SQL for querying data and Python libraries such as Pandas, Numpy, Matplotlib, and Seaborn for EDA. The analysis aims to answer various business questions, ranging from basic statistics to advanced insights, which are crucial for making informed decisions in the e-commerce domain.

Dataset

The dataset used in this project includes information on customers, orders, products, and sellers. It is assumed to be stored in a relational database and can be accessed using SQL queries.

Setup and Installation

  1. Clone the Repository:

    git clone https://github.com/yourusername/SQL-Python-E-commerce-Project.git
    cd SQL-Python-E-commerce-Project
  2. Install Dependencies:

    pip install pandas numpy matplotlib seaborn sqlalchemy
  3. Database Connection: Configure your database connection in the config.py file.

EDA and SQL Queries

Basic Queries

  1. List all unique cities where customers are located.
  2. Count the number of orders placed in 2017.
  3. Find the total sales per category.
  4. Calculate the percentage of orders that were paid in installments.
  5. Count the number of customers from each state.

Intermediate Queries

  1. Calculate the number of orders per month in 2018.
  2. Find the average number of products per order, grouped by customer city.
  3. Calculate the percentage of total revenue contributed by each product category.
  4. Identify the correlation between product price and the number of times a product has been purchased.
  5. Calculate the total revenue generated by each seller, and rank them by revenue.

Advanced Queries

  1. Calculate the moving average of order values for each customer over their order history.
  2. Calculate the cumulative sales per month for each year.
  3. Calculate the year-over-year growth rate of total sales.
  4. Calculate the retention rate of customers, defined as the percentage of customers who make another purchase within 6 months of their first purchase.
  5. Identify the top 3 customers who spent the most money in each year.

Results and Visualizations

Results of the queries and visualizations will be generated and displayed using Python libraries such as Pandas, Matplotlib, and Seaborn. Example visualizations include bar charts for sales per category, line plots for monthly sales trends, and scatter plots for correlation analysis.

Conclusion

This project demonstrates how to combine SQL and Python for comprehensive data analysis in the e-commerce domain. By answering various business questions, you can gain valuable insights to make informed decisions.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published