Hello Everyone,
Here is My EDA Project based on Super Market Sales Analysis where I analyzed the Data by using Matplotlib and Seaborn.
I used Super Market Sales Dataset from Kaggle recorded for 3 Different Cities for 3 Months.
Link to the Dataset : Super Market Sales Dataset
- To analyze sales data and gain insights into customer purchasing behavior, product performance and overall trends of the supermarket business.
-
For my Super Market Sales Analysis Project, I have created a Streamlit Web App for analyzing the Data in more interactive and user friendly way.
-
This Web App allows you to dig deep into the sales data, helping you answer critical questions in just few clicks.
Link to the Web App : Super Market Sales App
- Setting up the Enviroment
- Libraries required for the Project
- Getting started with Repository
- Steps involved in the Project
- Conclusion
- Link to the Notebook
Jupyter Notebook is required for this project and you can install and set it up in the terminal.
-
Install the Notebook -
pip install notebook
-
Run the Notebook -
jupyter notebook
NumPy
-
Go to Terminal and run this code -
pip install numpy
-
Go to Jupyter Notebook and run this code from a cell -
!pip install numpy
Pandas
-
Go to Terminal and run this code -
pip install pandas
-
Go to Jupyter Notebook and run this code from a cell -
!pip install pandas
Matplotlib
-
Go to Terminal and run this code -
pip install matplotlib
-
Go to Jupyter Notebook and run this code from a cell -
!pip install matplotlib
Seaborn
-
Go to Terminal and run this code -
pip install seaborn
-
Go to Jupyter Notebook and run this code from a cell -
!pip install seaborn
Sklearn
-
Go to Terminal and run this code -
pip install sklearn
-
Go to Jupyter Notebook and run this code from a cell -
!pip install sklearn
- Clone the repository to your local machine using the following command :
git clone https://github.com/TheMrityunjayPathak/SuperMarketSalesAnalysis.git
Reading the Data
-
First I installed all the necessary libraries required for this Project.
-
Then I imported the Data by reading csv file using
read.csv()
Method. -
Then I dropped the
Invoice ID
Column because we don't need it in analysis. -
After that I listed down all the columns in the Dataset by
df.columns
Method. -
Then I used
df.shape
Method to look for the rows and columns in the Data. -
Then I look for the Info of the Dataset by using
df.info()
Method.
Cleaning the Data
-
First I start by describing the Data by using
df.describe()
Method. -
Then I converted Date Column to Pandas Date and Time DataType.
-
And After that I extracted Year, Month, Day from the Date.
-
Then I listed down all the unique values of categorical columns.
-
And Finally I verified the null values in the Dataset by using
df.isna().sum()
Visualizing the Data
- Subplots of Distribution of Unit Price, Ratings and Gross Income
- Per Unit Price of Each Product Lines
- Count of Different Types of Customers from Different Cities
- Count of Different Types of Products in Super Market
- Count of Different Gender Visitors at Different Branches
- Count of Different Types of Payment Methods used by Different Genders
- Count of Different Gender Visitors from Different Cities
- Quantity of Products Sold from Each Product Line
- Different Payment Methods Used by Different Cities
- Total Amount Spend on Different Product Lines by Different Genders
- Rating of Different Product Lines by Different Genders
- Gross Income from Different Product Lines in Different Cities
- Total Sale on Each Day for All Months
- Taxes on Different Product Lines
- Number of Products bought by Different Genders from Different Product Lines
- Total Gross Income from Different Branches by Different Genders
-
In conclusion, Super Market Sales Project revealed valuable insights into customer purchasing behavior and product performance.
-
It provide opportunities for data-driven strategies to enhance profitability and customer satisfaction.
Scroll to Top ⬆️ |
---|