# **Project Name**    - *E-Commerce Project*



##### **Project Type**    - EDA
##### **Contribution**    - Individual
##### Name :- Akshay Sahebrao Chattar

# **Project Summary -**

This project involved analyzing a retail store's sales data to uncover key business insights. The analysis focused on identifying trends in monthly sales, product category performance, profitability, and customer segmentation. Various statistical and visualization techniques were used to derive actionable recommendations that can help improve business operations and revenue generation.

Key highlights include:

Identifying peak and low-performing months in terms of sales and profit.

Determining the best and worst-performing product categories and sub-categories.

Understanding the impact of customer segmentation on sales and profitability.

Evaluating the sales-to-profit ratio to improve pricing and discount strategies.

# **GitHub Link - **

 https://github.com/AkshayChattar/E-Commerce-EDA-Project

# **Problem Statement**


**Write Problem Statement Here.**

1. You need to calculate the monthly sales of the store and identify which month had the highest sales and which month had the lowest sales.

2. You need to analyze sales based on product categories and determine which category has the lowest sales and which category has the highest sales.

3. The sales analysis needs to be done based on sub-categories

4. You need to analyze the monthly profit from sales and determine which month had the highest profit.

5. Analyze the profit by category and sub-category.

6. Analyze the sales and profit by customer segment

7. Analyze the sales to profit ratio

# **General Guidelines** : -  

1.   Well-structured, formatted, and commented code is required.
2.   Exception Handling, Production Grade Code & Deployment Ready Code will be a plus. Those students will be awarded some additional credits.
     
     The additional credits will have advantages over other students during Star Student selection.
       
             [ Note: - Deployment Ready Code is defined as, the whole .ipynb notebook should be executable in one go
                       without a single error logged. ]

3.   Each and every logic should have proper comments.
4. You may add as many number of charts you want. Make Sure for each and every chart the following format should be answered.
        

```
# Chart visualization code
```
            

*   Why did you pick the specific chart?
*   What is/are the insight(s) found from the chart?
* Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

5. You have to create at least 20 logical & meaningful charts having important insights.


[ Hints : - Do the Vizualization in  a structured way while following "UBM" Rule.

U - Univariate Analysis,

B - Bivariate Analysis (Numerical - Categorical, Numerical - Numerical, Categorical - Categorical)

M - Multivariate Analysis
 ]





# ***Let's Begin !***

## ***1. Know Your Data***

### Import Libraries

In [None]:
# Import Libraries
import pandas as pd
import seaborn as sns
from datetime import datetime
import plotly.express as px
import plotly.graph_objects as go
import plotly.io as pio
import plotly.colors as colors
pio.templates.default = "plotly_white"
import warnings
warnings.filterwarnings('ignore')

### Dataset Loading

In [None]:
#import data from drive
from google.colab import drive
drive.mount('/content/drive')

In [None]:
# Load Dataset
data = pd.read_csv("/content/drive/MyDrive/learning with projects/projects/db/Superstore.csv",encoding = 'latin-1')

### Dataset First View

In [None]:
# Dataset First Look
data.head()

### Dataset Rows & Columns count

In [None]:
# Dataset Rows & Columns count
data.shape

### Dataset Information

In [None]:
# Dataset Info
data.info()

#### Duplicate Values

In [None]:
# Dataset Duplicate Value Count
len(data[data.duplicated()])

#### Missing Values/Null Values

In [None]:
# Missing Values/Null Values Count
data.isnull().sum()

In [None]:
# Visualizing the missing values
sns.heatmap(data.isnull(), cbar=False)

### What did you know about your dataset?

The dataset contains 9,994 rows and 21 columns. Here’s an overview of the columns:

Order Details: Row ID, Order ID, Order Date, Ship Date, Ship Mode

Customer Details: Customer ID, Customer Name, Segment

Location Details: Country, City, State, Postal Code, Region

Product Details: Product ID, Category, Sub-Category, Product Name

Sales & Profit Metrics: Sales, Quantity, Discount, Profit.

### Key Insights from the Analysis:

Missing Values: There are no missing values in any column.

Sales & Profit Insights:

Sales range from 0.44 dollers to 22 dollers, 638.48, with an average of 229.86 dollers.
Profit varies significantly, from -6599.98 (a major loss) to 8399.98 (a high profit), with an average of 28.66 dollers.

Discounts:

The discount ranges from 0% to 80%, with most values concentrated around 0% and 20%.

Quantity:

Orders contain between 1 to 14 units, with an average order size of ~3.79 units.


## ***2. Understanding Your Variables***

In [None]:
# Dataset Columns
data.columns

In [None]:
# Dataset Describe
data.describe(include='all')

### Variables Description

Answer Here

### Check Unique Values for each variable.

In [None]:
# Check Unique Values for each variable.
for i in data.columns:
  print("No. of unique values in ",i,"is",data[i].nunique())

## 3. ***Data Wrangling***

### Data Wrangling Code

In [None]:
# Change date dataype to datetime
data['Order Date']= pd.to_datetime(data['Order Date'])
data['Ship Date']= pd.to_datetime(data['Ship Date'])
data.info()

In [None]:
# Write your code to make your dataset analysis ready.
#extract day month year columns
data['Order Month']= data['Order Date'].dt.month
data['Order Year']= data['Order Date'].dt.year
data['Order week'] = data['Order Date'].dt.dayofweek
data.info()


### What all manipulations have you done and insights you found?

Answer Here.

## ***4. Data Vizualization, Storytelling & Experimenting with charts : Understand the relationships between variables***

#### Chart - 1

In [None]:
# Chart - 1 visualization code

In [None]:
# Monthly sales analysis
sales_by_month = data.groupby('Order Month')['Sales'].sum().reset_index()
fig = px.line(sales_by_month, x = 'Order Month', y = 'Sales', title= 'Monthly Sales Analysis')
fig.show()

##### 1. Why did you pick the specific chart?

Choose the chart beacaude it shows the monthly trend of sales

##### 2. What is/are the insight(s) found from the chart?

In November month sales are highest and in january month sales are minimum

##### 3. Will the gained insights help creating a positive business impact?


Yes, we can provide offers while non festive seasons like january to improve the sales.

#### Chart - 2

In [None]:
# Chart - 2 visualization code
# Sales by Category
sales_by_category = data.groupby('Category')['Sales'].sum().reset_index()
fig = px.pie(sales_by_category,
             values = 'Sales',
             names = 'Category',
             hole= 0.4,
             color_discrete_sequence = px.colors.qualitative.Pastel )


fig.update_traces(textposition = 'inside', textinfo = 'percent+label')
fig.update_layout(title_text = 'Sales by Category')
fig.show()

##### 1. Why did you pick the specific chart?

choose the pie chart beacause it shows the percentage of sales categorywise

##### 2. What is/are the insight(s) found from the chart?

The highest sale is of technology category then furniture and at the bottom office supplies.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

We have to improve the quality of office supplies and increase the quantity of technology

#### Chart - 3

In [None]:
# Chart - 3 visualization code
#Sales analysis by subcategory
sales_by_subcategory = data.groupby('Sub-Category')['Sales'].sum().reset_index()
fig = px.bar(sales_by_subcategory, x = 'Sub-Category', y = 'Sales', title = 'Sales by Sub-Category')
fig.show()

##### 1. Why did you pick the specific chart?

Choose the bar graph to visualize the maximun or minimum sale of sub categories.

##### 2. What is/are the insight(s) found from the chart?

Phones, Chairs and Tables are the highest selling subcategories while the Fastners, Lables and Envelopes are the lowest selling subcategories

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

We have to improve the quality and apply the offers to improve the sale of lowest selling subcategories

#### Chart - 4

In [None]:
# Chart - 4 visualization code
# Monthly profit analysis
profit_by_month = data.groupby('Order Month')['Profit'].sum().reset_index()
fig = px.line(profit_by_month, x = 'Order Month', y = 'Profit', title = 'Monthly profit analysis')
fig.show()

##### 1. Why did you pick the specific chart?

choose line chart to show the trend of monthly profit analysis

##### 2. What is/are the insight(s) found from the chart?

profit is maximum in the december month while profit is less in the january month

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

it is important to launch the offer in the january month to increase the sale in january for improving the profit

#### Chart - 5

In [None]:
# Chart - 5 visualization code
# Profit by Category
profit_by_category= data.groupby('Category')['Profit'].sum().reset_index()
fig = px.pie(profit_by_category,
             values = 'Profit',
             names = 'Category',
             hole= 0.4,
             color_discrete_sequence = px.colors.qualitative.Pastel )
fig.update_traces(textposition = 'inside', textinfo = 'percent+label')
fig.update_layout(title_text = 'Profit by Category')
fig.show()

##### 1. Why did you pick the specific chart?

Choose the pie chart to show the profit by category

##### 2. What is/are the insight(s) found from the chart?

Technology category gives higher profit and furniture category has low profit

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Technology category has the higher profit than other categories and furniture category has the low profit

#### Profit by Sub-Category

In [None]:
profit_by_subcategory = data.groupby('Sub-Category')['Profit'].sum().reset_index()
fig = px.bar(profit_by_subcategory, x = 'Sub-Category', y = 'Profit', title = 'Profit by Sub-Category')
fig.show()

##### 1. Why did you pick the specific chart?

Choose the bar graph to show the profit and loss through the sub categories

##### 2. What is/are the insight(s) found from the chart?

Copiers product gives the maximum profit while tables product give loss

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

sub category products like copiers, phones give the profit while the products like tables, bookcases loss

#### Chart - 6 : Profit by customer segment

In [None]:
# Chart - 6 visualization code
profit_by_segment = data.groupby('Segment').agg({'Sales':'sum', 'Profit':'sum'}).reset_index()
color_palette = colors.qualitative.Pastel
fig = go.Figure()
fig.add_trace(go.Bar(
    x = profit_by_segment['Segment'],
    y = profit_by_segment['Sales'],
    name = 'Sales',
    marker_color = color_palette[0]
))
fig.add_trace(go.Bar(
    x = profit_by_segment['Segment'],
    y = profit_by_segment['Profit'],
    name = 'Profit',
    marker_color = color_palette[1]
))
fig.update_layout(barmode = 'group',title = 'Profit by Customer Segment', xaxis_title = 'Customer Segment', yaxis_title = 'Amount')
fig.show()

##### 1. Why did you pick the specific chart?

Choose bar chart to visualize the sales and profit analysis by customer segment

##### 2. What is/are the insight(s) found from the chart?

from the consumers profit margin and sales are high while from the home office profit margin and sales also low

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

consumers gives the maximun sales and profit while home office gives low sales and profit margin.

#### Chart - 7 : Sales to profit ratio

In [None]:
# Chart -7 visualization code

sales_profit_by_segment = data.groupby('Segment').agg({'Sales':'sum', 'Profit':'sum'}).reset_index()
sales_profit_by_segment['Sales to Profit Ratio'] = sales_profit_by_segment['Sales']/sales_profit_by_segment['Profit']
fig = px.bar(sales_profit_by_segment, x = 'Segment', y = 'Sales to Profit Ratio', title = 'Sales to Profit Ratio by Customer Segment')
fig.show()

##### 1. Why did you pick the specific chart?

Choose bar chart to show the sales to profit ratio by Customer Segment.

##### 2. What is/are the insight(s) found from the chart?

Consumer segment has the highest sales to profit ratio while the home office has the lowest sales profit ratio.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Consumer segment has the highest sales to profit ratio while the home office has the lowest sales profit ratio.

## **5. Solution to Business Objective**

#### What do you suggest the client to achieve Business Objective ?


This project involved analyzing a retail store's sales data to uncover key business insights. The analysis focused on identifying trends in monthly sales, product category performance, profitability, and customer segmentation. Various statistical and visualization techniques were used to derive actionable recommendations that can help improve business operations and revenue generation.

Key highlights include:

Identifying peak and low-performing months in terms of sales and profit.

Determining the best and worst-performing product categories and sub-categories.

Understanding the impact of customer segmentation on sales and profitability.

Evaluating the sales-to-profit ratio to improve pricing and discount strategies.

# **Conclusion**

####The data-driven insights from this analysis provide a roadmap for enhancing sales, maximizing profits, and optimizing business operations. By implementing the recommended strategies, the store can improve its revenue streams, reduce losses, and make more informed business decisions. Regular monitoring and periodic analysis should be conducted to adapt to changing market trends and consumer preferences.

### ***Hurrah! I have successfully completed EDA Capstone Project !!!***