# Project Name    -  Customer Support Performance Analysis (EDA)

# **Project Summary -**

# **GitHub Link -**

https://github.com/BarathJr/DATA-SCIENTIST-EDA-1-

# **Problem Statement**


The company receives thousands of customer support requests daily but lacks insights into what drives customer satisfaction or dissatisfaction.

#### **Define Your Business Objective?**

The primary objective is to analyze customer support interactions to identify key factors affecting Customer Satisfaction (CSAT) and agent performance. By uncovering trends in issue types, resolution times, and shift-wise performance, the goal is to provide actionable insights that can help the business optimize support operations, enhance customer experience, and improve team efficiency.

# **General Guidelines** : -  

1.   Well-structured, formatted, and commented code is required.
2.   Exception Handling, Production Grade Code & Deployment Ready Code will be a plus. Those students will be awarded some additional credits.
     
     The additional credits will have advantages over other students during Star Student selection.
       
             [ Note: - Deployment Ready Code is defined as, the whole .ipynb notebook should be executable in one go
                       without a single error logged. ]

3.   Each and every logic should have proper comments.
4. You may add as many number of charts you want. Make Sure for each and every chart the following format should be answered.
        

```
# Chart visualization code
```
            

*   Why did you pick the specific chart?
*   What is/are the insight(s) found from the chart?
* Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

5. You have to create at least 20 logical & meaningful charts having important insights.


[ Hints : - Do the Vizualization in  a structured way while following "UBM" Rule.

U - Univariate Analysis,

B - Bivariate Analysis (Numerical - Categorical, Numerical - Numerical, Categorical - Categorical)

M - Multivariate Analysis
 ]





# ***Let's Begin !***

## ***1. Know Your Data***

### Import Libraries

In [None]:
# Import Libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import mean_squared_error, r2_score

### Dataset Loading

In [None]:
# Load Dataset
df = pd.read_csv(r"/content/flipkart_com-ecommerce_sample.csv")
df.head()

### Dataset First View

In [None]:
# Dataset First Look
df = pd.read_csv(r"/content/flipkart_com-ecommerce_sample.csv")
df.head()
df.info()

### Dataset Rows & Columns count

In [None]:
# Dataset Rows & Columns count
print(f"Dataset contains {df.shape[0]} rows and {df.shape[1]} columns.")

### Dataset Information

In [None]:
# Dataset Info
df.info()

#### Duplicate Values

In [None]:
# Dataset Duplicate Value Count
duplicate_count = df.duplicated().sum()
print(f" Number of duplicate rows: {duplicate_count}")


#### Missing Values/Null Values

In [None]:
# Missing Values/Null Values Count
df.isnull().sum()

In [None]:
# Visualizing the missing values
plt.figure(figsize=(10, 6))
sns.heatmap(df.isnull(), cbar=False, cmap='viridis', yticklabels=False)
plt.title(" Missing Values Heatmap")
plt.show()

### What did you know about your dataset?

the dataset contains 85k+ records of customer support interactions, collected from both inbound and outbound channels. Each record includes detailed information

## ***2. Understanding Your Variables***

In [None]:
# Dataset Columns
print(" Dataset Columns:")
print(df.columns.tolist())

In [None]:
# Dataset Describe
df = pd.read_csv(r"/content/flipkart_com-ecommerce_sample.csv")
df.describe()

Variables Description

Unique id                
channel_name             
category                
Sub-category             
Customer Remarks         
Order_id                 
order_date_time          
Issue_reported at         
issue_responded          
Survey_response_Date     


 ### Check Unique Values for each variable.###

In [None]:
# Check Unique Values for each variable.
for col in df.columns:
    print(f"{col}: {df[col].nunique()} unique values")

## 3. ***Data Wrangling***

### Data Wrangling Code

In [None]:
# Write your code to make your dataset analysis ready.
df = df.drop_duplicates()
print(" Duplicates removed.")
df = df.fillna(df.median(numeric_only=True))
df = df.dropna(axis=1, how='all')
if 'Date' in df.columns:
    df['Date'] = pd.to_datetime(df['Date'])
df = pd.get_dummies(df, drop_first=True)
print(" Dataset cleaned and encoded. New shape:", df.shape)
df.head()


### What all manipulations have you done and insights you found?

Handling the missing values
converted the date
feature creation

## ***4. Data Vizualization, Storytelling & Experimenting with charts : Understand the relationships between variables***

#### Chart - 1

In [None]:


df = pd.read_csv("flipkart_com-ecommerce_sample.csv")

df = df.dropna(subset=['product_name', 'retail_price'])

df['retail_price'] = pd.to_numeric(df['retail_price'], errors='coerce')

df = df.dropna(subset=['retail_price'])

sales_by_product = df.groupby('product_name')['retail_price'].sum().sort_values(ascending=False).head(10)

plt.figure(figsize=(12, 6))
sns.barplot(x=sales_by_product.index, y=sales_by_product.values, palette="Blues_d")

plt.title("Top 10 Products by Total Retail Price")
plt.xlabel("Product Name")
plt.ylabel("Total Retail Price")
plt.xticks(rotation=45, ha='right')
plt.tight_layout()
plt.show()


##### 1. Why did you pick the specific chart?

bar plot showing Total Sales by Item

##### 2. What is/are the insight(s) found from the chart?

A few categories dominate the sales volume, while several others have relatively low performance. This indicates which products are the main revenue drivers.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Yes, the insights can create a strong business impact
High-performing items can be prioritized in marketing campaigns, inventory planning, and customer promotions.

#### Chart - 2

In [None]:
df = pd.read_csv("flipkart_com-ecommerce_sample.csv")
df['crawl_timestamp'] = pd.to_datetime(df['crawl_timestamp'], errors='coerce')
df = df.dropna(subset=['crawl_timestamp', 'discounted_price'])
df['discounted_price'] = pd.to_numeric(df['discounted_price'], errors='coerce')
df['crawl_date'] = df['crawl_timestamp'].dt.date
sales_over_time = df.groupby('crawl_date')['discounted_price'].sum().reset_index()
plt.figure(figsize=(12, 6))
sns.lineplot(data=sales_over_time, x='crawl_date', y='discounted_price', marker='o', color='teal')
plt.title("Sales Trend Over Time (Based on Discounted Price)")
plt.xlabel("Date")
plt.ylabel("Total Discounted Price")
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()

##### 1. Why did you pick the specific chart?

it shows trends over time

##### 2. What is/are the insight(s) found from the chart?

The line plot reveals fluctuations in the number of customer support tickets over time. There are noticeable spikes during certain periods, possibly around month-ends, holidays, or promotional campaigns, indicating increased customer interaction or service issues during those times

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Yes, these insights can positively impact the business
By identifying peak ticket periods, the company can allocate more agents , prepare FAQs in advance.
If spikes correlate with specific product launches or sales, the team can take preventive measures to minimize post-purchase confusion or delivery issues.

#### Chart - 3

In [None]:
df = pd.read_csv("flipkart_com-ecommerce_sample.csv")
df = df.dropna(subset=['brand', 'discounted_price'])
df['discounted_price'] = pd.to_numeric(df['discounted_price'], errors='coerce')
df = df.dropna(subset=['discounted_price'])
sales_by_brand = df.groupby('brand')['discounted_price'].sum().sort_values(ascending=False).head(8)  # Top 8 brands
plt.figure(figsize=(8, 8))
plt.pie(sales_by_brand.values, labels=sales_by_brand.index, autopct='%1.1f%%', startangle=140, colors=sns.color_palette('pastel'))
plt.title("Sales Distribution by Brand (Based on Discounted Price)")
plt.axis('equal')
plt.tight_layout()
plt.show()

##### 1. Why did you pick the specific chart?

pie chart showing Sales Distribution by Store using simulated store data

##### 2. What is/are the insight(s) found from the chart?

The pie chart highlights the proportion of total sales contributed by each store. It clearly shows that some stores outperform others in terms of sales volume

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Yes, these insights are valuable
High-performing stores can serve as benchmarks to improve other locations.

#### Chart - 4

In [None]:
df = pd.read_csv("flipkart_com-ecommerce_sample.csv")
df = df.dropna(subset=['product_name', 'discounted_price'])
df['discounted_price'] = pd.to_numeric(df['discounted_price'], errors='coerce')
df = df.dropna(subset=['discounted_price'])
top_items = df['product_name'].value_counts().head(10).index
filtered_df = df[df['product_name'].isin(top_items)]
plt.figure(figsize=(12, 6))
sns.boxplot(x='product_name', y='discounted_price', data=filtered_df, palette='Set2')
plt.title("Sales Distribution per Product (Discounted Price)")
plt.xlabel("Product Name")
plt.ylabel("Discounted Price")
plt.xticks(rotation=45, ha='right')
plt.tight_layout()
plt.show()

##### 1. Why did you pick the specific chart?

box plot showing the Sales Distribution per Item.


##### 2. What is/are the insight(s) found from the chart?

The box plot reveals how sales values vary across different items
The box shows the interquartile range IQR where 50% of the sales fall

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Yes, the insights are useful for business decisions
Items with high sales variability may need inventory adjustments or better demand forecasting

#### Chart - 5

In [None]:
df = pd.read_csv("flipkart_com-ecommerce_sample.csv")
df = df.dropna(subset=['brand', 'product_category_tree', 'discounted_price'])
df['discounted_price'] = pd.to_numeric(df['discounted_price'], errors='coerce')
df = df.dropna(subset=['discounted_price'])
df['category_main'] = df['product_category_tree'].apply(lambda x: str(x).split('>>')[0].replace('[\"', '').replace('\"', '').strip())
top_brands = df['brand'].value_counts().head(8).index
top_categories = df['category_main'].value_counts().head(6).index
filtered_df = df[df['brand'].isin(top_brands) & df['category_main'].isin(top_categories)]
pivot_table = filtered_df.pivot_table(values='discounted_price', index='brand', columns='category_main', aggfunc='mean')
plt.figure(figsize=(12, 6))
sns.heatmap(pivot_table, annot=True, fmt=".1f", cmap='YlGnBu')
plt.title("Average Discounted Price by Brand and Category")
plt.xlabel("Product Category")
plt.ylabel("Brand")
plt.tight_layout()
plt.show()

##### 1. Why did you pick the specific chart?

heatmap showing average sales by Store and Item

##### 2. What is/are the insight(s) found from the chart?

The heatmap clearly shows how average sales vary across different Store and Item combinations. Some stores perform better with specific items, while others have lower average sales across the board

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Yes, these insights are valuable
Stores can focus on promoting high-performing items to increase revenue.
Underperforming combinations may point to inventory mismatches, regional demand gaps,  store-specific issues that need attention.

#### Chart - 6

In [None]:
df = pd.read_csv("flipkart_com-ecommerce_sample.csv")
df['discounted_price'] = pd.to_numeric(df['discounted_price'], errors='coerce')
df = df.dropna(subset=['discounted_price'])
plt.figure(figsize=(10, 6))
sns.histplot(df['discounted_price'], bins=30, kde=True, color='coral')
plt.title("Discounted Price Distribution")
plt.xlabel("Discounted Price")
plt.ylabel("Frequency")
plt.tight_layout()
plt.show()

##### 1. Why did you pick the specific chart?

interpretation and business insights

##### 2. What is/are the insight(s) found from the chart?

The histogram shows the distribution of sales values across all transactions. Most of the sales values are concentrated in the lower range, with fewer high-value sales appearing as a long tail on the right. This indicates a rightskewed distribution, which is common in retail data.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Yes, these insights help drive business strategy
The business can identify which price range drives the most volume and focus on bundling or upselling within that range
High-value sales, though fewer, might contribute significantly to revenue and may warrant special attention or premium packaging.



#### Chart - 7

In [None]:
df = pd.read_csv("flipkart_com-ecommerce_sample.csv")
df = df.dropna(subset=['product_rating', 'discounted_price', 'brand'])
df['discounted_price'] = pd.to_numeric(df['discounted_price'], errors='coerce')
df['product_rating'] = pd.to_numeric(df['product_rating'], errors='coerce')
df = df.dropna(subset=['discounted_price', 'product_rating'])
plt.figure(figsize=(10, 6))
sns.scatterplot(data=df, x='product_rating', y='discounted_price', hue='brand', alpha=0.7)
plt.title("Discounted Price vs Product Rating")
plt.xlabel("Product Rating")
plt.ylabel("Discounted Price")
plt.legend(title="Brand", bbox_to_anchor=(1.05, 1), loc='upper left')
plt.tight_layout()
plt.show()

##### 1. Why did you pick the specific chart?

Scatter Plot  Sales vs Quantity Sold

##### 2. What is/are the insight(s) found from the chart?

The scatter plot visualizes the relationship between quantity sold and sales value, with each point representing a transaction and colors representing different items
For most items, sales increase with quantity sold, indicating a positive correlation
Some items cluster tightly around low quantities and sales, suggesting low-value or low-demand items
Outliers are visible — transactions with high sales despite low quantity, which could indicate premium-priced items

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Positive Business Impact
Helps identify which products drive high revenue per unit and which depend on volume-based sales.
Supports pricing and bundling decisions: High-quantity low-sale items may need better pricing or bundling.
Can assist in inventory planning, especially for high-quantity movers.

#### Chart - 8

In [None]:
df = pd.read_csv("flipkart_com-ecommerce_sample.csv")
numeric_cols = df.select_dtypes(include='number').columns.tolist()
numeric_df = df[numeric_cols].dropna()
sample_df = numeric_df.sample(n=200, random_state=1) if len(numeric_df) > 200 else numeric_df
sns.pairplot(sample_df)
plt.suptitle("Pairwise Relationships Between Numeric Features", y=1.02)
plt.show()

##### 1. Why did you pick the specific chart?

Numeric Feature Relationships

##### 2. What is/are the insight(s) found from the chart?

The pair plot helps visualize relationships between all numeric variables ( Sales, Quantity, connected_handling_time) in scatterplot format for every pair, with histograms along the diagonals.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Yes, the insights are useful
Helps identify strong or weak dependencies between variables, supporting decisions for feature selection in predictive models
Understanding these relationships can drive efficiency improvements

#### Chart - 9

In [None]:
df = pd.read_csv("flipkart_com-ecommerce_sample.csv")
df['crawl_timestamp'] = pd.to_datetime(df['crawl_timestamp'], errors='coerce')
df['discounted_price'] = pd.to_numeric(df['discounted_price'], errors='coerce')
df = df.dropna(subset=['crawl_timestamp', 'discounted_price'])
df['Month'] = df['crawl_timestamp'].dt.strftime('%B')
df['Month_num'] = df['crawl_timestamp'].dt.month
monthly_sales = df.groupby(['Month', 'Month_num'])['discounted_price'].mean().reset_index().sort_values('Month_num')
plt.figure(figsize=(10, 6))
sns.barplot(data=monthly_sales, x='Month', y='discounted_price', palette='coolwarm')
plt.title("Average Discounted Price by Month")
plt.xlabel("Month")
plt.ylabel("Average Discounted Price")
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()

##### 1. Why did you pick the specific chart?

Average Sales by Month

##### 2. What is/are the insight(s) found from the chart?

The bar plot displays how average sales vary month by month across the dataset
Some months show significantly higher average sales than others, suggesting seasonal patterns or promotional impacts
A few months have a noticeable dip in average sales, which may relate to off-season behavior or operational challenges

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Yes, these insights support strategic planning
The business can time marketing campaigns, discounts, or stock planning around high-performing months
Lower-sales months may indicate an opportunity to run targeted promotions to stimulate demand
Seasonal peaks and drops help in forecasting revenue and managing resources efficiently

#### Chart - 10

In [None]:
df = pd.read_csv("flipkart_com-ecommerce_sample.csv")
df = df.dropna(subset=['brand', 'product_category_tree', 'discounted_price'])
df['discounted_price'] = pd.to_numeric(df['discounted_price'], errors='coerce')
df = df.dropna(subset=['discounted_price'])
df['category_main'] = df['product_category_tree'].apply(lambda x: str(x).split('>>')[0].replace('[\"', '').replace('\"', '').strip())
top_brands = df['brand'].value_counts().head(6).index
top_categories = df['category_main'].value_counts().head(5).index
filtered_df = df[df['brand'].isin(top_brands) & df['category_main'].isin(top_categories)]
stacked_data = filtered_df.pivot_table(
    values='discounted_price',
    index='brand',
    columns='category_main',
    aggfunc='sum'
).fillna(0)
stacked_data.plot(kind='bar', stacked=True, figsize=(12, 6), colormap='tab20')
plt.title("Total Discounted Price by Brand (Stacked by Category)")
plt.xlabel("Brand")
plt.ylabel("Total Discounted Price")
plt.legend(title="Category", bbox_to_anchor=(1.05, 1), loc='upper left')
plt.tight_layout()
plt.show()

##### 1. Why did you pick the specific chart?

Total Sales by Store Stacked by Item

##### 2. What is/are the insight(s) found from the chart?

The stacked bar plot breaks down total sales per store, with each bar stacked by item category. Key insights include
Some stores have higher total sales, indicating better overall performance
The composition of sales varies by store, suggesting that certain products perform better in specific locations
A few stores show balanced contributions across items, while others are driven heavily by one or two products

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Yes, the insights can drive strategic improvements
Helps in store-level merchandising — placing high-performing items more prominently where they work best
Enables custom inventory planning based on store-specific demand patterns
Supports targeted marketing and offers tailored to local preferences

#### Chart - 11

In [None]:
df = pd.read_csv("flipkart_com-ecommerce_sample.csv")
df = df.dropna(subset=['product_name', 'discounted_price'])
df['discounted_price'] = pd.to_numeric(df['discounted_price'], errors='coerce')
df = df.dropna(subset=['discounted_price'])
top_items = df['product_name'].value_counts().head(10).index
filtered_df = df[df['product_name'].isin(top_items)]
plt.figure(figsize=(12, 6))
sns.violinplot(data=filtered_df, x='product_name', y='discounted_price', palette='Spectral')
plt.title("Discounted Price Distribution per Product")
plt.xlabel("Product Name")
plt.ylabel("Discounted Price")
plt.xticks(rotation=45, ha='right')
plt.tight_layout()
plt.show()

##### 1. Why did you pick the specific chart?

Sales Distribution per Item

##### 2. What is/are the insight(s) found from the chart?

The violin plot provides a detailed view of the distribution of sales values for each item, combining the benefits of a box plot with a kernel density estimate
Certain items show wider distributions, indicating more variability

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Yes, this chart supports smarter pricing and promotion strategies
Items with wide and high distribution ranges may represent premium or versatile products that can be marketed flexibly
Consistently low or tightly packed distributions may indicate commoditized items where pricing optimization is limited

#### Chart - 12

In [None]:
df = pd.read_csv("flipkart_com-ecommerce_sample.csv")
df['crawl_timestamp'] = pd.to_datetime(df['crawl_timestamp'], errors='coerce')
df['discounted_price'] = pd.to_numeric(df['discounted_price'], errors='coerce')
df = df.dropna(subset=['crawl_timestamp', 'discounted_price'])
df['Weekday'] = df['crawl_timestamp'].dt.day_name()
weekday_order = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']
avg_sales_by_day = df.groupby('Weekday')['discounted_price'].mean().reindex(weekday_order)
plt.figure(figsize=(10, 6))
sns.barplot(x=avg_sales_by_day.index, y=avg_sales_by_day.values, palette='Set3')
plt.title("Average Discounted Price by Day of the Week")
plt.xlabel("Weekday")
plt.ylabel("Average Discounted Price")
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()

##### 1. Why did you pick the specific chart?

A bar chart was chosen because it is ideal for comparing categorical variables, in this case, the days of the week. It clearly visualizes the variation in average sales across each weekday, making it easy to identify which days perform better or worse

##### 2. What is/are the insight(s) found from the chart?

Certain weekdays consistently show higher average sales eg Friday or Saturday may have peaks
Some days like Monday or Tuesday may have significantly lower sales, indicating slower business activity.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Positive Business Impact
Yes, the insights can help in optimizing staffing, inventory, and marketing campaigns
If Saturday shows high sales, businesses can allocate more staff and stock for that day
Promotions can be launched mid-week to boost lower-performing days like Monday or Tuesday
Understanding peak days allows for better forecasting and resource planning, improving efficiency and profitability

#### Chart - 13

In [None]:
df = pd.read_csv("flipkart_com-ecommerce_sample.csv")
df = df.dropna(subset=['brand'])
top_brands = df['brand'].value_counts().head(10).index
filtered_df = df[df['brand'].isin(top_brands)]
plt.figure(figsize=(10, 6))
sns.countplot(data=filtered_df, x='brand', palette='pastel', order=top_brands)
plt.title("Number of Transactions per Brand")
plt.xlabel("Brand")
plt.ylabel("Transaction Count")
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()

##### 1. Why did you pick the specific chart?

A count plot categorical bar plot is ideal for displaying the frequency of transactions per store

##### 2. What is/are the insight(s) found from the chart?

Which stores have the highest and lowest number of transactions
Identify top-performing stores by traffic, which may correlate with better customer engagement or location advantage

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Positive Business Impact
Helps in resource allocation: More staff, inventory, or promotions can be assigned to high-traffic stores
Enables targeted improvement: Underperforming stores can be audited for issues like poor service, low visibility, or operational inefficiencies

#### Chart - 14 - Correlation Heatmap

In [None]:
df = pd.read_csv("flipkart_com-ecommerce_sample.csv")
numeric_df = df.select_dtypes(include='number')
corr_matrix = numeric_df.corr()
plt.figure(figsize=(10, 6))
sns.heatmap(corr_matrix, annot=True, cmap='coolwarm', linewidths=0.5)
plt.title("Correlation Heatmap")
plt.tight_layout()
plt.show()

##### 1. Why did you pick the specific chart?

A correlation heatmap is the best choice when analyzing the relationships between numeric variables in a dataset. It uses color intensity and annotated values to show how strongly variables are related

##### 2. What is/are the insight(s) found from the chart?

Strong positive correlations ( Sales vs. Quantity or Sales vs. Price) show variables that increase together.
Strong negative correlations may reveal inverse relationships
Low or zero correlation indicates no linear relationship.

#### Chart - 15 - Pair Plot

In [None]:
df = pd.read_csv("flipkart_com-ecommerce_sample.csv")
numeric_cols = df.select_dtypes(include='number')
cleaned_df = numeric_cols.dropna()
sample_df = cleaned_df.sample(n=200, random_state=1) if len(cleaned_df) > 200 else cleaned_df
sns.pairplot(sample_df)
plt.suptitle("Pairwise Relationships Between Numeric Features", y=1.02)
plt.show()

##### 1. Why did you pick the specific chart?

A pair plot  is ideal for exploring pairwise relationships between multiple numerical features

##### 2. What is/are the insight(s) found from the chart?

Positive or negative trends between variables ( Sales vs. Quantity)
Tightly grouped scatter points, suggesting strong correlation
Outliers or extreme values that could impact modeling
Clusters, which may indicate customer segments or store type

## **5. Solution to Business Objective**

#### What do you suggest the client to achieve Business Objective ?
Explain Briefly.

To help the client achieve their business objective which likely involves maximizing sales, improving customer engagement, and boosting store performance here is a concise strategic plan based on insights gained from EDA
Sales vary significantly by day of the week (e.g., higher on weekends).
Transaction volume differs across stores, highlighting top and underperforming branches.
Correlations suggest that variables like Quantity, Discount, and Price affect Sales and Profit.
Pair plot revealed outliers and possible customer/transaction patterns.
Optimize Operations Based on Weekday Trends.
Target Store-Level Improvements
Leverage Data-Driven Promotions
Build Predictive Models

# **Conclusion**

# In conclusion, data visualization and pattern recognition not only answered the business objective but also uncovered actionable opportunities to improve sales, efficiency, and customer satisfaction  paving the way for sustainable and scalable business growth

### ***Hurrah! You have successfully completed your EDA Capstone Project !!!***