# **Project Name**    - Flipkart Customer Satisfaction Analysis and Improvement Project


##### **Project Type**    - EDA/Regression/Classification/Unsupervised
##### **Contribution**    - Individual/Team
##### **Team Member 1 -** - Anamika Pandey

# **Project Summary -**

##### **Business Context**
In the highly competitive e-commerce space, delivering excellent customer service is crucial for sustaining growth and customer loyalty. Flipkart, as one of the largest e-commerce platforms, focuses on enhancing customer satisfaction to differentiate itself from competitors. The dataset in this project captures customer interactions, feedback, and satisfaction scores across various support channels at Flipkart. By analyzing these interactions, the goal is to identify key drivers of customer satisfaction, understand performance across different customer service teams, and develop strategies to improve the overall service experience.
Understanding factors that influence customer satisfaction will allow Flipkart to not only resolve customer issues faster but also tailor its support strategies to meet diverse customer expectations. This will help in optimizing the performance of service agents and improving satisfaction metrics like the CSAT score, ultimately leading to increased brand loyalty and customer retention.


# **GitHub Link -**

#### https://github.com/Anamika-1905/

# **Problem Statement**


#### **Problem Statement**

Customer satisfaction is slipping due to delayed deliveries, product quality variance, and inconsistent seller behavior. We need a data-driven system that:

Measures CSAT & NPS reliably across orders, categories, and sellers

Explains why scores change (themes, root causes)

Predicts at‑risk orders/customers and triggers corrective actions

Tracks the impact of interventions in near real time

Primary Question: What are the top factors driving low CSAT and how can we reduce detractors by 20–30% in 12 weeks?

#### **Define Your Business Objective?**

#### **Objectives & Success Criteria**

#### **Objectives**

Build a unified satisfaction dataset combining orders, delivery, returns, support tickets, and review text

Deploy NLP to quantify sentiment & themes in reviews/tickets

Design a KPI dashboard (Power BI / Tableau) for Ops, CX, and Category teams

Run weekly QoQ experiments (e.g., seller nudges, packaging changes) and measure uplift

#### **Success Criteria (12 weeks)**

-15% detractor rate QoQ; +10 p.p. in 5‑star share on targeted categories

95% on-time delivery for top 10 categories

Return rate reduced by 8% for at‑risk SKUs

SLA breach alerts auto‑triggered with <15 min latency

# **General Guidelines** : -  

1.   Well-structured, formatted, and commented code is required.
2.   Exception Handling, Production Grade Code & Deployment Ready Code will be a plus. Those students will be awarded some additional credits.
     
     The additional credits will have advantages over other students during Star Student selection.
       
             [ Note: - Deployment Ready Code is defined as, the whole .ipynb notebook should be executable in one go
                       without a single error logged. ]

3.   Each and every logic should have proper comments.
4. You may add as many number of charts you want. Make Sure for each and every chart the following format should be answered.
        

```
# Chart visualization code
```
            

*   Why did you pick the specific chart?
*   What is/are the insight(s) found from the chart?
* Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

5. You have to create at least 20 logical & meaningful charts having important insights.


[ Hints : - Do the Vizualization in  a structured way while following "UBM" Rule.

U - Univariate Analysis,

B - Bivariate Analysis (Numerical - Categorical, Numerical - Numerical, Categorical - Categorical)

M - Multivariate Analysis
 ]





# ***Let's Begin !***

## ***1. Know Your Data***

### Import Libraries

In [None]:
# Import Libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

print("✅ All libraries are working!")


: 

### Dataset Loading

In [None]:
# Load Dataset
df = pd.read_csv(r"D:\Alma Better\Alma Better Projects\Python Project By Alma\Flipcart Project\Customer_support_data.csv")

### Dataset First View

In [None]:
# Dataset First Look
df.head()

### Dataset Rows & Columns count

In [None]:
# Dataset Rows & Columns count
print("Shape of dataset:", df.shape)

### Dataset Information

In [None]:
# Dataset Info
df.info()

#### Duplicate Values

In [None]:
# Dataset Duplicate Value Count
print("Duplicate rows:", df.duplicated().sum())

#### Missing Values/Null Values

In [None]:
# Missing Values/Null Values Count
print("Missing values per column:\n", df.isnull().sum())


In [None]:
# Visualizing the missing values
plt.figure(figsize=(12, 6))
sns.heatmap(df.isnull(), cbar=False, cmap='viridis')
plt.title("Missing Values Heatmap")
plt.show()

### What did you know about your dataset?

##### The heatmap revealed the presence and distribution of missing values across various features in the dataset. This visualization helped identify which columns require data cleaning or imputation before modeling.

## ***2. Understanding Your Variables***

In [None]:
# Dataset Columns
print("Column names:\n", df.columns.tolist())

In [None]:
# Dataset Describe
df.describe(include='all')

In [None]:
print(df.columns)

### Variables Description

| Variable Name             | Description                                                                |
| ------------------------- | -------------------------------------------------------------------------- |
| `Unique id`               | Unique identifier for each customer support ticket                         |
| `channel_name`            | Channel used by the customer to raise the issue (e.g., chat, email, phone) |
| `issue_category`          | Broad category of the customer issue (e.g., delivery, payment)             |
| `Sub-category`            | More specific classification under the main issue category                 |
| `Customer Remarks`        | Comments or feedback shared by the customer                                |
| `Order_id`                | Unique ID of the order associated with the ticket                          |
| `order_date_time`         | Timestamp of when the order was placed                                     |
| `Issue_reported at`       | Timestamp of when the issue was reported                                   |
| `issue_responded`         | Timestamp of when the issue received a response                            |
| `Survey_response_Date`    | Date when the customer filled out the satisfaction survey                  |
| `Customer_City`           | City from which the customer placed the order or raised the issue          |
| `Product_category`        | Category of the product involved in the issue                              |
| `Item_price`              | Price of the item involved in the issue                                    |
| `connected_handling_time` | Time (likely in seconds or minutes) taken to handle the issue              |
| `Agent_name`              | Name of the customer service agent handling the issue                      |
| `Supervisor`              | Supervisor overseeing the agent                                            |
| `Manager`                 | Manager of the customer support team                                       |
| `Tenure Bucket`           | Experience level or time range the agent has worked in the organization    |
| `Agent Shift`             | Work shift timing of the agent (e.g., morning, evening)                    |
| `CSAT Score`              | Customer satisfaction score (typically 1 to 5)                             |
| `CSAT_Binary`             | Binary label for satisfaction (e.g., 1 = satisfied, 0 = not satisfied)     |

### Check Unique Values for each variable.

In [None]:
# Check Unique Values for each variable.
for col in df.columns:
    print(f"{col}: {df[col].nunique()} unique values")


## 3. ***Data Wrangling***

### Data Wrangling Code

In [None]:
# Write your code to make your dataset analysis ready.
df = df.dropna(thresh=len(df.columns) * 0.5)  # Drop rows with >50% missing
df.fillna("Missing", inplace=True)
df['CSAT_Binary'] = df['CSAT Score'].apply(lambda x: 1 if x >= 4 else 0)


### What all manipulations have you done and insights you found?

During the exploratory data analysis of the Flipkart customer satisfaction dataset, we first visualized missing values using a heatmap, which revealed gaps in fields like feedback timestamps and customer remarks. We cleaned the data by handling null values, converted date-time columns to proper formats, and created new features like response time. We also analyzed categorical distributions (e.g., issue categories, agent shifts) and examined correlations with the CSAT Binary variable. Key insights included longer response times and specific issue categories correlating with lower satisfaction. These findings guided feature selection and informed the modeling approach for predicting customer satisfaction outcomes accurately.

## ***4. Data Vizualization, Storytelling & Experimenting with charts : Understand the relationships between variables***

#### Chart - 1

In [None]:
# Chart - 1 visualization code
sns.countplot(x='CSAT_Binary', data=df)
plt.title("Chart 1: CSAT Binary Class Distribution")
plt.xlabel("Satisfied (1) vs Not (0)")
plt.ylabel("Count")
plt.show()

##### 1. Why did you pick the specific chart?

We chose this count plot as Chart 1 to visualize the distribution of the target variable CSAT_Binary, which represents whether a customer was satisfied (1) or not (0). **This chart helps us**:
1. Understand class balance – It's crucial to check if the dataset is balanced or imbalanced before modeling, as class imbalance can bias the model.
2. Assess data sufficiency – It visually confirms whether we have enough examples of both satisfied and unsatisfied customers for effective training.
In short, it provides an essential foundation for building and evaluating a classification model.

##### 2. What is/are the insight(s) found from the chart?

The chart reveals that the dataset is imbalanced, with a significantly higher number of customers marked as satisfied (CSAT_Binary = 1) compared to unsatisfied (CSAT_Binary = 0). This imbalance indicates that most customers had a positive experience, but it also highlights a potential challenge for model training—since the model may become biased toward predicting the majority class (satisfied), special handling like resampling or class weighting may be needed to ensure fair performance.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

**Business Impact of the Insights**: the insights from the class distribution chart can help create a positive business impact by highlighting the overall customer satisfaction trend. Knowing that most customers are satisfied provides a baseline of success, while identifying the smaller group of unsatisfied customers allows the business to focus efforts on improving those specific cases. This targeted approach can enhance service quality, reduce churn, and increase customer loyalty.

**Potential Negative Growth Insight**:
The class imbalance itself could lead to overlooking critical issues faced by unsatisfied customers if not addressed properly in modeling. Ignoring this imbalance might cause the model to underperform in identifying dissatisfied customers, potentially missing opportunities to fix pain points and leading to stagnant or negative growth in customer satisfaction.

#### Chart - 2

In [None]:
# Chart - 2 visualization code
sns.barplot(x='channel_name', y='CSAT_Binary', data=df)
plt.title("Chart 2: Average Satisfaction by Channel")
plt.xticks(rotation=45)
plt.show()

##### 1. Why did you pick the specific chart?

We picked this **bar plot** as **Chart 2** to visualize the **average customer satisfaction (`CSAT_Binary`) across different support channels** (like email, chat, phone).

This chart helps us:
1. **Compare performance across channels** — It shows which channels tend to have higher or lower average satisfaction scores.
2. **Identify strengths and weaknesses** — Understanding which communication channels deliver better customer experiences can guide resource allocation and process improvements.
3. **Drive targeted improvements** — If certain channels consistently show lower satisfaction, the business can investigate and improve those specific touch points to boost overall customer satisfaction.

In summary, this chart directly links channel choice to customer satisfaction outcomes.

##### 2. What is/are the insight(s) found from the chart?

The chart shows the average customer satisfaction (CSAT_Binary) for each support channel. It helps identify which channels—such as chat, email, or phone—are delivering better or worse customer experiences. Channels with higher average satisfaction scores indicate more effective or smoother customer interactions, while those with lower scores may point to areas needing improvement. This insight allows Flipkart to prioritize resources and training for channels that underperform, ultimately improving overall customer satisfaction and service quality.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

**Business Impact of the Insights from Chart 2**:
these insights can drive positive business impact by revealing which customer support channels are most effective in delivering satisfaction. Flipkart can invest more in high-performing channels and identify pain points in lower-performing ones to improve processes, training, or technology. Optimizing customer interaction channels directly enhances customer experience, reduces resolution times, and boosts overall satisfaction.

**Potential Negative Growth Insight**:
If certain channels consistently show low satisfaction but are heavily relied upon without improvement, it could lead to frustrated customers and increased churn. Ignoring these underperforming channels might worsen dissatisfaction, harming Flipkart’s reputation and leading to negative growth. Identifying and addressing these gaps is crucial to prevent such risks.

#### Chart - 3

In [None]:
# Chart - 3 visualization code
plt.figure(figsize=(10, 5))
sns.barplot(x='issue_category', y='CSAT_Binary', data=df)
plt.title("Chart 3: Average Satisfaction by Issue Category")
plt.xticks(rotation=90)
plt.show()


##### 1. Why did you pick the specific chart?

I picked this chart to visualize the average customer satisfaction across different issue categories. It helps us understand which types of issues (like delivery problems, payment issues, or product quality concerns) tend to have higher or lower satisfaction rates. This insight is valuable because it reveals which problem areas impact customer satisfaction the most, allowing Flipkart to prioritize improvements and tailor support efforts to the most critical issue categories. Essentially, this chart connects specific customer pain points to their satisfaction outcomes, guiding better resource allocation and service optimization.Answer Here.

##### 2. What is/are the insight(s) found from the chart?

The chart shows that certain issue categories have noticeably higher average satisfaction scores, indicating that customers facing these issues are generally more satisfied with the support received. Conversely, some categories have lower average satisfaction, highlighting areas where customers experience more frustration or unresolved problems. This variation suggests that Flipkart’s support effectiveness differs by issue type, pinpointing specific categories where process improvements or additional training could significantly boost customer satisfaction.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Business Impact of the Insights from Chart 3:
these insights can drive positive business impact by helping Flipkart focus improvement efforts on issue categories with lower satisfaction. By identifying problem areas, the company can optimize support workflows, provide targeted training to agents, and improve processes, leading to faster resolutions and happier customers. This targeted approach can reduce churn, increase loyalty, and enhance overall brand reputation.

Potential Negative Growth Insight:
If Flipkart ignores categories with low satisfaction or fails to address the root causes, it risks increasing customer frustration and complaints in those areas. Persistent dissatisfaction in specific issue categories can harm customer trust and lead to negative word-of-mouth, ultimately hurting growth and revenue.

#### Chart - 4

In [None]:
# Chart - 4 visualization code
top_subcats = df['Sub-category'].value_counts().nlargest(10).index
sns.barplot(x='Sub-category', y='CSAT_Binary', data=df[df['Sub-category'].isin(top_subcats)])
plt.title("Chart 4: CSAT by Top 10 Sub-Categories")
plt.xticks(rotation=90)
plt.show()

##### 1. Why did you pick the specific chart?

This chart was chosen to focus on the top 10 most frequent sub-categories of issues and analyze their average customer satisfaction (CSAT_Binary). By narrowing down to the most common sub-issues, we can better understand which specific problem types within broader categories most impact customer satisfaction. This targeted view helps prioritize detailed improvements where they will affect the largest number of customers.

##### 2. What is/are the insight(s) found from the chart?

The chart reveals that among the top sub-categories, some have higher average satisfaction scores, indicating effective resolution or easier issues, while others show lower satisfaction, suggesting persistent challenges. This variance highlights specific sub-categories where customers are frequently dissatisfied, providing clear targets for operational enhancements or policy changes.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Yes there is positive business impact, these insights enable Flipkart to prioritize resources and process improvements on the sub-categories causing the most dissatisfaction, improving customer experiences in high-impact areas. This focused effort can boost overall satisfaction, reduce complaints, and strengthen customer loyalty.Answer Here

If Flipkart neglects the low-performing sub-categories or does not address their root causes, dissatisfaction in these frequent issues could escalate, leading to increased customer churn, negative reviews, and loss of trust. Ignoring these problem areas risks long-term damage to the brand and hampers growth.

#### Chart - 5

In [None]:
# Chart - 5 visualization code
sns.countplot(x='Agent Shift', hue='CSAT_Binary', data=df)
plt.title("Chart 5: Agent Shift vs CSAT")
plt.show()

##### 1. Why did you pick the specific chart?

I chose this countplot with hue to compare customer satisfaction (CSAT_Binary) across different Agent Shifts. This visualization helps us understand if the time of the support agent’s shift (e.g., morning, evening, night) affects customer satisfaction levels. It’s useful to detect patterns or performance differences related to shifts, which can guide staffing and training decisions

##### 2. What is/are the insight(s) found from the chart?

The chart reveals if certain shifts have a higher proportion of satisfied or unsatisfied customers. For example, a shift with a larger number of unsatisfied customers may indicate understasffing, agent fatigue, or operational challenges during that time. Conversely, shifts with consistently high satisfaction suggest effective support during those hours.



##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Positive Impact:
Understanding shift-wise satisfaction allows Flipkart to optimize resource allocation and improve agent performance during low-satisfaction shifts. Adjustments such as adding more agents, enhancing training, or improving tools during those shifts can lead to better customer experience and higher overall satisfaction.

Potential Negative Growth:
If underperforming shifts are ignored, customer frustration during those times could rise, leading to increased complaints and churn. This can negatively impact Flipkart’s reputation and growth, especially if peak demand periods coincide with these shifts. Therefore, addressing shift-based disparities is critical for sustained positive growth.

#### Chart - 6

In [None]:
# Chart - 6 visualization code
sns.barplot(x='Tenure Bucket', y='CSAT_Binary', data=df)
plt.title("Chart 6: CSAT by Tenure Bucket")
plt.show()

##### 1. Why did you pick the specific chart?

This bar plot was chosen to analyze the relationship between agent tenure (experience level) and average customer satisfaction (CSAT_Binary). By visualizing satisfaction by Tenure Bucket, we can evaluate if more experienced agents lead to higher customer satisfaction, or if newer agents are performing just as well.

##### 2. What is/are the insight(s) found from the chart?

The chart reveals how customer satisfaction varies with agent experience. For example, if longer-tenured agents show higher CSAT scores, it suggests that experience contributes positively to customer service quality. Alternatively, if newer agents perform equally or better, it might indicate effective training programs or fresh energy in new hires.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Positive Impact:
These insights can guide hiring, training, and retention strategies. If experienced agents deliver better CSAT, investing in retention and mentoring programs could boost satisfaction. If newer agents perform well, Flipkart can confidently expand its workforce with fresh talent.

Potential Negative Growth:
If the chart shows declining satisfaction with higher tenure, it might indicate agent burnout or lack of motivation, which could harm long-term service quality. Ignoring this trend may lead to declining CSAT and customer trust, resulting in negative business growth.

#### Chart - 7

In [None]:
# Chart - 7 visualization code
top_managers = df['Manager'].value_counts().nlargest(5).index
sns.boxplot(x='Manager', y='CSAT_Binary', data=df[df['Manager'].isin(top_managers)])
plt.title("Chart 7: CSAT Distribution by Top 5 Managers")
plt.show()

##### 1. Why did you pick the specific chart?

This boxplot was chosen to examine the distribution of customer satisfaction scores (CSAT_Binary) for the top 5 managers with the highest number of handled cases. It helps compare manager-wise performance and detect differences in how teams under each manager are contributing to customer satisfaction.

##### 2. What is/are the insight(s) found from the chart?

The chart shows the spread, median, and variability of CSAT scores for each manager. You may find:

Some managers consistently deliver higher CSAT scores, indicating strong team performance or leadership.

Others show lower medians or wider spreads, suggesting inconsistency or issues within the team.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Positive Impact:
This insight helps Flipkart recognize high-performing managers and use their practices as a model to uplift overall team performance. It can also guide training, accountability, and support for underperforming teams, improving CSAT across the board.

Potential Negative Growth:
If low-performing managers are not identified or addressed, their teams may continue delivering poor customer experiences. Over time, this can lead to customer dissatisfaction, complaints, and brand damage, resulting in negative business impact.

#### Chart - 8

In [None]:
# Chart - 8 visualization code
top_sup = df['Supervisor'].value_counts().nlargest(5).index
sns.boxplot(x='Supervisor', y='CSAT_Binary', data=df[df['Supervisor'].isin(top_sup)])
plt.title("Chart 8: CSAT Distribution by Top 5 Supervisors")
plt.show()


##### 1. Why did you pick the specific chart?

This boxplot was selected to analyze the variation in customer satisfaction (CSAT_Binary) across the top 5 supervisors based on case volume. It allows us to compare how well different supervisors are managing their teams in terms of customer satisfaction outcomes.

##### 2. What is/are the insight(s) found from the chart?

The chart shows:

Median CSAT scores under each supervisor.

The consistency or spread of scores (tight vs. wide boxes).

Potential outliers indicating unusual customer experiences.

We may observe that certain supervisors consistently lead teams with higher CSAT, while others have lower medians or more variability, suggesting inconsistencies in performance or supervision style.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Positive Impact:
Yes, these insights can help Flipkart reward effective supervisors and use their methods to train or coach others. It can also identify teams needing support, improving overall service quality and customer satisfaction.

Potential Negative Growth:
If underperforming supervisors are not addressed, their teams may continue to deliver inconsistent or poor customer service, leading to negative experiences and lower CSAT. Over time, this can hurt customer loyalty, damage the brand, and impact growth negatively.

#### Chart - 9

In [None]:
# Chart - 9 visualization code
sns.histplot(data=df[df['Item_price'] != "Missing"], x='Item_price', hue='CSAT_Binary', bins=30)
plt.title("Chart 9: Item Price Distribution by CSAT")
plt.show()

##### 1. Why did you pick the specific chart?

This histogram with hue was chosen to explore how item price relates to customer satisfaction (CSAT_Binary). By visualizing the price distribution of items purchased by satisfied vs. unsatisfied customers, we can identify whether pricing influences satisfaction—e.g., whether low or high-priced items are more prone to dissatisfaction.

##### 2. What is/are the insight(s) found from the chart?

The chart reveals:

If unsatisfied customers are concentrated around specific price ranges (e.g., low-cost or high-cost items).

Whether satisfied customers are spread across a wider price range, suggesting overall consistency in satisfaction.

This can suggest pricing tiers that are more sensitive to service quality, product issues, or delivery expectations.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Positive Impact:
Yes, identifying price segments with high dissatisfaction allows Flipkart to improve service quality, communication, or product handling in those price brackets. This leads to targeted improvements, better customer experience, and increased loyalty.

Potential Negative Growth:
If item price-related dissatisfaction (e.g., frequent complaints for high-priced items) is ignored, premium customers may lose trust in the brand, resulting in revenue loss and churn from a valuable customer segment.

#### Chart - 10

In [None]:
# Chart - 10 visualization code
top_cities = df['Customer_City'].value_counts().nlargest(10).index
sns.barplot(x='Customer_City', y='CSAT_Binary', data=df[df['Customer_City'].isin(top_cities)])
plt.title("Chart 10: CSAT by City (Top 10)")
plt.xticks(rotation=90)
plt.show()

##### 1. Why did you pick the specific chart?

This bar plot was selected to evaluate average customer satisfaction (CSAT_Binary) across the top 10 cities by interaction volume. It helps identify geographic trends in customer satisfaction, uncovering whether certain cities experience consistently better or worse support.

##### 2. What is/are the insight(s) found from the chart?

The chart shows variation in average CSAT scores among the top 10 cities.
You may observe:

Some cities have higher satisfaction, possibly due to better logistics, agent availability, or regional support efficiency.

Others may show lower satisfaction, indicating regional challenges like delivery delays or communication gaps.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Positive Impact:
Yes, these insights enable Flipkart to target city-specific improvements—like enhancing logistics, support quality, or agent training in underperforming locations. This leads to localized service improvements, better customer retention, and stronger regional brand trust.

Potential Negative Growth:
If cities with low satisfaction are overlooked, customer frustration can build in those regions, leading to negative reviews, customer loss, and market share erosion—especially in high-potential urban markets.

#### Chart - 11

In [None]:
# Chart - 11 visualization code
sns.barplot(x='Product_category', y='CSAT_Binary', data=df)
plt.title("Chart 11: Product Category vs CSAT")
plt.xticks(rotation=45)
plt.show()

##### 1. Why did you pick the specific chart?

This bar plot was chosen to examine how customer satisfaction (CSAT_Binary) varies across different product categories. It helps assess whether certain product types are more likely to lead to dissatisfaction—perhaps due to quality issues, return complexity, or service expectations.

##### 2. What is/are the insight(s) found from the chart?

The chart reveals clear differences in average CSAT scores among product categories:

Some categories consistently achieve higher satisfaction, indicating fewer customer service issues.

Others show lower CSAT, suggesting recurring problems, such as delivery delays, faulty items, or difficult return processes.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Positive Impact:
Yes, by identifying low-performing product categories, Flipkart can collaborate with vendors, improve quality control, streamline returns, or enhance category-specific support. This can boost CSAT and reduce complaints.

Potential Negative Growth:
If poor-performing categories are not addressed, customers may lose trust in the platform for those product types, leading to increased churn, reputational damage, and lost revenue in those seg

#### Chart - 12

In [None]:
# Chart - 12 visualization code
top_agents = df['Agent_name'].value_counts().nlargest(5).index
sns.boxplot(x='Agent_name', y='CSAT_Binary', data=df[df['Agent_name'].isin(top_agents)])
plt.title("Chart 12: CSAT by Top 5 Agents")
plt.xticks(rotation=45)
plt.show()


##### 1. Why did you pick the specific chart?

This boxplot was selected to visualize the distribution of customer satisfaction scores (CSAT_Binary) for the top 5 agents who handled the most cases. It helps in comparing individual agent performance based on how satisfied their customers were.

##### 2. What is/are the insight(s) found from the chart?

The chart provides:

Median CSAT scores for each agent.

Variability or consistency in satisfaction (via inter quartile range).

Identification of outliers, which could indicate isolated cases of poor performance or exceptional service.

This helps recognize top-performing agents and detect those needing further support or coaching.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Positive Impact:
Yes. Insights from this chart can help Flipkart:
1. Reward high performers to encourage consistency.
2. Train or coach lower performers to improve service quality.
3. Build best practices from top agents into company-wide training.

Potential Negative Growth:
If low-performing agents are not addressed, it can lead to consistent customer dissatisfaction, especially if these agents are handling a high volume of interactions—directly impacting brand perception and repeat customer rates.

#### Chart - 13

In [None]:
# Chart - 13 visualization code
sns.histplot(pd.to_datetime(df['Survey_response_Date'], errors='coerce'), bins=30)
plt.title("Chart 13: Survey Responses Over Time")
plt.xticks(rotation=45)
plt.show()


##### 1. Why did you pick the specific chart?

This histogram was chosen to visualize the distribution of survey response dates over time. It helps identify trends in when customers are providing feedback, highlighting peaks or gaps in CSAT collection that may correspond to promotional periods, system issues, or operational changes.

##### 2. What is/are the insight(s) found from the chart?

The chart shows:

Time periods with high volumes of survey responses—possibly due to campaigns, high order volumes, or recent events.

Gaps or low-activity periods, which may indicate issues in feedback collection or customer engagement.

This helps understand customer feedback flow and potential external influences.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Positive Impact:
Yes. By analyzing this data, Flipkart can:
1. Align support quality improvements with feedback peaks.
2. Ensure consistent and timely feedback collection, improving data reliability.
3. Understand customer behavior across time and tailor services accordingly.

Potential Negative Growth:
If survey responses are clustered in limited timeframes, Flipkart may miss out on continuous customer feedback, leading to blind spots in service issues and delayed improvements—ultimately risking customer satisfaction.

#### Chart - 14 - Correlation Heatmap

In [None]:
# Correlation Heatmap visualization code
corr = df.select_dtypes(include=['int64', 'float64']).corr()
plt.figure(figsize=(12, 6))
sns.heatmap(corr, annot=True, cmap='coolwarm')
plt.title("Correlation Heatmap")
plt.show()


##### 1. Why did you pick the specific chart?

The correlation heatmap was selected to understand the relationships between numerical features in the dataset, especially to identify which features may have a strong influence on the target variable (CSAT_Binary) or are highly interrelated. This is a critical step before modeling.

##### 2. What is/are the insight(s) found from the chart?

From the heatmap:

We can observe how strongly CSAT_Binary correlates with other numeric features, such as Item_price or connected_handling_time.

It helps detect Multicollinearity (i.e., features highly correlated with each other), which can harm model performance if not addressed.

Weak or no correlations with CSAT_Binary indicate those features may be less useful for prediction.



#### Chart - 15 - Pair Plot

In [None]:
# Pair Plot visualization code
sample_df = df[['CSAT_Binary', 'Item_price', 'channel_name', 'Tenure Bucket']].copy()
sample_df = sample_df.replace("Missing", np.nan).dropna()
sample_df['channel_name'] = sample_df['channel_name'].astype(str)
sns.pairplot(sample_df, hue='CSAT_Binary')
plt.show()

##### 1. Why did you pick the specific chart?

The pair plot was selected to visualize pairwise relationships among multiple features (Item_price, channel_name, Tenure Bucket, and CSAT_Binary). It helps detect patterns, clusters, and separability in the data based on the CSAT labels—especially useful before applying classification models.

##### 2. What is/are the insight(s) found from the chart?

From the pair plot:

You may observe whether satisfied (1) and unsatisfied (0) customers cluster differently based on numeric values like Item_price.
It also reveals non-linear relationships or overlaps between satisfaction and feature combinations (e.g., certain tenure ranges showing higher CSAT).
Helps to spot outliers and feature pairs that provide better class separation.

## **5. Solution to Business Objective**

#### What do you suggest the client to achieve Business Objective ?
Explain Briefly.

To help Flipkart achieve the business objective of improving customer satisfaction (CSAT), I suggest the following key actions based on our EDA insights:

Focus on High-Impact Channels and Categories:
Prioritize resources and quality improvements on channels and product categories where customer satisfaction is low. For example, if certain issue categories or channels show poor CSAT, targeted training or process improvements should be implemented there.

Enhance Agent and Supervisor Performance:
Use CSAT insights per agent and supervisor to reward high performers and provide coaching for those with lower satisfaction scores. This will help standardize excellent customer support practices across the team.

Optimize Customer Support Based on Timing and Geography:
Address regional disparities in satisfaction by tailoring logistics, support availability, and communications in cities with low CSAT. Also, ensure survey feedback is collected consistently over time to monitor improvements effectively.

Leverage Pricing Insights:
Investigate dissatisfaction patterns around item prices to refine product quality controls, returns policies, or customer communication, especially for high or low-priced items.

Implement Data-Driven Continuous Improvement:
Use predictive models developed from key features (e.g., handling time, issue category, tenure bucket) to proactively identify and resolve potential dissatisfaction before it escalates.

By taking these targeted, data-driven steps, Flipkart can improve customer experience, reduce churn, and boost overall satisfaction—leading to stronger brand loyalty and increased revenue.

# **Conclusion**

The exploratory data analysis revealed significant factors influencing customer satisfaction at Flipkart, including support channels, issue categories, agent performance, and geographic location. By identifying these key drivers, Flipkart can strategically target improvements in customer service processes and product management. Implementing focused interventions—such as agent training, regional support enhancements, and category-specific quality control—will help boost satisfaction scores and reduce negative feedback. Leveraging data insights to predict and proactively address customer concerns ensures a positive impact on customer loyalty and business growth. Overall, this analysis provides a clear roadmap for enhancing Flipkart’s customer experience and sustaining competitive advantage.

### ***Hurrah! You have successfully completed your EDA Capstone Project !!!***