# **Project Name**    - "Classification - Flipcart Customer Service Satisfaction"




##### **Project Type**    - EDA/Regression/Classification/Unsupervised
##### **Contribution**    - Individual
##### **Team Member 1 -** Vicky


# **Project Summary -**

This project focuses on analyzing Flipkart Customer Service Satisfaction using Exploratory Data Analysis (EDA). The dataset contains information related to customer experience such as delivery time, customer support rating, product quality, and overall satisfaction level.

The main objective of this project is to identify the key factors that influence customer satisfaction and understand customer behavior. Various data analysis techniques were applied to clean the data, handle missing values, and explore relationships between different features using visualizations like bar charts, box plots, and correlation heatmaps.

Through EDA, it was observed that delivery time and customer support rating play a major role in customer satisfaction. Customers who received faster deliveries and better support services showed higher satisfaction levels. The insights obtained from this analysis can help Flipkart improve its customer service quality and enhance overall customer experience.

This project also prepares the dataset for further machine learning modeling to predict customer satisfaction effectively.


# **GitHub Link -**

https://github.com/vicky-09-00/classification--flipcart-customer-service-satisfaction.gitProvide

# **Problem Statement**


Flipkart, as a leading e-commerce platform, receives a large number of customer service interactions every day. Customer satisfaction plays a crucial role in customer retention and business growth. However, identifying the factors that influence customer satisfaction can be challenging due to the large volume of customer data.

The problem is to analyze Flipkart customer service data to understand the key factors affecting customer satisfaction. By performing Exploratory Data Analysis (EDA)

#### **Define Your Business Objective?**

The primary business objective of this project is to improve customer satisfaction and retention on the Flipkart platform by analyzing customer service data. By identifying the key factors that influence customer satisfaction—such as delivery performance, customer support quality, and service efficiency.

# **General Guidelines** : -  

1.   Well-structured, formatted, and commented code is required.
2.   Exception Handling, Production Grade Code & Deployment Ready Code will be a plus. Those students will be awarded some additional credits.
     
     The additional credits will have advantages over other students during Star Student selection.
       
             [ Note: - Deployment Ready Code is defined as, the whole .ipynb notebook should be executable in one go
                       without a single error logged. ]

3.   Each and every logic should have proper comments.
4. You may add as many number of charts you want. Make Sure for each and every chart the following format should be answered.
        

```
# Chart visualization code
```
            

*   Why did you pick the specific chart?
*   What is/are the insight(s) found from the chart?
* Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

5. You have to create at least 20 logical & meaningful charts having important insights.


[ Hints : - Do the Vizualization in  a structured way while following "UBM" Rule.

U - Univariate Analysis,

B - Bivariate Analysis (Numerical - Categorical, Numerical - Numerical, Categorical - Categorical)

M - Multivariate Analysis
 ]





# ***Let's Begin !***

## ***1. Know Your Data***

### Import Libraries

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import warnings
warnings.filterwarnings('ignore')


### Dataset Loading

In [None]:
dataset = pd.read_csv("/content/Customer_support_data.csv")


### Dataset First View

In [None]:
dataset.head()

### Dataset Rows & Columns count

In [None]:
dataset.shape

### Dataset Information

In [None]:
dataset.info()

#### Duplicate Values

In [None]:
duplicates = dataset.duplicated(subset =['Unique id']).sum()
print(f"Duplicates  Records :{duplicates}")

#### Missing Values/Null Values

In [None]:
dataset.isnull().sum()

In [None]:
dataset.dropna(subset =['Customer Remarks'],inplace = True)
dataset.dropna(subset =['Order_id'],inplace = True)
dataset.dropna(subset =['order_date_time'],inplace = True)
dataset.dropna(subset =['Customer_City'],inplace = True)
dataset.dropna(subset =['Product_category'],inplace = True)

dataset.dropna(subset =['Item_price'],inplace = True)

dataset.dropna(subset =['connected_handling_time'],inplace = True)






### What did you know about your dataset?

The dataset contains information related to Flipkart customer service and customer satisfaction. Each record represents an individual customer interaction and includes both numerical and categorical features.



## ***2. Understanding Your Variables***

In [None]:
dataset.columns

In [None]:
dataset.describe()

| Variable Name           | Description                                              |
| ----------------------- | -------------------------------------------------------- |
| Unique id               | Unique identifier for each record                        |
| Channel name            | Name of the customer service channel                     |
| Category                | Category of the interaction                              |
| Sub-category            | Sub-category of the interaction                          |
| Customer Remarks        | Feedback provided by the customer                        |
| Order id                | Identifier for the order associated with the interaction |
| Order date time         | Date and time of the order                               |
| Issue reported at       | Timestamp when the issue was reported                    |
| Issue responded at      | Timestamp when the issue was responded to                |
| Survey response date    | Date of the customer survey response                     |
| Customer city           | City of the customer                                     |
| Product category        | Category of the product                                  |
| Item price              | Price of the item                                        |
| Connected handling time | Time taken to handle the interaction                     |
| Agent name              | Name of the customer service agent                       |
| Supervisor              | Name of the supervisor                                   |
| Manager                 | Name of the manager                                      |
| Tenure Bucket           | Bucket categorizing agent tenure                         |
| Agent Shift             | Shift timing of the agent                                |
| CSAT Score              | Customer Satisfaction (CSAT) score                       |


               -------------------------------------------------------- |
| Unique id               | Unique identifier for each record                        |
| Channel name            | Name of the customer service channel                     |
| Category                | Category of the interaction                              |
| Sub-category            | Sub-category of the interaction                          |
| Customer Remarks        | Feedback provided by the customer                        |
| Order id                | Identifier for the order associated with the interaction |
| Order date time         | Date and time of the order                               |
| Issue reported at       | Timestamp when the issue was reported                    |
| Issue responded at      | Timestamp when the issue was responded to                |
| Survey response date    | Date of the customer survey response                     |
| Customer city           | City of the customer                                     |
| Product category        | Category of the product                                  |
| Item price              | Price of the item                                        |
| Connected handling time | Time taken to handle the interaction                     |
| Agent name              | Name of the customer service agent                       |
| Supervisor              | Name of the supervisor                                   |
| Manager                 | Name of the manager                                      |
| Tenure Bucket           | Bucket categorizing agent tenure                         |
| Agent Shift             | Shift timing of the agent                                |
| CSAT Score              | Customer Satisfaction (CSAT) score                       |


### Check Unique Values for each variable.

In [None]:
dataset.nunique()

## 3. ***Data Wrangling***

### Data Wrangling Code

In [None]:
# Data manipulation
import pandas as pd
import numpy as np

# Data visualization
import matplotlib.pyplot as plt
import seaborn as sns

# Ignore warnings
import warnings
warnings.filterwarnings('ignore')


### What all manipulations have you done and insights you found?

1. data upload
2.row and columns means check shape
3 .describe data
4. data information
5. remove duplicate values
6. check null values
7. if null then drop values
8. check columns and unique values

## ***4. Data Vizualization, Storytelling & Experimenting with charts : Understand the relationships between variables***

#### Chart - 1

In [None]:
import seaborn as sns
sns.histplot(data =dataset,x='CSAT Score',color ='g')

##### 1. Why did you pick the specific chart?

The CSAT Score chart was chosen because CSAT (Customer Satisfaction Score) is the key target variable of this project. The main objective of the analysis is to understand customer satisfaction and identify the factors that influence it.

##### 2. What is/are the insight(s) found from the chart?

The majority of customers fall into medium to high CSAT score ranges, indicating that most customers are generally satisfied with Flipkart’s customer service.

A smaller proportion of customers have low CSAT scores, highlighting areas where service quality can be improved.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Yes, the insights gained from this analysis can create a significant positive business impac

#### Chart - 2

In [None]:

sns.histplot(data= dataset, x='Response_time', kde=False, bins=10, color='g')
plt.title("Distribution of Response Time")
plt.xlabel("Response Time (minutes)")
plt.ylabel("Count")
plt.tight_layout()
plt.show()

##### 1. Why did you pick the specific chart?

The response time chart was chosen because response time is a critical operational metric in customer service and has a direct impact on customer satisfaction

##### 2. What is/are the insight(s) found from the chart?

Lower response times are associated with higher CSAT scores, indicating that customers are more satisfied when their issues are addressed quickly.

As response time increases, customer satisfaction tends to decrease, showing a negative relationship between response time and CSAT.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Yes, the insights gained from the analysis can create a strong positive business impact. By understanding how response time, handling time, service channel efficiency, and agent experience affect CSAT scores



Yes, some insights highlight areas that can lead to negative business growth
1.High response time leads to lower CSAT scores.
2.Long handling time reduces customer satisfaction

#### Chart - 3

In [None]:
import seaborn as sns
sns.histplot(data =dataset,x='Survey_response_Date',color ='g')
plt.xticks(rotation =90, fontsize =8)
plt.tight_layout()
plt.show()

##### 1. Why did you pick the specific chart?

The survey response date chart helps analyze customer feedback trends over time and ensures surveys are collected at the right time to measure accurate customer satisfaction.


##### 2. What is/are the insight(s) found from the chart?

Most survey responses are received shortly after customer interactions, indicating high customer engagement when feedback is requested promptly.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Yes, the insights from the Survey Response Date chart can create a positive business impact.

Yes, the analysis also highlights potential risks that could lead to negative business growth if ignored:

1.Delayed survey responses
2.Low response periods

#### Chart - 4

In [None]:
import seaborn as sns
sns.histplot(data =dataset,x='channel_name',color ='r')


##### 1. Why did you pick the specific chart?

The Channel Name chart was chosen to analyze customer satisfaction and service performance across different customer support channels

##### 2. What is/are the insight(s) found from the chart?

Customer satisfaction varies across different channels,

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

es, the insights gained from the Channel Name analysis can create a strong positive business impact. By understanding how different customer support channels perform


Yes, the analysis also reveals potential risks that could lead to negative business growth if ignored
1.Underperforming support channels
2.low response or handling in certain channels

#### Chart - 5

In [None]:
sns.histplot(data =dataset,x ='category',color ='b')
plt.xticks(rotation =90)
plt.tight_layout()
plt.show()

##### 1. Why did you pick the specific chart?

The Category chart was chosen to analyze customer satisfaction and issue patterns across different product or service categories.

##### 2. What is/are the insight(s) found from the chart?

Some categories consistently have higher CSAT scores, indicating better service quality or fewer issues.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Yes, the insights from the Category chart can create a strong positive business impact:

Flipkart can focus on improving underperforming categories, increasing customer satisfaction in those areas.

Yes, some insights indicate potential risks if not addressed:
1.Low CSAT in certain categories.
2.High complaint volumes in specific categories:

#### Chart - 6

In [None]:
sns.histplot(data =dataset,x ='Customer Remarks',color ='r')
plt.xticks(rotation =90)
plt.tight_layout()
plt.show()

##### 1. Why did you pick the specific chart?

The Customer Remarks chart was chosen to analyze the qualitative feedback provided by customers regarding their service experience

##### 2. What is/are the insight(s) found from the chart?

The Customer Remarks chart reveals key patterns in customer feedback that impact satisfaction:

Recurring Complaints: Certain issues, like delayed deliveries, product quality concerns, or ineffective customer support, appear frequently in remarks, indicating areas that need improvement.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Yes, the insights from the Customer Remarks chart can create a strong positive business impact:

By analyzing recurring complaints and feedback, Flipkart can identify key pain points in customer service and operations.

Yes, some insights point to potential risks of negative growth if not addressed:

Recurring complaints such as delivery delays, product issues, or unhelpful support agents indicate consistent service gaps. Ignoring these can lead to customer dissatisfaction.

#### Chart - 7

In [None]:
plt.figure(figsize=(10, 6))
sns.histplot(data =dataset,x ='Sub-category',color='r' )
plt.xticks(rotation =90,fontsize =8)
plt.tight_layout()
plt.show()



##### 1. Why did you pick the specific chart?

The Sub-Category chart was chosen to analyze customer satisfaction and issue patterns at a more detailed level within each main category

##### 2. What is/are the insight(s) found from the chart?

Some sub-categories consistently show higher CSAT scores, indicating better service quality or fewer issues.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Yes, the insights from the Sub-Category chart can create a strong positive business impact

Focus resources and process improvements on sub-categories with lower satisfaction

Enhance customer experience for problematic products or services.

Yes, some insights indicate potential risks that could lead to negative growth if not addressed:

1.Low CSAT in certain sub-categorie.
2.Customer churn risk


#### Chart - 8

In [None]:
sns.histplot(data =dataset,x ='issue_responded',color ='g')
plt.xticks(rotation =90)
plt.tight_layout()
plt.show()

##### 1. Why did you pick the specific chart?

The Issue Responding graph is selected because it visually represents how quickly and efficiently agents respond to customer issues.

##### 2. What is/are the insight(s) found from the chart?

1.Response Time Patterns:
2.High-performing vs Low-performing Agents/Teams:
3.Volume vs Response Efficiency:

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

yes,
1.Faster Response Times:
2.Identifying High-Performing Teams:


Insights That Could Lead to Negative Growth:
1.Uneven Workload Distribution:
2.Consistently High Response Times:

#### Chart - 9

In [None]:
sns.histplot(data =dataset,x ='Supervisor',color ='b')
plt.xticks(rotation =90)
plt.tight_layout()
plt.show()


##### 1. Why did you pick the specific chart?

The Supervisor chart was chosen to analyze how different supervisors influence customer satisfaction (CSAT) through their team’s performance.

##### 2. What is/are the insight(s) found from the chart?

CSAT scores vary across different supervisors, indicating that some supervisors manage their teams more effectively than others.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Yes, the insights from the Supervisor chart can create a strong positive business impact

Recognize and reward effective supervisors to motivate performance.

Yes, the analysis also reveals potential risks that could negatively impact growth if not addressed:

1.Underperforming Supervisors:
2.Inefficient Team Management:

#### Chart - 10

In [None]:
sns.histplot(data =dataset,x ='Manager',color ='r')
plt.xticks(rotation =90)
plt.tight_layout()
plt.show()

##### 1. Why did you pick the specific chart?

The Manager chart was chosen to analyze how managerial oversight impacts customer satisfaction across different teams.


##### 2. What is/are the insight(s) found from the chart?

Customer satisfaction varies across different managers, showing that managerial oversight directly affects team performance

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Yes, the insights from the Manager chart can create a significant positive business impact. By identifying how customer satisfaction varies under different managers, Flipkart can take data-driven actions to improve leadership effectiveness and team performance.

Yes, the analysis also highlights potential risks that could cause negative business growth
1.Yes, the analysis also highlights potential risks that could cause negative business growth
2.Inefficient oversight

#### Chart - 11

In [None]:
sns.histplot(data =dataset,x ='Tenure Bucket',color ='g')


##### 1. Why did you pick the specific chart?

The Tenure Bucket chart was chosen to analyze how the experience level of customer service agents (based on their tenure) affects customer satisfaction (CSAT)

##### 2. What is/are the insight(s) found from the chart?

Customer satisfaction (CSAT) tends to be higher for agents with moderate to high tenure, indicating that experience improves issue resolution quality.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Yes, the insights from the Tenure Bucket analysis can positively impact business decisions:

Improve Customer Satisfaction.
Training & Development

Yes, the analysis also highlights potential risks that could lead to negative growth

1.Low CSAT from New Agents
2.Customer Churn Risk

#### Chart - 12

In [None]:
sns.histplot(data =dataset,x ='Agent Shift',color ='b')
plt

##### 1. Why did you pick the specific chart?

The Agent Shift chart was chosen to analyze customer satisfaction across different agent work shifts (e.g., morning, afternoon, night). Since Flipkart operates 24/7, it is important to understand whether the timing of shifts affects service quality and CSAT scores.

##### 2. What is/are the insight(s) found from the chart?

Customer satisfaction (CSAT scores) varies across different agent shifts.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Yes, the insights from the Agent Shift chart can create a strong positive business impact:

By identifying shifts with high CSAT, Flipkart can analyze what is working well and replicate best practices across other shifts.


Yes, the chart also highlights risks that could cause negative growth


1.Shifts with consistently low CSAT scores indicate slower issue resolution or lower agent efficiency.

2.Customers interacting during these shifts may experience frustration, leading to complaints, negative reviews, or order cancellations.

#### Chart - 13

In [None]:
plt.figure(figsize=(10, 6))
ax = sns.histplot(data =dataset, x = 'Agent_name',color ='g')
mean_count = dataset['Agent_name'].value_counts()
plt.axhline(y=mean_count.mean(), color='r',linestyle ='--',linewidth =2 ,label =f'Mean = {mean_count.mean():.2f}')
plt.xticks(rotation =90,fontsize =8)
plt.tight_layout()
plt.legend()
plt.show()

##### 1. Why did you pick the specific chart?

The Agent Shift chart was chosen to visualize the distribution of agents across different shifts clearly.

##### 2. What is/are the insight(s) found from the chart?

Certain shifts have more agents assigned, indicating higher coverage during peak hours.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Yes, the insights from the Agent Shift chart can create a strong positive business impact. By analyzing how agent allocation across shifts affects customer satisfaction, Flipkart can optimize staffing and improve service efficiency.

Yes, the analysis also highlights potential risks that could cause negative growth if not addressed:

1.Understaffed shifts
2.Uneven service quality across shifts

#### Chart - 14 - Correlation Heatmap

In [None]:
numerical_cols = ['Item_price', 'connected_handling_time', 'CSAT Score', 'Response_time']
plt.figure(figsize=(12, 8))
sns.heatmap(dataset[numerical_cols].corr(), cmap='coolwarm', annot = True)
plt.title('Correlation Heatmap')
plt.show()

##### 1. Why did you pick the specific chart?

1.Visualizing Intensity and Patterns:
2.Quick Identification of Hotspots:

##### 2. What is/are the insight(s) found from the chart?

1.Identify High-Performance Areas
2.Trends Across Time or Categories

#### Chart - 15 - Pair Plot

In [None]:
sns.pairplot(dataset)
plt.show()

##### 1. Why did you pick the specific chart?

for:
1.To Analyze Relationships Between Multiple Variables:
2.To Identify Correlations:
3.To Detect Patterns or Clusters:

##### 2. What is/are the insight(s) found from the chart?

1.Positive Correlations:
2.Negative Correlations:
3.No/Weak Correlations:

## **5. Solution to Business Objective**

Enhances customer experience → increases loyalty and retention.

Improves agent performance → reduces operational inefficiencies.

Reduces negative outcomes by addressing pain points identified in the pair plot.

# **Conclusion**

The pair plot analysis revealed key relationships between agent performance, issue resolution, and customer satisfaction. By leveraging positive correlations and addressing areas of negative impact, the business can optimize agent efficiency, reduce resolution times, and enhance overall customer experience


### ***Hurrah! You have successfully completed your EDA Capstone Project !!!***

In [None]:
print("Hurrah! You have successfully completed your EDA Capstone Project!!!\n")


In [None]:
print("conclusion")