# Flipkart Customer Service Satisfaction — Classification Project

**Dataset:** flipkart_customer_service.csv

This notebook uses the real Flipkart dataset to build a classification model for predicting Customer Satisfaction (CSAT).


## 1. Setup & Imports

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder, StandardScaler
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics import classification_report, confusion_matrix, roc_auc_score
from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import RandomForestClassifier
import warnings
warnings.filterwarnings('ignore')

print('Libraries ready')

## 2. Load Dataset

In [None]:
df = pd.read_csv('flipkart_customer_service.csv')
print('Shape:', df.shape)
df.head()

## 3. Quick Checks

In [None]:
print('Missing values per column:\n', df.isnull().sum())
print('\nCSAT Score distribution:\n', df['CSAT Score'].value_counts())

## 4. Target Variable Transformation

Convert CSAT Score (1–5) to binary classification:
- **Satisfied (1)** = 4 or 5
- **Dissatisfied (0)** = 1–3

In [None]:
df['satisfaction'] = df['CSAT Score'].apply(lambda x: 1 if x >= 4 else 0)
print(df['satisfaction'].value_counts(normalize=True))

## 5. Feature Selection & Preprocessing

In [None]:
# Drop columns with very high missing values
cols_to_drop = ['Item_price','connected_handling_time','Customer_City','Product_category','order_date_time']
df = df.drop(columns=cols_to_drop)

# Text column
text_col = 'Customer Remarks'

# Categorical columns
cat_cols = ['channel_name','category','Sub-category','Agent_name','Supervisor','Manager','Tenure Bucket','Agent Shift']

# Numeric columns (none significant in dataset after dropping, except we can add later if needed)
numeric_cols = []

print('Remaining columns:', df.columns.tolist())

In [None]:
# Encode categorical features with LabelEncoder
cat_encoded = pd.DataFrame()
for col in cat_cols:
    le = LabelEncoder()
    cat_encoded[col] = le.fit_transform(df[col].astype(str))

# Text feature: TF-IDF
vectorizer = TfidfVectorizer(max_features=1000, stop_words='english')
X_text = vectorizer.fit_transform(df[text_col].fillna(''))

# Combine features
from scipy.sparse import hstack, csr_matrix
X = hstack([csr_matrix(cat_encoded.values), X_text]).tocsr()
y = df['satisfaction'].values

print('Feature matrix shape:', X.shape)

## 6. Train/Test Split

In [None]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42, stratify=y)
print('Train:', X_train.shape, 'Test:', X_test.shape)

## 7. Train Models

In [None]:
lr = LogisticRegression(max_iter=1000)
lr.fit(X_train, y_train)
y_pred_lr = lr.predict(X_test)
print('Logistic Regression Results:\n', classification_report(y_test, y_pred_lr))

In [None]:
rf = RandomForestClassifier(n_estimators=200, random_state=42, n_jobs=-1)
rf.fit(X_train, y_train)
y_pred_rf = rf.predict(X_test)
print('Random Forest Results:\n', classification_report(y_test, y_pred_rf))

## 8. Evaluation: ROC-AUC & Confusion Matrix

In [None]:
if hasattr(lr, 'predict_proba'):
    lr_probs = lr.predict_proba(X_test)[:,1]
    print('Logistic Regression ROC-AUC:', roc_auc_score(y_test, lr_probs))
if hasattr(rf, 'predict_proba'):
    rf_probs = rf.predict_proba(X_test)[:,1]
    print('Random Forest ROC-AUC:', roc_auc_score(y_test, rf_probs))

cm = confusion_matrix(y_test, y_pred_rf)
plt.figure(figsize=(4,3))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues')
plt.xlabel('Predicted')
plt.ylabel('Actual')
plt.title('Confusion Matrix - Random Forest')
plt.show()

## 9. Feature Importance

We check which categorical features & TF-IDF words are most important.

In [None]:
import numpy as np
# Top words correlated with satisfaction
X_text_arr = X_text.toarray()
mean_pos = X_text_arr[y==1].mean(axis=0)
mean_neg = X_text_arr[y==0].mean(axis=0)
diff = mean_pos - mean_neg
feature_names = vectorizer.get_feature_names_out()

print('Top positive words:')
for i in np.argsort(diff)[-10:][::-1]:
    print(feature_names[i], diff[i])

print('\nTop negative words:')
for i in np.argsort(diff)[:10]:
    print(feature_names[i], diff[i])

## 10. Recommendations & Next Steps

- Focus on channels/categories with lower satisfaction.
- Train agents with low performance (Agent_name-wise analysis possible).
- Monitor common negative feedback words to flag at-risk tickets.
- Deploy this model to support teams to proactively manage dissatisfaction.
