### Modeling Objectives

- Train and evaluate classification models to predict satisfaction.
- Use SHAP or LIME to explain key satisfaction drivers.
- Monitor performance using recall, F1 and AUC scores.

In [1]:
# Import libraries

import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split, cross_val_score
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import classification_report, confusion_matrix, roc_auc_score, f1_score, recall_score
from sklearn.preprocessing import StandardScaler, LabelEncoder
import matplotlib.pyplot as plt
import seaborn as sns

import warnings
warnings.filterwarnings("ignore")

# Import explainability libraries 
import shap
import yellowbrick

c:\Users\User\anaconda3\envs\learn-env\lib\site-packages\numpy\.libs\libopenblas.PYQHXLVVQ7VESDPUVUADXEVJOBGHJPAY.gfortran-win_amd64.dll
c:\Users\User\anaconda3\envs\learn-env\lib\site-packages\numpy\.libs\libopenblas64__v0.3.21-gcc_10_3_0.dll


In [2]:
# Load and prepare data
df = pd.read_csv('eda_incl.csv')  
df.head()

Unnamed: 0,Agency Name,Complaint Type,Descriptor,Borough,Resolution Description,Survey Year,Survey Month,Satisfaction Response,Dissatisfaction Reason,Justified Dissatisfaction,Cluster,Combined_Feedback,Sentiment Score,Sentiment Label
0,Department of Buildings,Adult Establishment,Zoning Violation,MANHATTAN,The Department of Buildings investigated this ...,2022,10,Strongly Agree,Not Applicable,Delays in inspections or provision of construc...,0,Delays in inspections or provision of construc...,0.0,neutral
1,Department of Buildings,Adult Establishment,Zoning Violation,BROOKLYN,The Department of Buildings investigated this ...,2024,11,Strongly Disagree,The Agency did not correct the issue.,Delays in inspections or provision of construc...,1,Delays in inspections or provision of construc...,0.0,neutral
2,Department for the Aging,Legal Services Provider Complaint,Not Provided,MANHATTAN,The Department for the Aging contacted you and...,2024,3,Strongly Agree,Not Applicable,Lack of timely support or services for senior ...,0,Lack of timely support or services for senior ...,0.1027,positive
3,Department of Buildings,Advertising Sign,Poster,MANHATTAN,The Department of Buildings reviewed this comp...,2024,2,Neutral,Not Applicable,Delays in inspections or provision of construc...,0,Delays in inspections or provision of construc...,0.0,neutral
4,Department of Buildings,Advertising Sign,Billboard,MANHATTAN,The Department of Buildings investigated this ...,2023,10,Strongly Disagree,"Status updates were unhelpful, inaccurate, inc...",Delays in inspections or provision of construc...,3,Delays in inspections or provision of construc...,0.0,neutral


In [3]:
# Importing the pipeline class
from satisfaction_pipeline import SatisfactionPipeline

In [4]:
# Initialize and train pipeline
satisfaction_model = SatisfactionPipeline()
# Fit the model
satisfaction_model.fit(df)

Recall: 0.983
F1 Score: 0.926
AUC Score: 0.971

Top 5 Important Features:
Cluster            0.907347
Complaint Type     0.035440
Survey Month       0.017973
Sentiment Score    0.015016
Borough            0.009906
dtype: float64


<satisfaction_pipeline.SatisfactionPipeline at 0x21165c21910>

In [None]:
# Make predictions
data = df
predictions = satisfaction_model.predict(data)
probabilities = satisfaction_model.predict_proba(data)

In [6]:
print(f"\nPredictions: {predictions}")
print(f"Satisfaction Probabilities: {probabilities[:, 1]}")


Predictions: [1 0 1 ... 0 1 0]
Satisfaction Probabilities: [0.85833333 0.         0.93940476 ... 0.38114286 0.9065613  0.        ]


In [16]:
# Create target
df['Satisfied'] = df['Satisfaction Response'].apply(
    lambda x: 1 if x in ['Strongly Agree', 'Agree'] else 0
)

# Select features
features = ['Agency Name', 'Complaint Type', 'Borough', 'Survey Year', 'Survey Month', 'Cluster', 'Sentiment Score']
X = df[features].copy()
y = df['Satisfied']

# Encode categorical variables
from sklearn.preprocessing import LabelEncoder
le_dict = {}
for col in X.select_dtypes(include=['object']).columns:
    le = LabelEncoder()
    X[col] = le.fit_transform(X[col].astype(str))
    le_dict[col] = le

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42, stratify=y)

In [None]:
from sklearn.model_selection import GridSearchCV
import xgboost as xgb

# Quick XGBoost comparison
xgb_model = xgb.XGBClassifier(random_state=42)
xgb_model.fit(X_train, y_train)
xgb_pred = xgb_model.predict(X_test)
print(f"XGBoost F1: {f1_score(y_test, xgb_pred):.3f}")
print(f"XGBoost Recall: {recall_score(y_test, xgb_pred):.3f}")
print(f"XGBoost AUC Score: {roc_auc_score(y_test, xgb_pred):.3f}")

XGBoost F1: 0.932
Recall: 0.999
AUC Score: 0.963


The comparison between Random Forest and XGBoost models reveals marginal performance differences, with XGBoost achieving slightly higher F1 score (0.932 vs 0.926) and recall (0.999 vs 0.983), while Random Forest maintains a better AUC score (0.971 vs 0.963). Despite XGBoost's marginal improvements, the Random Forest model remains the recommended choice due to its superior interpretability, stability, and lower risk of overfitting, particularly given that XGBoost's near-perfect 99.9% recall may indicate potential overfitting to the training data. Both models demonstrate exceptional performance for satisfaction prediction, but Random Forest offers the optimal balance of accuracy, reliability and business interpretability for production deployment.

### Comparison of Random Forest and XGBoost

| **Metric** | **Random Forest** | **XGBoost** | **Difference** |
| ---------- | ----------------- | ----------- | -------------- |
| F1 Score   | 0.926             | 0.932       | +0.6%          |
| Recall     | 0.983             | 0.999       | +1.6%          |
| AUC Score  | 0.971             | 0.963       | -0.8%          |

In [22]:
# Save model
satisfaction_model.save_model('satisfaction_model.pkl')

Model saved to satisfaction_model.pkl
