# EMI Default Risk Prediction

## Problem Statement
Predict loan default risk using structured customer and loan data.

## Approach
Data cleaning, feature engineering, and a single classification model focused on imbalanced data.

## Evaluation
Precision, recall, and ROC-AUC.

## Business Impact
Improves credit decisions by identifying high-risk applicants early.

# **EMIPredict AI - Intelligent Financial Risk Assessment Platform**

# EMI Default Risk Prediction

## Problem Statement
Predict loan default risk using structured customer and loan data.

## Approach
Performed data cleaning, exploratory analysis, feature engineering, and trained a classification model on imbalanced data.

## Evaluation
Model performance evaluated using precision, recall, and ROC-AUC.

## Business Impact
Helps financial institutions identify high-risk applicants early and improve lending decisions.

## 01. Imports

### ➤ *Import Libraries*

## 02. Uploading Files

### ➤ *Mount Google Drive*

### ➤ *Create a project folder in Drive*

### ➤ *File(s) check*

## 03. Loading Dataset

### ➤ *Auto detect .csv File*

### ➤ *Safe read .csv File*

## 04. Data Assessment

### ➤ *Dataset Rows & Columns Count*

### ➤ *Dataset Type*

### ➤ *Dataset First few Rows*

In [None]:
 # Preview first rows

 display(df.head())

### ➤ *Dataset Last few Rows*

In [None]:
 # Preview last rows

 display(df.tail())

### ➤ *Basic Statistics*

In [None]:
# Statistical summary

display(df.describe().T)

### ➤ *Dataset Info*

In [None]:
# Dataset info

df.info()

### ➤ *Unique Values*

### ➤ *Missing Values*

### ➤ *Duplicate Values*

## 05. Data Cleaning

### ➤ *Missing values handling*

### ➤ *Missing values after handling*

### *➤ Duplicate values handling*

### ➤ *Duplicate values after handling*

### ➤ *Standardize column names (lowercase + underscores) and inspect important names*

### ➤ *Final Cleaning Summary*

## 05. Feature Engineering

## 06. Machine Learning (Data Modeling & Deployment)

### ➤ *Feature & Target Preparation*

### ➤ *Train / Test Split*

### ➤ *Preprocessing Pipelines*

### ➤ *Build full Pipeline with Estimator*

### ➤ *Build baseline model Pipeline*

### ➤ *Train baseline pipeline*

### ➤ *Logistic Regression Model (Classification) [EMI Eligibility]*

### ➤ *Logistic Regression Model (Classification) [EMI Eligibility] Evaluation*

### ➤ *Random Forest Classifier (Classification) [EMI Eligibility]*

### ➤ *Random Forest Classifier (Classification) [EMI Eligibility] Evaluation*

### ➤ *XGBoost Classifier (Classification) [EMI Eligibility]*

### ➤ *XGBoost Classifier (Classification) [EMI Eligibility] Evaluation*

### ➤ *Logistic Regression Model [Maximum EMI Prediction]*

### ➤ *Logistic Regression Model [Maximum EMI Prediction] Evaluation*

### ➤ *Random Forest Regressor Model [Maximum EMI Prediction]*

### ➤ *Random Forest Regressor Model [Maximum EMI Prediction] Evaluation*

### ➤ *XGBoost Regressor Model [Maximum EMI Prediction]*

### ➤ *XGBoost Regressor Model [Maximum EMI Prediction] Evaluation*

### ➤ *Comparative Model Summary*

### ➤ *Hyperparameter tuning with RandomizedSearchCV*

### ➤ *Evaluate each trained model*

### ➤ *SHAP Explainability*

In [None]:
# Trained pipeline registry (for SHAP)
trained_pipelines = {}

if 'rf_reg' in locals() and 'preds_rf_reg' in locals():
    trained_pipelines["Random Forest Regressor"] = {"pipeline": rf_reg, "preds": preds_rf_reg}

if 'rf_clf' in locals() and 'preds_rf_clf' in locals():
    trained_pipelines["Random Forest Classifier"] = {"pipeline": rf_clf, "preds": preds_rf_clf}

if 'lin_reg' in locals() and 'preds_lin' in locals():
    trained_pipelines["Linear Regression"] = {"pipeline": lin_reg, "preds": preds_lin}

if 'log_clf' in locals() and 'preds_log' in locals():
    trained_pipelines["Logistic Regression"] = {"pipeline": log_clf, "preds": preds_log}

### ➤ *Persist artifacts separately*

### ➤ *MLflow logging*

In [None]:
%pip install --quiet mlflow

### ➤ *Serving alignment check*

### ➤ *Generate production-safe Streamlit app file*

### ➤ *Create pinned requirements.txt*

### ➤ *Final artifact list & notes*