### Customer Credit Risk Analysis with ML & Deep Learning
Accurate credit risk assessment is very crucial in modern finance. Lenders must evaluate the likelihood that a customer will repay their debt obligations. In this notebook, I will preprocess the data, apply various models, and evaluate their performance with and without SMOTE (Synthetic Minority Over-sampling Technique) to address class imbalance. 


[**Data Source:** Kaggle - Credit Risk Customers](https://www.kaggle.com/datasets/ppb00x/credit-risk-customers/data)

**Dataset Overview:** The dataset contains information on customers, including demographic details, financial status, and credit history. The target variable, `class`, indicates whether a customer is at high risk - bad (1) or low risk - good (0). Class imbalance is evident in the dataset (`class` has more 1s than 0s). To mitigate this, I use SMOTE to oversample the minority class.


#### 1. Load Libraries and Data
**Steps:**
1. Load dependencies and the dataset.
2. Visualize the df and check for missing values (none found in this dataset).

In [2]:
# Load dependencies
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.preprocessing import StandardScaler, OrdinalEncoder
from sklearn.ensemble import RandomForestClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.neural_network import MLPClassifier
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score
from imblearn.pipeline import Pipeline as ImbPipeline
from imblearn.over_sampling import SMOTE
from sklearn.pipeline import Pipeline
from sklearn.base import BaseEstimator, ClassifierMixin

import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout
from tensorflow.keras.optimizers import Adam, RMSprop
from tensorflow.keras.callbacks import EarlyStopping

import warnings
warnings.filterwarnings("ignore")

In [3]:
# Load data
df = pd.read_csv('data/credit_customers.csv')
print("The df shape is:", df.shape)
df.head()

The df shape is: (1000, 21)


Unnamed: 0,checking_status,duration,credit_history,purpose,credit_amount,savings_status,employment,installment_commitment,personal_status,other_parties,...,property_magnitude,age,other_payment_plans,housing,existing_credits,job,num_dependents,own_telephone,foreign_worker,class
0,<0,6.0,critical/other existing credit,radio/tv,1169.0,no known savings,>=7,4.0,male single,none,...,real estate,67.0,none,own,2.0,skilled,1.0,yes,yes,good
1,0<=X<200,48.0,existing paid,radio/tv,5951.0,<100,1<=X<4,2.0,female div/dep/mar,none,...,real estate,22.0,none,own,1.0,skilled,1.0,none,yes,bad
2,no checking,12.0,critical/other existing credit,education,2096.0,<100,4<=X<7,2.0,male single,none,...,real estate,49.0,none,own,1.0,unskilled resident,2.0,none,yes,good
3,<0,42.0,existing paid,furniture/equipment,7882.0,<100,4<=X<7,2.0,male single,guarantor,...,life insurance,45.0,none,for free,1.0,skilled,2.0,none,yes,good
4,<0,24.0,delayed previously,new car,4870.0,<100,1<=X<4,3.0,male single,none,...,no known property,53.0,none,for free,2.0,skilled,2.0,none,yes,bad


In [4]:
df.info()
# df.describe()  
# df.isnull().sum().plot(kind="barh")

# No missing found

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1000 entries, 0 to 999
Data columns (total 21 columns):
 #   Column                  Non-Null Count  Dtype  
---  ------                  --------------  -----  
 0   checking_status         1000 non-null   object 
 1   duration                1000 non-null   float64
 2   credit_history          1000 non-null   object 
 3   purpose                 1000 non-null   object 
 4   credit_amount           1000 non-null   float64
 5   savings_status          1000 non-null   object 
 6   employment              1000 non-null   object 
 7   installment_commitment  1000 non-null   float64
 8   personal_status         1000 non-null   object 
 9   other_parties           1000 non-null   object 
 10  residence_since         1000 non-null   float64
 11  property_magnitude      1000 non-null   object 
 12  age                     1000 non-null   float64
 13  other_payment_plans     1000 non-null   object 
 14  housing                 1000 non-null   o

#### 2. Data Preprocessing
**Steps:**
1. Split the `personal_status` column into `sex` and `marital_status`.
2. Apply ordinal encoding to categorical variables.
3. Scale numerical features using `StandardScaler`.

In [5]:
# Divide "personal_status" column into two different columns
df[["sex", "marital_status"]]=df.personal_status.str.split(expand=True)
df.drop(columns=["personal_status"], inplace=True)
print(df.marital_status.value_counts())

marital_status
single         548
div/dep/mar    310
mar/wid         92
div/sep         50
Name: count, dtype: int64


**Encode Categorical Columns**

In [6]:
ord_enc = OrdinalEncoder()
cols_to_encode = [
    'checking_status', 'credit_history', 'purpose', 'savings_status', 'employment',
    'other_parties', 'property_magnitude', 'other_payment_plans',
    'housing', 'job', 'own_telephone', 'foreign_worker', 'class',
    'sex', 'marital_status'
]

df[cols_to_encode] = ord_enc.fit_transform(df[cols_to_encode])

In [7]:
print(df["class"].value_counts())
# Suggests class imbalance with more 1s than 0s.

class
1.0    700
0.0    300
Name: count, dtype: int64


**Split Data and Scale**

In [8]:
X = df.drop(columns=["class"])
y = df["class"]

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
print("X_train shape:", X_train.shape)
print("y_train shape:", y_train.shape)

X_train shape: (800, 21)
y_train shape: (800,)


In [9]:
# Apply scaling
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

#### 3. Modeling and Evaluation

##### Custom TensorFlow Deep Learning Model
I will first emply a customer Tensorflow wrapper to judge performance of tranditional ML models against the modern techniques. 
**Architecture:**
- Input Layer: Dense layer with ReLU activation.
- Hidden Layers: Dense layers with Dropout for regularization.
- Output Layer: Dense layer with sigmoid activation for binary classification.


In [10]:
# A Custom TensorFlow Model
class TensorFlowClassifier(BaseEstimator, ClassifierMixin):
    def __init__(self, optimizer='adam', dropout_rate=0.2, epochs=50, batch_size=32, learning_rate=0.001, random_state=42):
        self.optimizer = optimizer
        self.dropout_rate = dropout_rate
        self.epochs = epochs
        self.batch_size = batch_size
        self.learning_rate = learning_rate
        self.random_state = random_state
        self.model_ = None

    def _build_model(self, input_dim):
        tf.random.set_seed(self.random_state)  # Set seed for TensorFlow
        model = Sequential()
        model.add(Dense(128, activation='relu', input_dim=input_dim))
        model.add(Dropout(self.dropout_rate))
        model.add(Dense(64, activation='relu'))
        model.add(Dropout(self.dropout_rate))
        model.add(Dense(1, activation='sigmoid'))
        
        if self.optimizer == 'adam':
            optimizer = Adam(learning_rate=self.learning_rate)
        else:
            optimizer = RMSprop(learning_rate=self.learning_rate)
            
        model.compile(optimizer=optimizer, loss='binary_crossentropy', metrics=['accuracy'])
        return model

    def fit(self, X, y):
        input_dim = X.shape[1]
        self.model_ = self._build_model(input_dim)
        early_stopping = EarlyStopping(monitor='val_loss', patience=10, restore_best_weights=True, verbose=0)
        self.model_.fit(X, y, epochs=self.epochs, batch_size=self.batch_size, verbose=0, 
                        validation_split=0.2, callbacks=[early_stopping])
        return self

    def predict(self, X):
        return (self.model_.predict(X) > 0.5).astype("int32").flatten()

    def predict_proba(self, X):
        return self.model_.predict(X)

#### 4. Model Definitions

In [11]:
# Model definitions and parameter grids
models_and_params = {
    "Random Forest": {
        "model": RandomForestClassifier(random_state=42),
        "params": {
            "n_estimators": [100, 200],
            "max_depth": [5, 10, None]
        }
    },
    "Logistic Reg.": {
        "model": LogisticRegression(max_iter=1000),
        "params": {
            "C": [0.1, 1.0, 10]
        }
    },
    "Neural Network": {
        "model": MLPClassifier(max_iter=500, random_state=42),
        "params": {
            "hidden_layer_sizes": [(64, 32), (100,)],
            "alpha": [0.0001, 0.001]
        }
    },
    "TF Deep Learning": {
        "model": TensorFlowClassifier(random_state=42),
        "params": {
            "optimizer": ['adam', 'rmsprop'],
            "dropout_rate": [0.2, 0.3],
            "epochs": [50, 100],
            "batch_size": [32, 64],
            "learning_rate": [0.001, 0.01]
        }
    }
}

#### 5. Model Evaluation: With and Without SMOTE

In [12]:
# A function to train and evaluate models
def train_evaluate(X_train, y_train, X_test, y_test, model_name, model, params):
    print(f"Tuning {model_name}...")
    
    pipeline = Pipeline([('classifier', model)])
    grid = GridSearchCV(pipeline, params, scoring='f1', cv=5, n_jobs=-1, verbose=0)
    grid.fit(X_train, y_train)
    
    best_model = grid.best_estimator_
    y_pred = best_model.predict(X_test)
    
    return {
        "Model": model_name,
        "Accuracy": accuracy_score(y_test, y_pred),
        "Precision": precision_score(y_test, y_pred),
        "Recall": recall_score(y_test, y_pred),
        "F1 Score": f1_score(y_test, y_pred),
        "Best Parameters": grid.best_params_
    }

In [13]:
# Evaluate WITHOUT SMOTE
results_no_smote = []
for name, mp in models_and_params.items():
    results_no_smote.append(train_evaluate(X_train, y_train, X_test, y_test, name + " (No SMOTE)", mp["model"], {'classifier__' + k: v for k, v in mp["params"].items()}))
results_df_no_smote = pd.DataFrame(results_no_smote).sort_values(by="F1 Score", ascending=False)

Tuning Random Forest (No SMOTE)...
Tuning Logistic Reg. (No SMOTE)...
Tuning Neural Network (No SMOTE)...
Tuning TF Deep Learning (No SMOTE)...


In [14]:
# Evaluate WITH SMOTE
smote = SMOTE(random_state=42)
X_train_smote, y_train_smote = smote.fit_resample(X_train, y_train)
results_smote = []
for name, mp in models_and_params.items():
    results_smote.append(train_evaluate(X_train_smote, y_train_smote, X_test, y_test, name + " (SMOTE)", mp["model"], {'classifier__' + k: v for k, v in mp["params"].items()}))
results_df_smote = pd.DataFrame(results_smote).sort_values(by="F1 Score", ascending=False)

Tuning Random Forest (SMOTE)...
Tuning Logistic Reg. (SMOTE)...
Tuning Neural Network (SMOTE)...
Tuning TF Deep Learning (SMOTE)...


#### 6. Combine and Compare results

In [15]:
# Combine and compare results
results_comparison = pd.concat([results_df_no_smote, results_df_smote])

# Format relevant columns as percentages
percentage_cols = ['Accuracy', 'Precision', 'Recall', 'F1 Score']
results_comparison[percentage_cols] = results_comparison[percentage_cols].applymap(lambda x: f"{x * 100:.2f}%")
results_df_no_smote[percentage_cols] = results_df_no_smote[percentage_cols].applymap(lambda x: f"{x * 100:.2f}%")
results_df_smote[percentage_cols] = results_df_smote[percentage_cols].applymap(lambda x: f"{x * 100:.2f}%")

# Display results
print("\nModel Performance Comparison:")
print(results_comparison[['Model'] + percentage_cols])
print("\nBest models without SMOTE:\n", results_df_no_smote.head(1))
print("\nBest models with SMOTE:\n", results_df_smote.head(1))


Model Performance Comparison:
                         Model Accuracy Precision  Recall F1 Score
0     Random Forest (No SMOTE)   76.50%    77.33%  94.33%   84.98%
3  TF Deep Learning (No SMOTE)   74.00%    74.59%  95.74%   83.85%
2    Neural Network (No SMOTE)   74.50%    81.69%  82.27%   81.98%
1     Logistic Reg. (No SMOTE)   72.00%    75.45%  89.36%   81.82%
0        Random Forest (SMOTE)   76.00%    80.39%  87.23%   83.67%
3     TF Deep Learning (SMOTE)   76.00%    82.52%  83.69%   83.10%
2       Neural Network (SMOTE)   74.50%    83.58%  79.43%   81.45%
1        Logistic Reg. (SMOTE)   67.00%    82.05%  68.09%   74.42%

Best models without SMOTE:
                       Model Accuracy Precision  Recall F1 Score  \
0  Random Forest (No SMOTE)   76.50%    77.33%  94.33%   84.98%   

                                     Best Parameters  
0  {'classifier__max_depth': 10, 'classifier__n_e...  

Best models with SMOTE:
                    Model Accuracy Precision  Recall F1 Score  \
0 

#### Results and Conclusions
##### Observations
- **Best Performing Model (No SMOTE):** Random Forest or MLP had strong F1 performance even without balancing.
- **Best Performing Model (SMOTE):** Random Forest (SMOTE) gave balanced performance, improving recall with minimal precision loss.
- **Logistic Regression** underperformed under class imbalance, but worked better post-SMOTE.
- **Deep Learning (TF)** performed solidly across both setups, slightly behind RF in F1.

##### My suggestions: In Words
- Random Forest is a robust model under both balanced and imbalanced data. 
- SMOTE helps improve recall (identifying risky customers), which is crucial in AML/fraud detection.
- Custom TF deep nets are effective but require tuning and more compute.