Random Forest Scenario: Loan Approval Prediction 

A commercial bank wants to decide whether to approve a loan application. They have historical data about customers, including: 

Age (years) 

Income (annual income in local currency) 

Credit Score (300â€“850 scale) 

Loan Amount Requested (currency units) 

Loan Approved (Yes/No) 

The bank applies Random Forest, an ensemble method that builds multiple decision trees on different subsets of the data and aggregates their predictions. This helps reduce overfitting compared to a single tree and improves accuracy in predicting loan approvals for new applicants. 

In [4]:
import pandas as pd

import pandas as pd
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix
 

df = pd.read_csv('loan_approvals_100.csv')
 


In [5]:
df.head(2)

Unnamed: 0,Age,Income,Credit Score,Loan Amount Requested,Loan Approved
0,25,200000,668,118724,No
1,46,200000,729,93682,Yes


In [6]:

X = df.drop(columns = ["Loan Approved"])
y = df["Loan Approved"]
 

Xtrain, Xtest, ytrain, ytest = train_test_split(
    X, y, test_size=0.3, random_state=42, stratify=y
)
 

clf = RandomForestClassifier(
    n_estimators=500,
    max_depth=20,
    class_weight='balanced',
    random_state=42,
    n_jobs=-1
)


In [7]:
clf.fit(Xtrain, ytrain)
 
# Evaluate
y_pred = clf.predict(Xtest)
print("Accuracy:", accuracy_score(ytest, y_pred))
print("\nClassification Report:\n", classification_report(ytest, y_pred))
print("Confusion Matrix:\n", confusion_matrix(ytest, y_pred))
 

Accuracy: 0.8666666666666667

Classification Report:
               precision    recall  f1-score   support

          No       0.83      0.83      0.83        12
         Yes       0.89      0.89      0.89        18

    accuracy                           0.87        30
   macro avg       0.86      0.86      0.86        30
weighted avg       0.87      0.87      0.87        30

Confusion Matrix:
 [[10  2]
 [ 2 16]]
