Random Forest Classifier Scenario: Loan Approval Prediction 
A commercial bank wants to decide whether to approve a loan application. They have historical data about customers, including: 
Age (years) 
Income (annual income in local currency) 
Credit Score (300â€“850 scale) 
Loan Amount Requested (currency units) 
Loan Approved (Yes/No) 
The bank applies Random Forest, an ensemble method that builds multiple decision trees on different subsets of the data and aggregates their predictions. This helps reduce overfitting compared to a single tree and improves accuracy in predicting loan approvals for new applicants. 

In [2]:
import pandas as pd
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix
 
# Load the synthetic data
df = pd.read_csv('loan_approvals_100.csv')
 
# Prepare features/target
X = df[['Age', 'Income', 'Credit Score', 'Loan Amount Requested']]
y = (df['Loan Approved'].str.lower() == 'yes').astype(int)
 
# Stratified split
Xtrain, Xtest, ytrain, ytest = train_test_split(
    X, y, test_size=0.25, random_state=42, stratify=y
)
 
# (Optional) scale
scaler = StandardScaler()
X_train_s = scaler.fit_transform(Xtrain)
X_test_s  = scaler.transform(Xtest)
 
# Simple, strong Random Forest (no hyperparameter search)
clf = RandomForestClassifier(
    n_estimators=500,
    max_depth=21,
    class_weight='balanced',
    random_state=42,
    n_jobs=-1
)
clf.fit(X_train_s, ytrain)
 
# Evaluate
y_pred = clf.predict(X_test_s)
print("Accuracy:", accuracy_score(ytest, y_pred))
print("\nClassification Report:\n", classification_report(ytest, y_pred))
print("Confusion Matrix:\n", confusion_matrix(ytest, y_pred))

Accuracy: 0.88

Classification Report:
               precision    recall  f1-score   support

           0       0.82      0.90      0.86        10
           1       0.93      0.87      0.90        15

    accuracy                           0.88        25
   macro avg       0.87      0.88      0.88        25
weighted avg       0.88      0.88      0.88        25

Confusion Matrix:
 [[ 9  1]
 [ 2 13]]
