 

Scenario 3 Banking – Predicting Loan Default (AdaBoost) 

A bank wants to predict whether a customer will default on a loan (Yes/No) based on: 

Monthly_Income (in local currency) 

Loan_Amount (requested loan size) 

Credit_Score (rating between 300–850) 

Since individual decision trees can be weak learners (often shallow stumps), the bank applies AdaBoost. 

AdaBoost sequentially trains trees, giving higher weights to misclassified customers (e.g., those who defaulted but were predicted safe). 

Each new tree focuses more on these “hard cases.” 

Final prediction is a weighted vote across all trees. 

In [8]:

import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import AdaBoostClassifier
from sklearn.preprocessing import LabelEncoder


df = pd.read_excel("adaboost_loan.xlsx", engine="openpyxl")


X = df[["Monthly_Income", "Loan_Amount", "Credit_Score"]]
y = df["Default"]


le = LabelEncoder()
y = le.fit_transform(y) 


X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42, stratify=y
)


dt = DecisionTreeClassifier(random_state=42, max_depth=6)
dt.fit(X_train, y_train)
print("Decision Tree accuracy:", dt.score(X_test, y_test))


ada = AdaBoostClassifier(
    n_estimators=200,   
    learning_rate=0.8,    
    random_state=42
)
ada.fit(X_train, y_train)
print("AdaBoost accuracy:", ada.score(X_test, y_test))


Decision Tree accuracy: 0.6
AdaBoost accuracy: 0.6
