## Implementing AdaBoost Classifier from scratch
#### Álvaro Corrales Cano
#### April 2021 - work in progress

In [45]:
# Imports
import numpy as np
import pandas as pd
from sklearn.datasets import make_classification
from sklearn.metrics import roc_auc_score
from sklearn.model_selection import train_test_split
from AdaBoost import *
from sklearn.ensemble import AdaBoostClassifier


### Our model

In [46]:
# Generate classification dataset
X, y = make_classification(n_samples= 1000, n_features = 20, random_state = 42)
y = y * 2 - 1       # Original AdaBoost uses {1, -1} as class labels

# Train/test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 42)

In [51]:
# Fit model
ab = AdaBoost()
ab.fit(X_train, y_train, M = 400)

# Predict on test set
y_pred = ab.predict(X_test)
print('The ROC-AUC score of the model is:', round(roc_auc_score(y_test, y_pred), 4))

The ROC-AUC score of the model is: 0.8593


### Scikit-Learn implementation of AdaBoost

In [52]:
ab_sk = AdaBoostClassifier(n_estimators = 400) # Same boosting rounds (M) as in our model
ab_sk.fit(X_train, y_train)
y_pred_sk = ab_sk.predict(X_test)
print('The ROC-AUC score of the model is:', round(roc_auc_score(y_test, y_pred_sk), 4))

The ROC-AUC score of the model is: 0.8427


Our custom model has a ROC-AUC score that's comparable to the Scikit-Learn implementation of AdaBoost. The Scikit-Learn implementation uses a different boosting algorithm, so our results are not exactly the same.