# AdaBoost 

Adaptive Boosting is an iterative sequence of weak algorithms, typically decision trees. The classifier starts with a base decision tree model and each subsequent model iteratively learns from the mistakes of its predecessor and updates the weights and biases of the instances that were misclassified.


# Training Process

The algorithm starts with an initial base model and makes predictions with equal weights for all features. The instances misclassified by this base model are boosted and the subsequent model then performs better than its predecessor. This sequence continues until all predictors are trained. The model then makes predictions by using the class that receives the majority of the weighted votes across all models.

There are notable differences between Random Forests and Adaboost. When making predictions, random forests apply an equal vote to all trees in the ensemble while Adaboost applies greater weights to trees that minimize the error rate. Moreover, Adaboost operates by training the model sequentially where each tree is fit on a modified version of the original dataset. while Random Forests apply bagging based resampling technique in a parallel fashion. 

# AdaBoost Hyperparameters


- n_estimators : The number of trees.

- learning_rate: Weight applied to each model at each boosting iteration

- algorithm: If model supports probabilistic output use 'SAMME.R' if model produces discreet output 1/0 use  'SAMME'.



# AdaBoost Pros and Cons


**Pros**

- Very few hyperparameters for tune.

- They are non- parametric models and don’t require data-pre-processing (feature scaling or one-hot-encoding)

- Good for non-linear datasets


**Cons**


- Cannot be parallelized as each model is trained based on the results of the previous model.

- Much slower than other boosting models such as XGboost




# 1. Libraries

In [1]:
# Importing Libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.ensemble import AdaBoostRegressor
from sklearn.tree import DecisionTreeRegressor
from sklearn.metrics import mean_absolute_error, mean_squared_error
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.preprocessing import OneHotEncoder
from sklearn.compose import make_column_transformer
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import GridSearchCV
from sklearn.model_selection import RandomizedSearchCV

In [2]:
# Import Data
df = pd.read_csv('LungCapData.csv')
df.head()

Unnamed: 0,LungCap,Age,Height,Smoke,Gender,Caesarean
0,6.475,6,62.1,no,male,no
1,10.125,18,74.7,yes,female,no
2,9.55,16,69.7,no,female,yes
3,11.125,14,71.0,no,male,no
4,4.8,5,56.9,no,male,no


# 2. Preprocessing

In [3]:
# Predictors and Target
X = df.drop(columns = ['LungCap'])
y = df['LungCap']

# Instantiate one-hot encoder
ohe = OneHotEncoder()

# columns to be one hot encoded
ct = make_column_transformer(

    (ohe, ['Smoke', 'Gender', 'Caesarean']),
    remainder = 'passthrough')

# predictors and target variable
X = np.array(ct.fit_transform(X))
y = np.array(y)

# Checck input and target variable shape
X.shape, y.shape

((725, 8), (725,))

In [4]:
# Training and Testing subsets 
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 911)

# Feature Scaling
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
print('Standardized feature Mean:',  X_train.mean().round())
print('Standardized feature SD :',   X_train.std().round())

Standardized feature Mean: 0.0
Standardized feature SD : 1.0


# 3. Training

In [5]:
# Initialize AdaBoost Classifier
ada = AdaBoostRegressor(
 DecisionTreeRegressor(max_depth=1), n_estimators=10, learning_rate=0.5)

# Fit the model
ada.fit(X_train, y_train)

AdaBoostRegressor(base_estimator=DecisionTreeRegressor(max_depth=1),
                  learning_rate=0.5, n_estimators=10)

# 3. Testing

In [6]:
# Predicting the Test set results
y_pred = ada.predict(X_test)

# Mean squared error
print('Mean Squared Error :', mean_squared_error(y_test, y_pred))

Mean Squared Error : 2.836973246539746


# 4. K-Fold Cross Validation

In [7]:
# 10 fold cross validation
R2 = cross_val_score(estimator = ada,
                             X = X,
                             y = y,
                             cv = 10,
                             scoring = 'r2')

# Cross validation accuracy and standard deviation
print(R2)
print("R2: {:.3f} %".format(R2.mean()*100))
print("Standard Deviation: {:.3f} %".format(R2.std()*100))

[0.55289921 0.57933502 0.68654874 0.49889516 0.67031272 0.67832696
 0.59131214 0.59631827 0.58271821 0.56166797]
R2: 59.983 %
Standard Deviation: 5.770 %


# 5. Hyperparametric Tuning

In [8]:
# Grid Search CV
ada = AdaBoostRegressor()
param_grid = [{
      'learning_rate': [0.25, 0.5, 1],
      'n_estimators': [10, 100, 1000]}]


# Configure GridSearchCV
grid_search = GridSearchCV(ada, param_grid, cv=5,
                                  scoring="r2",
                                  n_jobs=-1)

# Initiate Search
grid_search.fit(X_train, y_train)

# Extract Tuned Parameters and Predictive Accuracy
tuned_params = grid_search.best_params_
tuned_score = grid_search.best_score_
best_estimator = grid_search.best_estimator_

# Print Results
print("Best R2: {:.2f} %".format(grid_search.best_score_*100))
print("Best Parameters:", tuned_params)

Best R2: 81.95 %
Best Parameters: {'learning_rate': 0.5, 'n_estimators': 100}


In [9]:
# Randomized Search

ada = AdaBoostRegressor()
param_grid = [{
      'learning_rate': [0.25, 0.5, 1, 2],
      'n_estimators': [10, 100, 1000],}]


# Configure Randomized Search
random_search = RandomizedSearchCV(ada, param_grid,
                                        scoring="r2", cv=5, n_iter = 10,
                                        n_jobs=-1, random_state=911)
#Initiate Search
random_search.fit(X_train, y_train)


# Extract Tuned Parameters and Predictive Accuracy
tuned_params = random_search.best_params_
tuned_score = random_search.best_score_
best_estimator = random_search.best_estimator_

# Print accuracy and best parameters
print("Best R2: {:.2f} %".format(random_search.best_score_*100))
print("Best Parameters:", tuned_params)

Best R2: 81.72 %
Best Parameters: {'n_estimators': 100, 'learning_rate': 2}
