# Boosting

Boosting refers to an ensemble method in which several models are trained sequentially with each model learning from the errors of its predecessors. 

## Adaboost

![adaboost.png](attachment:adaboost.png)
![learning_rate.png](attachment:learning_rate.png)Ensemble method combining several weak learners to form a strong learner. Weak learner is a model doing slightly better than random guessing. For example, a decision tree with a maximum-depth of one, known as a decision-stump, is a weak learner.

Ensemble of predictors are trained sequentially and each predictor correct the errors made by its predecessor.

Adaboost: Each predictor pays more attention to the instances wrongly predicted its predecessor by constantly changing the weights of training instances.

AdaboostClassifier:  weighted majority voting
AdaboostRegressor: weighted average.

In [12]:
import pandas as pd
from sklearn.preprocessing import LabelEncoder
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import roc_auc_score
from sklearn.ensemble import AdaBoostClassifier

cancer = pd.read_csv("cancer.csv")
X = cancer.drop(["id", "Unnamed: 32", "diagnosis"], axis=1)
y = cancer["diagnosis"]
le = LabelEncoder()
y = le.fit_transform(y)
y = pd.Series(y)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3,
                                                   stratify=y, random_state=1)
dt = DecisionTreeClassifier(max_depth=1, random_state=1)
adb_clf = AdaBoostClassifier(base_estimator=dt, n_estimators=100)
adb_clf.fit(X_train, y_train)
y_pred_proba = adb_clf.predict_proba(X_test)[:, 1]
adb_clf_roc_auc_score = roc_auc_score(y_test, y_pred_proba)
print("ROC AUC score: {:.2f}".format(adb_clf_roc_auc_score))
from sklearn.metrics import accuracy_score
y_pred = adb_clf.predict(X_test)
accuracy_score(y_test, y_pred)

ROC AUC score: 0.99


0.935672514619883

### Define the AdaBoost classifier

In [14]:
liver = pd.read_csv("indian_liver_patient.csv")
liver = liver.dropna()
X = liver.drop(["Dataset"], axis=1)
X = pd.get_dummies(X, drop_first=True)
y = liver["Dataset"].replace(2,0)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,
                                                   stratify=y, random_state=1)

dt = DecisionTreeClassifier(max_depth=2, random_state=1)
ada = AdaBoostClassifier(base_estimator=dt, n_estimators=180, random_state=1)


### Train the AdaBoost classifier

In [15]:
ada.fit(X_train, y_train)
y_pred_proba = ada.predict_proba(X_test)[:,1]

### Evaluate the AdaBoost classifier

In [16]:
ada_roc_auc = roc_auc_score(y_test, y_pred_proba)
print('ROC AUC score: {:.2f}'.format(ada_roc_auc))

ROC AUC score: 0.70


## Gradient Boosting (GB)


![gradient_boosted.png](attachment:gradient_boosted.png)Correction of ![shrinkage.png](attachment:shrinkage.png)predecessor's errors. Fit each predictor is trained using its predecessor's residual errors as labels.

In [64]:
from sklearn.ensemble import GradientBoostingRegressor
from sklearn.metrics import mean_squared_error as MSE

cars = pd.read_csv("mpg.csv")
df_origin = pd.get_dummies(cars)
X = df_origin.drop("mpg", axis=1)
y = df_origin["mpg"]

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=1)

gbt = GradientBoostingRegressor(n_estimators=300, max_depth=1, random_state=1)
gbt.fit(X_train, y_train)
y_pred = gbt.predict(X_test)
rmse_test = MSE(y_test, y_pred) ** 0.5
print("Test set RMSE: {:.2f}".format(rmse_test))

Test set RMSE: 4.01


### Define the GB regressor

to predict the bike rental demand 

In [65]:
gb = GradientBoostingRegressor(max_depth=4, n_estimators=200, random_state=2)

### Train the GB regressor

In [66]:
bike = pd.read_csv("bike.csv")
X = bike.drop("cnt", axis=1)
y = bike["cnt"]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=6)

gb.fit(X_train, y_train)
y_pred = gb.predict(X_test)

### Evaluate the GB regressor

In [68]:
mse_test = MSE(y_test, y_pred)
rmse_test = mse_test ** 0.5
print('Test set RMSE of gb: {:.3f}'.format(rmse_test))
X_train

Test set RMSE of gb: 53.460


Unnamed: 0,hr,holiday,workingday,temp,hum,windspeed,instant,mnth,yr,Clear to partly cloudy,Light Precipitation,Misty
241,1,0,1,0.64,0.89,0.0000,13245,7,1,1,0,0
187,19,0,0,0.78,0.62,0.1642,13191,7,1,1,0,0
691,19,0,0,0.76,0.55,0.1642,13695,7,1,1,0,0
604,4,0,1,0.66,0.74,0.1940,13608,7,1,1,0,0
386,2,0,1,0.72,0.70,0.0896,13390,7,1,1,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...
1389,21,0,1,0.72,0.74,0.1940,14393,8,1,1,0,0
618,18,0,1,0.92,0.40,0.3582,13622,7,1,1,0,0
227,11,0,1,0.80,0.46,0.1940,13231,7,1,1,0,0
713,17,0,1,0.76,0.55,0.3284,13717,7,1,1,0,0


## Stochastic Gradient Boosting (SGB)

![stochasticg.png](attachment:stochasticg.png)GB involves an exhaustive search precedure. Each CART is trained to find the best split points and features. May lead to CARTs using the same split point and maybe the same features. To mitigate this stochastic gradient boosting can be used.

In SGB each CART is trained on a random subset of the training of the data. Subset is sampled without replacement. Features are sampled without replacment when choosing split points. -> further diversity.

In [45]:
cars = pd.read_csv("mpg.csv")
df_origin = pd.get_dummies(cars)
X = df_origin.drop("mpg", axis=1)
y = df_origin["mpg"]

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=1)
sgbt = GradientBoostingRegressor(max_depth=1, subsample=0.8, max_features=0.2,
                                n_estimators=300, random_state=1)
sgbt.fit(X_train, y_train)
y_pred = sgbt.predict(X_test)
rmse_test = MSE(y_test, y_pred) ** 0.5
print("Test set RMSE: {:.2f}".format(rmse_test))

Test set RMSE: 3.95


### Regression with SGB 

In [61]:
sgbr = GradientBoostingRegressor(max_depth=4, subsample=0.9, max_features=0.75,
                                n_estimators=200, random_state=2)

### Train the SGB regressor

In [69]:
bike = pd.read_csv("bike.csv")
X = bike.drop("cnt", axis=1)
y = bike["cnt"]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=6)

sgbr.fit(X_test, y_test)
y_pred = sgbr.predict(X_test)

### Evaluate the SGB regressor

In [70]:
mse_test = MSE(y_test, y_pred)
rmse_test = mse_test ** 0.5
print('Test set RMSE of sgbr: {:.3f}'.format(rmse_test))

Test set RMSE of sgbr: 10.708


The stochastic gradient boosting regressor achieves a lower test set RMSE than the gradient boosting regressor