# Boosting

Boosting is an ensemble technique where models are trained sequentially, each focusing on correcting the errors of the previous model.

Boosting reduces bias by focusing on previous mistakes, while bagging reduces variance.”

| Feature      | Bagging         | Boosting     |
| ------------ | --------------- | ------------ |
| Training     | Parallel        | Sequential   |
| Focus        | Reduce variance | Reduce bias  |
| Data weights | Equal           | Weighted     |
| Overfitting  | Reduced         | Can increase |


Key Difference from Bagging

Bagging → models trained independently

Boosting → models trained sequentially

In [9]:
from sklearn.ensemble import AdaBoostClassifier, GradientBoostingClassifier
from xgboost import XGBClassifier
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

In [10]:
# Load data
X, y = load_iris(return_X_y=True)

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

adaboost

In [11]:
# train AdaBoost model
ada_model = AdaBoostClassifier(n_estimators=100, random_state=42)
     

ada_model.fit(X_train,y_train)

0,1,2
,estimator,
,n_estimators,100
,learning_rate,1.0
,algorithm,'deprecated'
,random_state,42


In [13]:
y_pred = ada_model.predict(X_test)
     

accuracy = accuracy_score(y_test, y_pred)
     

accuracy
     


0.9333333333333333

GradientBoostingClassifier

In [14]:
gb_model = GradientBoostingClassifier(n_estimators=100, learning_rate=0.1, random_state=42)

In [15]:
gb_model.fit(X_train, y_train)

0,1,2
,loss,'log_loss'
,learning_rate,0.1
,n_estimators,100
,subsample,1.0
,criterion,'friedman_mse'
,min_samples_split,2
,min_samples_leaf,1
,min_weight_fraction_leaf,0.0
,max_depth,3
,min_impurity_decrease,0.0


In [16]:
y_pred_gb = gb_model.predict(X_test)    

In [18]:
accuracy_gb = accuracy_score(y_test, y_pred_gb)
accuracy_gb

1.0

XGBoost

In [19]:
xgb_model = XGBClassifier(n_estimators=100, learning_rate=0.1, max_depth=3, use_label_encoder=False, eval_metric='mlogloss', random_state=42)

In [20]:
xgb_model.fit(X_train, y_train)

Parameters: { "use_label_encoder" } are not used.

  bst.update(dtrain, iteration=i, fobj=obj)


0,1,2
,objective,'multi:softprob'
,base_score,
,booster,
,callbacks,
,colsample_bylevel,
,colsample_bynode,
,colsample_bytree,
,device,
,early_stopping_rounds,
,enable_categorical,False


In [21]:
y_pred = xgb_model.predict(X_test)

In [23]:
 accuracy = accuracy_score(y_test, y_pred)
 accuracy

1.0