**Bagging:** Combines predictions from models trained on bootstrapped subsets to reduce variance.

**Random Forest:** A bagging technique using randomized decision trees for robust predictions.

**Boosting:** Builds sequential models to focus on correcting prior errors and reduce bias.

**AdaBoost:**Assigns higher weights to misclassified samples for better accuracy.

**Gradient Boosting:** Optimizes performance by minimizing loss functions iteratively.

* **XGBoost**, LightGBM, CatBoost: Advanced, efficient versions of Gradient Boosting for large datasets.

*Both are used to improve predictive accuracy but serve different purposes—bagging reduces overfitting, while boosting reduces bias.

In [1]:
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

In [2]:
data=load_breast_cancer()

In [3]:
X,y=data.data,data.target

In [4]:
X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.2,random_state=42)
scaler=StandardScaler()
X_train_scaled=scaler.fit_transform(X_train)
X_test_scaled=scaler.transform(X_test)

In [5]:
#TRAIN BAGGING ADABOOST AND GRADIENT MODELS
from sklearn.ensemble import BaggingClassifier,AdaBoostClassifier,GradientBoostingClassifier
from xgboost import XGBClassifier
from sklearn.metrics import accuracy_score


In [6]:
from inspect import get_annotations
#bagging
bagging=BaggingClassifier(n_estimators=10,random_state=42)
bagging.fit(X_train_scaled,y_train)
bagging_acc=accuracy_score(y_test,bagging.predict(X_test_scaled))

#adaboost
ada=AdaBoostClassifier(n_estimators=10,random_state=42)
ada.fit(X_train_scaled,y_train)
ada_acc=accuracy_score(y_test,ada.predict(X_test_scaled))

#gradient boosting
gb=GradientBoostingClassifier(n_estimators=10,learning_rate=0.1,random_state=42)
gb.fit(X_train_scaled,y_train)
gb_acc=accuracy_score(y_test,gb.predict(X_test_scaled))

#xgboost
xgb=XGBClassifier(use_label_encoder=False,eval_metric='logloss',random_state=42)
xgb.fit(X_train_scaled,y_train)
xgb_acc=accuracy_score(y_test,xgb.predict(X_test_scaled))




Parameters: { "use_label_encoder" } are not used.



In [7]:
#show accuracy results
import pandas as pd
results_df=pd.DataFrame({'Model':['Bagging','AdaBoost','Gradient Boosting','XGBoost'],
                         'Accuracy':[bagging_acc,ada_acc,gb_acc,xgb_acc]})
results_df

Unnamed: 0,Model,Accuracy
0,Bagging,0.95614
1,AdaBoost,0.964912
2,Gradient Boosting,0.95614
3,XGBoost,0.95614



Passing `palette` without assigning `hue` is deprecated and will be removed in v0.14.0. Assign the `y` variable to `hue` and set `legend=False` for the same effect.

