Bagging (Bootstrap Aggregating) is an ensemble learning technique used to improve the accuracy and stability of machine learning models by reducing variance. It achieves this by training multiple base models (often weak learners, like Decision Trees) independently on different subsets of the training data and combining their pre
dictions (usually through averaging for regression or majority voting for classification).



Boosting is another ensemble learning technique that improves model performance by reducing bias and variance. It builds models sequentially, with each model correcting the errors of its predecessor. Boosting assigns higher weights to the data points misclassified by earlier models, forcing subsequent models to focus on these hard-to-classify instances.



Why Use Bagging and Boosting?

Improved Accuracy: Both techniques enhance model performance by combining multiple weak learners.

Robustness: Bagging makes models robust against overfitting, while Boosting helps correct underfitting.

Versatility:They work with various machine learning algorithms as base models.

Real-World Applications: Widely used in tasks like fraud detection, image recognition, and financial forecasting.

In [9]:
import numpy as np

# Bagging (Bootstrap Aggregating)

In [1]:
# Import necessary 
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import BaggingClassifier
from sklearn.metrics import accuracy_score

In [2]:
# Load Iris dataset
iris = load_iris()
X = iris.data
y = iris.target

In [3]:
# Split dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

In [4]:
# Initialize a Decision Tree Classifier
dt_classifier = DecisionTreeClassifier(random_state=42)

In [5]:

# Initialize Bagging Classifier with Decision Trees as base learners
bagging_classifier = BaggingClassifier(estimator=dt_classifier, n_estimators=50, random_state=42)


In [6]:
# Train the model
bagging_classifier.fit(X_train, y_train)


In [7]:
# Predict on test data
y_pred_bagging = bagging_classifier.predict(X_test)

In [8]:
# Calculate accuracy
accuracy_bagging = accuracy_score(y_test, y_pred_bagging)
print(f"Bagging Model Accuracy: {accuracy_bagging: }")


Bagging Model Accuracy:  1.0


# Boosting

In [10]:
# Import AdaBoost classifier
from sklearn.ensemble import AdaBoostClassifier

In [11]:
# Initialize AdaBoost with Decision Trees as base learners
adaboost_classifier = AdaBoostClassifier(estimator=dt_classifier, n_estimators=50, random_state=42)

In [12]:
# Train the model
adaboost_classifier.fit(X_train, y_train)

In [13]:
# Predict on test data
y_pred_adaboost = adaboost_classifier.predict(X_test)


In [14]:
# Calculate accuracy
accuracy_adaboost = accuracy_score(y_test, y_pred_adaboost)
print(f"AdaBoost Model Accuracy: {accuracy_adaboost:.4f}")


AdaBoost Model Accuracy: 1.0000
