# Experiment - 5
### Aim: To study and implement bagging using Random Forests.
### Abstract:
The aim of this experiment is to investigate and implement the bagging technique using random forests and evaluate its performance in comparison to individual decision trees. Bagging, short for bootstrap aggregating, is an ensemble learning method that combines multiple models trained on different subsets of the dataset to improve predictive accuracy and reduce overfitting. Random forests, which employ bagging on decision trees, further enhance the performance by introducing randomization in the feature selection process. In this experiment, we will implement and evaluate the bagging technique using random forests on a selected dataset to assess its effectiveness in enhancing predictive accuracy.

### Procedure:

In [1]:
# Import necessary libraries
import numpy as np
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score

### Step 1: Dataset Selection

In [2]:
# Loading the Iris dataset
data = load_iris()
X = data.data
y = data.target

### Step 2: Data Preprocessing (Not required for the Iris dataset)


### Step 3: Implementation of Bagging using Random Forests

In [3]:
# Initializing the random forest classifier
rf_classifier = RandomForestClassifier(n_estimators=10)

### Step 4: Experimental Setup

In [4]:
# Splitting the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

### Step 5: Performance Evaluation Metrics (Accuracy in this case)


### Step 6: Comparison and Analysis


In [5]:
# Training the random forest classifier
rf_classifier.fit(X_train, y_train)

# Predicting the labels for the testing set
y_pred = rf_classifier.predict(X_test)

# Calculating the accuracy of the random forest classifier
accuracy = accuracy_score(y_test, y_pred)
print("Accuracy of Random Forest:", accuracy)

Accuracy of Random Forest: 1.0


In [7]:
from sklearn.tree import DecisionTreeClassifier
# Training individual decision trees
decision_trees = []
for i in range(10):
    decision_tree = DecisionTreeClassifier()
    decision_tree.fit(X_train, y_train)
    decision_trees.append(decision_tree)

# Predicting the labels using individual decision trees
y_preds_individual = [tree.predict(X_test) for tree in decision_trees]

# Combining the predictions of individual decision trees
y_pred_bagged = np.round(np.mean(y_preds_individual, axis=0))

# Calculating the accuracy of individual decision trees
accuracy_individual = accuracy_score(y_test, y_pred_bagged)
print("Accuracy of Individual Decision Trees:", accuracy_individual)

Accuracy of Individual Decision Trees: 1.0


### Conclusion: 
By conducting this experiment, we gained insights into the performance of bagging using random forests and understood its potential benefits in improving predictive accuracy compared to individual decision trees.