##### Bagging (Bootstrap Aggregating):

Bagging is like having a group of friends who each have different opinions, and you ask all of them for advice. Then, you take the average of all their opinions to make a more reliable decision.
In machine learning, it works similarly. We create multiple models using different subsets of our data and then combine their predictions to get a more accurate result.

##### Bootstrap Sampling:

Explanation:
Imagine you have a big jar of candies. Instead of just taking a handful of candies once, you put the candies back in the jar after picking each one. Then, you pick again. This way, you create many different groups of candies, and each group might have some candies repeated and some missing.

Workflow:

-   Take a random candy from the jar.
-   Put the candy back into the jar.
-   Repeat this process several times to create different groups of candies.



In [20]:
from sklearn.ensemble import BaggingClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

iris = load_iris()
X, y = iris.data, iris.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

base_classifier = DecisionTreeClassifier(random_state=42)
bagging_classifier = BaggingClassifier(base_classifier, n_estimators=5, random_state=42)

bagging_classifier.fit(X_train, y_train)
predictions = bagging_classifier.predict(X_test)

accuracy = accuracy_score(y_test, predictions)
print("Bootstrap Sampling Accuracy:", accuracy)


Bootstrap Sampling Accuracy: 1.0


##### Random Forest:

Explanation:

Now, imagine you have friends who are good at different things. One friend is great at guessing colors, another at shapes, and so on. In a random forest, each friend (a tree) focuses on a specific thing. When you have a question, you ask all your friends, and they vote on the answer. This way, you get a more reliable answer.

Workflow:

-   Each friend (tree) looks at a random set of candies (features) and tries to guess the candy's color.
-   When you have a new candy, ask each friend for their guess.
-   Count the votes and choose the color that most friends agreed on.

In [21]:
from sklearn.ensemble import RandomForestClassifier

random_forest_classifier = RandomForestClassifier(n_estimators=5, random_state=42)

random_forest_classifier.fit(X_train, y_train)
predictions = random_forest_classifier.predict(X_test)

accuracy = accuracy_score(y_test, predictions)
print("Random Forest Accuracy:", accuracy)

Random Forest Accuracy: 0.9666666666666667


##### Bagging with Different Models:

Explanation:

Now, your friends have different skills. One friend is good at guessing, and another is good at drawing. In this case, you have friends with different abilities. Bagging with different models is like having friends with different skills help you make a decision.

Workflow:

-   Ask one friend (model) to focus on guessing colors.
-   Ask another friend (model) to focus on drawing shapes.
-   When you have a question, each friend gives their opinion based on their skill.
-   Combine all the opinions to make a decision.

In [22]:
from sklearn.ensemble import BaggingClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC

# Define base classifiers
base_classifiers = [LogisticRegression(), SVC()]

# Create a BaggingClassifier with different models
bagging_classifier = BaggingClassifier(base_estimator=None,  # Automatically selects base estimator type
                                       n_estimators=5,
                                       random_state=42)

# Train the BaggingClassifier
bagging_classifier.fit(X_train, y_train)

# Make predictions
predictions = bagging_classifier.predict(X_test)

# Calculate accuracy
accuracy = accuracy_score(y_test, predictions)
print("Bagging with Different Models Accuracy:", accuracy)

Bagging with Different Models Accuracy: 1.0


##### Out-of-Bag (OOB) Score:

Explanation:

Imagine you have some candies that your friends have never seen before because they were busy with other candies. The out-of-bag score is like asking your friends to guess the color of these candies they have never seen. It helps you check how good your friends are at guessing new things.

Workflow:

-   Some candies are left out when making groups (out-of-bag candies).
-   Ask each friend (model) to guess the color of the out-of-bag candies.
-   Compare their guesses to see how well they can guess new candies.

In [23]:
bagging_classifier = BaggingClassifier(base_classifier, n_estimators=5, oob_score=True, random_state=42)

bagging_classifier.fit(X_train, y_train)

# OOB score
oob_score = bagging_classifier.oob_score_
print("Out-of-Bag Score:", oob_score)

Out-of-Bag Score: 0.9083333333333333


  warn(
  oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]


#### Feature Bagging:

Explanation:

Now, besides candies, you have candies with different patterns and flavors. Feature bagging is like asking your friends to focus only on guessing the color of the candies without worrying about patterns or flavors. Each friend looks at a different part of the candies.

Workflow:

-   Each friend (model) looks at only a part of the candies (features).
-   They guess the color without considering other details.
-   Combine all the color guesses from different friends.

In [24]:
from sklearn.ensemble import BaggingClassifier
from sklearn.ensemble import RandomForestClassifier

# Assume X has more features than shown here
X_subset = X[:, :2]

base_classifier = RandomForestClassifier(random_state=42)
feature_bagging_classifier = BaggingClassifier(base_classifier, n_estimators=5, random_state=42)

feature_bagging_classifier.fit(X_train[:, :2], y_train)
predictions = feature_bagging_classifier.predict(X_test[:, :2])

accuracy = accuracy_score(y_test, predictions)
print("Feature Bagging Accuracy:", accuracy)

Feature Bagging Accuracy: 0.8


#### Pasting:

Explanation:

Pasting is like asking your friends to pick candies without putting them back in the jar. Each friend can only pick each candy once. This way, you create groups of candies without repeats.

Workflow:

-   Take a random candy from the jar.
-   Don't put the candy back into the jar.
-   Repeat this process several times to create different groups of candies without repeats.

In [25]:
pasting_classifier = BaggingClassifier(base_classifier, n_estimators=5, bootstrap=False, random_state=42)

pasting_classifier.fit(X_train, y_train)
predictions = pasting_classifier.predict(X_test)

accuracy = accuracy_score(y_test, predictions)
print("Pasting Accuracy:", accuracy)

Pasting Accuracy: 1.0


#### Random Patches:

Explanation:

Imagine candies with different colors, patterns, and flavors. Random patches are like asking your friends to focus on only a part of the candies and a part of the features. Each friend looks at a subset of candies and a subset of details.

Workflow:

-   Each friend (model) looks at only some candies and some features.
-   They guess the color without considering the whole picture.
-   Combine all the color guesses from different friends.

In [26]:
random_patches_classifier = BaggingClassifier(base_classifier, n_estimators=5, max_samples=0.8, max_features=0.8, random_state=42)

random_patches_classifier.fit(X_train, y_train)
predictions = random_patches_classifier.predict(X_test)

accuracy = accuracy_score(y_test, predictions)
print("Random Patches Accuracy:", accuracy)

Random Patches Accuracy: 1.0
