<a href="https://colab.research.google.com/github/ansiyo/Machine-Learning-Rep/blob/main/Module_56.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
# BAGGING

# Importing necessary libraries
from sklearn.ensemble import BaggingClassifier
# BaggingClassifier: This is an ensemble learning method that combines the predictions of multiple classifiers (usually the same type, like decision trees) trained on different subsets of the data. Each classifier is trained on a randomly sampled subset (with replacement) of the original dataset.
from sklearn.tree import DecisionTreeClassifier
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Load the Iris dataset
iris = load_iris()
X, y = iris.data, iris.target

# Split into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Create a Bagging Classifier with Decision Trees as base learners
bagging_model = BaggingClassifier(estimator=DecisionTreeClassifier(),
                                  n_estimators=10,  # Number of trees
                                  random_state=42)
# estimator=DecisionTreeClassifier: Multiple decision trees will be trained on different subsets of the data.
# n_estimators=10: Specifies that 10 decision trees will be trained. Each tree is trained on a different bootstrapped.

# Train the model
bagging_model.fit(X_train, y_train)

# Make predictions on the test set
y_pred = bagging_model.predict(X_test)

# Calculate accuracy
accuracy = accuracy_score(y_test, y_pred)
print(f"Bagging with Decision Trees Accuracy: {accuracy:.2f}%")

Bagging with Decision Trees Accuracy: 1.00%


In [None]:
from sklearn.ensemble import AdaBoostClassifier
# AdaBoost (Adaptive Boosting) is a technique that combines multiple weak learners (usually decision trees) to form a strong learner.
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Create a synthetic dataset
X, y = make_classification(n_samples=1000, n_features=20, random_state=42)
# X: The feature matrix with 1000 rows and 20 columns, representing the input data.
# y: The target array containing the class labels (0 or 1 )

# Split into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Initialize the AdaBoost classifier
model = AdaBoostClassifier(n_estimators=50, random_state=42)
# n_estimators=50: The number of weak learners (base models) that will be combined. The default weak learner in AdaBoost is a decision tree.

# Train the model
model.fit(X_train, y_train)

# Make predictions
y_pred = model.predict(X_test)

# Evaluate the accuracy
accuracy = accuracy_score(y_test, y_pred)
# Accuracy= Number of correct predictions / Total number of predictions
print(f"AdaBoost Accuracy: {accuracy:.2f}")



AdaBoost Accuracy: 0.87


In [None]:
from sklearn.ensemble import GradientBoostingClassifier

# Initialize the Gradient Boosting classifier
model = GradientBoostingClassifier(n_estimators=100, learning_rate=0.1, random_state=42)
# n_estimators=100: This specifies the number of decision trees (or estimators) to be trained sequentially. In this case, 100 trees will be built one after the other, with each tree correcting the errors of the previous ones.
# learning_rate=0.1: The learning rate controls how much each new tree contributes to the overall model. A lower learning rate makes the model more conservative, requiring more trees to build a strong model, but it also reduces the risk of overfitting.

# Train the model
model.fit(X_train, y_train)

# Make predictions
y_pred = model.predict(X_test)

# Evaluate the accuracy
accuracy = accuracy_score(y_test, y_pred)
print(f"Gradient Boosting Accuracy: {accuracy:.2f}")

Gradient Boosting Accuracy: 0.92


What does the learning rate do?
Gradient Boosting works by sequentially adding decision trees to correct the errors made by the previous ones.

The learning rate is a scaling factor that adjusts the contribution of each new tree to the overall model.

If the learning rate is high (e.g., 0.5 or 1.0), each new tree makes a larger correction to the model, quickly adapting to the errors made by the previous trees.
If the learning rate is low (e.g., 0.1 or 0.01), each new tree makes a smaller correction, requiring more trees to achieve a strong model but also making the training process more cautious and controlled.
Why is the learning rate set to 0.1?
In practice, a learning rate of 0.1 is a common choice because it strikes a good balance between performance and generalization:

Balanced Updates: With a learning rate of 0.1, each tree contributes meaningfully to the overall model, but not so much that it risks overfitting.
More Trees, Better Performance: Since a lower learning rate requires more trees to reach the same level of accuracy, it allows the model to be refined over a larger number of iterations (trees).
Thus, setting the learning rate to 0.1 helps to improve the model's performance while minimizing the risk of overfitting, though it's important to fine-tune the learning rate along with the number of trees (n_estimators) based on the specific dataset.