In [1]:
# Ensemble learning is a machine learning technique that involves combining multiple individual models, known as base learners, 
# to make predictions or decisions. The main idea behind ensemble learning is that by combining the predictions of multiple models, 
# the overall performance can be improved compared to using a single model.
    
# Ensemble learning can be used for both classification and regression problems, and it is particularly effective when the base learners are diverse and 
# make different types of errors. There are several popular ensemble methods, including bagging, boosting, and stacking. Let's explore each of them in more 
# detail:

# (1). Bagging: Bagging stands for Bootstrap Aggregating. It involves creating multiple subsets of the training data through 
#     bootstrapping (sampling with replacement), training a separate base learner on each subset, and then combining their predictions. 
#     The most common example of bagging is the Random Forest algorithm, which combines multiple decision trees.

# (2). Boosting: Boosting is an iterative ensemble method that focuses on training weak learners sequentially and giving more importance to the 
#     instances that were misclassified by previous learners. In boosting, each base learner is trained to correct the mistakes made by the previous 
#     learners. Examples of boosting algorithms include AdaBoost, Gradient Boosting, and XGBoost.

In [5]:
import numpy as np
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

In [7]:
# Generate a synthetic dataset

X, y = make_classification(n_samples=1000, n_features=20, random_state=42)

In [9]:
#Split the dataset into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

In [15]:
# Initialize a list to store the base learners
base_learners = []

# Number of base Learners (decision trees)
num_base_learners = 10

#Train the base learners

for i in range(num_base_learners):
    # Create a bootstrap sample of the training data
    bootstrap_indices = np.random.choice(len(X_train),
                                         size=len(X_train),replace=True)
    X_bootstrap = X_train [bootstrap_indices]
    y_bootstrap = y_train [bootstrap_indices]
    
    # Create and train a base learner (Random Forest)
    base_learner = RandomForestClassifier(n_estimators=10, random_state=42)
    base_learner.fit(X_bootstrap, y_bootstrap)

    #Add the trained base learner to the list
    base_learners.append(base_learner)

#Make predictions with each base learner
base_predictions=[]
for base_learner in base_learners:
    y_pred=base_learner.predict(X_test)
    base_predictions.append(y_pred)

#Combine the predictions using majority voting
ensemble_predictions=np.round(np.mean(base_predictions, axis=0))
accuracy=accuracy_score(y_test,ensemble_predictions)
print("Ensemble Accuracy:", accuracy)

Ensemble Accuracy: 0.87
