## Ensemble learning

![ense,ble.webp](attachment:ense,ble.webp)

## Lecture Housekeeping:

- The use of disrespectful language is prohibited in the questions, this is a supportive, learning environment for all - please engage accordingly.
    - Please review Code of Conduct (in Student Undertaking Agreement) if unsure
- No question is daft or silly - ask them!
- There are Q&A sessions midway and at the end of the session, should you wish to ask any follow-up questions.
- Should you have any questions after the lecture, please schedule a mentor session.
- For all non-academic questions, please submit a query: [www.hyperiondev.com/support](www.hyperiondev.com/support)

## What is Ensemble Learning

Ensemble learning is a machine learning technique that combines the predictions of multiple machine learning models to produce a more accurate prediction. This is done by combining the strengths of the individual models and reducing the weaknesses of each model.

# Key Components

### 1.  Base Models

Base models in machine learning are simple machine learning models that are used as building blocks for more complex models. They are often used in ensemble learning methods, such as bagging and boosting, to improve the accuracy and performance of the final model.
- Ensemble learning starts with a set of base models.

### 2. Diversity:

To be effective, ensemble models require diversity among the base models. This means that the individual models should make different types of errors or predictions. Diversity is achieved by training the base models on different subsets of data, with different parameter settings, or using different algorithms.

In [6]:
def base1(parameter1,parameter2):
    return prediction 1

def base2(parameter3,parameter4):
    return prediction 1

def base3(parameter1,parameter2):
    return prediction 1

def base4(parameter3,parameter4):
    return prediction 2

### 3. Combining Predictions

The predictions of the base models are combined to make a final prediction. The combination can be done through various techniques, such as:

Certainly, let's explore some examples of combining predictions in ensemble learning:

**1. Majority Voting (Classification)**:
   - In a classification task, suppose you have three base models that predict the class of an image: Model A predicts "Cat," Model B predicts "Dog," and Model C predicts "Cat." Majority voting would result in the final prediction being "Cat" because it's the most commonly predicted class among the base models.

**2. Averaging (Regression)**:
   - In a regression task, you have three base models that predict the price of a house. Model X predicts $300,000, Model Y predicts $310,000, and Model Z predicts $305,000. 
   Averaging the predictions would result in a final prediction of $(300,000 + 310,000 + 305,000) / 3 = $305,000.





In [None]:
Training Data - used to train my machine learning(size, num ofbedroom, num of bathrooms)

## Types of Ensembling Learning

Bagging (Bootstrap Aggregating): Combines the predictions of base models trained on different bootstrapped subsets of the data.

Boosting: Sequentially builds an ensemble of models, giving more weight to examples that were misclassified by previous models.


Stacking: Combines base models by training a meta-learner on their predictions.

## Bagging

In [7]:
from sklearn.ensemble import BaggingClassifier #Library we use to build
from sklearn.datasets import make_classification
# Decistion Tree
# Load the dataset
X, y = make_classification(n_samples=1000, n_features=10, n_classes=2, random_state=42)

# Create a bagging classifier
bagging_clf = BaggingClassifier(n_estimators=3, random_state=42)

# Train the bagging classifier
bagging_clf.fit(X, y)

# Make predictions on the test data
y_pred = bagging_clf.predict(X)

# Evaluate the accuracy of the bagging classifier
accuracy = bagging_clf.score(X, y)

print("Accuracy:", accuracy)


Accuracy: 0.979


## Boosting

In [11]:
from sklearn.ensemble import AdaBoostClassifier
from sklearn.datasets import make_classification

# Load the dataset
X, y = make_classification(n_samples=1000, n_features=10, n_classes=2, random_state=42)

# Create an AdaBoost classifier
adaboost_clf = AdaBoostClassifier(n_estimators=15, random_state=4)

# Train the AdaBoost classifier
adaboost_clf.fit(X, y)

# Make predictions on the test data
y_pred = adaboost_clf.predict(X)

# Evaluate the accuracy of the AdaBoost classifier
accuracy = adaboost_clf.score(X, y)

print("Accuracy:", accuracy)


Accuracy: 0.886


## Stacking

In [2]:

from sklearn.datasets import make_classification
from sklearn.tree import DecisionTreeClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import StackingClassifier


# Load the dataset
X, y = make_classification(n_samples=1000, n_features=10, n_classes=2, random_state=42)

# Create a base classifier
base_clf = DecisionTreeClassifier(random_state=42)

# Create a meta-model
meta_clf = LogisticRegression(random_state=42)

# Create a stacking classifier
stacking_clf = StackingClassifier(estimators=[('base_clf', base_clf)])

# Train the stacking classifier
stacking_clf.fit(X, y)

# Make predictions on the test data
y_pred = stacking_clf.predict(X)

# Evaluate the accuracy of the stacking classifier
accuracy = stacking_clf.score(X, y)

print("Accuracy:", accuracy)


Accuracy: 1.0
