### Ensemble Learning:
- It is a mtheod when we use many small models instead of one.
- Each of these models may not be very strong on its own, but when we put their results together, we get a better and more accurate results.
### Types of Ensemble Learning in Machine Learning:
- There are three main types of ensemble methods:
#### 1. Bagging ( Bootstrap Aggregating):
- Models are trained independently on different random subsets of the training data.
- Their results are then combined usually by averaging(for regression) (or) voting (for classification).
- This helps reduce variance and prevents overfitting.
#### 2. Boosting:
- Models are trained one after another.
- Each new model focus on fixing the errors made by the previous ones.
- The final prediction is a weighted combination of all models, which helps reduce bias and improve accuracy.
#### 3. Stacking (stacked Generalization):
- Multiple different model are trained and their predictions are used as inputs to a final model, called a meta-model.
- The meta-model learns how to best combine the predictions of the base models, aiming for better performance than any individual model.

###  Bagging Algorithm and How it Works:
- It can be used for both regression and classification tasks.
- Here is an overview of Bagging classifier algorithm.
#### 1. Bootstrap Sampling:
- Divides the original dataset into 'N' subsets and randomly selects a subset with replacement in some rows from other subsets.
- This step ensures that the base models are trained on diverse subsets of the data and there is no class imbalance.
#### 2. Base Model Training:
- For each bootstrapped sample we train a base model independently on that subset of data.
- These weak models are trained in parallel to increase computational efficently and reduce time consumption.
- We can use different base learners i.e different models as base learns to bring variety and robustness.
#### 3. Prediction Aggregation:
- To make a prediction on testing data combine the predictions of all base models.
- For classification tasks, it can include majority voting (or) weighted majority.
- For regression tasks, it involves averaging the predictions.
#### 4. Out-of-Bag(OOB) Evaluation:
- Some samples are prevent from the training subset of particular base models during the bootstrapping method.
- These "out-of-bag" samples can be used to estimate the model's performance without the need for cross-validation.
#### 5.Final Prediction:
- After aggregating the predictions from all the base models, Bagging produces a final prediction for each instance.

### Implementing Bagging Estimator in python:
#### 1. Import Libraries:

In [1]:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import BaggingClassifier
from sklearn.metrics import accuracy_score


#### 2. Load and split the Dataset:
- df = loud_iris() - Which includes the target and features.
- X = df.data - Extracts the feature matrix [ input variables ].
- y = df.target - Extracts the target vector [ class labels].
- train_test_split() - Split the data training (80%) and testing (20%) sets, with random_state=42 to ensure the reproducibility.

In [3]:
df = load_iris()
X = df.data
y = df.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

#### 3. Creating a Base Classifier:
- Decision Tree is chosen as the base model.
- They are prone to overfitting when trained on small datasets making them good candiates for bagging.
- base_classifier = DecisionTreeClassifier() - Intializes a Decision Tree Classifier which will serve as the base estimator in the Bagging ensemble.

In [4]:
base_classifier = DecisionTreeClassifier()

#### 4. Creating and Training the Bagging Classifier:
- A BaggingClassifier - is created using the decision tree as the base classifier.
- n_estimator=10 - Specifies that 10 decision trees will be trained on different bootstrapped subsets of the training data.

In [7]:
bagging_classifier = BaggingClassifier(base_classifier, n_estimators=10, random_state=42)
bagging_classifier.fit(X_train, y_train)

#### 5. Making Predictions and Evaluating Accuracy:
- The trained bagging model predicts labels for test data.
- The accuracy of the predictions is calculated by comparing the predicted labels(y_pred) to the actual labels(y_test). 

In [9]:
y_pred = bagging_classifier.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)

Accuracy: 1.0


### Boosting Algorithm and How its works:
- It is an ensemble technique that combines multiple weak learners to create a strong learner.
- Weak models are trained in series such that each next model focus on coorecting the errors by previous ones until the entire training dataset is predicted correctly.
- One of the most-well known boosting algorithm is AdaBoost( Adaptive Boosting).
- Here is an overview of Boosting Algorthim.
#### 1. Initialize Model Weights:
- Begin with single weak model learner and assign equal weights to all traing examples.
#### 2. Train Weak Learner:
- Train Weak learnes on these dataset.
#### 3. Sequential Learning:
- Boostings works by training models sequentially where each model focuses on correcting the errors of its predecessor.
- Boosting typically uses a single type of weak learner like decision trees.
#### 4. Weight Adjustement:
- Boosting assigns weights to training datapoints.
- Misclassified examples recevie higher weights in the next iteration so that next models pay more attention to them.

### Implementation of boosting Estimator in Python:
#### 1. Importing Libraries:
- AdaBoostClassifier from sklearnensemble - For building the AdaBoost ensemble model.
- DecisionTreeClassifier from sklearn.tree - As base weak learner for AdaBoost.
- accuracy_score from sklearn.metrics - To evaluate the model's accuracy.

In [10]:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import AdaBoostClassifier
from sklearn.metrics import accuracy_score

#### 2. Load and split Dataset:

In [11]:
df = load_iris()
X = df.data
y = df.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

#### 3. Defining the Weak Learner:
- We creating the base classifier as a decision tree with maximum depth 1 (as decision stump).

In [12]:
base_classifier = DecisionTreeClassifier(max_depth=1)

#### 4. Creating and Training the AdaBoost Classifier:
- base_classifier - The weak learner used in boosting.
- n_estimators=50 - Number of weak learners to train sequentially.
- learning_rate=1.0 - Controls the contribution of each weak learners to the final model.
- random_state=42 - Ensures reproducibility.

In [14]:
adaboost_classifier = AdaBoostClassifier(
    base_classifier, n_estimators=50, learning_rate=1.0, random_state=42)
adaboost_classifier.fit(X_train, y_train)

#### 5. Making Predictons and Calculating Accuracy:


In [15]:
accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:",accuracy)

Accuracy: 1.0
