# Ensemble models

Ensemble models are a very popular technique as they can assist your models be more resistant to outliers and have better chances at generalizing with future data. They also gained popularity after several ensembles helped people win prediction competitions. Recently, stochastic gradient boosting became a go-to candidate model for many data scientists. This model walks you through the theory behind ensemble models and popular tree-based ensembles.

## Learning Objectives

Identify, use, and interpret common ensemble models for classification, including bagging, boosting, stacking, and random forest.

Build ensemble models with sklearn, including bagging, boosting, stacking, and random forest.

Identify common supervised machine learning algorithms.

In [None]:
## Learning Goals

In this section, we will cover:

- Combining models (ensemble-based methods)
- Bootstrap aggregation (Bagging)
- Bagging, Random Forest, and Extra Trees Classifiers

## Ensemble Based Methods and Bagging

![](./images/52_CombineModelPredictions.png)

![](./images/53_AggregateResults.png)

### Bagging = Bootstrap Aggregating

![](./images/55_HowManyTRees.png)

![](./images/56_BaggingErrorCalculations.png)

### Bagging Classifier: The Syntax

```python
# Import the class containing the classification method
from sklearn.ensemble import BaggingClassifier

# Create an instance of the class
BC = BaggingClassifier (n estimators=50)

# Fit the instance on the data and then predict the expected value
BC = BC. fit (X train, y train)
y_predict = BC. predict (X test)

```
Tune parameters with cross-validation. Use BaggingRegressor for regression.

## Random Forest

![](./images/57_ReductionInVarianceDueToBagging.png)

![](./images/58_RandomForest.png)

![](./images/59_HowManyTreesInForest.png)

### RandomForest: The Syntax
```python
# Import the class containing the classification method
from sklearn.ensemble import RandomForestClassifier

# Create an instance of the class
RC = RandomForestClassifier(n_estimators=50)

# Fit the instance on the data and then predict the expected value
RC = RC.fit(X_train, y_train)
y_predict = RC.predict (X_test)
```

Tune parameters with cross-validation. Use RandomForestRegressor for regression.


### Introducing Even More Randomness

Sometimes additional randomness is desired beyond Random Forest.

Solution: select features randomly and create splits randomly --- don't choose greedily.

Called "Extra Random Trees".

### Extra Trees Classifier: The Syntax
```python
# Import the class containing the classification method
from sklearn. ensemble import ExtraTreesClassifier

# Create an instance of the class
EC = ExtraTreesClassifier (n estimators=50)

# Fit the instance on the data and then predict the expected value
EC = EC. fit (X train, y train)
y_ predict = EC. predict (X test)
```

Tune parameters with cross-validation. Use ExtraTreesRegressor for regression.

## Bagging labs

## Boosting and stacking

### Learning Goals

In this section, we will cover:
- The Boosting approach to combining models
- Types of Boosting models: Gradient Boosting, AdaBoost
- Boosting loss functions
- Combining heterogeneous classifiers

![](./images/60_BaggingReview.png)

![](./images/61_Boosting.png)

![](./images/62_Boosting.png)


### Adaboost and Gradient Boosting Overview

![](./images/63_BoostingSpecifics.png)

![](./images/64_01LossFunction.png)

![](./images/65_AdaBoostLossFunction.png)

![](./images/66_GradientBoostingLossFunction.png)

### Bagging vs Boosting

|       Bagging                    |  Boosting                          |
|----------------------------------|------------------------------------|
| Bootstrapped samples             | Fit entire data set                |
| Base trees created independently | Base trees created successively    |
| Only data points considered      | Use residuals from previous models |
| No weighting used                | Up-weight misclassified points     |
| Excess trees will not overfit    | Beware of overfitting              |

![](./images/67_TurningGradientBoostedModel.png)

![](./images/67_TurningGradientBoostedModel2.png)

### GradientBoosting Classifier: The Syntax

```python

# Import the class containing the classification method
from sklearn. ensemble import GradientBoostingClassifier

# Create an instance of the class
GBC = GradientBoostingClassifier (learning_rate=0.1,
max features=1, subsample=0.5,
n estimators=200)

# Fit the instance on the data and then predict the expected value
GBC = GBC. fit (X train, y train)
y predict = GBC. predict (X test)
```

Tune with cross-validation. Use GradientBoostingRegressor for regression.


### AdaBoostClassifier: The Syntax

```python
# Import the class containing the classification method
from sklearn. ensemble import AdaBoostClassifier
from sklearn. tree import DecisionTreeClassifier

# Create an instance of the class

ABC = AdaBoostClassifier( base_estimator= DecisionTreeClassifier(),learning_rate=0.1, n_estimators=200) # can also set max depth here

# Fit the instance on the data and then predict the expected value
ABC = ABC.fit (X_train, y_train)
Y_predict = ABC.predict (X_test)

```
Tune with cross-validation. Use AdaBoostRegressor for regression.

## Stacking

![](./images/69_Stacking_CombiningClassifiers.png)

- Output of base learners can
be combined via majority vote
or weighted.
- Additional hold-out data needed
if meta learner parameters are
used.
- Be aware of increasing
model complexity.
- The final prediction can be done
by voting or with another model

### Voting Classifier: The Syntax

```python
# Import the class containing the classification method
from sklearn. ensemble import VotingClassifier

# Create an instance of the class
VC = VotingClassifier (estimator_list) # estimator_list is list of modal were fitted already

# Fit the instance on the data and then predict the expected value
VC = VC. fit (X train, y train)
Y_predict = VC. predict (X test)
# Use VotingRegressor for regression.
# The StackingClassifier (or StackingRegressor) works similarly:
SC = StackingClassifier (estimator list, final estimator=LogisticRegression ())

```

### Learning Recap

In this section, we discussed:
- The Boosting approach to combining models
- Types of Boosting models: Gradient Boosting, AdaBoost
- Boosting loss functions
- Combining heterogeneous classifiers

Further reading:
XGBoost is another popular boosting algorithm (not in Scikit-Learn).
