# Ensemble models

Ensemble models are a very popular technique as they can assist your models be more resistant to outliers and have better chances at generalizing with future data. They also gained popularity after several ensembles helped people win prediction competitions. Recently, stochastic gradient boosting became a go-to candidate model for many data scientists. This model walks you through the theory behind ensemble models and popular tree-based ensembles.

## Learning Objectives

Identify, use, and interpret common ensemble models for classification, including bagging, boosting, stacking, and random forest.

Build ensemble models with sklearn, including bagging, boosting, stacking, and random forest.

Identify common supervised machine learning algorithms.

In [None]:
## Learning Goals

In this section, we will cover:

- Combining models (ensemble-based methods)
- Bootstrap aggregation (Bagging)
- Bagging, Random Forest, and Extra Trees Classifiers

## Ensemble Based Methods and Bagging

![](./images/52_CombineModelPredictions.png)

![](./images/53_AggregateResults.png)

### Bagging = Bootstrap Aggregating

![](./images/55_HowManyTRees.png)

![](./images/56_BaggingErrorCalculations.png)

### Bagging Classifier: The Syntax

```python
# Import the class containing the classification method
from sklearn.ensemble import BaggingClassifier

# Create an instance of the class
BC = BaggingClassifier (n estimators=50)

# Fit the instance on the data and then predict the expected value
BC = BC. fit (X train, y train)
y_predict = BC. predict (X test)

```
Tune parameters with cross-validation. Use BaggingRegressor for regression.

## Random Forest

![](./images/57_ReductionInVarianceDueToBagging.png)

![](./images/58_RandomForest.png)

![](./images/59_HowManyTreesInForest.png)

### RandomForest: The Syntax
```python
# Import the class containing the classification method
from sklearn.ensemble import RandomForestClassifier

# Create an instance of the class
RC = RandomForestClassifier(n_estimators=50)

# Fit the instance on the data and then predict the expected value
RC = RC.fit(X_train, y_train)
y_predict = RC.predict (X_test)
```

Tune parameters with cross-validation. Use RandomForestRegressor for regression.


### Introducing Even More Randomness

Sometimes additional randomness is desired beyond Random Forest.

Solution: select features randomly and create splits randomly --- don't choose greedily.

Called "Extra Random Trees".

### Extra Trees Classifier: The Syntax
```python
# Import the class containing the classification method
from sklearn. ensemble import ExtraTreesClassifier

# Create an instance of the class
EC = ExtraTreesClassifier (n estimators=50)

# Fit the instance on the data and then predict the expected value
EC = EC. fit (X train, y train)
y_ predict = EC. predict (X test)
```

Tune parameters with cross-validation. Use ExtraTreesRegressor for regression.

## Bagging labs

## Boosting and stacking
