<!--BOOK_INFORMATION-->
<a href="https://www.packtpub.com/big-data-and-business-intelligence/machine-learning-opencv" target="_blank"><img align="left" src="data/cover.jpg" style="width: 76px; height: 100px; background: white; padding: 1px; border: 1px solid black; margin-right:10px;"></a>
*This notebook contains an excerpt from the upcoming book [Machine Learning for OpenCV](https://www.packtpub.com/big-data-and-business-intelligence/machine-learning-opencv) by Michael Beyeler (expected Aug 2017).
The code is released under the [MIT license](https://opensource.org/licenses/MIT),
and is available on [GitHub](https://github.com/mbeyeler/opencv-machine-learning).*

*Note that this excerpt contains only the raw code - the book is rich with additional explanations and illustrations.
If you find this content useful, please consider supporting the work by
[buying the book](https://www.packtpub.com/big-data-and-business-intelligence/machine-learning-opencv)!*

<!--NAVIGATION-->
< [None](09.00-Using-Deep-Learning-to-Classify-Handwritten-Digits.ipynb) | [Contents](../README.md) | [None](11.00-Selecting-the-Right-Model-with-Hyper-Parameter-Tuning.ipynb) >

# Combining Different Algorithms Into an Ensemble

## Understanding average ensembles

### Implementing a bagging classifier

In [1]:
from sklearn.ensemble import BaggingClassifier
from sklearn.neighbors import KNeighborsClassifier
bagging = BaggingClassifier(KNeighborsClassifier(), n_estimators=10)

In [2]:
bag_knn = BaggingClassifier(KNeighborsClassifier(n_neighbors=5),
                            n_estimators=10, max_samples=0.5,
                           bootstrap=True, random_state=3) 

In [3]:
from sklearn.datasets import load_breast_cancer
dataset = load_breast_cancer()
X = dataset.data
y = dataset.target

In [4]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(
    X, y, random_state=3
)

In [5]:
bag_knn.fit(X_train, y_train)
bag_knn.score(X_test, y_test)

0.93706293706293708

In [6]:
knn = KNeighborsClassifier(n_neighbors=5)
knn.fit(X_train, y_train)
knn.score(X_test, y_test)

0.91608391608391604

In order to change the above code to implement the random patches method, add `max_features=xxx` to the `BaggingClassifier` call in `In [2]`, where `xxx` is a number or fraction of features you want each base estimator to train on.

### Implementing a bagging regressor

In [7]:
from sklearn.ensemble import BaggingRegressor
from sklearn.tree import DecisionTreeRegressor
bag_tree = BaggingRegressor(DecisionTreeRegressor(),
                           max_features=0.5, n_estimators=10, 
                           random_state=3)

In [8]:
from sklearn.datasets import load_boston
dataset = load_boston()
X = dataset.data
y = dataset.target

In [9]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(
    X, y, random_state=3
)

In [10]:
bag_tree.fit(X_train, y_train)
bag_tree.score(X_test, y_test)

0.82704756225081688

In [11]:
tree = DecisionTreeRegressor()
tree.fit(X_train, y_train)
tree.score(X_test, y_test)

0.77309893269359309

<!--NAVIGATION-->
< [None](09.00-Using-Deep-Learning-to-Classify-Handwritten-Digits.ipynb) | [Contents](../README.md) | [None](11.00-Selecting-the-Right-Model-with-Hyper-Parameter-Tuning.ipynb) >