Hi, in this lesson, you will discover the stacked generalization or stacking ensemble.

Stacking involves combining the predictions of multiple different types of base-models, much like voting.


The important difference from voting is that another machine learning model is used to learn how to best combine the predictions of the base-models. This is often a linear model, such as a linear regression for regression problems or logistic regression for classification, but can be any machine learning model you like.


The meta-model is trained on the predictions made by base-models on out-of-sample data.

This involves using k-fold cross-validation for each base-model and storing all of the out-of-fold predictions. The base-models are then trained on the entire training dataset, and the meta-model is trained on the out-of-fold predictions and learns which model to trust, the degree to trust them, and under which circumstances.


Although internally stacking uses k-fold cross-validation to train the meta-model, you can evaluate stacking models any way you like, such as via a train-test split or k-fold cross-validation. The evaluation of the model is separate from this internal resampling-for-training process.


Stacking ensembles are available in scikit-learn via the StackingClassifier and StackingRegressor classes. A list of base-models can be provided as an argument to the model and each model in the list must be a tuple with a name and the model, e.g. ('lr', LogisticRegression()). The meta-learner can be specified via the final_estimator argument and the resampling strategy can be specified via the cv argument and can be simply set to an integer indicating the number of cross-validation folds.

The complete example of evaluating a stacking ensemble for classification is listed below.

In [5]:
from numpy import mean
from numpy import math
from numpy import std

from sklearn.datasets import make_classification
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import RepeatedStratifiedKFold

from sklearn.ensemble import StackingClassifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.tree import DecisionTreeClassifier

from sklearn.linear_model import LogisticRegression


In [15]:
# create the synthetics classification dataset

X,y=make_classification(random_state=1)

In [16]:
models = [('knn', KNeighborsClassifier()), ('tree', DecisionTreeClassifier())]

In [17]:
# configure the ensemble model
model = StackingClassifier(models, final_estimator=LogisticRegression(), cv=3)

In [18]:
# configure the resampling method
cv = RepeatedStratifiedKFold(n_splits=10, n_repeats=3, random_state=1)
# evaluate the ensemble on the dataset using the resampling method
n_scores = cross_val_score(model, X, y, scoring='accuracy', cv=cv, n_jobs=-1)
# report ensemble performance
print('Mean Accuracy: %.3f (%.3f)' % (mean(n_scores), std(n_scores)))


Mean Accuracy: 0.923 (0.088)
