# StackingCVRegressor

An ensemble-learning meta-regressor for stacking using out-of-fold predictions to prepare the inputs for the level-2 regressor to prevent overfitting.


# Algorithm

Stacking is an ensemble learning technique to combine multiple regression models via a meta-regressor. The StackingCVRegressor extends the standard stacking algorithm (implemented as StackingRegressor) using out-of-fold predictions to prepare the input data for the level-2 classifier.

In the standard stacking procedure, the first-level regressors are fit to the same training set that is used prepare the inputs for the second-level regressor, which may lead to overfitting. The StackingCVRegressor, however, uses the concept of out-of-fold predictions: the dataset is split into k folds, and in k successive rounds, k-1 folds are used to fit the first level regressor; in each round, the first-level regressors are then applied to the remaining 1 subset that was not used for model fitting in each iteration. The resulting predictions are then stacked and provided -- as input data -- to the second-level regressor. After the training of the StackingCVRegressor, the first-level regressors are fit to the entire dataset for optimal training.

# Example 1: Boston housing data predictions

In this example we evaluate some basic prediction models on the boston housing dataset and see how score is affected by combining the models with StackingCVRegressor. See below, script output demonstrates that the stacked model performs the best, slightly better than the best single model.

In [6]:
from mlxtend.regressor import StackingCVRegressor
from sklearn.datasets import load_boston
from sklearn.svm import SVR
from sklearn.linear_model import Lasso
from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import cross_val_score

X, y = load_boston(return_X_y=True)

svr = SVR(kernel='linear')
lasso = Lasso()
rf = RandomForestRegressor(n_estimators=5)

stack = StackingCVRegressor(regressors=(svr, lasso, rf), meta_regressor=lasso)

print('5-fold cross validation scores:\n')

for clf, label in zip([svr, lasso, rf, stack], ['SVM', 'Lasso', 'Random Forest', 'StackingClassifier']):
    scores = cross_val_score(clf, X, y, cv=5)
    print("Score: %0.2f (+/- %0.2f) [%s]" % (scores.mean(), scores.std(), label))

3-fold cross validation scores:

Score: 0.45 (+/- 0.29) [SVM]
Score: 0.43 (+/- 0.14) [Lasso]
Score: 0.57 (+/- 0.21) [Random Forest]
Score: 0.58 (+/- 0.25) [StackingClassifier]


# Example 2: GridSearchCV with stacking

In this second example we demonstrate how StackingCVRegressor works in combination with GridSearchCV. The stack still allows tuning hyper parameters of the base and meta models!

In [11]:
from mlxtend.regressor import StackingCVRegressor
from sklearn.datasets import load_boston
from sklearn.linear_model import Lasso, Ridge
from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import GridSearchCV

X, y = load_boston(return_X_y=True)

ridge = Ridge()
lasso = Lasso()
rf = RandomForestRegressor()

stack = StackingCVRegressor(regressors=(lasso, ridge), meta_regressor=rf, use_features_in_secondary=True)

params = {'lasso__alpha': [0.1, 1.0, 10.0],
          'ridge__alpha': [0.1, 1.0, 10.0]}

grid = GridSearchCV(
    estimator=stack, 
    param_grid={
        'lasso__alpha': [x/5.0 for x in range(1,10)],
        'ridge__alpha': [x/20.0 for x in range(1,10)]
    }, 
    cv=5,
    refit=True
)

grid.fit(X, y)

print("Best: %f using %s" % (grid.best_score_, grid.best_params_))

Best: 0.674269 using {'ridge__alpha': 0.35, 'lasso__alpha': 1.2}
