# Blending Ensemble Machine Learning With Python
by Jason Brownlee on November 30, 2020.[Here](https://machinelearningmastery.com/blending-ensemble-machine-learning-with-python/) in [Ensemble Learning](https://machinelearningmastery.com/category/ensemble-learning/)

__Blending -`mezcla`- is an ensemble machine learning algorithm__

It is a colloquial name for `stacked generalization` or `stacking ensemble` where instead of fitting the meta-model on __out-of-fold predictions__ made by the base model, it is fit on predictions made on a __holdout dataset__.

After completing this tutorial, you will know:

- Blending ensembles are a type of stacking where the meta-model is fit using predictions on a holdout validation -`validación de retención`- dataset instead of out-of-fold predictions -`predicciones fuera del pliegue`-.
- How to develop a blending ensemble, including functions for training the model and making predictions on new data.
- How to evaluate blending ensembles for classification and regression predictive modeling problems.

## Tutorial Overview
This tutorial is divided into four parts; they are:

1. Blending Ensemble
2. Develop a Blending Ensemble
3. Blending Ensemble for Classification
    - 3.1 Blending Ensemble for Classification - soft voting
    - 3.2 Blending Ensemble for Classification - Prediction
4. Blending Ensemble for Regression
    - 4.1 Blending Ensemble for Regression - Prediction

## 1. Blending Ensemble
Blending is an ensemble machine learning technique that uses a machine learning model to learn how to best combine the predictions from multiple contributing ensemble member models.

As such, blending is the same as `stacked generalization`, known as stacking, broadly conceived. Often, blending and stacking are used interchangeably in the same paper or model description.

The __architecture of a stacking model__ involves two or more base models, often `referred to as level-0 models`, and a meta-model that combines the predictions of the base models, `referred to as a level-1 model`. The meta-model is trained on the predictions made by base models on out-of-sample data.

- __Level-0 Models__ (Base-Models): Models fit on the training data and whose predictions are compiled.
- __Level-1 Model__ (Meta-Model): Model that learns how to best combine the predictions of the base models.

Blending may suggest developing a stacking ensemble where the base-models are machine learning models of any type, and the meta-model is a linear model that “blends” the predictions of the base-models.

For example, a linear regression model when predicting a numerical value or a logistic regression model when predicting a class label would calculate a weighted sum of the predictions made by base models and would be considered a blending of predictions.

- __Blending__: Stacking-type ensemble where the meta-model is trained on predictions made on a holdout dataset.
- __Stacking__: Stacking-type ensemble where the meta-model is trained on out-of-fold predictions made during k-fold cross-validation.

## 2. Develop a Blending Ensemble 
__The scikit-learn library does not natively support blending at the time of writing.__

__First__, we need to create a number of base models. These can be any models we like for a regression or classification problem. We can define a function get_models() that returns a list of models where each model is defined as a tuple with a name and the configured classifier or regression object.

```
# get a list of base models
def get_models():
    models = list()
    
    models.append(('lr', LogisticRegression()))
    models.append(('knn', KNeighborsClassifier()))
    models.append(('cart', DecisionTreeClassifier()))
    models.append(('svm', SVC(probability=True)))
    models.append(('bayes', GaussianNB()))
    
    return models
```

__Second__, we need to fit the blending model.

Recall that the base models are fit on a training dataset. The meta-model is fit on the predictions made by each base model on a holdout dataset.

```
...
# fit all models on the training set and predict on hold out set
meta_X = list()
for name, model in models:
	
    # fit in training set
	model.fit(X_train, y_train)
	
    # predict on hold out set
	yhat = model.predict(X_val)
	
    # reshape predictions into a matrix with one column
	yhat = yhat.reshape(len(yhat), 1)
	
    # store predictions as input for blending
	meta_X.append(yhat)
```

We now have “meta_X” that represents the input data that can be used to train the meta-model. `Each column or feature represents the output of one base model`.

__Third__, Each row represents the one sample from the holdout dataset. We can use the hstack() function to ensure this dataset is a 2D numpy array as expected by a machine learning model.

```
...
# create 2d array from predictions, each set is an input feature
meta_X = hstack(meta_X)
```

We can now train our meta-model. This can be any machine learning model we like, such as logistic regression for classification.

```
...
# define blending model
blender = LogisticRegression()

# fit on predictions from base models
blender.fit(meta_X, y_val)
```

We can tie all of this together into a function named fit_ensemble() that trains the blending model using a training dataset and holdout validation dataset.

```
# fit the blending ensemble
def fit_ensemble(models, X_train, X_val, y_train, y_val):
	
    # fit all models on the training set and predict on hold out set
	meta_X = list()
	for name, model in models:
		
        # fit in training set
		model.fit(X_train, y_train)
		
        # predict on hold out set
		yhat = model.predict(X_val)
		
        # reshape predictions into a matrix with one column
		yhat = yhat.reshape(len(yhat), 1)
		
        # store predictions as input for blending
		meta_X.append(yhat)
	
    # create 2d array from predictions, each set is an input feature
	meta_X = hstack(meta_X)
	
    # define blending model
	blender = LogisticRegression()
	
    # fit on predictions from base models
	blender.fit(meta_X, y_val)
    
	return blender
```

__Fourth__ The next step is to use the blending ensemble to make predictions on new data.

This is a two-step process. The `first step` is to use each base model to make a prediction. These predictions are then gathered together and `second step`, used as input to the blending model to make the final prediction.

We can use the same looping structure as we did when training the model. That is, we can collect the predictions from each base model into a training dataset, stack the predictions together, and call predict() on the blender model with this meta-level dataset.

The predict_ensemble() function below implements this. Given the list of fit base models, the fit blender ensemble, and a dataset (such as a test dataset or new data), it will return a set of predictions for the dataset.

```
# make a prediction with the blending ensemble
def predict_ensemble(models, blender, X_test):
	
    # make predictions with base models
	meta_X = list()
	for name, model in models:
		
        # predict with base model
		yhat = model.predict(X_test)
		
        # reshape predictions into a matrix with one column
		yhat = yhat.reshape(len(yhat), 1)
		
        # store prediction
		meta_X.append(yhat)
	
    # create 2d array from predictions, each set is an input feature
	meta_X = hstack(meta_X)
	
    # predict
	return blender.predict(meta_X)
```

## 3. Blending Ensemble for Classification

In [1]:
# blending ensemble for classification using hard voting
from numpy import hstack
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from sklearn.linear_model import LogisticRegression
from sklearn.neighbors import KNeighborsClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.svm import SVC
from sklearn.naive_bayes import GaussianNB

In [2]:
# get the dataset
def get_dataset():
    X, y = make_classification(n_samples=10000, n_features=20, n_informative=15, n_redundant=5, random_state=7)
    return X, y

In [3]:
# get a list of base models
def get_models():
    models = list()
    models.append(('lr', LogisticRegression()))
    models.append(('knn', KNeighborsClassifier()))
    models.append(('cart', DecisionTreeClassifier()))
    models.append(('svm', SVC()))
    models.append(('bayes', GaussianNB()))
    return models

In [4]:
# fit the blending ensemble
def fit_ensemble(models, X_train, X_val, y_train, y_val):
    
    # fit all models on the training set and predict on hold out set
    meta_X = list()
    for name, model in models:
        
        # fit in training set
        model.fit(X_train, y_train)
        
        # predict on hold out set
        yhat = model.predict(X_val)
        
        # reshape predictions into a matrix with one column
        yhat = yhat.reshape(len(yhat), 1)
        
        # store predictions as input for blending
        meta_X.append(yhat)
    
    # create 2d array from predictions, each set is an input feature
    meta_X = hstack(meta_X)
    
    # define blending model
    blender = LogisticRegression()
    
    # fit on predictions from base models
    blender.fit(meta_X, y_val)
    
    return blender

In [5]:
# make a prediction with the blending ensemble
def predict_ensemble(models, blender, X_test):
    
    # make predictions with base models
    meta_X = list()
    for name, model in models:
        
        # predict with base model
        yhat = model.predict(X_test)
        
        # reshape predictions into a matrix with one column
        yhat = yhat.reshape(len(yhat), 1)
        
        # store prediction
        meta_X.append(yhat)
    
    # create 2d array from predictions, each set is an input feature
    meta_X = hstack(meta_X)

    # predict
    return blender.predict(meta_X)

In [6]:
# define dataset
X, y = get_dataset()

# summarize the dataset
print(X.shape, y.shape)

(10000, 20) (10000,)


we need to split the dataset up, `first` into train and test sets, and `then the training` set into a subset used to train the base models and a subset used to train the meta-model.

In this case, we will use a 50-50 split for the train and test sets, then use a 67-33 split for train and validation sets.

In [7]:
# split dataset into train and test sets
X_train_full, X_test, y_train_full, y_test = train_test_split(X, y, test_size=0.5, random_state=1)

# split training set into train and validation sets
X_train, X_val, y_train, y_val = train_test_split(X_train_full, y_train_full, test_size=0.33, random_state=1)

# summarize data split
print('Train: %s, Val: %s, Test: %s' % (X_train.shape, X_val.shape, X_test.shape))

Train: (3350, 20), Val: (1650, 20), Test: (5000, 20)


We can then use the get_models() function from the previous section to create the classification models used in the ensemble.

In [8]:
# create the base models
models = get_models()

We can then use the get_models() function from the previous section to create the classification models used in the ensemble.

The fit_ensemble() function can then be called to fit the blending ensemble on the train and validation datasets and the predict_ensemble() function can be used to make predictions on the holdout dataset.

In [9]:
# train the blending ensemble
blender = fit_ensemble(models, X_train, X_val, y_train, y_val)

# make predictions on test set
yhat = predict_ensemble(models, blender, X_test)

Finally, we can evaluate the performance of the blending model by reporting the classification accuracy on the test dataset.

In [10]:
# evaluate predictions
score = accuracy_score(y_test, yhat)
print('Blending Accuracy: %.3f' % (score*100))

Blending Accuracy: 97.700


### 3.1 Blending Ensemble for Classification - soft voting
In the previous example, crisp class label predictions were combined using the blending model. This is a type of [hard voting](https://machinelearningmastery.com/voting-ensembles-with-python/).

An alternative is to have each model predict class probabilities and use the meta-model to blend the probabilities. This is a type of `soft voting` and can result in better performance in some cases.

- 1. Configure the models to return probabilities, such as the `SVM model`.
- 2. Change the base models to predict probabilities instead of crisp class labels.
- 3. Change the predictions made by the base models when using the blending model to make predictions on new data.

In [11]:
# blending ensemble for classification using soft voting
from numpy import hstack
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from sklearn.linear_model import LogisticRegression
from sklearn.neighbors import KNeighborsClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.svm import SVC
from sklearn.naive_bayes import GaussianNB

# get the dataset
def get_dataset():
    X, y = make_classification(n_samples=10000, n_features=20, n_informative=15, n_redundant=5, random_state=7)
    return X, y

# get a list of base models
def get_models():
    models = list()
    models.append(('lr', LogisticRegression()))
    models.append(('knn', KNeighborsClassifier()))
    models.append(('cart', DecisionTreeClassifier()))
    models.append(('svm', SVC(probability=True))) ## (1)
    models.append(('bayes', GaussianNB()))
    return models

# fit the blending ensemble
def fit_ensemble(models, X_train, X_val, y_train, y_val):
    
    # fit all models on the training set and predict on hold out set
    meta_X = list()
    for name, model in models:
        
        # fit in training set
        model.fit(X_train, y_train)
        
        # predict on hold out set
        yhat = model.predict_proba(X_val)
        
        ## reshape predictions into a matrix with one column
        ##yhat = yhat.reshape(len(yhat), 1) ## (2)

        # store predictions as input for blending
        meta_X.append(yhat)
    
    # create 2d array from predictions, each set is an input feature
    meta_X = hstack(meta_X)
    
    # define blending model
    blender = LogisticRegression()
    
    # fit on predictions from base models
    blender.fit(meta_X, y_val)

    return blender

# make a prediction with the blending ensemble
def predict_ensemble(models, blender, X_test):
    
    # make predictions with base models
    meta_X = list()
    for name, model in models:
        
        # predict with base model
        yhat = model.predict_proba(X_test)
        
        ## reshape predictions into a matrix with one column
        ##yhat = yhat.reshape(len(yhat), 1) ## (3)        
        
        # store prediction
        meta_X.append(yhat)
    
    # create 2d array from predictions, each set is an input feature
    meta_X = hstack(meta_X)

    # predict
    return blender.predict(meta_X)

# define dataset
X, y = get_dataset()

# split dataset into train and test sets
X_train_full, X_test, y_train_full, y_test = train_test_split(X, y, test_size=0.5, random_state=1)

# split training set into train and validation sets
X_train, X_val, y_train, y_val = train_test_split(X_train_full, y_train_full, test_size=0.33, random_state=1)

# summarize data split
print('Train: %s, Val: %s, Test: %s' % (X_train.shape, X_val.shape, X_test.shape))

# create the base models
models = get_models()

# train the blending ensemble
blender = fit_ensemble(models, X_train, X_val, y_train, y_val)

# make predictions on test set
yhat = predict_ensemble(models, blender, X_test)

# evaluate predictions
score = accuracy_score(y_test, yhat)
print('Blending Accuracy: %.3f' % (score*100))

Train: (3350, 20), Val: (1650, 20), Test: (5000, 20)
Blending Accuracy: 98.300


A blending ensemble is only effective if it is able to out-perform any single contributing model.

We can confirm this by evaluating each of the base models in isolation. Each base model can be fit on the entire training dataset (unlike the blending ensemble) and evaluated on the test dataset (just like the blending ensemble).

The example below demonstrates this, evaluating each base model in isolation.

In [12]:
# evaluate standalone model
for name, model in models:
    # fit the model on the training dataset
    model.fit(X_train_full, y_train_full)
    
    # make a prediction on the test dataset
    yhat = model.predict(X_test)
    
    # evaluate the predictions
    score = accuracy_score(y_test, yhat)
    
    # report the score
    print('>%s Accuracy: %.3f' % (name, score*100))

>lr Accuracy: 87.800
>knn Accuracy: 97.380
>cart Accuracy: 88.060
>svm Accuracy: 98.200
>bayes Accuracy: 87.300


In this case, we can see that all models perform worse than the blended ensemble.

Interestingly, we can see that the SVM comes very close to achieving an accuracy of 98.200 percent compared to 98.240 achieved with the blending ensemble.

We may choose to use a blending ensemble as our final model.

### 3.2 Blending Ensemble for Classification - Prediction

This involves fitting the ensemble on the entire training dataset and making predictions on new examples. Specifically, the entire training dataset is split onto train and validation sets to train the base and meta-models respectively, then the ensemble can be used to make a prediction.

The complete example of making a prediction on new data with a blending ensemble for classification is listed below.

In [13]:
# define dataset
X, y = get_dataset()

# split dataset set into train and validation sets
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.33, random_state=1)

# summarize data split
print('Train: %s, Val: %s' % (X_train.shape, X_val.shape))

# create the base models
models = get_models()

# train the blending ensemble
blender = fit_ensemble(models, X_train, X_val, y_train, y_val)

# make a prediction on a new row of data
row = [-0.30335011, 2.68066314, 2.07794281, 1.15253537, -2.0583897, -2.51936601, 0.67513028, -3.20651939, -1.60345385, 3.68820714, 0.05370913, 1.35804433, 0.42011397, 1.4732839, 2.89997622, 1.61119399, 7.72630965, -2.84089477, -1.83977415, 1.34381989]
yhat = predict_ensemble(models, blender, [row])

# summarize prediction
print('Predicted Class: %d' % (yhat))

Train: (6700, 20), Val: (3300, 20)
Predicted Class: 1


## 4. Blending Ensemble for Regression

In [14]:
# evaluate blending ensemble for regression
from numpy import hstack
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_absolute_error
from sklearn.linear_model import LinearRegression
from sklearn.neighbors import KNeighborsRegressor
from sklearn.tree import DecisionTreeRegressor
from sklearn.svm import SVR

In [15]:
# get the dataset
def get_dataset():
    X, y = make_regression(n_samples=10000, n_features=20, n_informative=10, noise=0.3, random_state=7)
    
    return X, y

In [16]:
# get a list of base models
def get_models():
    models = list()
    
    models.append(('lr', LinearRegression()))
    models.append(('knn', KNeighborsRegressor()))
    models.append(('cart', DecisionTreeRegressor()))
    models.append(('svm', SVR()))
    
    return models

In [17]:
# fit the blending ensemble
def fit_ensemble(models, X_train, X_val, y_train, y_val):
    
    # fit all models on the training set and predict on hold out set
    meta_X = list()
    for name, model in models:
        
        # fit in training set
        model.fit(X_train, y_train)
        
        # predict on hold out set
        yhat = model.predict(X_val)
        
        # reshape predictions into a matrix with one column
        yhat = yhat.reshape(len(yhat), 1)
        
        # store predictions as input for blending
        meta_X.append(yhat)
    
    # create 2d array from predictions, each set is an input feature
    meta_X = hstack(meta_X)
    
    # define blending model
    blender = LinearRegression()
    
    # fit on predictions from base models
    blender.fit(meta_X, y_val)
    
    return blender

In [18]:
# make a prediction with the blending ensemble
def predict_ensemble(models, blender, X_test):
    
    # make predictions with base models
    meta_X = list()
    for name, model in models:
        
        # predict with base model
        yhat = model.predict(X_test)
        
        # reshape predictions into a matrix with one column
        yhat = yhat.reshape(len(yhat), 1)
        
        # store prediction
        meta_X.append(yhat)
    
    # create 2d array from predictions, each set is an input feature
    meta_X = hstack(meta_X)

    # predict
    return blender.predict(meta_X)

In [19]:
# define dataset
X, y = get_dataset()

# split dataset into train and test sets
X_train_full, X_test, y_train_full, y_test = train_test_split(X, y, test_size=0.5, random_state=1)

# split training set into train and validation sets
X_train, X_val, y_train, y_val = train_test_split(X_train_full, y_train_full, test_size=0.33, random_state=1)

# summarize data split
print('Train: %s, Val: %s, Test: %s' % (X_train.shape, X_val.shape, X_test.shape))

Train: (3350, 20), Val: (1650, 20), Test: (5000, 20)


In [20]:
# create the base models
models = get_models()

# train the blending ensemble
blender = fit_ensemble(models, X_train, X_val, y_train, y_val)

# make predictions on test set
yhat = predict_ensemble(models, blender, X_test)

In [21]:
# evaluate predictions
score = mean_absolute_error(y_test, yhat)
print('Blending MAE: %.3f' % score)

Blending MAE: 0.237


As with classification, the blending ensemble is only useful if it performs better than any of the base models that contribute to the ensemble.

We can check this by evaluating each base model in isolation by first fitting it on the entire training dataset (unlike the blending ensemble) and making predictions on the test dataset (like the blending ensemble).

The example below evaluates each of the base models in isolation on the synthetic regression predictive modeling dataset.

In [22]:
# define dataset
X, y = get_dataset()

# split dataset into train and test sets
X_train_full, X_test, y_train_full, y_test = train_test_split(X, y, test_size=0.5, random_state=1)

# summarize data split
print('Train: %s, Test: %s' % (X_train_full.shape, X_test.shape))

# create the base models
models = get_models()

# evaluate standalone model
for name, model in models:
    # fit the model on the training dataset
    model.fit(X_train_full, y_train_full)
    
    # make a prediction on the test dataset
    yhat = model.predict(X_test)
    
    # evaluate the predictions
    score = mean_absolute_error(y_test, yhat)
    
    # report the score
    print('>%s MAE: %.3f' % (name, score))

Train: (5000, 20), Test: (5000, 20)
>lr MAE: 0.236
>knn MAE: 100.169
>cart MAE: 132.670
>svm MAE: 138.195


In this case, we can see that indeed the linear regression model has performed slightly better than the blending ensemble, achieving a MAE of 0.236 as compared to 0.237 with the ensemble. This may be because of the way that the synthetic dataset was constructed.

Nevertheless, in this case, we would choose to use the linear regression model directly on this problem. This highlights the importance of checking the performance of the contributing models before adopting an ensemble model as the final model.

### 4.1 Blending Ensemble for Regression - Prediction
Again, we may choose to use a blending ensemble as our final model for regression.

This involves fitting splitting the entire dataset into train and validation sets to fit the base and meta-models respectively, then the ensemble can be used to make a prediction for a new row of data.

In [23]:
# define dataset
X, y = get_dataset()

# split dataset set into train and validation sets
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.33, random_state=1)

# summarize data split
print('Train: %s, Val: %s' % (X_train.shape, X_val.shape))

# create the base models
models = get_models()

# train the blending ensemble
blender = fit_ensemble(models, X_train, X_val, y_train, y_val)

# make a prediction on a new row of data
row = [-0.24038754, 0.55423865, -0.48979221, 1.56074459, -1.16007611, 1.10049103, 1.18385406, -1.57344162, 0.97862519, -0.03166643, 1.77099821, 1.98645499, 0.86780193, 2.01534177, 2.51509494, -1.04609004, -0.19428148, -0.05967386, -2.67168985, 1.07182911]
yhat = predict_ensemble(models, blender, [row])

# summarize prediction
print('Predicted: %.3f' % (yhat[0]))

Train: (6700, 20), Val: (3300, 20)
Predicted: 359.985
