# Multioutput Regression: Sklearn

Multioutput regression are regression problems that involve predicting two or more numerical values given an input example. An example might be to predict a coordinate given an input, e.g. predicting x and y values. Another example would be multi-step time series forecasting that involves predicting multiple future time series of a given variable. Many machine learning algorithms are designed for predicting a single numeric value, referred to simply as regression. Some algorithms do support multioutput regression inherently, such as linear regression and decision trees. There are also special workaround models that can be used to wrap and use those algorithms that do not natively support predicting multiple outputs.

In multioutput regression, typically the outputs are dependent upon the input and upon each other. This means that often the outputs are not independent of each other and may require a model that predicts both outputs together or each output contingent upon the other outputs. Multi-step time series forecasting may be considered a type of multiple-output regression where a sequence of future values are predicted and each predicted value is dependent upon the prior values in the sequence.

We will use the make_regression() function to create a test dataset for multiple-output regression. We will generate 1,000 examples with 10 input features, five of which will be redundant and five that will be informative. The problem will require the prediction of two numeric values.

- Problem Input: 10 numeric variables.
- Problem Output: 2 numeric variables.

## Import libraries

In [1]:
import numpy as np
from sklearn.datasets import make_regression
from sklearn.linear_model import LinearRegression
from sklearn.neighbors import KNeighborsRegressor
from sklearn.tree import DecisionTreeRegressor
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import RepeatedKFold
from sklearn.svm import LinearSVR
from sklearn.multioutput import MultiOutputRegressor
from sklearn.multioutput import RegressorChain

## Load data + declare features and targets

In [2]:
# create datasets
X, y = make_regression(n_samples=1000, n_features=10, n_informative=5, n_targets=2, random_state=1, noise=0.5)
print(X.shape)
print(y.shape)

(1000, 10)
(1000, 2)


Some regression machine learning algorithms support multiple outputs directly. This includes most of the popular machine learning algorithms implemented in the scikit-learn library, such as:

- LinearRegression (and related)
- KNeighborsRegressor
- DecisionTreeRegressor
- RandomForestRegressor (and related)

## Inherently Multioutput Regression Algorithms

### Linear Regression for Multioutput Regression

In [3]:
# define model
model = LinearRegression()

In [4]:
# fit model
model.fit(X, y)

LinearRegression(copy_X=True, fit_intercept=True, n_jobs=None, normalize=False)

In [5]:
# make a prediction
row = [0.21947749, 0.32948997, 0.81560036, 0.440956, -0.0606303, -0.29257894, -0.2820059, -0.00290545, 0.96402263, 0.04992249]
yhat = model.predict([row])

In [6]:
# summarize prediction
print(yhat[0])

[50.06781717 64.564973  ]


Running the example fits the model and then makes a prediction for one input, confirming that the model predicted two required values.

### k-Nearest Neighbors for Multoutput Regression

In [7]:
# define model
model = KNeighborsRegressor()

In [8]:
# fit model
model.fit(X, y)

KNeighborsRegressor(algorithm='auto', leaf_size=30, metric='minkowski',
                    metric_params=None, n_jobs=None, n_neighbors=5, p=2,
                    weights='uniform')

In [9]:
# make a prediction
row = [0.21947749, 0.32948997, 0.81560036, 0.440956, -0.0606303, -0.29257894, -0.2820059, -0.00290545, 0.96402263, 0.04992249]
yhat = model.predict([row])

In [10]:
# summarize prediction
print(yhat[0])

[-11.73511093  52.78406297]


Running the example fits the model and then makes a prediction for one input, confirming that the model predicted two required values.

### Decision Tree for Multioutput Regression

In [11]:
# define model
model = DecisionTreeRegressor()

In [12]:
# fit model
model.fit(X, y)

DecisionTreeRegressor(ccp_alpha=0.0, criterion='mse', max_depth=None,
                      max_features=None, max_leaf_nodes=None,
                      min_impurity_decrease=0.0, min_impurity_split=None,
                      min_samples_leaf=1, min_samples_split=2,
                      min_weight_fraction_leaf=0.0, presort='deprecated',
                      random_state=None, splitter='best')

In [13]:
# make a prediction
row = [0.21947749, 0.32948997, 0.81560036, 0.440956, -0.0606303, -0.29257894, -0.2820059, -0.00290545, 0.96402263, 0.04992249]
yhat = model.predict([row])

In [14]:
# summarize prediction
print(yhat[0])

[49.93137149 64.08484989]


Running the example fits the model and then makes a prediction for one input, confirming that the model predicted two required values.

### Evaluate Multioutput Regression with Cross-Validation

We may want to evaluate a multioutput regression using k-fold cross-validation. This can be achieved in the same way as evaluating any other machine learning model. We will fit and evaluate a DecisionTreeRegressor model on the test problem using 10-fold cross-validation with three repeats. We will use the mean absolute error (MAE) performance metric as the score.

In [15]:
# define model
model = DecisionTreeRegressor()

In [16]:
# define the evaluation procedure
cv = RepeatedKFold(n_splits=10, n_repeats=3, random_state=1)

In [17]:
# evaluate the model and collect the scores
n_scores = cross_val_score(model, X, y, scoring='neg_mean_absolute_error', cv=cv, n_jobs=-1)

In [18]:
# force the scores to be positive
n_scores = np.abs(n_scores)

In [19]:
# summarize performance
print('MAE: %.3f (%.3f)' % (n_scores.mean(), n_scores.std()))

MAE: 52.010 (2.977)


Running the example evaluates the performance of the decision tree model for multioutput regression on the test problem. The mean and standard deviation of the MAE is reported calculated across all folds and all repeats.

## Wrapper Multioutput Regression Algorithms

### Direct Multioutput Regression

The direct approach to multioutput regression involves dividing the regression problem into a separate problem for each target variable to be predicted. This assumes that the outputs are independent of each other, which might not be a correct assumption. Nevertheless, this approach can provide surprisingly effective predictions on a range of problems and may be worth trying, at least as a performance baseline. For example, the outputs for your problem may, in fact, be mostly independent, if not completely independent, and this strategy can help you find out. This approach is supported by the MultiOutputRegressor class that takes a regression model as an argument. It will then create one instance of the provided model for each output in the problem.

The example below demonstrates how we can first create a single-output regression model then use the MultiOutputRegressor class to wrap the regression model and add support for multioutput regression.

In [20]:
# define base model
model = LinearSVR()

In [21]:
# define the direct multioutput wrapper model
wrapper = MultiOutputRegressor(model)

In [22]:
# define the evaluation procedure
cv = RepeatedKFold(n_splits=10, n_repeats=3, random_state=1)

In [23]:
# evaluate the model and collect the scores
n_scores = cross_val_score(wrapper, X, y, scoring='neg_mean_absolute_error', cv=cv, n_jobs=-1)

In [24]:
# force the scores to be positive
n_scores = np.abs(n_scores)

In [25]:
# summarize performance
print('MAE: %.3f (%.3f)' % (np.mean(n_scores), np.std(n_scores)))

MAE: 0.419 (0.024)


We can also use the direct multioutput regression wrapper as a final model and make predictions on new data. First, the model is fit on all available data, then the predict() function can be called to make predictions on new data. The example below demonstrates this on our synthetic multioutput regression dataset.

In [26]:
# fit the model on the whole dataset
wrapper.fit(X, y)

MultiOutputRegressor(estimator=LinearSVR(C=1.0, dual=True, epsilon=0.0,
                                         fit_intercept=True,
                                         intercept_scaling=1.0,
                                         loss='epsilon_insensitive',
                                         max_iter=1000, random_state=None,
                                         tol=0.0001, verbose=0),
                     n_jobs=None)

In [27]:
# make a single prediction
row = [0.21947749, 0.32948997, 0.81560036, 0.440956, -0.0606303, -0.29257894, -0.2820059, -0.00290545, 0.96402263, 0.04992249]
yhat = wrapper.predict([row])

In [28]:
# summarize the prediction
print('Predicted: %s' % yhat[0])

Predicted: [50.04689821 64.49918644]


Running the example fits the direct wrapper model on the entire dataset and is then used to make a prediction on a new row of data, as we might when using the model in an application.

### Chained Multioutput Regression

Another approach to using single-output regression models for multioutput regression is to create a linear sequence of models. The first model in the sequence uses the input and predicts one output; the second model uses the input and the output from the first model to make a prediction; the third model uses the input and output from the first two models to make a prediction, and so on. For example, if a multioutput regression problem required the prediction of three values y1, y2 and y3 given an input X, then this could be partitioned into three dependent single-output regression problems as follows:

- Problem 1: Given X, predict y1.
- Problem 2: Given X and yhat1, predict y2.
- Problem 3: Given X, yhat1, and yhat2, predict y3.

This can be achieved using the RegressorChain class in the scikit-learn library. The order of the models may be based on the order of the outputs in the dataset (the default) or specified via the “order” argument. For example, order=[0,1] would first predict the oth output, then the 1st output, whereas an order=[1,0] would first predict the last output variable and then the first output variable in our test problem.

The example below demonstrates how we can first create a single-output regression model then use the RegressorChain class to wrap the regression model and add support for multioutput regression.

In [29]:
# define base model
model = LinearSVR()

In [30]:
# define the chained multioutput wrapper model
wrapper = RegressorChain(model)

In [31]:
# define the evaluation procedure
cv = RepeatedKFold(n_splits=10, n_repeats=3, random_state=1)

In [32]:
# evaluate the model and collect the scores
n_scores = cross_val_score(wrapper, X, y, scoring='neg_mean_absolute_error', cv=cv, n_jobs=-1)

In [33]:
# force the scores to be positive
n_scores = np.abs(n_scores)

In [34]:
# summarize performance
print('MAE: %.3f (%.3f)' % (np.mean(n_scores), np.std(n_scores)))

MAE: 0.594 (0.322)


Running the example reports the mean and standard deviation MAE of the chained wrapper model. Note that you may see a ConvergenceWarning when running the example, which can be safely ignored.

We can also use the chained multioutput regression wrapper as a final model and make predictions on new data. First, the model is fit on all available data, then the predict() function can be called to make predictions on new data. The example below demonstrates this on our synthetic multioutput regression dataset.

In [35]:
# fit the model on the whole dataset
wrapper.fit(X, y)



RegressorChain(base_estimator=LinearSVR(C=1.0, dual=True, epsilon=0.0,
                                        fit_intercept=True,
                                        intercept_scaling=1.0,
                                        loss='epsilon_insensitive',
                                        max_iter=1000, random_state=None,
                                        tol=0.0001, verbose=0),
               cv=None, order=None, random_state=None)

In [36]:
# make a single prediction
row = [0.21947749, 0.32948997, 0.81560036, 0.440956, -0.0606303, -0.29257894, -0.2820059, -0.00290545, 0.96402263, 0.04992249]
yhat = wrapper.predict([row])

In [37]:
# summarize the prediction
print('Predicted: %s' % yhat[0])

Predicted: [50.04191782 64.44116998]


Running the example fits the chained wrapper model on the entire dataset and is then used to make a prediction on a new row of data, as we might when using the model in an application.