An example to compare multi-output regression with random forest and the [multioutput.MultiOutputRegressor](http://scikit-learn.org/stable/modules/multiclass.html#multiclass) meta-estimator.

This example illustrates the use of the [multioutput.MultiOutputRegressor](https://www.youtube.com/watch?v=SMs0GnYze34) meta-estimator to perform multi-output regression. A random forest regressor is used, which supports multi-output regression natively, so the results can be compared.

The random forest regressor will only ever predict values within the range of observations or closer to zero for each of the targets. As a result the predictions are biased towards the centre of the circle.

Using a single underlying feature the model learns both the x and y coordinate as output

### Version

In [1]:
import sklearn
sklearn.__version__

'0.18.1'

### Imports

This tutorial imports [RandomForestRegressor](http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestRegressor.html#sklearn.ensemble.RandomForestRegressor), [train_test_split](http://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html#sklearn.model_selection.train_test_split) and [MultiOutputRegressor](http://scikit-learn.org/stable/modules/generated/sklearn.multioutput.MultiOutputRegressor.html#sklearn.multioutput.MultiOutputRegressor).

In [2]:
import plotly.plotly as py
import plotly.graph_objs as go

import numpy as np
from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import train_test_split
from sklearn.multioutput import MultiOutputRegressor

### Calculations

In [3]:
# Create a random dataset
rng = np.random.RandomState(1)
X = np.sort(200 * rng.rand(600, 1) - 100, axis=0)
y = np.array([np.pi * np.sin(X).ravel(), np.pi * np.cos(X).ravel()]).T
y += (0.5 - rng.rand(*y.shape))

X_train, X_test, y_train, y_test = train_test_split(X, y,
                                                    train_size=400,
                                                    random_state=4)

max_depth = 30
regr_multirf = MultiOutputRegressor(RandomForestRegressor(max_depth=max_depth,
                                                          random_state=0))
regr_multirf.fit(X_train, y_train)

regr_rf = RandomForestRegressor(max_depth=max_depth, random_state=2)
regr_rf.fit(X_train, y_train)

# Predict on new data
y_multirf = regr_multirf.predict(X_test)
y_rf = regr_rf.predict(X_test)


### Plot Results

In [4]:
s = 50
a = 0.4

data = go.Scatter(x=y_test[:, 0], y=y_test[:, 1],
                  mode='markers',
                  marker=dict(color="navy",
                             line=dict(width=1, color='black')), 
                  name="Data",
                  opacity=0.8
                 )

multi_rf_score = go.Scatter(x=y_multirf[:, 0], y=y_multirf[:, 1],
                            mode='markers',
                            marker=dict(color="cornflowerblue", 
                                        line=dict(width=1, color='black')),
                            name="Multi RF score=%.2f" % regr_multirf.score(X_test, y_test),
                            opacity=0.8
                           )

rf_score = go.Scatter(x=y_rf[:, 0], y=y_rf[:, 1],
                      mode='markers',
                      marker=dict(color='cyan',
                                  line=dict(width=1, color='black')),
                      name="RF score=%.2f" % regr_rf.score(X_test, y_test),
                      opacity=0.8
                     )
data_ = [data, multi_rf_score, rf_score]
layout = go.Layout(title="Comparing random forests and the multi-output meta estimator",
                   xaxis=dict(title='target1', showgrid=False,
                              zeroline=False),
                   yaxis=dict(title='target2', showgrid=False,
                              zeroline=False),
                   hovermode='closest'
                  )

fig = go.Figure(data=data_, layout=layout)

In [5]:
py.iplot(fig)

### License

Author: 
    
        Tim Head <betatim@gmail.com>

License: 
    
        BSD 3 clause

In [3]:

from IPython.display import display, HTML

display(HTML('<link href="//fonts.googleapis.com/css?family=Open+Sans:600,400,300,200|Inconsolata|Ubuntu+Mono:400,700" rel="stylesheet" type="text/css" />'))
display(HTML('<link rel="stylesheet" type="text/css" href="http://help.plot.ly/documentation/all_static/css/ipython-notebook-custom.css">'))

! pip install git+https://github.com/plotly/publisher.git --upgrade
import publisher
publisher.publish(
    'Comparing random forests and the multi-output meta estimator.ipynb', 'scikit-learn/plot-random-forest-regression-multioutput/', 'Comparing Random Forests and the Multi-Output Meta Estimator | plotly',
    ' ',
    title = 'Comparing Random Forests and the Multi-Output Meta Estimator | plotly',
    name = 'Comparing Random Forests and the Multi-Output Meta Estimator',
    has_thumbnail='true', thumbnail='thumbnail/random-forests.jpg', 
    language='scikit-learn', page_type='example_index',
    display_as='ensemble_methods', order=6,
    ipynb= '~Diksha_Gabha/3018')

Collecting git+https://github.com/plotly/publisher.git
  Cloning https://github.com/plotly/publisher.git to /tmp/pip-zQi7J9-build
Installing collected packages: publisher
  Found existing installation: publisher 0.10
    Uninstalling publisher-0.10:
      Successfully uninstalled publisher-0.10
  Running setup.py install for publisher ... [?25l- done
[?25hSuccessfully installed publisher-0.10
