A 1D regression with decision tree.

The [decision trees](http://scikit-learn.org/stable/modules/tree.html#tree) is used to fit a sine curve with addition noisy observation. As a result, it learns local linear regressions approximating the sine curve.

We can see that if the maximum depth of the tree (controlled by the max_depth parameter) is set too high, the decision trees learn too fine details of the training data and learn from the noise, i.e. they overfit.

#### New to Plotly?
Plotly's Python library is free and open source! [Get started](https://plot.ly/python/getting-started/) by downloading the client and [reading the primer](https://plot.ly/python/getting-started/).
<br>You can set up Plotly to work in [online](https://plot.ly/python/getting-started/#initialization-for-online-plotting) or [offline](https://plot.ly/python/getting-started/#initialization-for-offline-plotting) mode, or in [jupyter notebooks](https://plot.ly/python/getting-started/#start-plotting-online).
<br>We also have a quick-reference [cheatsheet](https://images.plot.ly/plotly-documentation/images/python_cheat_sheet.pdf) (new!) to help you get started!

### Version

In [1]:
import sklearn
sklearn.__version__

'0.18.1'

### Imports

In [2]:
print(__doc__)
import plotly.plotly as py
import plotly.graph_objs as go

import numpy as np
from sklearn.tree import DecisionTreeRegressor

Automatically created module for IPython interactive environment


### Calculations

In [3]:
# Create a random dataset
rng = np.random.RandomState(1)
X = np.sort(5 * rng.rand(80, 1), axis=0)
y = np.sin(X).ravel()
y[::5] += 3 * (0.5 - rng.rand(16))

# Fit regression model
regr_1 = DecisionTreeRegressor(max_depth=2)
regr_2 = DecisionTreeRegressor(max_depth=5)
regr_1.fit(X, y)
regr_2.fit(X, y)

# Predict
X_test = np.arange(0.0, 5.0, 0.01)[:, np.newaxis]
y_1 = regr_1.predict(X_test)
y_2 = regr_2.predict(X_test)

### Plot Results

In [4]:
def data_to_plotly(x):
    k = []
    
    for i in range(0, len(x)):
        k.append(x[i][0])
        
    return k

In [5]:

p1 = go.Scatter(x=data_to_plotly(X), y=y, 
                mode='markers',
                marker=dict(color="darkorange"),
                name="data")

p2 = go.Scatter(x=data_to_plotly(X_test), y=y_1, 
                mode='lines',
                line=dict(color="cornflowerblue"),
                name="max_depth=2")

p3 = go.Scatter(x=data_to_plotly(X_test), y=y_2, 
                mode='lines',
                line=dict(color="yellowgreen"),
                name="max_depth=5")

layout = go.Layout(xaxis=dict(title="data"),
                   yaxis=dict(title="target"),
                   title="Decision Tree Regression"
                  )
fig = go.Figure(data=[p1, p2, p3], layout=layout)

In [6]:
py.iplot(fig)

In [8]:
from IPython.display import display, HTML

display(HTML('<link href="//fonts.googleapis.com/css?family=Open+Sans:600,400,300,200|Inconsolata|Ubuntu+Mono:400,700" rel="stylesheet" type="text/css" />'))
display(HTML('<link rel="stylesheet" type="text/css" href="http://help.plot.ly/documentation/all_static/css/ipython-notebook-custom.css">'))

! pip install git+https://github.com/plotly/publisher.git --upgrade
import publisher
publisher.publish(
    'Decision Tree Regression.ipynb', 'scikit-learn/plot-tree-regression/', 'Decision Tree Regression | plotly',
    '',
    title = 'Decision Tree Regression | plotly',
    name = 'Decision Tree Regression',
    has_thumbnail='true', thumbnail='thumbnail/dt-reg.jpg', 
    language='scikit-learn', page_type='example_index',
    display_as='decision_trees', order=1,
    ipynb= '~Diksha_Gabha/3602')

Collecting git+https://github.com/plotly/publisher.git
  Cloning https://github.com/plotly/publisher.git to /tmp/pip-bBlLNq-build
Installing collected packages: publisher
  Found existing installation: publisher 0.10
    Uninstalling publisher-0.10:
      Successfully uninstalled publisher-0.10
  Running setup.py install for publisher ... [?25l- done
[?25hSuccessfully installed publisher-0.10
