This example illustrates and compares the bias-variance decomposition of the expected mean squared error of a single estimator against a bagging ensemble.

In regression, the expected mean squared error of an estimator can be decomposed in terms of bias, variance and noise. On average over datasets of the regression problem, the bias term measures the average amount by which the predictions of the estimator differ from the predictions of the best possible estimator for the problem (i.e., the Bayes model). The variance term measures the variability of the predictions of the estimator when fit over different instances LS of the problem. Finally, the noise measures the irreducible part of the error which is due the variability in the data.

The upper left figure illustrates the predictions (in dark red) of a single decision tree trained over a random dataset LS (the blue dots) of a toy 1d regression problem. It also illustrates the predictions (in light red) of other single decision trees trained over other (and different) randomly drawn instances LS of the problem. 

Intuitively, the variance term here corresponds to the width of the beam of predictions (in light red) of the individual estimators. The larger the variance, the more sensitive are the predictions for x to small changes in the training set. The bias term corresponds to the difference between the average prediction of the estimator (in cyan) and the best possible model (in dark blue). On this problem, we can thus observe that the bias is quite low (both the cyan and the blue curves are close to each other) while the variance is large (the red beam is rather wide).

The lower left figure plots the pointwise decomposition of the expected mean squared error of a single decision tree. It confirms that the bias term (in blue) is low while the variance is large (in green). It also illustrates the noise part of the error which, as expected, appears to be constant and around 0.01.

The right figures correspond to the same plots but using instead a bagging ensemble of decision trees. In both figures, we can observe that the bias term is larger than in the previous case. In the upper right figure, the difference between the average prediction (in cyan) and the best possible model is larger (e.g., notice the offset around x=2). In the lower right figure, the bias curve is also slightly higher than in the lower left figure. In terms of variance however, the beam of predictions is narrower, which suggests that the variance is lower. Indeed, as the lower right figure confirms, the variance term (in green) is lower than for single decision trees. Overall, the bias- variance decomposition is therefore no longer the same. The tradeoff is better for bagging: averaging several decision trees fit on bootstrap copies of the dataset slightly increases the bias term but allows for a larger reduction of the variance, which results in a lower overall mean squared error (compare the red curves int the lower figures). The script output also confirms this intuition. The total error of the bagging ensemble is lower than the total error of a single decision tree, and this difference indeed mainly stems from a reduced variance.

#### New to Plotly?
Plotly's Python library is free and open source! [Get started](https://plot.ly/python/getting-started/) by downloading the client and [reading the primer](https://plot.ly/python/getting-started/).
<br>You can set up Plotly to work in [online](https://plot.ly/python/getting-started/#initialization-for-online-plotting) or [offline](https://plot.ly/python/getting-started/#initialization-for-offline-plotting) mode, or in [jupyter notebooks](https://plot.ly/python/getting-started/#start-plotting-online).
<br>We also have a quick-reference [cheatsheet](https://images.plot.ly/plotly-documentation/images/python_cheat_sheet.pdf) (new!) to help you get started!

### Version

In [1]:
import sklearn
sklearn.__version__

'0.18.1'

### Imports

This tutorial imports [BaggingRegressor](http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.BaggingRegressor.html#sklearn.ensemble.BaggingRegressor) and [DecisionTreeRegressor](http://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeRegressor.html#sklearn.tree.DecisionTreeRegressor).

In [2]:
print(__doc__)

import plotly.plotly as py
import plotly.graph_objs as go
from plotly import tools

import numpy as np
from sklearn.ensemble import BaggingRegressor
from sklearn.tree import DecisionTreeRegressor

Automatically created module for IPython interactive environment


### Calculations

In [3]:
n_repeat = 50       # Number of iterations for computing expectations
n_train = 50        # Size of the training set
n_test = 1000       # Size of the test set
noise = 0.1         # Standard deviation of the noise
np.random.seed(0)

# Change this for exploring the bias-variance decomposition of other
# estimators. This should work well for estimators with high variance (e.g.,
# decision trees or KNN), but poorly for estimators with low variance (e.g.,
# linear models).
estimators = [("Tree", DecisionTreeRegressor()),
              ("Bagging(Tree)", BaggingRegressor(DecisionTreeRegressor()))]

n_estimators = len(estimators)

# Generate data
def f(x):
    x = x.ravel()

    return np.exp(-x ** 2) + 1.5 * np.exp(-(x - 2) ** 2)

def generate(n_samples, noise, n_repeat=1):
    X = np.random.rand(n_samples) * 10 - 5
    X = np.sort(X)

    if n_repeat == 1:
        y = f(X) + np.random.normal(0.0, noise, n_samples)
    else:
        y = np.zeros((n_samples, n_repeat))

        for i in range(n_repeat):
            y[:, i] = f(X) + np.random.normal(0.0, noise, n_samples)

    X = X.reshape((n_samples, 1))

    return X, y

X_train = []
y_train = []

for i in range(n_repeat):
    X, y = generate(n_samples=n_train, noise=noise)
    X_train.append(X)
    y_train.append(y)

X_test, y_test = generate(n_samples=n_test, noise=noise, n_repeat=n_repeat)


### Plot Results

In [4]:
def data_to_plotly(x):
    k = []
    
    for i in range(0, len(x)):
        k.append(x[i][0])
        
    return k

In [5]:
fig = tools.make_subplots(rows=2, cols=2, 
                          subplot_titles=('Tree', 'Bagging(tree)'))

This is the format of your plot grid:
[ (1,1) x1,y1 ]  [ (1,2) x2,y2 ]
[ (2,1) x3,y3 ]  [ (2,2) x4,y4 ]



In [6]:
col = 1
row = 1

for n, (name, estimator) in enumerate(estimators):
    # Compute predictions
    y_predict = np.zeros((n_test, n_repeat))

    for i in range(n_repeat):
        estimator.fit(X_train[i], y_train[i])
        y_predict[:, i] = estimator.predict(X_test)

    # Bias^2 + Variance + Noise decomposition of the mean squared error
    y_error = np.zeros(n_test)

    for i in range(n_repeat):
        for j in range(n_repeat):
            y_error += (y_test[:, j] - y_predict[:, i]) ** 2

    y_error /= (n_repeat * n_repeat)

    y_noise = np.var(y_test, axis=1)
    y_bias = (f(X_test) - np.mean(y_predict, axis=1)) ** 2
    y_var = np.var(y_predict, axis=1)

    print("{0}: {1:.4f} (error) = {2:.4f} (bias^2) "
          " + {3:.4f} (var) + {4:.4f} (noise)".format(name,
                                                      np.mean(y_error),
                                                      np.mean(y_bias),
                                                      np.mean(y_var),
                                                      np.mean(y_noise)))

    # Plot figures
    if(col==1):
        legend = True
    else:
        legend = False
    
    for i in range(n_repeat):
        if i == 0:
            p3 = go.Scatter(x=data_to_plotly(X_test), 
                            y=y_predict[:, i],
                            showlegend=legend,
                            mode='lines',
                            line=dict(color='red'),
                            name="<i>y(x)</i>")
            fig.append_trace(p3, row, col)
        else:
            p3 = go.Scatter(x=data_to_plotly(X_test),
                            y=y_predict[:, i], 
                            showlegend=False,                       
                            mode='lines',
                            line=dict(color='red', width=0.3))
            fig.append_trace(p3, row, col)
            
    p1 = go.Scatter(x=data_to_plotly(X_test),
                    y=f(X_test), 
                    showlegend=legend,
                    mode='lines',
                    line=dict(color='blue'),
                    name="<i>f(x)</i>")
    fig.append_trace(p1, row, col)
    
    p2 = go.Scatter(x=data_to_plotly(X_train[0]), 
                    y=y_train[0], 
                    showlegend=legend,
                    mode='lines',
                    line=dict(color='blue', dash='dot'),
                    name="LS ~ <i>y = f(x)+noise</i>")
    fig.append_trace(p2, row, col)

    p4 = go.Scatter(x=data_to_plotly(X_test),
                    y=np.mean(y_predict, axis=1),
                    showlegend=legend,
                    mode='lines',
                    line=dict(color='cyan'),
                    name="<i>E<sub>LS</sub>y(x)</i>")
    fig.append_trace(p4, row, col)
    
    
    p5 = go.Scatter(x=data_to_plotly(X_test),
                    y=y_error, 
                    showlegend=legend,
                    mode='lines',
                    line=dict(color='red'), 
                    name="<i>error(x)</i>")
    fig.append_trace(p5, row+1, col)
    
    p6 = go.Scatter(x=data_to_plotly(X_test), 
                    y=y_bias, 
                    showlegend=legend,
                    mode='lines',
                    line=dict(color='blue'), 
                    name="<i>bias<sup>2</sup>(x)</i>")
    fig.append_trace(p6, row+1, col)
    
    p7 = go.Scatter(x=data_to_plotly(X_test),
                    y=y_var,
                    showlegend=legend,
                    mode='lines',
                    line=dict(color='green'),  
                    name="<i>variance(x)</i>")
    fig.append_trace(p7, row+1, col)
    
    p8 = go.Scatter(x=data_to_plotly(X_test), 
                    y=y_noise, 
                    showlegend=legend,
                    mode='lines',
                    line=dict(color='cyan'),
                    name="<i>noise(x)</i>")
    fig.append_trace(p8, row+1, col)
    
    row = 1
    col += 1

Tree: 0.0255 (error) = 0.0003 (bias^2)  + 0.0152 (var) + 0.0098 (noise)
Bagging(Tree): 0.0196 (error) = 0.0004 (bias^2)  + 0.0092 (var) + 0.0098 (noise)


In [7]:
fig['layout'].update(height=700, hovermode='closest')

for i in map(str, range(1, 5)):
    x = 'xaxis' + i
    y = 'yaxis' + i
    fig['layout'][x].update(showgrid=False, zeroline=False)
    fig['layout'][y].update(showgrid=False, zeroline=False)
                     
py.iplot(fig)

The draw time for this plot will be slow for all clients.



Estimated Draw Time Too Long



### References

T. Hastie, R. Tibshirani and J. Friedman, “Elements of Statistical Learning”, Springer, 2009.

### License

Author: 

         Gilles Louppe <g.louppe@gmail.com>
         
License: 

         BSD 3 clause

In [9]:
from IPython.display import display, HTML

display(HTML('<link href="//fonts.googleapis.com/css?family=Open+Sans:600,400,300,200|Inconsolata|Ubuntu+Mono:400,700" rel="stylesheet" type="text/css" />'))
display(HTML('<link rel="stylesheet" type="text/css" href="http://help.plot.ly/documentation/all_static/css/ipython-notebook-custom.css">'))

! pip install git+https://github.com/plotly/publisher.git --upgrade
import publisher
publisher.publish(
    'Single estimator versus bagging bias-variance decomposition.ipynb', 'scikit-learn/plot-bias-variance/', 'Single Estimator Versus Bagging Bias-Variance Decomposition | plotly',
    ' ',
    title = 'Single Estimator Versus Bagging Bias-Variance Decomposition | plotly',
    name = 'Single Estimator Versus Bagging Bias-Variance Decomposition',
    has_thumbnail='true', thumbnail='thumbnail/bias-variance.jpg', 
    language='scikit-learn', page_type='example_index',
    display_as='ensemble_methods', order=19,
    ipynb= '~Diksha_Gabha/3061')

Collecting git+https://github.com/plotly/publisher.git
  Cloning https://github.com/plotly/publisher.git to /tmp/pip-t_UxXM-build
Installing collected packages: publisher
  Found existing installation: publisher 0.10
    Uninstalling publisher-0.10:
      Successfully uninstalled publisher-0.10
  Running setup.py install for publisher ... [?25l- done
[?25hSuccessfully installed publisher-0.10
