The usual covariance maximum likelihood estimate can be regularized using shrinkage. Ledoit and Wolf proposed a close formula to compute the asymptotically optimal shrinkage parameter (minimizing a MSE criterion), yielding the Ledoit-Wolf covariance estimate.

Chen et al. proposed an improvement of the Ledoit-Wolf shrinkage parameter, the OAS coefficient, whose convergence is significantly better under the assumption that the data are Gaussian.

This example, inspired from Chen’s publication [1], shows a comparison of the estimated MSE of the LW and OAS 
methods, using Gaussian distributed data.

[1] “Shrinkage Algorithms for MMSE Covariance Estimation” Chen et al., IEEE Trans. on Sign. Proc., Volume 58, Issue 10, October 2010.

#### New to Plotly?
Plotly's Python library is free and open source! [Get started](https://plot.ly/python/getting-started/) by downloading the client and [reading the primer](https://plot.ly/python/getting-started/).
<br>You can set up Plotly to work in [online](https://plot.ly/python/getting-started/#initialization-for-online-plotting) or [offline](https://plot.ly/python/getting-started/#initialization-for-offline-plotting) mode, or in [jupyter notebooks](https://plot.ly/python/getting-started/#start-plotting-online).
<br>We also have a quick-reference [cheatsheet](https://images.plot.ly/plotly-documentation/images/python_cheat_sheet.pdf) (new!) to help you get started!

### Version

In [1]:
import sklearn
sklearn.__version__

'0.18'

### Imports

This tutorial imports [toeplitz](http://docs.scipy.org/doc/scipy-0.11.0/reference/generated/scipy.linalg.toeplitz.html#scipy.linalg.toeplitz), [cholesky](http://docs.scipy.org/doc/scipy-0.11.0/reference/generated/scipy.linalg.cholesky.html#scipy.linalg.cholesky),[LedoitWolf](http://scikit-learn.org/stable/modules/generated/sklearn.covariance.LedoitWolf.html#sklearn.covariance.LedoitWolf) and [OAS](http://scikit-learn.org/stable/modules/generated/sklearn.covariance.OAS.html#sklearn.covariance.OAS).

In [2]:
print(__doc__)

import plotly.plotly as py
import plotly.graph_objs as go

import numpy as np
from scipy.linalg import toeplitz, cholesky

from sklearn.covariance import LedoitWolf, OAS

Automatically created module for IPython interactive environment


### Calculations

In [3]:
np.random.seed(0)
n_features = 100
# simulation covariance matrix (AR(1) process)
r = 0.1
real_cov = toeplitz(r ** np.arange(n_features))
coloring_matrix = cholesky(real_cov)

n_samples_range = np.arange(6, 31, 1)
repeat = 100
lw_mse = np.zeros((n_samples_range.size, repeat))
oa_mse = np.zeros((n_samples_range.size, repeat))
lw_shrinkage = np.zeros((n_samples_range.size, repeat))
oa_shrinkage = np.zeros((n_samples_range.size, repeat))
for i, n_samples in enumerate(n_samples_range):
    for j in range(repeat):
        X = np.dot(
            np.random.normal(size=(n_samples, n_features)), coloring_matrix.T)

        lw = LedoitWolf(store_precision=False, assume_centered=True)
        lw.fit(X)
        lw_mse[i, j] = lw.error_norm(real_cov, scaling=False)
        lw_shrinkage[i, j] = lw.shrinkage_

        oa = OAS(store_precision=False, assume_centered=True)
        oa.fit(X)
        oa_mse[i, j] = oa.error_norm(real_cov, scaling=False)
        oa_shrinkage[i, j] = oa.shrinkage_

### Plot MSE

In [4]:
Ledoit_Wolf = go.Scatter(x=n_samples_range, 
                  y=lw_mse.mean(1), 
                  error_y=dict(visible=True, arrayminus=lw_mse.std(1)),
                  name='Ledoit-Wolf', 
                  mode='lines',
                  line= dict(color='navy', width=2)
                 )
OAS = go.Scatter(x=n_samples_range, 
                 y=oa_mse.mean(1), 
                 error_y=dict(visible=True, arrayminus=oa_mse.std(1)),
                 name='OAS', 
                 mode='lines',
                 line=dict(color='#FF8C00', width=2)
                )

data = [Ledoit_Wolf, OAS]
layout = go.Layout(title="Comparison of covariance estimators",
                   yaxis=dict(title="Squared error"),
                   xaxis=dict(title="n_samples")
                  )

fig = go.Figure(data=data, layout=layout)

In [5]:
py.iplot(fig)

### Plot shrinkage coefficient

In [6]:

Ledoit_Wolf = go.Scatter(x=n_samples_range, 
                    y=lw_shrinkage.mean(1),
                    error_y=dict(visible=True, arrayminus=lw_mse.std(1)),
                    name='Ledoit-Wolf', 
                    mode='lines',
                    line= dict(color='navy', width=2)
                    )

OAS = go.Scatter(x=n_samples_range, 
                 y=oa_shrinkage.mean(1), 
                 error_y=dict(visible=True, arrayminus=oa_shrinkage.std(1)),
                 name='OAS', 
                 mode='lines',
                 line=dict(color='#FF8C00', width=2)
                )

data = [Ledoit_Wolf, OAS]
layout = go.Layout(title="Comparison of covariance estimators",
                   yaxis=dict(title="Shrinkage"),
                   xaxis=dict(title="n_samples")
                  )

fig = go.Figure(data=data, layout=layout)

In [7]:
py.iplot(fig)

In [2]:
from IPython.display import display, HTML

display(HTML('<link href="//fonts.googleapis.com/css?family=Open+Sans:600,400,300,200|Inconsolata|Ubuntu+Mono:400,700" rel="stylesheet" type="text/css" />'))
display(HTML('<link rel="stylesheet" type="text/css" href="http://help.plot.ly/documentation/all_static/css/ipython-notebook-custom.css">'))

! pip install git+https://github.com/plotly/publisher.git --upgrade
import publisher
publisher.publish(
    'Ledoit-Wolf-vs-OAS-estimation.ipynb', 'scikit-learn/plot-lw-vs-oas/', 'Ledoit-Wolf vs OAS Estimation | plotly',
    ' ',
    title = 'Ledoit-Wolf vs OAS Estimation | plotly',
    name = 'Ledoit-Wolf vs OAS Estimation',
    has_thumbnail='true', thumbnail='thumbnail/ledoit.jpg', 
    language='scikit-learn', page_type='example_index',
    display_as='covariance_estimation', order=1,
    ipynb= '~Diksha_Gabha/2871')

Collecting git+https://github.com/plotly/publisher.git
  Cloning https://github.com/plotly/publisher.git to /tmp/pip-mhXnyL-build
Installing collected packages: publisher
  Running setup.py install for publisher ... [?25l- error
    Complete output from command /usr/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-mhXnyL-build/setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /tmp/pip-neKoOx-record/install-record.txt --single-version-externally-managed --compile:
    running install
    running build
    running build_py
    creating build
    creating build/lib.linux-x86_64-2.7
    creating build/lib.linux-x86_64-2.7/publisher
    copying publisher/publisher.py -> build/lib.linux-x86_64-2.7/publisher
    copying publisher/__init__.py -> build/lib.linux-x86_64-2.7/publisher
    running install_lib
    creating /usr/local/lib/python2.7/dist-packages/publisher
    error: could not create 