Skip to content

Commit

Permalink
[MRG] Grammar fixes for documentation of covariance package. (#11025)
Browse files Browse the repository at this point in the history
  • Loading branch information
Daniel Gomez authored and TomDLT committed Apr 25, 2018
1 parent 481dac7 commit 65489ca
Showing 1 changed file with 26 additions and 27 deletions.
53 changes: 26 additions & 27 deletions doc/modules/covariance.rst
Expand Up @@ -7,14 +7,13 @@ Covariance estimation
.. currentmodule:: sklearn.covariance


Many statistical problems require at some point the estimation of a
Many statistical problems require the estimation of a
population's covariance matrix, which can be seen as an estimation of
data set scatter plot shape. Most of the time, such an estimation has
to be done on a sample whose properties (size, structure, homogeneity)
has a large influence on the estimation's quality. The
`sklearn.covariance` package aims at providing tools affording
an accurate estimation of a population's covariance matrix under
various settings.
have a large influence on the estimation's quality. The
`sklearn.covariance` package provides tools for accurately estimating
a population's covariance matrix under various settings.

We assume that the observations are independent and identically
distributed (i.i.d.).
Expand All @@ -24,22 +23,22 @@ Empirical covariance
====================

The covariance matrix of a data set is known to be well approximated
with the classical *maximum likelihood estimator* (or "empirical
by the classical *maximum likelihood estimator* (or "empirical
covariance"), provided the number of observations is large enough
compared to the number of features (the variables describing the
observations). More precisely, the Maximum Likelihood Estimator of a
sample is an unbiased estimator of the corresponding population
sample is an unbiased estimator of the corresponding population's
covariance matrix.

The empirical covariance matrix of a sample can be computed using the
:func:`empirical_covariance` function of the package, or by fitting an
:class:`EmpiricalCovariance` object to the data sample with the
:meth:`EmpiricalCovariance.fit` method. Be careful that depending
whether the data are centered or not, the result will be different, so
one may want to use the ``assume_centered`` parameter accurately. More precisely
if one uses ``assume_centered=False``, then the test set is supposed to have the
same mean vector as the training set. If not so, both should be centered by the
user, and ``assume_centered=True`` should be used.
:meth:`EmpiricalCovariance.fit` method. Be careful that results depend
on whether the data are centered, so one may want to use the
``assume_centered`` parameter accurately. More precisely, if
``assume_centered=False``, then the test set is supposed to have the
same mean vector as the training set. If not, both should be centered
by the user, and ``assume_centered=True`` should be used.

.. topic:: Examples:

Expand All @@ -64,17 +63,17 @@ empirical covariance matrix cannot be inverted for numerical
reasons. To avoid such an inversion problem, a transformation of the
empirical covariance matrix has been introduced: the ``shrinkage``.

In the scikit-learn, this transformation (with a user-defined shrinkage
In scikit-learn, this transformation (with a user-defined shrinkage
coefficient) can be directly applied to a pre-computed covariance with
the :func:`shrunk_covariance` method. Also, a shrunk estimator of the
covariance can be fitted to data with a :class:`ShrunkCovariance` object
and its :meth:`ShrunkCovariance.fit` method. Again, depending whether
the data are centered or not, the result will be different, so one may
want to use the ``assume_centered`` parameter accurately.
and its :meth:`ShrunkCovariance.fit` method. Again, results depend on
whether the data are centered, so one may want to use the
``assume_centered`` parameter accurately.


Mathematically, this shrinkage consists in reducing the ratio between the
smallest and the largest eigenvalue of the empirical covariance matrix.
smallest and the largest eigenvalues of the empirical covariance matrix.
It can be done by simply shifting every eigenvalue according to a given
offset, which is equivalent of finding the l2-penalized Maximum
Likelihood Estimator of the covariance matrix. In practice, shrinkage
Expand All @@ -95,7 +94,7 @@ bias/variance trade-off, and is discussed below.
Ledoit-Wolf shrinkage
---------------------

In their 2004 paper [1]_, O. Ledoit and M. Wolf propose a formula so as
In their 2004 paper [1]_, O. Ledoit and M. Wolf propose a formula
to compute the optimal shrinkage coefficient :math:`\alpha` that
minimizes the Mean Squared Error between the estimated and the real
covariance matrix.
Expand Down Expand Up @@ -190,10 +189,10 @@ The matrix inverse of the covariance matrix, often called the precision
matrix, is proportional to the partial correlation matrix. It gives the
partial independence relationship. In other words, if two features are
independent conditionally on the others, the corresponding coefficient in
the precision matrix will be zero. This is why it makes sense to estimate
a sparse precision matrix: by learning independence relations from the
data, the estimation of the covariance matrix is better conditioned. This
is known as *covariance selection*.
the precision matrix will be zero. This is why it makes sense to
estimate a sparse precision matrix: the estimation of the covariance
matrix is better conditioned by learning independence relations from
the data. This is known as *covariance selection*.

In the small-samples situation, in which ``n_samples`` is on the order
of ``n_features`` or smaller, sparse inverse covariance estimators tend to work
Expand Down Expand Up @@ -273,13 +272,13 @@ paper. It is the same algorithm as in the R ``glasso`` package.
Robust Covariance Estimation
============================

Real data set are often subjects to measurement or recording
Real data sets are often subject to measurement or recording
errors. Regular but uncommon observations may also appear for a variety
of reason. Every observation which is very uncommon is called an
outlier.
of reasons. Observations which are very uncommon are called
outliers.
The empirical covariance estimator and the shrunk covariance
estimators presented above are very sensitive to the presence of
outlying observations in the data. Therefore, one should use robust
outliers in the data. Therefore, one should use robust
covariance estimators to estimate the covariance of its real data
sets. Alternatively, robust covariance estimators can be used to
perform outlier detection and discard/downweight some observations
Expand Down

0 comments on commit 65489ca

Please sign in to comment.