### Sparse Covariance Estimation for Gaussian Random Vectors

Reference: Section 7.1.1, Boyd & Vandenberghe "Convex Optimization"

The problem is to find a sparse estimate $\Theta$ of an inverse covariance matrix (aka information or concentration matrix) of a Gaussian random vector $X$ with covariance matrix $R$. We are given a bunch of independent empirical measurements $x_1,...,x_m$ and can use them to find an empirical estimate of the covariance matrix $S = \frac{1}{m}\sum X_iX_i^T$. The maximum likelihood estimate of the covariance matrix is given by maximizing the log likelihood of the multivariate Gaussian pdf:

\begin{equation}
\log p(X) = (-mn/2) \log 2\pi - (m/2) \log \det R - (1/2) \sum_{k = 1}^m x_k^TR^{-1}x_k 
\end{equation}

\begin{equation}
= (-mn/2) \log 2\pi - (m/2) \log \det R - (m/2) \text{tr}(RS)  
\end{equation}

This is not concave in $R$, but it is convex in $\Theta = R^{-1}$. The ML estimation problem becomes:

\begin{equation*}
  \begin{aligned}
    &\text{minimize} && - \log \det \Theta + \text{tr}(S\Theta)  
  \end{aligned}
\end{equation*}

To reward sparse solutions, we add an $\ell_1$ regularization term on the off-diagonals of the information matrix to get the final problem:

\begin{equation*}
  \begin{aligned}
    &\text{minimize} && \lambda\sum_{i \neq j} \left| \Theta_{ij} \right| - \log \det \Theta + \text{tr}(S\Theta)  
  \end{aligned}
\end{equation*}

We also require that $\Theta$ be positive semidefinite.



In [80]:
import cvxpy as cp
import numpy as np
import scipy as sp

# Variable declarations

np.random.seed(0)

m = 10
n = 20
lam = float(0.1)

import scipy.sparse as sps

A = sps.rand(n,n, 0.01)
A = np.asarray(A.T.dot(A).todense() + 0.1*np.eye(n))
L = np.linalg.cholesky(np.linalg.inv(A)) # Sparse
X = np.random.randn(m,n).dot(L.T) # Draw m experiments according to the covariance matrix A^-1
S = X.T.dot(X)/m # Estimate of covariance matrix
W = np.ones((n,n)) - np.eye(n)

Theta = cp.Variable(n, n)

Why don't we need to write $\texttt{Theta = cp.Semidef(n, n)}$? Note the objective function is symmetric across the diagonal of the variable $\Theta$. Therefore, we expect the solution to be symmetric since given some asymmetric matrix $B$,  we will find that $B^T$ gives the same objective value, and since $C = (B+B^T)/2$ is symmetric, by the convexity of the objective, $C$ gives an objective value no larger than that of $B$.

The solution must be positive definite because that is the domain of the log det function.

In [81]:
# Problem construction
prob = None
opt_val = None

prob = cp.Problem(cp.Minimize(
        lam*cp.norm1(cp.mul_elemwise(W,Theta)) +
        cp.sum_entries(cp.mul_elemwise(S,Theta)) -
        cp.log_det(Theta)))


# For debugging individual problems:
if __name__ == "__main__":
    prob.solve()
    print("status:", prob.status)
    print("optimal value:", prob.value)
    print("true optimal value:", opt_val)

('status:', 'optimal')
('optimal value:', 32.21413536302113)
('true optimal value:', None)
