# Gaussian Graphical Model on a 4â€‘Cycle
In this notebook we study a 4-dimensional Gaussian graphical model whose 
precision matrix corresponds to the **cycle graph**:

$$
1 \;-\; 2 \;-\; 3 \;-\; 4 \;-\; 1.
$$

Zeros in the **precision matrix** encode conditional independences.  
We will:

1. Define the precision matrix $ \Theta $
2. Compute the covariance matrix $ \Sigma = \Theta^{-1} $
3. Simulate a sample $X_1,\dots,X_N \sim \mathcal{N}(0, \Sigma)$
4. Standardize the margins to variance 1
5. Estimate covariance and precision matrices from data
6. Compute the maximum-likelihood estimate (MLE) of the precision matrix


## 1. Define the graph and precision matrix
Consider the following precision matrix 

$$
\Theta =
\begin{pmatrix}
10 & 5 & 0 & 3 \\
5 & 10 & 5 & 0 \\
0 & 5 & 10 & 5 \\
3 & 0 & 5 & 10
\end{pmatrix}.
$$

We define the matrix $\Sigma$ as the inverse of $\Theta$, that is, $\Sigma = \Theta^{-1}$.

In [1]:
import numpy as np
import pandas as pd

d = 4
Theta = np.array([
    [10,  5,  0,  3],
    [ 5, 10,  5,  0],
    [ 0,  5, 10,  5],
    [ 3,  0,  5, 10]
])
Sigma = np.linalg.inv(Theta)
Sigma

array([[ 0.29411765, -0.26470588,  0.23529412, -0.20588235],
       [-0.26470588,  0.38823529, -0.31176471,  0.23529412],
       [ 0.23529412, -0.31176471,  0.38823529, -0.26470588],
       [-0.20588235,  0.23529412, -0.26470588,  0.29411765]])

## 2. Simulate Gaussian sample
We simulate a large sample from a centered multivariate normal distribution with covariance matrix $\Sigma$.


In [2]:
N = 10**6
mean = np.zeros(d)
X = np.random.multivariate_normal(mean, Sigma, size=N)
X[:5]

array([[ 0.62154872, -0.71801739,  0.17602042, -0.16783284],
       [ 0.54163047, -0.4116966 ,  0.96618277, -0.69574101],
       [ 0.00518286, -0.14638079,  0.30370709, -0.29053133],
       [-0.11780136,  0.03063679,  0.60636705, -0.36270261],
       [ 0.09513002, -0.29269288,  0.06902821,  0.4848796 ]])

## 3. Standardize margins
We standardize the margins to variance 1

In [3]:
std = np.sqrt(np.diag(Sigma))
X = X / std
np.var(X, axis=0)

array([0.9981154 , 0.99722568, 0.99894653, 0.99900666])

## 4. Estimate covariance and precision
We compute the MLE covariance matrix $$
S = \frac{1}{n} \sum_{i =1}^n X_i^t X_i
$$
from our sample and then invert it to get an estimate of the precision matrix. 

In [4]:
EstimSigma = np.cov(X, rowvar=False)
MLE_EstimSigma = ((N-1)/N) * EstimSigma
EstimTheta = np.linalg.inv(MLE_EstimSigma)
print("This is the MLE for the covariance matrix:\n"  , MLE_EstimSigma)
print("This is the estimate of the precision matrix computed as the inverse of the MLE covariance matrix:\n"  , EstimTheta)

This is the MLE for the covariance matrix:
 [[ 0.9981154  -0.78086859  0.69396025 -0.69857465]
 [-0.78086859  0.99722568 -0.80036266  0.69399612]
 [ 0.69396025 -0.80036266  0.99894653 -0.78150351]
 [-0.69857465  0.69399612 -0.78150351  0.99900666]]
This is the estimate of the precision matrix computed as the inverse of the MLE covariance matrix:
 [[ 2.94102317e+00  1.68903859e+00  2.90553067e-03  8.85488298e-01]
 [ 1.68903859e+00  3.87621230e+00  1.93214679e+00 -1.77406274e-04]
 [ 2.90553067e-03  1.93214679e+00  3.86238541e+00  1.68126521e+00]
 [ 8.85488298e-01 -1.77406274e-04  1.68126521e+00  2.93553345e+00]]


## 5. Maximum-likelihood estimation

In [6]:
from scipy.optimize import minimize

def construct_symmetric_matrix(theta_vec):
    Theta_mat = np.zeros((d, d))
    idx = np.triu_indices(d)
    Theta_mat[idx] = theta_vec
    Theta_mat = Theta_mat + Theta_mat.T - np.diag(np.diag(Theta_mat))
    return Theta_mat

def log_likelihood(theta_vec, S):
    Theta_mat = construct_symmetric_matrix(theta_vec)
    det = np.linalg.det(Theta_mat)
    if det <= 0:
        return -np.inf
    return np.log(det) - np.trace(S @ Theta_mat)

A = np.random.randn(d, d)
Theta0 = A @ A.T + d * np.eye(d)
theta0_vec = Theta0[np.triu_indices(d)]

res = minimize(lambda t: -log_likelihood(t, EstimSigma),
               theta0_vec)

Theta_MLE = construct_symmetric_matrix(res.x)
print("This is the MLE for the precision matrix :\n"  , Theta_MLE)

This is the MLE for the precision matrix :
 [[ 2.94102548e+00  1.68903111e+00  2.92474787e-03  8.85516879e-01]
 [ 1.68903111e+00  3.87617846e+00  1.93216640e+00 -1.41448029e-04]
 [ 2.92474787e-03  1.93216640e+00  3.86237952e+00  1.68124247e+00]
 [ 8.85516879e-01 -1.41448029e-04  1.68124247e+00  2.93549596e+00]]


  df = fun(x1) - f0
  df = fun(x1) - f0


## Conclusion

-We simulated a Gaussian vector consistent with a given graph structure.

-The precision matrix was initially estimated by inverting the MLE for the covariance matrix.

-We then computed the MLE of the precision matrix by maximizing the log-likelihood (equivalently, minimizing the negative log-likelihood).

-Both methods yielded the same results, confirming that the MLE for the precision matrix coincides with the inverse of the sample covariance matrix.

-However, neither method identified exact zeros for $\theta_{13}$ 
 and $\theta_{24}$
 . Introducing a lasso penalization with an appropriate regularization term could address this limitation.