In [1]:
from naginterfaces.library import opt, lapackeig, correg
from naginterfaces.base import utils
import numpy as np

# Nearest correlation matrix using Semi-Definite Programming (SDP)
## Correct Rendering of this notebook

This notebook makes use of the `latex_envs` Jupyter extension for equations and references.  If the LaTeX is not rendering properly in your local installation of Jupyter , it may be because you have not installed this extension.  Details at https://jupyter-contrib-nbextensions.readthedocs.io/en/latest/nbextensions/latex_envs/README.html

## Introduction

We start with a matrix $G$ that is not quite a correlation matrix:
\begin{bmatrix}
1 & -1 & 0 & 0\\
-1 & 1 & -1 & 0\\
0 & -1 & 1 & -1\\
0 & 0& -1 & 2
\end{bmatrix}
It is indeed symmetric with one on the diagonal, but if we look at the eigenvalues:

In [2]:
G = np.array([[1, -1,  0, 0],
              [-1, 1, -1, 0],
              [0, -1, 1, -1],
              [0, 0, -1, 1]],
             dtype=float)
n = G.shape[0]

# Compute the eigenvalues of G
itype = 1
jobz = 'N'
uplo = 'U'
A = G.copy()
B = np.identity(n, dtype=float)
_, _, sv = lapackeig.dsygv(itype, jobz, uplo, A, B)
print('Minimum eigenvalue:', min(sv))

Minimum eigenvalue: -0.6180339887498952


We have a negative eigenvalue! G is therefore not semi-definite is is not a correlation matrix.

### Perturbation of $G$
We can try adding a perturbation of $G$ such that we get $G^*$, the closest correlation matrix to G in the Frobenius-norm sense:
\begin{equation*}
\min ||G^*-G||_F\\
\text{s.t } G^*  \succeq 0 
\end{equation*}
where
\begin{equation*}
G^* = 
\begin{bmatrix}
1 & -1 + x_1 & 0 & 0\\
-1 + x_1 & 1 & -1 + x_2 & 0\\
0 & -1 + x_2 & 1 & -1 + x_3\\
0 & 0& -1 + x_3& 2
\end{bmatrix}
\end{equation*}
This fits exactly the definition of an SDP. Let's define it, first compute the number of variables:

In [3]:
# Number of variables: number of nonzeros above the diagonal
nvar = 0
for i in range(0, n):
    for j in range(i+1, n):
        if G[i, j] != 0.0:
            nvar += 1

# initialize the problem handle
handle = opt.handle_init(nvar)

Now set up the objective function. The Forbenius norm of $G^*-G$ is simply:
\begin{equation*}
\sum_i x_i^2
\end{equation*}

In [4]:
# Our variables are stored as a vector thus, just minimize
# sum of squares of the corrections --> H is identity matrix, c = 0.
h = np.ones(nvar, dtype=float)
irowh = np.arange(nvar, dtype=int) + 1
icolh = np.arange(nvar, dtype=int) + 1
opt.handle_set_quadobj(handle, idxc=None, c=None, irowh=irowh, icolh=icolh, h=h)

Now we need to define the constraint $G^* \succeq 0$. This can be done with a linear matrix inequality:
\begin{equation*}
x_1 \begin{bmatrix}
&1&&\\
1&&&\\
&&&\\
&&&
\end{bmatrix}
+ x_2 \begin{bmatrix}
&&&\\
&&1&\\
&1&&\\
&&&
\end{bmatrix}
+ x_3 \begin{bmatrix}
&&&\\
&&&\\
&&&1\\
&&1&
\end{bmatrix}
+ G \succeq 0
\end{equation*}
As we only deal with symmetric matrices, only the upper diagonal part will need to be defined

In [5]:
# Positive-definite constraint: linear matrix inequality
# number of nonzeros in the sum => full triangle for G
# 1 for each E_ij
nnzasum = int(n*(n+1)/2 + nvar)

# Number of nonzeros in each of the matrix in the linear combination  G + x_ijE_ij
nnza = np.empty(nvar+1, dtype=int)
nnza[0] = n*(n+1)/2
nnza[1:nvar+1] = 1

# Store thye matrix inequality in the 3 sparse arrays
irowa = np.empty(nnzasum, dtype=int)
icola = np.empty(nnzasum, dtype=int)
a =  np.empty(nnzasum, dtype=float)
# copy upper triangular part of G to -A_0
idx = 0
for i in range(n):
    for j in range(i, n):
        irowa[idx] = i + 1
        icola[idx] = j + 1
        a[idx] = -G[i, j]
        idx += 1
# E_ij has one nonzero
for i in range(n):
    for j in range(i+1, n):
        if (G[i,j] != 0):
            irowa[idx] = i + 1
            icola[idx] = j + 1
            a[idx] = 1.0
            idx += 1

dima = n
idblk = opt.handle_set_linmatineq(handle, dima, nnza, irowa, icola, a, blksizea=None, idblk=0)

The problem is now fully defined, we can pass it to the solver

In [6]:
# I/O
iom = utils.FileObjManager(locus_in_output=False)

# Set optional argument
for option in ['Print Options = No',
               'Initial X = Automatic']:
    opt.handle_opt_set(handle, option)
    
x = np.empty(nvar)
inform = 0
x, _, _, _, rinfo, stats, _ = opt.handle_solve_pennon(handle, x, inform, u=None, uc=None, ua=None, io_manager=iom)

 E04SV, NLP-SDP Solver (Pennon)
 ------------------------------
 Number of variables             3                 [eliminated            0]
                            simple  linear  nonlin
 (Standard) inequalities         0       0       0
 (Standard) equalities                   0       0
 Matrix inequalities                     1       0 [dense    1, sparse    0]
                                                   [max dimension         4]

 --------------------------------------------------------------
  it|  objective |  optim  |   feas  |  compl  | pen min |inner
 --------------------------------------------------------------
   0  0.00000E+00  0.00E+00  6.19E-01  6.63E+00  1.00E+00   0
   1  4.12017E-01  6.38E-04  0.00E+00  1.44E+00  1.00E+00   5
   2  3.29642E-01  7.76E-04  0.00E+00  4.96E-01  4.65E-01   2
   3  2.65315E-01  1.02E-04  0.00E+00  1.55E-01  2.16E-01   3
   4  2.33229E-01  1.03E-03  0.00E+00  4.71E-02  1.01E-01   3
   5  2.19082E-01  2.22E-03  0.00E+00  1.46E-02  

In [7]:
# Form the nearest correlation matrix
G_star = np.zeros((n,n))
idx = 0
for i in range(n):
    for j in range (i+1, n):
        if G[i,j] != 0:
            G_star[i,j] = G[i,j] + x[idx]
            G_star[j,i] += G[i,j] + x[idx]
            idx += 1
    G_star[i,i] = 1.0
print('The Nearest correlation matrix to G is')
print(G_star)

The Nearest correlation matrix to G is
[[ 1.         -0.68232776  0.          0.        ]
 [-0.68232776  1.         -0.53442871  0.        ]
 [ 0.         -0.53442871  1.         -0.68232776]
 [ 0.          0.         -0.68232776  1.        ]]


Let's test its eigenvalues now

In [8]:
# Compute the eigenvalues of G
itype = 1
jobz = 'N'
uplo = 'U'
A = G_star.copy()
B = np.identity(n, dtype=float)
_, _, sv = lapackeig.dsygv(itype, jobz, uplo, A, B)
print('Minimum eigenvalue:', min(sv))

Minimum eigenvalue: 7.946753527371534e-08


## There exists a dedicated solver!

The problem is common enough that a dedicated solver was developped in the NAG library. Let's try it!

In [9]:
G_star, _, _, _ = correg.corrmat_nearest(G)
print(G_star)

[[ 1.         -0.8084125   0.1915875   0.10677505]
 [-0.8084125   1.         -0.65623269  0.1915875 ]
 [ 0.1915875  -0.65623269  1.         -0.8084125 ]
 [ 0.10677505  0.1915875  -0.8084125   1.        ]]


The sparsity pattern is however not preserved. Another solver allows you to fix some values:

In [10]:
# G_star[i, j] is fixed to G[i,j] if H[i,j]==1
H = np.array([[0, 0, 1, 1],
             [0, 0, 0, 1],
             [1, 0, 0, 0],
             [1, 1, 0, 0]])
alpha = 0.0
G_star, _, _ = correg.corrmat_fixed(G, alpha, H, 4)
print(G_star)

[[ 1.         -0.6823278   0.          0.        ]
 [-0.6823278   1.         -0.53442877  0.        ]
 [ 0.         -0.53442877  1.         -0.6823278 ]
 [ 0.          0.         -0.6823278   1.        ]]


In [2]:
help(opt.handle_solve_pennon)

Help on function handle_solve_pennon in module naginterfaces.library.opt:

handle_solve_pennon(handle, x, inform, u=None, uc=None, ua=None, io_manager=None)
    Run the Pennon solver on a compatible problem initialized by
    ``handle_init`` and defined by other functions from the suite, such
    as, semidefinite programming (SDP) and SDP with bilinear matrix
    inequalities (BMI).
    
    Note: this function uses optional algorithmic parameters, see also:
    ``handle_opt_set``, ``handle_opt_get``.
    
    For full information please refer to the NAG Library document for
    e04sv
    
    https://www.nag.com/numeric/nl/nagdoc_27/flhtml/e04/e04svf.html
    
    Parameters
    ----------
    handle : Handle
        The handle to the problem. It needs to be initialized by
        ``handle_init`` and must not be changed before the call to
        ``handle_solve_pennon``.
    
    x : float, array-like, shape (nvar)
        If 'Initial X' = 'USER' (the default), x^0, the initial estima