In [1]:
import warnings; warnings.filterwarnings('ignore')
import numpy as np
import oxyba as ox
from importlib import reload; reload(ox);

### Create Example Matrix
First, create an ill-conditioned correlation matrix. 
This matrix might have set manually by experts.

In [2]:
R = ox.illcond_corrmat(4, random_state=2)
print("Ill-conditioned Correlation Matrix")
print(R.round(3))

Ill-conditioned Correlation Matrix
[[ 1.    -0.948  0.099 -0.129]
 [-0.948  1.    -0.591  0.239]
 [ 0.099 -0.591  1.     0.058]
 [-0.129  0.239  0.058  1.   ]]


In [3]:
try: np.linalg.cholesky(R);
except: print("Matrix is not positive definite");

Matrix is not positive definite


In [4]:
print("Det(R) is negative: ", np.linalg.det(R))

Det(R) is negative:  -0.1660653430570136


In [5]:
print("One or more eigenvalues are negative")
print(np.linalg.eigvals(R))

One or more eigenvalues are negative
[ 2.20386329 -0.08662667  0.81350395  1.06925943]


### Adjust the Matrix 
Use the k-Factor Nearest correlation optimization problem to find an adjusted matrix that is semi-positive definite. 

In [6]:
k = 3

The first test is using scipy's SLSQP solver

In [7]:
C, X, f, g, results = ox.subjcorr_kfactor(R,k, 'SLSQP')
print("Fitted Correlation Matrix")
print(C.round(3))
print("Is Det(C)>=0 ? ", np.linalg.det(C))
print("\nDiff")
print(np.abs(R-C).round(4))

Fitted Correlation Matrix
[[ 1.011 -0.903  0.113 -0.141]
 [-0.903  1.017 -0.538  0.221]
 [ 0.113 -0.538  1.003  0.071]
 [-0.141  0.221  0.071  1.   ]]
Is Det(C)>=0 ?  -1.0498841996441135e-16

Diff
[[0.0115 0.0454 0.0138 0.0116]
 [0.0454 0.017  0.0524 0.0177]
 [0.0138 0.0524 0.0035 0.0125]
 [0.0116 0.0177 0.0125 0.0003]]


The SLSQP solver **not** could create a correlation matrix (see diagonal elements and negative `det`).
The SLSQP algorithm also converges very slowly for this type of problem.

`subjcorr_kfactor` uses COBYLA as default algorithm. 
It's faster and more likely to return an useable correlation matrix.

In [8]:
C, X, f, g, results = ox.subjcorr_kfactor(R,k,'COBYLA')
print("Fitted Correlation Matrix")
print(C.round(3))
print("Is Det(C)>=0 ? ", np.linalg.det(C))
print("\nDiff")
print(np.abs(R-C).round(4))

Fitted Correlation Matrix
[[ 1.    -0.874  0.131 -0.131]
 [-0.874  1.    -0.563  0.255]
 [ 0.131 -0.563  1.     0.06 ]
 [-0.131  0.255  0.06   1.   ]]
Is Det(C)>=0 ?  6.039076265258523e-17

Diff
[[0.     0.0744 0.0312 0.0017]
 [0.0744 0.     0.0278 0.0163]
 [0.0312 0.0278 0.     0.002 ]
 [0.0017 0.0163 0.002  0.    ]]


### Slightly bigger matrices
The k-Factor approach will run into troubles, the bigger the matrix.

In [9]:
R = ox.illcond_corrmat(8, random_state=23)
print("Ill-conditioned Correlation Matrix")
print(R.round(2))

Ill-conditioned Correlation Matrix
[[ 1.    0.89  0.53 -0.44 -0.56  0.37 -0.67 -0.22]
 [ 0.89  1.   -1.    0.77  0.77 -0.4   0.18  0.96]
 [ 0.53 -1.    1.   -0.42  0.64  0.25 -0.78 -1.  ]
 [-0.44  0.77 -0.42  1.    0.74 -0.14  0.66  0.44]
 [-0.56  0.77  0.64  0.74  1.   -0.07 -0.68  0.1 ]
 [ 0.37 -0.4   0.25 -0.14 -0.07  1.    0.01 -0.21]
 [-0.67  0.18 -0.78  0.66 -0.68  0.01  1.   -0.84]
 [-0.22  0.96 -1.    0.44  0.1  -0.21 -0.84  1.  ]]


In [10]:
k = 3
C, X, f, g, results = ox.subjcorr_kfactor(R,k,'COBYLA')
print("Fitted Correlation Matrix")
print(C.round(2))
print("Is Det(C)>=0 ? ", np.linalg.det(C))
print("\nDiff")
print(np.abs(R-C).round(2))

Fitted Correlation Matrix
[[ 1.   -0.06  0.67 -0.44  0.58  0.65 -0.87  0.09]
 [-0.06  1.   -0.65  0.86  0.73  0.11  0.13  0.77]
 [ 0.67 -0.65  1.   -0.66  0.04  0.66 -0.43 -0.68]
 [-0.44  0.86 -0.66  1.    0.48  0.08  0.61  0.4 ]
 [ 0.58  0.73  0.04  0.48  1.    0.69 -0.31  0.47]
 [ 0.65  0.11  0.66  0.08  0.69  1.   -0.2  -0.3 ]
 [-0.87  0.13 -0.43  0.61 -0.31 -0.2   1.   -0.31]
 [ 0.09  0.77 -0.68  0.4   0.47 -0.3  -0.31  1.  ]]
Is Det(C)>=0 ?  4.621626757262551e-79

Diff
[[0.   0.95 0.14 0.01 1.14 0.28 0.21 0.3 ]
 [0.95 0.   0.35 0.09 0.04 0.5  0.05 0.18]
 [0.14 0.35 0.   0.24 0.61 0.41 0.35 0.32]
 [0.01 0.09 0.24 0.   0.26 0.23 0.05 0.03]
 [1.14 0.04 0.61 0.26 0.   0.76 0.37 0.38]
 [0.28 0.5  0.41 0.23 0.76 0.   0.21 0.09]
 [0.21 0.05 0.35 0.05 0.37 0.21 0.   0.53]
 [0.3  0.18 0.32 0.03 0.38 0.09 0.53 0.  ]]


### Links
* Higham, N.J., 2002. Computing the nearest correlation matrix -- a problem from finance. IMA Journal of Numerical Analysis 22, 329–343. [DOI](https://doi.org/10.1093/imanum/22.3.329), [PDF](http://www.maths.manchester.ac.uk/~higham/narep/narep369.pdf)
* Higham, Nick, 2009, [presentation](https://www.nag.com/market/nagquantday2009_ComputingaNearestCorrelationMatrixNickHigham.pdf)
