In [1]:
import warnings; warnings.filterwarnings('ignore')
import numpy as np
import oxyba as ox
from importlib import reload; reload(ox);

### Multiple Correlated Random Numbers
Let $X \in \mathbb{R}^{N \times M}$ a matrix with $M$ uncorrelated normal random numbers,
and $C \in \{\rho \in \mathbb{R} | -1 \leq \rho \leq +1 \}^{M \times M}$ the desired correlation matrix,
and $L$ the lower triangular matrix of the Cholesky decomposition $C=L L^T$
then 

$$
Y = X \, L^T
$$

is a matrix correlated random numbers.

### Example

In our example we will use 5 random variables with 10000 observations each.

In [2]:
M = 5
N = 10000

First, create random example of a correlation matrix $C$ (Step a. an ill-conditioned correlation matrix `R`, then b. fit a proper semipositive definite correlation matrix `C`).

In [3]:
R = ox.illcond_corrmat(M, random_state=42)
C,_,_ = ox.subjcorr_luriegold(R)
print(C.round(3))

[[ 1.     0.726  0.6    0.286 -0.524]
 [ 0.726  1.     0.508  0.056  0.146]
 [ 0.6    0.508  1.    -0.462 -0.426]
 [ 0.286  0.056 -0.462  1.    -0.281]
 [-0.524  0.146 -0.426 -0.281  1.   ]]


Second, generate some random numbers $X$

In [4]:
np.random.seed(23)
X = np.random.normal(0,1, (N,M))
print(X[:7,:].round(3))

[[ 0.667  0.026 -0.778  0.949  0.702]
 [-1.051 -0.368 -1.137 -1.322  1.772]
 [-0.347  0.67   0.322  0.06  -1.043]
 [-1.01   0.442  1.129 -1.838 -0.939]
 [-0.202  1.045  0.538  0.812  0.241]
 [-0.953 -0.136  1.267  0.174 -1.223]
 [ 1.415  0.458  0.729  1.968 -0.548]]


Just as a little check: These $X$ variables are **not** correlated

In [5]:
r,_ = ox.corr(X)
print(r.round(3))

[[ 1.    -0.001  0.002  0.008 -0.011]
 [-0.001  1.    -0.008 -0.009  0.002]
 [ 0.002 -0.008  1.     0.011 -0.004]
 [ 0.008 -0.009  0.011  1.    -0.015]
 [-0.011  0.002 -0.004 -0.015  1.   ]]


Third, transform uncorrelated $X$ with `rand_chol` to get correlated $Y$

In [6]:
Y = ox.rand_chol(X, C);
print(Y[:7,:].round(3))

[[ 0.667  0.502 -0.214  1.283 -0.41 ]
 [-1.051 -1.016 -1.571 -0.04   0.921]
 [-0.347  0.208  0.118 -0.463  0.6  ]
 [-1.01  -0.43   0.336 -2.223  1.115]
 [-0.202  0.572  0.416 -0.274  0.545]
 [-0.953 -0.785  0.419 -1.126  0.038]
 [ 1.415  1.343  1.475  0.78  -1.127]]


Now check the correlation matrix of these $Y$ variables

In [7]:
r,_ = ox.corr(Y)
print(r.round(3))

[[ 1.     0.726  0.603  0.292 -0.525]
 [ 0.726  1.     0.506  0.061  0.145]
 [ 0.603  0.506  1.    -0.452 -0.433]
 [ 0.292  0.061 -0.452  1.    -0.284]
 [-0.525  0.145 -0.433 -0.284  1.   ]]


How different are these from the desired correlation matrix $C$?

In [8]:
print( (r - C).round(3) )

[[ 0.    -0.     0.003  0.006 -0.001]
 [-0.    -0.    -0.002  0.004 -0.   ]
 [ 0.003 -0.002 -0.     0.01  -0.007]
 [ 0.006  0.004  0.01  -0.    -0.003]
 [-0.001 -0.    -0.007 -0.003 -0.   ]]


That looks close :)