# Testing Covariance

A more precise test of our covariance unit is to generate data having a specific distribution and then see whether our covarinace matrix matches our specified matrix.

There are two approaches - one using a Cholesky decomposition, the other using an eigenvalue decomposition. See https://stats.stackexchange.com/questions/32169/how-can-i-generate-data-with-a-prespecified-correlation-matrix and https://stats.stackexchange.com/questions/120179/generating-data-with-a-given-sample-covariance-matrix.

In [2]:
from src.var_processor.covariance import CovarianceUnit
import numpy as np

We need a positive definite covariance matrix. For this we can:
* generate random matrix
* multiply it by it's own transposition
* you have obtained a positive semi-definite matrix.

In [8]:
cov = np.random.randn(3,3)
cov = np.dot(cov,cov.T)

In [9]:
cov

array([[ 1.12848561, -0.79697223,  1.6590042 ],
       [-0.79697223,  0.82794797, -1.04018027],
       [ 1.6590042 , -1.04018027,  4.65298715]])

In [10]:
mean = 0.5

In [11]:
L = np.linalg.cholesky(cov)

In [12]:
L

array([[ 1.06230203,  0.        ,  0.        ],
       [-0.7502313 ,  0.51487956,  0.        ],
       [ 1.5617067 ,  0.25532373,  1.46590216]])

Now we loop 100 times, generating samples and seeing what our mean and covariance equal...

In [13]:
cov_unit = CovarianceUnit(3)

In [29]:
for _ in range(0, 100):
    sample = np.dot(L, np.random.randn(3, 1)) + mean
    cov_unit.update(sample)
print(cov_unit.mean, cov_unit.covariance, sep="\n")

[[0.54730931]
 [0.47355449]
 [0.5773568 ]]
[[ 1.14993377 -0.78521168  1.68587113]
 [-0.78521168  0.7942798  -1.05060454]
 [ 1.68587113 -1.05060454  4.59816267]]


In [31]:
for _ in range(0, 10000):
    sample = np.dot(L, np.random.randn(3, 1)) + mean
    cov_unit.update(sample)
print(cov_unit.mean, cov_unit.covariance, sep="\n")

[[0.51125101]
 [0.49221394]
 [0.50523693]]
[[ 1.1386086  -0.80263796  1.70602671]
 [-0.80263796  0.82986436 -1.07135361]
 [ 1.70602671 -1.07135361  4.74586408]]


In [32]:
mean = -0.5
cov_unit2 = CovarianceUnit(3)
for _ in range(0, 10000):
    sample = np.dot(L, np.random.randn(3, 1)) + mean
    cov_unit2.update(sample)
print(cov_unit2.mean, cov_unit2.covariance, sep="\n")

[[-0.48874286]
 [-0.51869883]
 [-0.49200419]]
[[ 1.14307288 -0.81324798  1.66545525]
 [-0.81324798  0.84053001 -1.04764234]
 [ 1.66545525 -1.04764234  4.6189823 ]]


We can use numpy's allclose/isclose to determine if close. https://docs.scipy.org/doc/numpy/reference/generated/numpy.allclose.html#numpy.allclose

In [39]:
mean_array = np.ones(shape=(3,1))*0.5
# Within 5%
np.allclose(mean_array, cov_unit.mean, rtol=0.05)

True

In [41]:
np.allclose(cov, cov_unit.covariance, rtol=0.05)

True

In [None]:
def test_covariance_computation():
    """Statistical test that cov unit is determining the covariance."""
    # Generate random positive definite matrix
    cov = np.random.randn(3,3)
    cov = np.dot(cov,cov.T)
    # Generate desired mean
    mean = np.random.randn(3,1)
    # Use Cholesky decomposition to get L
    L = np.linalg.cholesky(cov)
    cov_unit = CovarianceUnit(3)
    for _ in range(0, 1000):
        sample = np.dot(L, np.random.randn(3, 1)) + mean
        cov_unit.update(sample)
    # Check within 5%
    assert np.allclose(mean, cov_unit.mean, rtol=0.05)
    assert np.allclose(cov, cov_unit.covariance, rtol=0.05)

In [46]:
np.random.uniform(size=(3,3))

array([[0.45475868, 0.0813685 , 0.64410836],
       [0.48388041, 0.81378173, 0.6283129 ],
       [0.06086568, 0.22582442, 0.82828134]])

In [53]:
cov = np.random.randn(3, 3)
cov = np.dot(cov, cov.T) 
cov = cov / cov.max()
cov

array([[ 0.65029906, -0.24070865,  0.63803126],
       [-0.24070865,  0.09457014, -0.23644969],
       [ 0.63803126, -0.23644969,  1.        ]])

In [54]:
cov.max()

1.0

In [55]:
cov / cov.max()

array([[ 0.65029906, -0.24070865,  0.63803126],
       [-0.24070865,  0.09457014, -0.23644969],
       [ 0.63803126, -0.23644969,  1.        ]])

# Testing Eigenvalue Estimation

We can use a specified covariance matrix - determine the eigenvectors using numpy and check agai

In [None]:
from src.var_processor.power_iterator import PowerIterator

