# Mass matrix selection criterion

In [1]:
from sklearn import covariance
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from scipy import linalg, stats

In [2]:
N = 10_000
M = 200

# Generate training and test samples.
# The true eigenvalues are (n - 1) times 1 and 1 times `scale`

def transform(samples, direction, scale):
    samples_ = samples.copy()
    samples_[...] += (samples @ direction)[:, None] * (direction * (scale - 1))
    return samples_

direction = np.random.randn(N)
direction[...] /= linalg.norm(direction)

scale = 2

train, test = np.random.randn(2, M, N)
train = transform(train, direction, scale)
test = transform(test, direction, scale)

We now assume that we were able to extract the direction and scale exactly from train.

So we reverse-transform our test samples:

In [3]:
test_trafo = transform(test, direction, 1 / scale)

We now test three different measures for how good our transformation is.

### Ledoit-Wolf based max_min eigenvalue selection criterion

In [4]:
test_cov_lw, _ = covariance.ledoit_wolf(test)
test_trafo_cov_lw, _ = covariance.ledoit_wolf(test_trafo)

In [5]:
linalg.eigvalsh(test_cov_lw)[-1]

1.6620650442246963

In [6]:
linalg.eigvalsh(test_trafo_cov_lw)[-1]

1.6608396279899977

It reports a very small improvement, but nowhere near the actual value (2).

The absolute estimate is wrong, the true values are 2 and 1.

### Empirical covariance based max eigenvalue criterion

This is pretty much the one from the paper, assuming that the minimum eigenvalue is the same for both.

In [11]:
test_cov_eigs = linalg.svdvals(test) ** 2 / M
test_trafo_eigs = linalg.svdvals(test_trafo) ** 2 / M

In [12]:
test_cov_eigs[0]

65.07197896790458

In [13]:
test_trafo_eigs[0]

65.03648659216542

This also reports a very small improvement, but again not the true value.

The absolute values are *way* off.

### Variance estimate in proposed direction

In [14]:
np.sqrt(direction @ test.T @ test @ direction / M)

1.962265810044899

This correctly tells us that the variance along the direction that was proposed to transform was ~2 as expected.