In [1]:
import numpy as np
from scipy.optimize import minimize

*This notebook presents and implements the initial ideas of the estimation in ML Factor Analysis. The functions were later added to `utils.py`.*

___

# Estimating $\hat\lambda$ and $\hat\Psi$ in the maximum likelihood Factor Analysis Model.

In [2]:
def sim_factor_model(loadings, specific_variance, mu, nsim=1, verbose=True):
    """
    Parameters
    ---
        loadings:           (p, k) matrix
        specific_variance:  (p, p) diagonal matrix, with specific variances on diagonals
        mu:                 (p, 1) vector of means
        nsim:               How many observations should be simulated

    Returns
    ---
        (n, p) matrix of observations from the specified factor model

    """
    k = loadings.shape[1]
    p = specific_variance.shape[0]
    if verbose:
        print(f"{k=} {p=}")

    X = []
    for _ in range(nsim):
        factor_vector = np.random.multivariate_normal(np.zeros(k), np.eye(k))
        u = np.random.multivariate_normal(np.zeros(p), np.diag(specific_variance))

        X.append(loadings @ factor_vector + u + mu)

    return np.array(X)

Goal is to minimize:
$$
tr((\hat{\Lambda}\hat{\Lambda}' + \Psi)^{-1} S) - log(|(\hat{\Lambda}\hat{\Lambda}' + \Psi)^{-1} S|) \quad \text{w.r.t. $\Psi$}
$$


Step by step:
1. Calculate $S^* = \Psi^{-1/2} S \Psi^{-1/2}$, $S^* = \Gamma \Theta \Gamma'$
2. Get eigenvectors $\gamma_{(i)}$ and values, $\theta_i$
3. Now $\Lambda^*$ has columns $c_i \gamma_i{(i)}, \quad c_i = \sqrt{\max(\theta_i - 1, 0)}$
4. $\hat\Lambda = \Psi^{1/2} \Lambda^*$
5. Calculate the object function $\text{tr}((\hat\Lambda \hat\Lambda' + \Psi)^{-1} S) - log(|(\hat\Lambda \hat\Lambda' + \Psi)^{-1}S|)$

In [3]:
# Generate synthetic data
loadings = np.array([[2, 2, 2, 2, 2]]).T
specific_variance = np.array([2, 2, 10, 1, 1])
mu = np.array([10, 20, 30, 40, 50])

X = sim_factor_model(loadings, specific_variance, mu, nsim=10**4)
X

k=1 p=5


array([[ 9.9200258 , 19.28934412, 28.78413474, 42.03826513, 52.36085071],
       [ 6.86836933, 16.2209485 , 26.31668071, 34.44290849, 45.62596067],
       [ 6.2821622 , 15.82438986, 25.47848701, 36.32129902, 45.9620475 ],
       ...,
       [12.09127338, 21.98839508, 35.35280837, 40.87600116, 52.40441897],
       [ 9.26148643, 21.74310034, 27.03230995, 39.87741989, 50.7011847 ],
       [ 9.80686983, 18.65725894, 21.95390689, 38.12839992, 48.90860892]])

1. Calculate $S^*$

In [4]:
S = np.cov(X.T)
Psi = np.diag(specific_variance)
Psi_sq_inv = np.linalg.inv(Psi ** 0.5)
S_star = Psi_sq_inv @ S @ Psi_sq_inv

2. Get eigenvectors and eigenvalues

In [5]:
eigval, eigvec = np.linalg.eig(S_star)

3. Construct $\Lambda^*$.

Here we must choose k. Lets choose k = 1, like the underlying model that generated the data. (the `loadings` used to simulate is (5, 1) = (p, k))

In [6]:
lambda_star = max(eigval[0] - 1, 0) ** 0.5 * eigvec[:,0]
lambda_star

array([-1.45884803, -1.44679443, -0.66171187, -2.04331882, -2.04910301])

4. Contstruct $\hat\Lambda = \Psi^{1/2} \Lambda^*$

In [7]:
lambda_hat = Psi ** 0.5 @ lambda_star
lambda_hat

array([-2.06312268, -2.04607631, -2.09251666, -2.04331882, -2.04910301])

5. Calculate $\text{tr}((\hat\Lambda \hat\Lambda' + \Psi)^{-1} S) - log(|(\hat\Lambda \hat\Lambda' + \Psi)^{-1}S|)$

In [8]:
internal = np.linalg.inv(lambda_hat @ lambda_hat + Psi) @ S
result = np.trace(internal) - np.log(np.linalg.det(internal))
result

5.770666173787942

Having done this, lets construct a function that takes an arbitrary (D, 1) array of specific variances and calculates the objective function

In [9]:
def calculate_objective(specific_variance, X_data):
    # Step 1
    S = np.cov(X_data.T)
    Psi = np.diag(specific_variance)
    Psi_sq_inv = np.linalg.inv(Psi ** 0.5)
    S_star = Psi_sq_inv @ S @ Psi_sq_inv

    # Step 2
    eigval, eigvec = np.linalg.eig(S_star)

    # Step 3
    lambda_star = max(eigval[0] - 1, 0) ** 0.5 * eigvec[:,0]

    # Step 4
    lambda_hat = Psi ** 0.5 @ lambda_star

    # Step 5
    internal = np.linalg.inv(lambda_hat @ lambda_hat.T + Psi) @ S
    result = np.trace(internal) - np.log(np.linalg.det(internal))

    return result

In [10]:
specific_variance = np.array([2, 2, 10, 1, 1])
calculate_objective(specific_variance, X)

5.770666173787942

This matches our manual step by step!

Now lets minimize the objective function. Note that the minimization algorithm works best with variables in $\R$, however $\psi_{ii} \geq 0$. We circumvent this by optimizing with $\alpha_i \in \R \rightarrow \psi_{ii} = \exp(\alpha_i) \in [0, \infty [$

In [11]:
x_0_guess = np.array([2, 2, 10, 1, 1])
problem = minimize(fun=lambda x: calculate_objective(np.exp(x), X_data=X),
                   x0=x_0_guess)

In [12]:
psi_hat = np.exp(problem.x)
psi_hat

array([ 2.07843272,  2.05732084, 10.04731862,  1.00318627,  1.00896644])

A simple change in step 3 lets us add the option to specify k number of factors.

In [13]:
def calculate_objective(specific_variance, X_data, k):
    # Step 1
    S = np.cov(X_data.T)
    Psi = np.diag(specific_variance)
    Psi_sq_inv = np.linalg.inv(Psi ** 0.5)
    S_star = Psi_sq_inv @ S @ Psi_sq_inv

    # Step 2
    eigval, eigvec = np.linalg.eig(S_star)

    # Step 3
    lambda_star = []
    for i in range(k):
        lambda_star.append(max(eigval[i] - 1, 0) ** 0.5 * eigvec[:,i])
    lambda_star = np.array(lambda_star).T

    # Step 4
    lambda_hat = Psi ** 0.5 @ lambda_star

    # Step 5
    internal = np.linalg.inv(lambda_hat @ lambda_hat.T + Psi) @ S
    result = np.trace(internal) - np.log(np.linalg.det(internal))

    return result

Let's generate some data that actually has k = 2 factors.

In [14]:
loadings = np.array([[2, 2, 2, 2, 2],
                     [1, 1, 0, -1, -1]]).T
specific_variance = np.array([2, 2, 10, 1, 1])
mu = np.array([10, 20, 30, 40, 50])

X = sim_factor_model(loadings, specific_variance, mu, nsim=10**5)

k=2 p=5


In [15]:
x_0_guess = np.array([2, 2, 10, 1, 1])
problem = minimize(fun=lambda x: calculate_objective(np.exp(x), X_data=X, k=2),
         x0=x_0_guess)
problem

  message: Optimization terminated successfully.
  success: True
   status: 0
      fun: 5.000031822293619
        x: [ 6.326e-01  7.364e-01  2.302e+00 -5.109e-02  3.294e-02]
      nit: 25
      jac: [ 1.609e-06 -1.073e-06 -2.980e-07 -1.192e-06  1.073e-06]
 hess_inv: [[ 2.390e+01 -1.881e+01 ...  4.766e+00 -4.207e+00]
            [-1.881e+01  1.774e+01 ... -4.197e+00  3.801e+00]
            ...
            [ 4.766e+00 -4.197e+00 ...  4.191e+01 -3.562e+01]
            [-4.207e+00  3.801e+00 ... -3.562e+01  3.378e+01]]
     nfev: 174
     njev: 29

Then estimates are:

In [16]:
psi_hat = np.diag(np.exp(problem.x))

k = 2
S = np.cov(X.T)
Psi_sq_inv = np.linalg.inv(psi_hat ** 0.5)
S_star = Psi_sq_inv @ S @ Psi_sq_inv
eigval, eigvec = np.linalg.eig(S_star)
lambda_star = []
for i in range(2):
    lambda_star.append(max(eigval[i] - 1, 0) ** 0.5 * eigvec[:,i])
lambda_star = np.array(lambda_star).T
lambda_hat = psi_hat ** 0.5 @ lambda_star

print("Estimated psi: ", psi_hat, 
      "",
      "Estimated lambda:", lambda_hat,
      sep="\n")

Estimated psi: 
[[1.88248212 0.         0.         0.         0.        ]
 [0.         2.08845499 0.         0.         0.        ]
 [0.         0.         9.98945812 0.         0.        ]
 [0.         0.         0.         0.95019627 0.        ]
 [0.         0.         0.         0.         1.0334897 ]]

Estimated lambda:
[[-1.77254165 -1.40832441]
 [-1.77151525 -1.35083897]
 [-1.95054833 -0.39189105]
 [-2.17857421  0.59117534]
 [-2.16727003  0.56890184]]
