# Factor Analysis

The Factor Analysis framework, assumes that for each individual we observe $m$ realizations of $y$ and there are $p$ anobserved realizations of $z$. This variables are related as follows:

- $ Y|Z \sim N\left(\sideset{^{CZ}}{}\mu + \sideset{^{CZ}}{}{L}, \sideset{^{CZ}}{}\Sigma \right) $  
- $Z \sim N\left(0, 1 \right)$ 

where:
- $Cov(z_i, z_j) =\delta_{i,j}$. 

We will also assume that $\Sigma$ is a diagonal matrix.

The goald is, given $n$ observations of $Y$ $i.i.d$, calculate  $\Theta = \left \{\sideset{^{CZ}}{}\mu, \sideset{^{CZ}}{}L, \sideset{^{CZ}}{}\Sigma \right \}$  such that:

$$ \Theta = argmax \left \{ P \left(Y_1, ... , Y_n | \Theta \right) \right \} $$

The calculation strategy is as follows:

- For a given candidate $\Theta = \left \{\sideset{^{CZ}}{}\mu, \sideset{^{CZ}}{}L, \sideset{^{CZ}}{}\Sigma \right \}$ we caltulate the join distribution of $(Z, Y)\sim N\left \{\sideset{^{J}}{}\mu, \sideset{^{J}}{}\Sigma \right \}$

- Next, we must caltulate the posterior probability distribution of $Z|Y$ i.e. $Z \sim N \left(\sideset{^{CY}}{}\mu, \sideset{^{CY}}{}\Sigma \right )$

- At this point we will be able to apply the EM algorithm. i.e. calculate:

$$ Q \left( \Theta_{new}, \Theta_{old} \right) = E_{Z|Y,\Theta_{old}} \left[ \log P\left( Z, Y | \Theta_{new} \right)  \right]$$

The reader will have noticed that we are using a top left script notation to specify at which point of the calculation we are, __CZ__ for conditioned on Z, __J__ for join distribution and __CY__ for conditioned on Y.

The following sections derived the calculation of $\sideset{^{J}}{}P$ and $\sideset{^{CY}}{}P$. We then present the implementation of the calculation framework and finally we show the result of randomly generated data.

## Caltulation of the join distribution

Since Y|Z is a multivariated normal distribution, Z is also a multivariated normal distribution and the relation between Y and Z is linear, the join distribution is also in multivariate normal distribution. So to calculate the join distribution, all we need to do is calculate $\sideset{^{J}}{} \mu$ and  $\sideset{^{J}}{} \Sigma$.

We will define:

$$ X = \begin{pmatrix}
Z \\ Y
\end{pmatrix}
$$ 

### Calculation of $\sideset{^{J}}{} \mu$
Since $Z \sim N\left(0, 1 \right)$ it's easy to see that:

$$ \sideset{^{J}}{} \mu = \begin{pmatrix}
0_p \\ 
\sideset{^{CZ}}{} \mu
\end{pmatrix} $$ 

Where $0_0$ is a vector of dimension p with all its elements equal to 0.

### Calculation of $\sideset{^{J}}{} \Sigma $

Given that $y_i = \sum_k l_{i, k}z_k + \sigma_i * \epsilon_i$ all we need to do is calculate $Cov(x_i, x_j)$.

- $ Cov(z_i, z_j) = \delta{i, j} $
- $ Cov(y_i, z_j) = l_{i,j} $
- $ Cov(y_i, y_i) = \sum_k l_{i, k}^2 + \sigma_i^2$
- $ Cov(y_i, y_j) = \sum_k l_{i, k} l_{j, k} $

## Calculations de probability distribution of Z condition on Y

Let  
$$\sideset{^{J}}{} \Sigma = \begin{pmatrix} 
                    \sideset{^{J}}{} \Sigma_{1, 1} & \sideset{^{J}}{} \Sigma_{1, 2} \\
                    \sideset{^{J}}{} \Sigma_{2, 1} & \sideset{^{J}}{} \Sigma_{2, 2}
                    \end{pmatrix} \text{ with sizes }
                    \begin{pmatrix} 
                    p \times p & p \times m \\
                    m \times p & m \times m
                    \end{pmatrix}
                    $$
                    
Then:

<center>
    $\sideset{^{CY}}{} \Sigma =\sideset{^{J}}{} \Sigma_{1, 1} -\sideset{^{J}}{}\Sigma_{1, 2} 
\sideset{^{J}}{} \Sigma_{2, 2}^{-1} \sideset{^{J}}{} \Sigma_{2, 1}$ 
</center>


We will see that we do not need to caltulate $\sideset{^{CY}}{} \mu $

## Calculation of $ Q \left( \Theta_{new}, \Theta_{old} \right) $ 

In orther to compute $ Q \left( \Theta_{new}, \Theta_{old} \right) $ i.e. $ = E_{Z|Y,\Theta_{old}} \left[ \log P\left( Z, Y | \Theta_{new} \right)  \right]$ we will right the former expresion as an integral over $Z$ specifying the density function.

That is: 
<center>

$ Q \left( \Theta_{new}, \Theta_{old} \right)\propto 
\int_{Z} \left (
-\frac{1}{2}(x - \sideset{^{J}}{}\mu_{new})'
\sideset{^{J}}{} \Sigma_{new}^{-1}
(x - \sideset{^{J}}{}\mu_{new}) 
- \frac{1}{2} \log |\sideset{^{J}}{} \Sigma_{new}| 
\right) 
\frac{exp
\left (
-\frac{1}{2}(x - \sideset{^{CY}}{}\mu_{old})'
\sideset{^{CY}}{} \Sigma_{old}^{-1}
(x - \sideset{^{CY}}{}\mu_{old}) 
\right) 
}
{\sqrt{|\sideset{^{CY}}{} \Sigma_{old}|}} dz$
    </center>

But this integral is fairly simple, because we are integrating on $Z$ a quadratic form on $X = (Z, Y)$ minus a function on $\Theta_{old}$ i.e. a constant.

Now, the quadratic form has four possibilites, $z_i$ vs $z_i$, $z_i$ vs $z_j$, $y_i$ vs $z_j$ and $y_i$ vs $y_j$

### $z_i$ vs $z_i$:


$ \begin{align}
E_{Z|Y,\Theta_{old}}[\sideset{^{J}}{} \Sigma_{new, i, i}^{-1}(z_i - \sideset{^{J}}{} \mu_{new, i})^2] &= \sideset{^{J}}{} \Sigma_{new, i, i}^{-1}E_{Z|Y,\Theta_{old}}[(z_i - \sideset{^{J}}{} \mu_{new, i})^2] \\
 &=  \sideset{^{J}}{} \Sigma_{new, i, i}^{-1}E_{Z|Y,\Theta_{old}}[(z_i^2 - 2\sideset{^{J}}{} \mu_{new, i} z_i + \sideset{^{J}}{} \mu_{new, i}^2)]\\
 &= \sideset{^{J}}{} \Sigma_{new, i, i}^{-1}E_{Z|Y,\Theta_{old}}[z_i^2] \\
 &= \sideset{^{J}}{} \Sigma_{new, i, i}^{-1} *\sideset{^{CY}}{} \Sigma_{old, i, i}
\end{align}
$

Since $ \sideset{^{J}}{} \mu_{new, i} = 0$ 


###  $z_i$ vs $z_j$:

$ \begin{align}
E_{Z|Y,\Theta_{old}}[\sideset{^{J}}{} \Sigma_{new, i, j}^{-1}(z_i - \sideset{^{J}}{} \mu_{new, i})(z_j - \sideset{^{J}}{} \mu_{new, j})] &= \sideset{^{J}}{} \Sigma_{new, i, j}^{-1} E_{Z|Y,\Theta_{old}}[z_i z_j] \\
&= \sideset{^{J}}{} \Sigma_{new, i, j}^{-1} * \sideset{^{CY}}{} \Sigma_{old, i, j}
\end{align}$

Again since  $\sideset{^{J}}{} \mu_{new, i} = 0$.

### $y_i$ vs $z_j$:

$\begin{align}
E_{Z|Y,\Theta_{old}}[\sideset{^{J}}{}  \Sigma_{new, i, j}^{-1}(z_i - \sideset{^{J}}{} \mu_{new, i})(y_j - \sideset{^{J}}{} \mu_{new, j})] &= 
\sideset{^{J}}{} \Sigma_{new, i, j}^{-1}E_{Z|Y,\Theta_{old}}[(z_i - \sideset{^{J}}{} \mu_{new, i})(y_j - \sideset{^{J}}{} \mu_{new, j})] \\ & = 
\sideset{^{J}}{} \Sigma_{new, i, j}^{-1}E_{Z|Y,\Theta_{old}}[(z_i y_j - \sideset{^{J}}{} \mu_{new, i} y_j - \sideset{^{J}}{} \mu_{new, j}z_i + \sideset{^{J}}{} \mu_{new, i} \sideset{^{J}}{} \mu_{new, j})] \\
& = 
\sideset{^{J}}{} \Sigma_{new, i, j}^{-1}E_{Z|Y,\Theta_{old}}[(z_i y_j - \sideset{^{J}}{} \mu_{new, j}z_i)]
\\ & = 
\sideset{^{J}}{} \Sigma_{new, i, j}^{-1}(y_j - \sideset{^{J}}{} \mu_{new, j})E_{Z|Y,\Theta_{old}}[z_i] \\ 
&= \sideset{^{J}}{} \Sigma_{new, i, j}^{-1}(y_j - \sideset{^{J}}{} \mu_{new, j}\sideset{^{CY}}{} \mu_{new, i})
\end{align}
$


### $y_i$ vs $y_j$:

$E_{Z|Y,\Theta_{old}}[\sideset{^{J}}{}  \Sigma_{new, i, j}^{-1}(y_i - \sideset{^{J}}{} \mu_{new, i})(y_j - \sideset{^{J}}{} \mu_{new, j})] = \sideset{^{J}}{}  \Sigma_{new, i, j}^{-1}(y_i - \sideset{^{J}}{} \mu_{new, i})(y_j - \sideset{^{J}}{} \mu_{new, j})$