# Principal Component Analysis #

Principal Component Analysis (PCA) is based on the spectral decomposition of a matrix. For a matrix $A$, we aim to find the decomposition $A = W \Lambda W^\dag$, where $\Lambda$ is a diagonal matrix. This is useful as it gives us a transformation to an orthonormal basis for the matrix $A$. It is typically applied on a covariance or correlation matrix.

While PCA is the most popular and straightforward orthogonalization technique, it is by no means the only method.

Let us denote $X$ to be the $T \times n$ matrix of data ($T$ data points, $n$ number of variables). We also denote the columns of $X$ as $\{\vec{x}_1,\dots,\vec{x}_n\}$, where each vector $\vec{x}_i$ is the vector of data for a explanatory variable. We assume each $\vec{x}_i$ has zero mean.

The sample variances and covariances of this data is summarized by the matrix

$$
V = T^{-1} X^\dag X.
$$

If we normalize the data such that each $\vec{x}_i$ has zero mean and variance 1, $V$ represents the correlation matrix of the returns.

Sometimes, we might not have enough data ($T < n$), which causes $V$ to have some zero eigenvalues since it is singular. In such a scenario, a full set of $n$ principal components will not be able to be determined. Usually, this is still not too big of an issue since we only look at the first few most important principal components. When ($T > n$), $V$ is positive definite.

We still have not defined what we mean by a principal component. A principal component is a linear combination of the vectors of $X$, where the weights are chosen such that:

1) The principal components are uncorrelated with each other

2) The first principal component explains the most variation, the second explains the greatest amount of the remaining variation, ect.

We now describe the method to do so.

Denote $\Lambda$ as the diagonal matrix of the eigenvalues of $V$, and $W$ the orthogonal matrix of the corresponding eigenvectors of $V$. The eigenvalues (and corresponding eigenvectors) are ordered from largest to smallest, $\lambda_1 \geq \lambda_2 \dots \geq \lambda_n$. We define the matrix of principal components $P$ as:

$$
P = XW.
$$

Then the $i$ th principal component of $V$ is the $i$ th column of $P$.