# Fundamental Factor Models

Fundamental factor models start with asset returns $r_t$ and factor loadings $B_t$, and estimates factor returns $f_t$ and idio returns $\epsilon_t$

Process:
- Data ingestion: correctness, outliers, consistency across vendors, missing data
- Estimation universe selection
- Winsorization: identify outliers and winsorize
- Loading generation: generate $B_t$
- Cross-sectional regression: estimate $f_t$ and $\epsilon_t$
- Time-series estimation:
  - factor cov matrix
  - idio cov matrix
  - risk-adjusted performance of factor returns

## Cross-sectional regression

Starting with a single period model:

$$ r_t = Bf_t + \epsilon_t $$

Where $r_t$ and $f_t$ are column vectors. To estimate $f_t$, we solve the following optimization problem:

$$ \min ||r_t - Bf_t||^2 $$
$$ \text{s.t. } f \in R^{m}$$

However, since the idio returns $\epsilon_t$ usually aren't homoskedastic (same variance), we need to make them so:

$$ \Omega_{\epsilon}^{-1/2}r_t = \Omega_{\epsilon}^{-1/2}Bf_t + \Omega_{\epsilon}^{-1/2}\epsilon_t $$
$$ var(\Omega_{\epsilon}^{-1/2}\epsilon_t) = 
    \Omega_{\epsilon}^{-1/2}\Omega_{\epsilon}\Omega_{\epsilon}^{-1/2} = I_m
$$

And the optimization problem becomes

$$ \min ||\Omega_{\epsilon}^{-1/2}(r_t - Bf_t)||^2 $$
$$ \text{s.t. } f \in R^{m}$$

The solution is the normal equation: 

$$ \hat{f}_t = (B^T\Omega_{\epsilon}^{-1}B)^{-1}B^T\Omega_{\epsilon}^{-1}r_t $$

For multi-period case where the factor returns $F \in R^{mxT}$, the method stays the same as each single-period factor returns can be solved independently and combined to get the factor returns matrix $F$

### Idio covariance matrix

We don't know idio cov matrix before solving for factor returns, and we need idio cov matrix to solve for factor returns. It's a chicken and egg problem. We can borrow from Expectation-Maximization procedure, 
- we start with an identity idio cov matrix
- solve for factor returns and get a new idio returns and cov matrix
- solve for factor returns again using the new idio returns, get new factor returns and update the idio cov matrix.

If we have $T$ time stamps, we can use the first half of the samples to estimate idio cov matrix, then use and update it in a walk-forward manner
- Use the idio cov matrix from time $t-1$ to estimate factor returns and idio returns at time $t$
- Update idio cov matrix with the new idio returns from time $t$ 

### Rank-deficient Loading Matrices

An example would be, a country factor and an industry factor, each asset must belong to one and only one country and one industry. Therefore, the sum of all country loadings equals the sum of all industry loadings equals a vector of ones. A vector that assign 1 to all country loadings and -1 to all industry loadings and 0 everywhere else will create a 0 vector, thus the loading matrix is rank-deficient due to non-empty null space. If $B$ is rank-deficient, so is $B^TB$ and we can't inverse it, therefore we can't get the factor returns estimate $\hat{f}_t$.

Ways to deal with this:
- Add ridge regularization: $||r-Bf||^2 + \delta ||f||^2$. As $\delta \to 0$, this becomes ridgeless, and we can calculate the peudo-inverse of $B^TB$ for factor returns estimate.
- Or, we can add constraint, for example: $w^TB_{ind}f_{ind}=0$, where $w$ is the firm's market cap weight, which indicates that the market-weighted industry returns must be 0.

## Estimating Factor Covariance Matrix

$$\Omega_T^{emp} = T^{-1}\sum^T_{t=1}\hat{f}_t\hat{f}_t^T$$
The standard error is $\sqrt{2/T}$, so as $T \to \infty$, $\Omega_T^{emp} \to \Omega_f$

The issues are:
- The number of factor is not much smaller than the number of observations, then the estimate is noisy
- Factor return estimate is inflated by the estimation process
- Factor returns are non-stationary during crisis
- Factor returns are mildy autocorrelated

### Factor covariance matrix shrinkage

The estimate $var(\hat{f}_t)$ is biased:

$$var(\hat{f}_t) = \Omega_f + (B^T\Omega_{\epsilon}^{-1}B)^{-1}$$

**WHY?????**

To correct this, we set
$$ \hat{\Omega}_f = var(\hat{f}_t) - (B^T\Omega_{\epsilon}^{-1}B)^{-1}$$

Combined with the Ledoit-Wolf shrinkage

$$ \Omega_{f, shrink}(\rho) = (1-\rho)\hat{\Omega}_f + \rho\frac{trace(\hat{\Omega}_f)}{m}I_m$$