# Penalized regressions and sparse hedging for minimum variance portfolios

Possible applications of "regularization" for linear models:

- Improve the *robustness* of factor-based predictive regressions
- Fuel an allocation scheme (Han et al., 2019; Rapach and Zhou, 2019)
- Improve the quality of mean-variance driven portfolio weights (Stevens, 1998)
- General idea: remove noises (at the cost of a possible bias)

## Penalized Regressions

### Simple Regressions

The classical linear function: $\boldsymbol{y}=\boldsymbol{X}\boldsymbol{\beta}+\boldsymbol{\varepsilon}$. 

The best choice of $\boldsymbol{\varepsilon}$ is naturally the one that *minimizes the error*. A general idea is to minimize the *square errors*: $L=\boldsymbol{\varepsilon}^{'}\boldsymbol{\varepsilon}=\sum_i \varepsilon_i^2$. The loss $L$ is called the sum of squared residuals (*SSR*). Take partial differentiation to get
\begin{align*}
\nabla_{\boldsymbol{\beta}} L&=\frac{\partial}{\partial \boldsymbol{\beta}}(\textbf{y}-\textbf{X}\boldsymbol{\beta})'(\boldsymbol{y}-\boldsymbol{X}\boldsymbol{\beta})=\frac{\partial}{\partial \boldsymbol{\beta}}[\boldsymbol{\beta}'\boldsymbol{X}'\boldsymbol{X}\boldsymbol{\beta}-2\boldsymbol{y}'\boldsymbol{X}\boldsymbol{\beta}] \\
&=2\boldsymbol{X}'\boldsymbol{X}\boldsymbol{\beta}  -2\boldsymbol{X}'\boldsymbol{y}
\end{align*}
so that the first order condition $\nabla_{\boldsymbol{\beta}}=\mathbf{0}$ is satisfied if $$\boldsymbol{\beta}^*=(\boldsymbol{X}'\boldsymbol{X})^{-1}\boldsymbol{X}'\boldsymbol{y}$$
which is known as the **standard ordinary least squares (OLS)** solution of the linear model. Two issues:

- Matrix $\boldsymbol{X}$ with dimensions $I\times K$. $\boldsymbol{X}'\boldsymbol{X}$ can only be inverted if $I$ (*nbs. of rows*) is strictly superior to $K$ (*nbs. of columns*). If there are more predictors than instances then there is no unique value of $\boldsymbol{\beta}$ that minimizes the loss.