# Factor Pricing Models

### Introduction

Linear Pricing Model is a model that try to explain the *long-run expected excess return (expected return above the risk-free rate)* of an asset as a linear function of its exposure (beta) to certain risk factors, each having an expected excess returns associated with it.

A *Factor Pricing Model*:
- Explains how expected excess returns vary accross assets (not how they vary across time).
- Is all about **expected** excess returns (not realized returns).
- Models the long-term expected returns of an asset.

The **"risk premium"** is the expected excess return of a factor due to the risk it represents.
- *The risk premium of an asset is proportional to the risk premium of the factors it is exposed to.*

*It Very different from Linear Factor Decomposition*, which is about realized returns and how they vary across time.


### Risk Premium (Rational vs. Behavioral)

- Rational Risk Premium: The expected excess return of a factor due to the risk it represents.
- Behavioral Risk Premium: The expected excess return of a factor due to the mispricing of the factor.

- There is still a lot of debate nowadays whether it's in compensation for risk or whether the risk premium might just be due to markets being irrational (ex. bubbles and trends).



### Beta

The **beta** of an asset with respect to a factor is the sensitivity of the asset's expected excess return to the factor.

*We got those betas from the Linear Factor Decomposition (time-series regression) and we are using them to predict the expected excess return of the asset.*

- A negative market beta mean that the asset is expected to have a lower return when the market goes up (negative mean excess return). Ex.: put options.
- The betas are not typical time series regression betas (like in a Linear Factor Decomposition), because if they were, the equation E[r_i] = beta_i,m * E[r_m] would actually be E[r_t] = alpha + beta_m,t * E[r_m,t] + epsilon_t
- Whatever happens this month (ex. stock goes down while the market goes up), it doesn't matter. This Factor Pricing Models are a theory about expected long-term average excess returns.
- Different than Linear Factor Decomposition, where the the left and right hand side are about realized returns. Here we are talking about expected excess returns.
    - In a Linear Factor Decomposition, the beats means that, on average, when the market goes up by 1%, the asset goes up by beta_i,m %.

---
---

## CAPM

The Capital Asset Pricing Model (CAPM) is a linear factor model that relates the expected excess return of an asset to its beta with the market portfolio and the market risk premium.

The CAPM identifies the **market portfolio** as the tangency portfolio.

- The market portfolio is the value-weighted portfolio of all available assets.
- It should include every type of asset, including non-traded assets.
- In practice, a broad equity index is typically used.
- *In practice, we use a broad equity index, like SPY or Russell 1000.*

The CAPM is about **expected** returns:

- The expected return of any asset is given as a function of two market statistics: the risk-free rate and the market risk premium.
- The coefficient is determined by a regression. 
- The theory does not say anything about how the risk-free rate or market risk premium are given: it is a **relative pricing formula**.

*Market beta is the only risk associated to higher average returns:*
- No other characteristics of the asset command a higher expected excess return returns from investors
- **Beyond how it affects market beta, CAPM says volatility, skewness, and other covariances do not matter** for determining risk premia.
- Idiosyncratic risks have such a negligible effect on the portfolio that in the limit it becomes meaningless  (what matters is its covariance with the rest of the portfolio).
- The only thing that matters is the beta.


---

### Derivation of the CAPM


First method: If returns have a joint normal distribution...

1. The mean and variance of returns are sufficient statistics for the return distribution.
2. Thus, every investor holds a portfolio on the MV frontier.
3. Everyone holds a combination of the tangency portfolio and the risk-free rate.
4. Then aggregating across investors, the market portfolio of all investments is equal to the tangency portfolio.

However:
1. Returns are not jointly normal nor iid.
2. Investors ae not each holding a portfolio on the MV frontier.
3. Then. in reality, the market portfolio is not the tangency portfolio.

Second Method: Don’t assume the returns are jointly normal.

- This is another way of assuming all investors choose MV portfolios (only care about mean and variance of return).
- But now it is not because mean and variance are sufficient statistics of the return distribution, but rather that they are sufficient statistics of investor objectives.
- So one derivation of the CAPM is about return distribution, while the other is about investor behavior.

However:
- Investors do not all hold MV portfolios, some could be holding inefficient portfolios based on their individual preferences or what makes them feel good.


---


### Return Variance Decomposition

The CAPM implies a clear relation between volatility of returns and risk premia.

- Consider the linear factor decomposition:
$$\tilde{r}_t^i = \beta^{i,m} \tilde{r}_t^m + \epsilon_t$$

- Take the variance of both sides of the equation to get:
$$\sigma_i^2 = (\beta^{i,m})^2 (\sigma^m)^2 + \sigma_\epsilon^2$$
$$\sigma_i^2 = sistemaic + idiosyncratic$$

So CAPM implies...

- The variance of an asset’s return is made up of a systematic (or market) portion and an idiosyncratic portion --> *Only the former risk is priced.*


---


### Proportional Risk Premium

The CAPM implies that the risk premium of an asset is proportional to the risk premium of the market.

$$\mathbb{E} \left[ \tilde{r}^i \right] = \beta^{i,m} \, \mathbb{E} \left[ \tilde{r}^m \right]$$

- Using the definition of \( \beta^{i,m} \):

$$\frac{\mathbb{E} \left[ \tilde{r}^i \right]}{\sigma^i} = \left( \rho^{i,m} \right) \frac{\mathbb{E} \left[ \tilde{r}^m \right]}{\sigma^m}$$


**The CAPM and Sharpe Ratio:**

Using the definition of the Sharpe ratio in (3), we have

$$\text{SR}^i = \left( \rho^{i,m} \right) \text{SR}^m$$

- The Sharpe ratio earned on an asset depends only on the correlation between the asset return and the market.

- A security with large idiosyncratic risk, $( \sigma_\epsilon^2 )$, will have lower $( \rho^{i,m} )$, which implies a lower Sharpe Ratio.

- *If there is a factor (or combination of factors) such that it has the highest sharpe ratio, then this factor (or this combination of factors) is the tangency portfolio.*

- The math shows that all assets have a sharpe ration smaller or equal to the market sharpe ratio (because correlation is between -1 and 1).

- Thus, risk premia are determined only by systematic risk.

*All securities must have a Sharpe ratio smaller than the market Sharpe ratio (which returns to the idea that market portfolio is the tangency portfolio).*


**The CAPM and Treynor Ratio:**

$$\text{Treynor Ratio} = \frac{\mathbb{E} \left[ \tilde{r}^i \right]}{\beta^{i,m}}$$

- If CAPM does not hold, then Treynor’s Measure is not capturing all priced risk.
- If the CAPM does hold, then: *Treynor Ratio should be equal to all securities = market portfolio risk premium.*
- If we calculated the tangency portfolio using MV and expected returns for each asset by its mean return, and use that portfolio as the market portfolio, then the Treynor Ratio would be the same for all assets.
    - But we cannot do that because we don't know the expected returns of the assets - and that's exactly what we are trying to find out.



---


## Testing of Linear Pricing Models

**Outline**

- If you had a 5-factor model, then you could still combine those into a single factor and return to the same model (but would not be a "market factor," but a "combination factor").
- You might use a 5-factor model if those factors are clearly identifiable.
  - Then, you are saying that the market portfolio is not the tangency portfolio, but the combination of those factors is a tangency portfolio.

**Testing**

- We need to find the factor returns without using the expected returns of the assets, i.e., without calculating the tangency portfolio explicitly (circularity).
  - Find ways/assumptions to calculate the factor.

- The difficulty is *knowing* you have the correct model.
- Calculating the model, once you have it, is easy.
- We will need to test it (involving regression).

---

### Time-Series Test: CAPM and Realized Returns

The CAPM implies that expected returns for any security are

$$\mathbb{E} \left[ \tilde{r}^i \right] = \beta^{i,m} \, \mathbb{E} \left[ \tilde{r}^m \right]$$

This implies that realized returns can be written as

$$\tilde{r}_t^i = \beta^{i,m} \tilde{r}_t^m + \epsilon_t$$

where $\epsilon_t$ is *not* assumed to be normal, but: $\mathbb{E} \left[ e \right] = 0$


**Testing the CAPM on an Asset**

- Run a time-series regression of excess returns $i$ on the excess market return.
- Regression for asset $i$, across multiple data points $t$:

  $$\tilde{r}_t^i = \alpha^i + \beta^{i,m} \tilde{r}_t^m + \epsilon_t^i$$

  Estimate $\alpha$ and $\beta$.

**Alpha must be zero if CAPM holds:** $\alpha^i = 0$.
- Even if the true population $\alpha$ is zero, the sample $\alpha$ might not be zero.
- Check if $\alpha$ is "close enough" to zero:
    - P-test would check if the true population $\alpha$ should be zero.
- However testing on expectations is hard because they are not very precise.

*we know that $\alpha$ should be zero for all assets, so we perform a joint test on all $\alpha^i$:* chi-squared test.
- Interpretation of the **joint test**:
    - run the best (mean-variance) portfolio of alphas and epsilons and hedge out the market factor.
    - Then calculate the sharpe ratio. This sharpe ratio should be zero because we hedged out the market factor, so there is no premium.
    - By performing multiple t-tests simultaneously increases the likelyhood of finding at least one significant result purely by chances.
    - The H-test, on the other hand, combines all the t-tests into a single test accounting for the alphas joint distribution, reducing the probability of encountering a deviation from the CAPM model purely by chance.

- Zero Alpha implies that the risk factor completely captures everything (implying the risk factor *is* the tangency portfolio)
- Non-zero Apha implies that the risk factor is not 100% responsible for the expected excess return of the portfolios.
- In other words, we are assessing whether the risk factors replicate (or span) our tangency portfolio.
    
- If the model explains premium well, there should be no alpha.


**Pricing error ($\alpha$):** 

- The alphas are the errors in the model (the difference between the expected return, based on the betas and the market risk premium, and the realized return).

- They are the mean returns that the factors *cannot explain*

- We get one alpha per asset.

- To compare across assets, we can take the *mean absolute error (MAE)* of the alphas.
    - We expect the MAE to be zero if the model is correct.

*(Note: in a hedging regression, the error is the residual, not the alpha)*


**$R^2$ of the CAPM do not matter**

- I'm using the betas from the Linear Factor Decomposition to predict the expected returns of the asset.
- I'm using the alphas from the Linear Factor Decomposition to test the CAPM
- I am doing nothing with the $R^2$ of the regression.
- The $R^2$ of the regression is saying:
    - Is this a good hedge?
    - Is this a good replication/ decomposition?
    - It does not say anything about the quality of the Linear Pricing Model (or the expected long term  excess returns).
    - An $R^2$ only shows how good a factor decomposition is → used to find models statistically (when you don’t know where to look).
    - You might have a good factor model (good pricing prediction) and low $R^2$, and vice versa.
    - Even if the CAPM were exactly true, it would not imply anything about the $R^2$ of the above

- CAPM only cares about risk premia of the asset compared to the factor risk premia.
- CAPM does not say "stock A is down this month because the market is down" (that's a realized return).
- CAPM explains variation in $\mathbb{E} \left[ \tilde{r}^i \right]$ across assets—not variation in $\tilde{r}^i$ across time!

$$\tilde{r}_t^i = \alpha^i + \beta^{i,m} \tilde{r}_t^m + \epsilon_t$$


---


### Cross-Section OLS Regression test: Industry Portfolios

A famous test for the CAPM is a collection of industry portfolios.
Consider a graph of the mean excess return of each industry portfolio against its market beta.

- Stocks are sorted into portfolios such as manufacturing, telecom, healthcare, etc.
- Variation in mean returns should be accompanied by variation in market beta.
    - Betas were obtained from a regression (linear factor decomposition).
- All portfolios should fit within the line if CAPM holds for $r^m$.

**Cross-Section Regression:**

Objective now is to find how close the portfolios fit the line passing through the origin and the market portfolio in a (market beta x historical excess return) graph --> **Security Market Line** (SML).

$$\mathbb{E} \left[ \tilde{r}^i \right] = \underbrace{\eta}_{\alpha} + \underbrace{\beta^{i,m}}_{x^i} \underbrace{\lambda_m}_{\beta^j} + \underbrace{v^i}_{\epsilon^i}$$

- Aach asset is a data point in the regression.
- The data on the left side is a list of mean returns on assets, $\mathbb{E} \left[ \tilde{r}^i \right]$.
- The data on the right side is a list of asset betas: $\beta^{i,m}$ for each asset $i$.
- The regression parameters are $\eta$ and $\lambda_m$.
- The regression errors are $v^i$.


---


**The Risk-Return Tradeoff**

To check that the *slope of the SML is the market risk premium*, note that the CAPM can be separated into two statements:

- Risk premia are proportional to market beta:

   $$\mathbb{E} \left[ \tilde{r}^i \right] = \beta^{i,m} \lambda_m$$

- The proportionality is equal to the market risk premium:

   $$\lambda_m = \mathbb{E} \left[ \tilde{r}^m \right]$$
   
The parameter $\lambda_m$ is the slope of the line.

- It represents the amount of risk premium an asset gets per unit of market beta.
- Thus, can divide risk premium into quantity of risk, $\beta^{i,m}$, multiplied by *price of risk*, $\lambda_m$.


---


**Cross-Section Test of the CAPM:**

$$\mathbb{E} \left[ \tilde{r}^i \right] = \eta + \beta^{i,m} \lambda_m + v^i$$

If the CAPM holds, then:

- The intercept of the regression (model alpha) should be zero: $\eta = 0$
    - That is, the SML goes through zero and the market return.
- The slope of the regression should be the market risk premium (mean excess return): $\lambda_m = \mathbb{E} \left[ \tilde{r}^m \right]$

This means that:

-  The Treynor Ratio, which is the slope of the SML, should be the same for all assets because they all fit on the SML.
- $R^2$ of the cross-sectional regression should be 100% (*Now you care about the $R^2$ of the regression*)
- The error term should be zero for all assets:  $v^i = 0$, $\forall i$ --> and, thus, MAE should be zero.

Note that:

- The time-series alpha is the difference between the expected return from the CAPM and the realized return for each asset (the error in the model).

In Summary:
*We want a high $R^2$, slope close to market risk premium, and zero alpha.*

That obviously does not hold in practice:
- The SML line doesn’t start at zero, $\eta > 0$.
- Slope is small, $\lambda_m$, is too small relative to the market risk premium (high beta assets are not being rewarded by additional premium).
    - *Maybe market beta is insufficient to explain all asset risk premia.*

The slope being off is a little forgiving because it equals the sample average of the factor excess returns, and the sample average is not a very precise estimate.

However, the intercept not being zero is disturbing, and probably either the t-stat is not significant or the model does not hold.

---


### CAPM as Practical Model

For many years, the CAPM was the primary model in finance.

- In many early tests, it performed quite well.
- Some statistical error could be attributed to difficulties in testing.
- For instance, the market return in the CAPM refers to the return on all assets—not just an equity index (*Roll critique*): you can't disproved the CAPM by using an equity index because you never used the true market portfolio.
- Further, working with short series of volatile returns leads to considerable statistical uncertainty.

---
---

## Mean Absolute Error

In this factor‐model approach, we have two different regressions—time‐series and cross‐sectional—and each produces its own notion of “error" when testing the model.

1. **Time‐Series MAE:**  
   We run a separate regression for each asset \(i\):
   \[
   R_{i,t} = \alpha_i + \beta_{i,1}\,F_{1,t} + \beta_{i,2}\,F_{2,t} + \dots + \varepsilon_{i,t}.
   \]
   - The \(\alpha_i\) is a single intercept for each asset.  
   - The code’s “TS MAE” is the average of \(\lvert \alpha_i\rvert\) across all assets:
     \[
     \text{TS MAE} = \frac{1}{N} \sum_{i=1}^N \bigl|\alpha_i\bigr|.
     \]
   - This measures how much, on average, each asset’s mean return deviates from what the time‐series factor exposures predict.

2. **Cross‐Sectional MAE:**  
   We then take each asset’s **mean** return \(\bar{R}_i\) and its estimated \(\beta_{i,k}\) from the time‐series step, and run one cross‐sectional regression:
   \[
   \bar{R}_i = \gamma_0 + \gamma_1 \,\beta_{i,1} + \gamma_2 \,\beta_{i,2} + \dots + u_i.
   \]
   - The residual \(u_i\) is how far each asset’s average return is from the fitted “factor‐pricing” line.  
   - The code’s “CS MAE” is the average of \(\lvert u_i\rvert\):
     \[
     \text{CS MAE} = \frac{1}{N} \sum_{i=1}^N \bigl|\hat{u}_i\bigr|.
     \]
   - This measures how well the estimated factor premia \(\gamma_k\) explain the *cross‐asset* variation in average returns.

So “TS MAE” is about **time‐series mispricing** (i.e., \(\alpha_i\)) for each asset, while “CS MAE” is about **cross‐sectional mispricing** (i.e., the residuals \(u_i\)) across all assets.


## Time-Varying Beta

*Time-series used to scrutinize $\alpha$ over time, not $R^2$.*

We want to allow for **beta to vary over time**.

$$\tilde{r}_t^i = \alpha^i + \beta_t^{i, z} z_t + \epsilon_t^i$$

So far, we have been estimating unconditional $ \beta $:
$$\tilde{r}_t^i = \alpha^i + \beta^{i, z} z_t + \epsilon_t^i$$

Must choose a model for how $ \beta $ changes over time.

- Consider stochastic vol models above.
- Often see estimates of $ \beta_t $ using a rolling window of data (5 years?).
- Can use GARCH, other models to capture nonlinear impact.
- *Or a simpler approach:*


### Fama-Macbeth Beta Estimation

The **Fama-Macbeth procedure** is widely used to deal with time-varying betas.

- Imposes little on the cross-sectional returns.
- Does assume no correlation across time in returns.
- Equivalent to certain GMM (generalized method of moments) specifications under these assumptions.


1. **Estimate $ \beta_t $**  
   For each security, $i$, estimate the time-series of $ \beta_t^i $. This could be done for each $t$ **using a rolling window**
      - 1 or 0.5 year for daily data (not much used for monthly data) or other methods.  
   *(If using a constant $ \beta $ just run the usual time-series regression for each security.)*

   $$\tilde{r}_t^i = \alpha^i + \beta_t^{i, z} z_t + \epsilon_t^i$$

2. **Estimate $ \lambda_t, v_t^i $**  
   For each $t$, estimate a cross-sectional regression to obtain $ \lambda_t $ and estimates of the $N$ pricing errors, $v_t^i$.

   $$\tilde{r}_t^i = \beta_t^{i, z} \underbrace{\lambda_t}_{\text{month $t$ factor premium}} + \underbrace{v_t^i}_{\text{month $t$ pricing error on asset $i$}}$$

   - *Use industry or style portfolios to test it to avoid using single names (idiosyncratic)*
   - *Last week's recording for whether to include $\alpha$.*

3. Take the average of the factor premium estimates, $ \hat{\lambda} $, and the average of the pricing errors, $ \hat{v}^i $, for every month.
---

### Illustration of Time and Cross Regressions

Use sample means of the estimates:
$$\hat{\lambda} = \frac{1}{T} \sum_{t=1}^{T} \lambda_t, \quad \hat{v}^i = \frac{1}{T} \sum_{t=1}^{T} v_t^i$$

- *Average factor premium*
- *Standard error on factor premium*

- This allowed flexible model for $ \beta_t^{i, z} $.
- Running $ t $ cross-sectional regressions allowed $ t $ (unrelated) estimates $ \lambda_t $ and $ v_t $.

---

### Fama-MacBeth Standard Errors

Get standard errors of the estimates by using the Law of Large Numbers for the sample means, $\hat{\lambda}$ and $\hat{v}$.

$$\text{s.e.}(\hat{\lambda}) = \frac{1}{\sqrt{T}} \sigma_\lambda$$
$$= \frac{1}{T} \sqrt{\sum_{t=1}^{T} (\lambda_t - \hat{\lambda})^2}$$

- These standard errors correct for cross-sectional correlation.
- If there is no time-series correlation in the OLS errors, then the Fama-Macbeth standard errors will equal the GMM errors.

---

### Beyond Fama-MacBeth

The **Fama-MacBeth two-pass regression approach** is very popular to incorporate dynamic betas.

- It is easy to implement.
- It is (relatively!) easy to understand.
- It gives reasonable estimates of the standard errors.

If we want to calculate more precise standard errors, we could easily use the **Generalized Method of Moments (GMM)**.

- GMM would account for any serial correlation.
- GMM would account for the imprecision of the first-stage (time-series) estimates.


*Note:* If using full-sample time-series betas, there would be no point in using Fama-MacBeth, as this would give us the usual cross-sectional estimates.

