**Fin 585R**  
**Diether**  
**Course Notes**   
**Final Thoughts on Factor Models**  

<br>

**I. The Fama-French Three Factor Model**

+ In response to the empirical failures of the CAPM, Fama and French proposed a three factor model.<br><br>

+ See, "Common Risk Factors in the Returns on Stocks and Bonds", 1993. <br><br>

**A. The model**

$$
E(r_{i}) - r_f = b_i\bigl[ E(r_M)-r_f \bigr] + s_iE(SMB) + h_iE(HML)
$$

+ SMB is small minus big.<br><br>

+ It's the size factor.<br><br>

+ It's the return for a portfolio of small-cap stocks minus the return for a portfolio of large-cap stocks.  <br><br>
$$
SMB = r_{small} - r_{big}
$$<br>

+ HML is high book to market minus low book to market.<br><br>

+ It's the value/growth factor. <br><br>

+ It's the return for a portfolio of high book to market stocks minus the return for a portfolio of low book to market stocks.  <br><br>
$$
HML = r_{high\ B/M} - r_{low\ B/M}
$$<br>


**B. Rational motivation for value/growth effect (Zhang, 2005)**

+ Value stocks have more assets in place than growth stocks.<br><br>

+ Growth stocks: lots of options for growth and less assets in place. <br><br>

+ In bad times, have to cut capital stock to be productive. This is expensive.<br><br>

+ Value stocks have to adjust capital more than growth stocks in bad times. <br><br>

+ Thus, having lots of assets in place may be riskier than having growth options.<br><br> 


**C. Smell Test**

+ Does this rational story pass the smell test?<br><br>

+ Specifically, there a three necessary conditions (together they should be sufficient):
  
  1. Affects marginal ulility<br><br>
  
  2. Affects expected return of relatively large cross-sections of assets.<br><br>
  
  3. Doesn't show up in beta with the market.<br><br>

+ Why doesn't this "riskiness" show up in beta with the market?<br><br>

+ One possibility is covariance with the market doesn't track or pick-up this kind of riskiness well.<br><br>

  + Could be a particularly problem given how we estimate covariance: historical data. <br><br>

  + In other words, it could be that true beta with the market is really hard to measure well because of economic     shocks like recessions, and the estimated loading on the market and the value/growth loading together in a         multifactor model jointly estimate the true beta better.<br><br>

+ Also, recessions (bad times) could be of special hedging concern via the effect on marginal utility of unemployment unrelated to wealth. 
<br>


**D. Failures of the three factor model**

+ The three factor model, of course, explains the value/growth and size anomalies reasonably well. <br><br>

+ But there are some empirical patterns it has trouble explaining. <br><br>

+ For example, the three factor model cannot explain the *momentum effect* (see, for example, Fama and French (1996), "Multifactor Explanations of Asset Pricing Anomalies").<br><br>

**II. Testing the three factor model**

Let's test the Fama-French three factor model using momentum portfolios as tests assets.

**A. Get and merge the data**

In [1]:
import pandas as pd
import numpy as np
import statsmodels.formula.api as smf
from finance_byu.summarize import summary
from finance_byu.regtables import Regtable
import warnings
warnings.filterwarnings("ignore")

In [2]:
url = 'http://diether.org/prephd/10-port_mom_ew.csv'
port = pd.read_csv(url,parse_dates=['caldt'])
port

Unnamed: 0,caldt,p0,p1,p2,p3,p4
0,1927-01-31,-2.708397,4.290747,0.776512,1.210773,0.508464
1,1927-02-28,5.418420,8.356758,4.758203,4.905527,5.422874
2,1927-03-31,-3.829973,-2.910900,-1.801945,0.259834,0.020529
3,1927-04-30,-0.747539,-0.438827,-0.216951,-0.172500,3.216826
4,1927-05-31,3.314240,5.759958,7.950440,8.014930,8.587076
...,...,...,...,...,...,...
1144,2022-05-31,-5.392614,-2.553349,0.502323,-0.183222,1.199527
1145,2022-06-30,-5.344633,-8.743887,-8.291228,-4.778448,-9.131728
1146,2022-07-29,11.776372,11.033183,9.284861,6.875124,8.193744
1147,2022-08-31,0.317134,-2.100848,-2.140245,-2.222316,0.123339


In [3]:
fac = pd.read_csv('http://diether.org/prephd/10-factors.csv',parse_dates=['caldt'])
port = port.merge(fac,on='caldt',how='inner')
port

Unnamed: 0,caldt,p0,p1,p2,p3,p4,exmkt,smb,hml,umd,rf
0,1927-01-31,-2.708397,4.290747,0.776512,1.210773,0.508464,-0.06,-0.37,4.54,0.36,0.25
1,1927-02-28,5.418420,8.356758,4.758203,4.905527,5.422874,4.18,0.04,2.94,-2.14,0.26
2,1927-03-31,-3.829973,-2.910900,-1.801945,0.259834,0.020529,0.13,-1.65,-2.61,3.61,0.30
3,1927-04-30,-0.747539,-0.438827,-0.216951,-0.172500,3.216826,0.46,0.30,0.81,4.30,0.25
4,1927-05-31,3.314240,5.759958,7.950440,8.014930,8.587076,5.44,1.53,4.73,3.00,0.30
...,...,...,...,...,...,...,...,...,...,...,...
1144,2022-05-31,-5.392614,-2.553349,0.502323,-0.183222,1.199527,-0.34,-1.85,8.41,2.48,0.03
1145,2022-06-30,-5.344633,-8.743887,-8.291228,-4.778448,-9.131728,-8.43,2.09,-5.97,0.79,0.06
1146,2022-07-29,11.776372,11.033183,9.284861,6.875124,8.193744,9.57,2.81,-4.10,-3.96,0.08
1147,2022-08-31,0.317134,-2.100848,-2.140245,-2.222316,0.123339,-3.77,1.39,0.31,2.10,0.19


**B. CAPM regressions**

In [4]:
names = ['p0','p1','p2','p3','p4']
port[names] = port[names].sub(port.rf,axis=0)

reg = [smf.ols(r + ' ~ 1 + exmkt',data=port).fit() for r in names]
Regtable(reg,sig='coeff').render()

Unnamed: 0,p0,p1,p2,p3,p4
,,,,,
Intercept,-0.857***,-0.216**,0.107,0.327***,0.595***
,(-6.49),(-2.57),(1.63),(5.36),(6.33)
exmkt,1.488***,1.215***,1.096***,1.030***,1.092***
,(60.87),(78.18),(89.76),(91.05),(62.70)
Obs,1149,1149,1149,1149,1149
Rsq,0.76,0.84,0.88,0.88,0.77


**C. Three factor model regressions**

In [5]:
reg = [smf.ols(r + ' ~ exmkt + smb + hml',data=port).fit() for r in names]

Regtable(reg,sig='coeff').render()

Unnamed: 0,p0,p1,p2,p3,p4
,,,,,
Intercept,-1.010***,-0.336***,0.001,0.253***,0.572***
,(-9.77),(-5.85),(0.02),(6.23),(8.36)
exmkt,1.281***,1.061***,0.964***,0.918***,0.976***
,(62.32),(92.67),(120.55),(113.38),(71.66)
smb,0.797***,0.566***,0.469***,0.470***,0.715***
,(23.45),(29.89),(35.49),(35.12),(31.78)
hml,0.373***,0.312***,0.285***,0.154***,-0.120***
,(12.62),(18.99),(24.79),(13.23),(-6.14)
Obs,1149,1149,1149,1149,1149


<br><br>

**III. The Four Factor Model**

+ Inspired by the empirical failure of the three factor model with respect to the momentum effect, a four factor model that adds a momentum factor became common. <br><br>

+ It's very common to use a four factor model in empirical testing. <br><br>

+ Almost all papers these days perform factor model tests using both the three factor, four factor, and five factor models.<br><br>


**A. The model**

$$
E(r_{i}) - r_f = b_i\bigl[E(r_M)-r_f\bigr] + s_iE(SMB) + h_iE(HML) + u_iE(UMD)
$$

+ SMB is small minus big. It's the size factor. <br><br>

+ HML is high book to market minus low book to market. It's the
  value/growth factor. <br><br>

+ UMD is up minus down (past winner minus past losers). It is the momentum factor.<br><br>
$$
UMD = r_{up} - r_{down} = r_{past\ winners} - r_{past\ losers}
$$<br><br>

+ You can download $UMD$ from Ken French's data library.<br><br>


In [6]:
x_var = ' ~ 1 + exmkt + smb + hml + umd'
reg = [smf.ols(r + x_var,data=port).fit() for r in names]

Regtable(reg,sig='coeff').render()

Unnamed: 0,p0,p1,p2,p3,p4
,,,,,
Intercept,-0.319***,0.004,0.106***,0.103***,0.134***
,(-5.64),(0.11),(2.72),(2.79),(3.25)
exmkt,1.123***,0.983***,0.940***,0.952***,1.076***
,(98.88),(125.19),(120.22),(128.46),(130.28)
smb,0.764***,0.550***,0.464***,0.477***,0.736***
,(42.14),(43.84),(37.18),(40.29),(55.79)
hml,0.037**,0.147***,0.234***,0.227***,0.093***
,(2.19),(12.51),(20.03),(20.51),(7.53)
umd,-0.716***,-0.353***,-0.109***,0.156***,0.454***



**B. Momentum as a factor**

+ Very few people believe that momentum represents a risk factor. <br><br>

+ If it's not a risk factor, why do we include it in a factor model?<br><br>


**IV. Other Factor Models**


+ A five (or six factor) model is becomming pretty common.<br><br>

+ The five factor adds profitability and investment factors to the three factor model.<br><br>


**V. Mapping multifactor models into a single factor**

+ All mulltifactor models have a single factor representation.<br><br> 

+ If a multifactor model is true $\rightarrow$ some portfolio of the factor portfolios is the **tangency portfolio**.<br><br>

**A. An Example**

Consider the following time-series regression using the excess returns on the `p3` momentum portfolio as the dependent variable:

$$
r_{3t} - r_{ft} = a_3 + b_{3}(r_{Mt}-r_{ft}) + s_{3}SMB_t 
                    + h_{3}HML_t + e_{3t}
$$

In [7]:
x_var = ' ~ 1 + exmkt + smb + hml'
reg = [smf.ols(r + x_var,data=port).fit() for r in names]

reg3 = smf.ols('p3 ~ 1 + exmkt + smb + hml',data=port).fit()
coef = reg3.params
coef

Intercept    0.253296
exmkt        0.917732
smb          0.469806
hml          0.153709
dtype: float64

**Loadings are Unscaled Portfolio Weights**

Here is a portfolio based on the loadings from the regressions. This portfolio is actually pretty interesting. 

In [8]:
(b,s,h) = reg3.params[1:]
port['x'] = port.eval('(@b*exmkt + @s*smb + @h*hml)/(@b + @s + @h)')
# x is a portfolio of three factor portfolioss
summary(port['x']).loc[['mean','std','tstat'],]

mean     0.489320
std      3.717909
tstat    4.461225
Name: x, dtype: float64

Notice the **Estimated Sharpe Ratio** of portfolio X compared to the other factor portfolios:
$$
SR_x = \frac{\bar{r}_x - \bar{r}_f}{\hat{\sigma}_x}
$$

In [9]:
port[['exmkt','smb','hml','x']].mean() / port[['exmkt','smb','hml','x']].std()

exmkt    0.123247
smb      0.062567
hml      0.098914
x        0.131612
dtype: float64

The regression implicitly finds the tangency portfolio

<br><br>
**The Alpha and Beta of Portfolio X**



In [10]:
reg_x = smf.ols('p3 ~ 1 + x',data=port).fit()

Regtable(reg[3:4] + [reg_x],sig='coeff',bfmt='.4f',sfmt='.4f').render()
# Key point is alpha is the same (intercept)

Unnamed: 0,p3,p3.1
,,
Intercept,0.2533***,0.2533***
,(6.2253),(6.2418)
exmkt,0.9177***,
,(113.3805),
smb,0.4698***,
,(35.1153),
hml,0.1537***,
,(13.2293),
x,,1.5412***


<br><br>
**X is the single factor representation of the three factor model for that LHS test asset (portfolio 3)**

+ The alpha is exactly the same.<br><br>

+ Notice, $\hat{b} + \hat{s} + \hat{h} = \hat{\beta}_{3X}$:<br>
$$
0.9217 + 0.1508 + 0.4679 = 1.5404
$$<br><br>

    + But why is the standard error on the alpha slightly different if these two methods are the same?<br><br>

    + Degrees of freedom for the 2nd regression is technically wrong.<br><br>

    + Regression 2 is using $N-1$; it's really the same as I: $N-3$.<br><br>

    + Clearly, the weights to create the portfolio were estimated parameters. Therefore, degrees of freedom must         be $N-3$.<br><br>


<br>

**B. The tangency portfolio and multifactor models**  

+ Testing a multifactor model is equivalent to testing whether some portfolio of the factor portfolios is the *tangency portfolio*. <br><br>

+ In our example above it's the following: <br><br>
$$
r_x = \left(\frac{b}{b + s + h}\right) \bigl(r_{M}-r_{f} \bigr) 
      + \left(\frac{s}{b + s + h}\right)SMB 
      + \left(\frac{h}{b + s + h}\right)HML
$$<br>

**C. General Derivation**

Consider a multifactor model with $k$ factors all of the form $r_k - r_f$ (this seems restrictive but we can always rewrite a factor in this form):

$$
E(r_i) - r_f = \sum_{k=1}^{K} b_{ik} \bigl[ E(r_k) - r_f \bigr]
$$

Multiply both sides of the equation by one:

$$
E(r_i) - r_f = \left( \frac{\sum_{k=1}^{K} b_{ik}}
                           {\sum_{k=1}^{K} b_{ik}} \right)
                \left( \sum_{k=1}^{K} b_{ik}
                \bigl[ E(r_k) - r_f \bigr] \right)
$$

Rearrange a little:

$$
E(r_i) - r_f = \left( \sum_{k=1}^{K} b_{ik}  \right)
                \left( 
                \sum_{k=1}^{K} \left[
                \frac{b_{ik}}{\sum_{k=1}^{K} b_{ik}}
                \right]
                \bigl[ E(r_k) - r_f \bigr] \right)
$$

Let $w_{k}$ be the weight on factor $k$ for portfolio $X$:

$$
w_k = \frac{b_{ik}}{\sum_{k=1}^{K} b_{ik}}
$$

And let $\beta_{ix}$ be the following:

$$
\beta_{ix} =  \sum_{k=1}^{K} b_{ik} 
$$

Now we can simplify:

$$
E(r_i) - r_f = \beta_{ix} \sum_{k=1}^{K} 
                w_{k} \bigl[ E(r_k) - r_f \bigr]
$$

$$
E(r_i) - r_f = \beta_{ix} \bigl[ E(r_x) - r_f \bigr]
$$

**D. Implications**  

+ The final equation is the tangency portfolio condition.<br><br>

+ Therefore, testing the $\alpha = 0$ condition for a multifactor model is equivalent to testing whether some portfolio of the factor portfolios is the tangency portfolio. <br><br>

+ The factor loadings from the regressions are the *unscaled* weights for the hypothesized tangency portfolio.<br><br>

+ More generally: if the $\alpha \ne 0$, then the loadings are the unscaled weights from a portfolio that makes the $alpha$ in the regression as close to zero as possible.<br><br>
