# Homework 5

## FINM 36700 - 2024

### UChicago Financial Mathematics

* Mark Hendricks
* hendricks@uchicago.edu

***

# Section 1: Harvard Case

*This section will not be graded, but it will be discussed in class.*

**Smart Beta Exchange-Traded-Funds and Factor Investing**.

* The case is a good introduction to important pricing factors.
* It also gives useful introduction and context to ETFs, passive vs active investing, and so-called “smart beta” funds.

1. Describe how each of the factors (other than MKT) is measured.1That is, each factor is a portfolio of stocks–which stocks are included in the factor portfolio?

2. Is the factor portfolio...
* long-only
* long-short
* value-weighted
* equally-weighted

4. What steps are taken in the factor construction to try to reduce the correlation between the factors?
5. What is the point of figures 1-6?
6. How is a “smart beta” ETF different from a traditional ETF?
7. Is it possible for all investors to have exposure to the “value” factor?
8. How does factor investing differ from traditional diversification?


If you need more info in how these factor portfolios are created, see Ken French’s website, and the follow- details: 

https://mba.tuck.dartmouth.edu/pages/faculty/ken.french/Data_Library/f-f_5_factors_2x3.html

https://mba.tuck.dartmouth.edu/pages/faculty/ken.french/Data_Library/det_mom_factor.html

***

# 2. The Factors

Use the data found in `factor_pricing_data.xlsx`.

* FACTORS: Monthly excess return data for the overall equity market, $\tilde{r}^{\text{MKT}}$.
* The column header to the market factor is `MKT` rather than `MKT-RF`, but it is indeed already in excess return form.
* The sheet also contains data on five additional factors.
* All factor data is already provided as excess returns

In [17]:
import pandas as pd
import numpy as np
from portfolio import *

In [18]:
df = pd.read_excel('factor_pricing_data.xlsx', sheet_name='factors (excess returns)')
df.set_index('Date', inplace=True)
df.head()

Unnamed: 0_level_0,MKT,SMB,HML,RMW,CMA,UMD
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
1980-01-31,0.0551,0.0183,0.0175,-0.017,0.0164,0.0755
1980-02-29,-0.0122,-0.0157,0.0061,0.0004,0.0268,0.0788
1980-03-31,-0.129,-0.0693,-0.0101,0.0146,-0.0119,-0.0955
1980-04-30,0.0397,0.0105,0.0106,-0.021,0.0029,-0.0043
1980-05-31,0.0526,0.0211,0.0038,0.0034,-0.0031,-0.0112


1. Analyze the factors, similar to how you analyzed the three Fama-French factors in Homework 4.
You now have three additional factors, so let’s compare there univariate statistics. • mean
• volatility
• Sharpe

2. Based on the factor statistics above, answer the following.
(a) Does each factor have a positive risk premium (positive expected excess return)? (b) How have the factors performed since the time of the case, (2015-present)?

3. Report the correlation matrix across the six factors.
* Does the construction method succeed in keeping correlations small?
* Fama and French say that HML is somewhat redundant in their 5-factor model. Does this seem to be the case?

4. Report the tangency weights for a portfolio of these 6 factors.
* Which factors seem most important? And Least?
* Are the factors with low mean returns still useful?
* Re-do the tangency portfolio, but this time only include MKT, SMB, HML, and UMD. Which factors get high/low tangency weights now?

What do you conclude about the importance or unimportance of these styles?

### 1.

In [19]:
stats = performanceMetrics(df, 12)
stats

Unnamed: 0,Mean,Vol,Sharpe,Min,Max
MKT,0.086277,0.156904,0.549872,-0.2324,0.1365
SMB,0.008319,0.101873,0.081665,-0.1532,0.1828
HML,0.025809,0.109999,0.234629,-0.1388,0.128
RMW,0.047096,0.083213,0.565962,-0.1865,0.1307
CMA,0.029537,0.073084,0.404148,-0.072,0.0907
UMD,0.062709,0.154564,0.405714,-0.343,0.182


### 2.

In [20]:
# Part a) does each factor have a positive risk premium?
for factor in stats.index[1:]:
    mean = stats.loc[factor]['Mean']
    if mean > stats.loc['MKT']['Mean']:
        print(f'{factor} has a positive risk premium')
    else:
        print(f'{factor} does not have a risk premium')

SMB does not have a risk premium
HML does not have a risk premium
RMW does not have a risk premium
CMA does not have a risk premium
UMD does not have a risk premium


In [21]:
df.index = df.index.strftime('%Y-%m-%d')

In [22]:
stats_2015_present = df.loc['2015-12-31':]
performanceMetrics(stats_2015_present, 12)

Unnamed: 0,Mean,Vol,Sharpe,Min,Max
MKT,0.125291,0.162743,0.769872,-0.1339,0.1365
SMB,-0.0184,0.106055,-0.173494,-0.0824,0.0828
HML,-0.011006,0.137657,-0.07995,-0.1388,0.128
RMW,0.056,0.075508,0.741641,-0.0479,0.0727
CMA,0.001234,0.087131,0.014166,-0.072,0.0774
UMD,0.003371,0.137163,0.02458,-0.1602,0.0796


### The factors have performed worse since the time of the case, where the market has returned the most.

### 3.

In [23]:
corr = df.corr()
corr

Unnamed: 0,MKT,SMB,HML,RMW,CMA,UMD
MKT,1.0,0.227756,-0.204356,-0.246768,-0.357823,-0.175585
SMB,0.227756,1.0,-0.029072,-0.414055,-0.049575,-0.055304
HML,-0.204356,-0.029072,1.0,0.219651,0.67845,-0.216986
RMW,-0.246768,-0.414055,0.219651,1.0,0.127209,0.079525
CMA,-0.357823,-0.049575,0.67845,0.127209,1.0,0.008398
UMD,-0.175585,-0.055304,-0.216986,0.079525,0.008398,1.0


### The correlations seem to be fairly small except for a few pairs.

### HML doesn't seem to be too redundant, the correlations are fairly disperse across the board among other factors.

### 4.

In [24]:
tangency_weights(df)

Unnamed: 0,tangency weights
MKT,0.20976
SMB,0.077337
HML,-0.042142
RMW,0.313263
CMA,0.338982
UMD,0.102798


## CMA and RMW got the most importance in the tangency portfoio, as they both exhibit high sharpe ratios. RMW actually had a higher Sharpe than the market, and though CMA had a slightly lower sharpe, it had almost half the volatility.
## SMB and HML, particularly the lowest mean returns, are still useful but not as much as the others in this tangency portfolio. Both have a weighting of <10%.

In [25]:
tangency_weights(df[['MKT', 'SMB', 'HML', 'UMD']])

Unnamed: 0,tangency weights
MKT,0.365529
SMB,-0.032422
HML,0.356199
UMD,0.310694


## This portfolio also has a very small weighting on SMB, signifying that this factor may not be too useful to aid to the portfolio's performance. UMD however got about 3x weighting compared to the 6 factor portfolio, signifying that in the absence of robust near-term profitability, the next best stocks would be the ones with strong momentum.

***

# 3. Testing Modern LPMs

Consider the following factor models:
* CAPM: MKT
* Fama-French 3F: MKT, SMB, HML
* Fama-French 5F: MKT, SMB, HML, RMW, CMA
* AQR: MKT, HML, RMW, UMD

We are not saying this is “the” AQR model, but it is a good illustration of their most publicized factors: value, momentum, and more recently, profitability.

For instance, for the AQR model is...

![](../refs/LFP-4-factors.png)

We will test these models with the time-series regressions. Namely, for each asset i, estimate the following regression to test the AQR model:

![](../refs/LFD-4-factors.png)

Data
* PORTFOLIOS: Monthly excess return data on 49 equity portfolios sorted by their industry. Denote these as $\tilde{r}^i$ , for $n = 1, . . . , 49.$

* You do NOT need the risk-free rate data. It is provided only for completeness. The other two tabs are already in terms of excess returns.

1. Test the AQR 4-Factor Model using the time-series test. (We are not doing the cross-sectional regression tests.)
* For each regression, report the estimated α and r-squared.
* Calculate the mean-absolute-error of the estimated alphas.
* If the pricing model worked, should these alpha estimates be large or small? Why?
* Based on your MAE stat, does this seem to support the pricing model or not?

2. Test the CAPM, FF 3-Factor Model and the the FF 5-Factor Model.
   * Report the MAE statistic for each of these models and compare it with the AQR Model MAE.
   * Which model fits best?
   
3. Does any particular factor seem especially important or unimportant for pricing? Do you think Fama and French should use the Momentum Factor?

4. This does not matter for pricing, but report the average (across $n$ estimations) of the time-series regression r-squared statistics.
   * Do this for each of the three models you tested.
   * Do these models lead to high time-series r-squared stats? That is, would these factors be good in a Linear Factor Decomposition of the assets?

5. We tested three models using the time-series tests (focusing on the time-series alphas.) Re-test these models, but this time use the cross-sectional test.
* Report the time-series premia of the factors (just their sample averages,) and compare to the cross-sectionally estimated premia of the factors. Do they differ substantially?4
* Report the MAE of the cross-sectional regression residuals for each of the four models. How do they compare to the MAE of the time-series alphas?

***