# TA Review Session 4

## FINM 36700 - 2023

### UChicago Financial Mathematics
* Mani Sawhney
* msawhn2@uchicago.edu

## DFA Case



### Notation
(Hidden LaTeX commands)







$$\newcommand{\mux}{\tilde{\boldsymbol{\mu}}}$$
$$\newcommand{\wtan}{\boldsymbol{\text{w}}^{\text{tan}}}$$
$$\newcommand{\wtarg}{\boldsymbol{\text{w}}^{\text{port}}}$$
$$\newcommand{\mutarg}{\tilde{\boldsymbol{\mu}}^{\text{port}}}$$
$$\newcommand{\wEW}{\boldsymbol{\text{w}}^{\text{EW}}}$$
$$\newcommand{\wRP}{\boldsymbol{\text{w}}^{\text{RP}}}$$
$$\newcommand{\wREG}{\boldsymbol{\text{w}}^{\text{REG}}}$$

## Agenda
 - CAPM 
 - Fama-French 3 Factor Model
 - DFA Case
 - HW 4 Highlights

## CAPM

 - The CAPM states that the expected excess portfolio/ arbitrary return is the expected excess return of the broader benchmark scaled by beta, where the beta comes from regression (Time series return)
 
 
- The CAPM identifies the market portfolio as the tangency portfolio (from MV Optimization - using excess returns). 


- However, there can be skepticism about relying on this tangency portfolio. When we talk about expected returns (as opposed to realized returns), we are referring to the anticipated return of an asset. This expected return is determined based on a combination of the risk-free rate and the market risk premium. The CAPM theory doesn't specify the exact method for determining these components, making it a relative pricing model. In this context, market beta represents systematic risk.

     i.e., CAPM is described as a relative pricing model. This means that it doesn't provide specific, absolute values for expected returns or risk-free rates. Instead, it offers a framework for comparing and pricing assets relative to each other. It does this by focusing on the relationship between an asset's expected return and its sensitivity to systematic market risk, as captured by the market beta.


- One of the strengths of the CAPM is its assertion that systematic risk is the sole source of higher average returns in the market. In other words, it claims that only risks related to the broader market, as represented by beta, are compensated with higher returns.

## Fama-French 3 Factor Models

The Fama-French 3-Factor Model is an extension of the Capital Asset Pricing Model (CAPM) used to explain the returns of a portfolio or asset. It introduces three additional factors beyond the market factor considered in the CAPM. These three factors are:

- Market Risk Premium (Market Factor): This is the same factor as in the CAPM, representing the return of the overall market or a market index, such as the S&P 500. It's a measure of systematic risk.

- Size Factor (SMB - Small Minus Big): The size factor accounts for the historical tendency of smaller companies to outperform larger companies. In the Fama-French model, this factor compares the returns of small-cap stocks to large-cap stocks. Small-cap stocks are considered riskier and are expected to offer higher returns to compensate for the additional risk. The SMB factor captures this size effect.

- Value Factor (HML - High Minus Low): The value factor accounts for the historical tendency of value stocks to outperform growth stocks. 

    Value stocks are those that are considered undervalued, often with low price-to-earnings (P/E) ratios and other value-oriented metrics. 
    
    Growth stocks, on the other hand, are expected to have higher future growth potential but often come with higher valuations. The HML factor captures this value effect.

## DFA Case

- DFA leverages Fama-French 3 factor model to generate long term premium through exposure to Market, Value and Size factors


- The edge they give to their investors is that small cap and value stocks are imperfectly correlated with the overall market, these holdings will reduce risk at the margin for any investor completely invested in funds tracking the broader equity index


- DFA's strategies are largely reliant on market efficiency and Linear factor models, and they do not take directional single stock or macroeconomic bets.


## Imports

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from scipy.stats import norm
import statsmodels.api as sm
import scipy.stats as stats
import arch
import warnings
warnings.filterwarnings("ignore")
from arch import arch_model
from arch.univariate import GARCH, EWMAVariance
sns.set_theme()

## Reading the Data

In [2]:
risk_free_rate = pd.read_excel('dfa_analysis_data.xlsx', sheet_name = 'factors')[['Date','RF']].set_index('Date')
factors = pd.read_excel('dfa_analysis_data.xlsx', sheet_name = 'factors').drop(columns = ['RF']).set_index('Date')
portfolio_total_returns = pd.read_excel('dfa_analysis_data.xlsx', sheet_name = 'portfolios (total returns)').set_index('Date')

### Converting total returns to excess returns in the portfolio

In [3]:
portfolio_excess_returns = portfolio_total_returns.sub(risk_free_rate.values)

# Helper Functions

## 3.2 
### 3.2.a) For each of the n = 25 test portfolios, run the CAPM time-series regression: <br> 
###   <center> $\tilde{r}_{t}^{i}$ = $α^{i} + β^{i,m}\tilde{r}^{m}_{t} + ε_{t}$ </center>
### So you are running 25 separate regressions, each using the T-sized sample of time-series data.

### Time-series Regression

The logic behind the following code is that it iterates over each column (representing an asset) in the portfolio DataFrame to perform a time-series regression. For each asset:

- It sets up the dependent variable (lhs) as the asset's returns.
- It fits a regression model using Ordinary Least Squares (OLS) from the Statsmodels library (sm.OLS).
- Various statistics from the regression results are extracted and stored in the ff_report DataFrame.
- If the FF3F flag is True, additional Fama-French model statistics (Size and Value betas) are computed and stored in the ff_report. (this is used for extensions) 
- If the resid flag is True, the residuals of the regression are stored in the bm_residuals DataFrame. (used for extensions to calculate the MAE

In [5]:
def time_series_regression(portfolio, factors, FF3F = False, resid = False):
    
    ff_report = pd.DataFrame(index=portfolio.columns)
    bm_residuals = pd.DataFrame(columns=portfolio.columns)

    rhs = sm.add_constant(factors)

    for portf in portfolio.columns:
        lhs = portfolio[portf]
        res = sm.OLS(lhs, rhs, missing='drop').fit()
        ff_report.loc[portf, 'alpha_hat'] = res.params['const'] * 12
        ff_report.loc[portf, 'beta_mkt'] = res.params[1]
        if FF3F:
            ff_report.loc[portf, 'Size beta'] = res.params[2] 
            ff_report.loc[portf, 'Value beta'] = res.params[3]
            
        ff_report.loc[portf, 'info_ratio'] = np.sqrt(12) * res.params['const'] / res.resid.std()
        ff_report.loc[portf, 'treynor_ratio'] = 12 * portfolio[portf].mean() / res.params[1]
        ff_report.loc[portf, 'R-squared'] = res.rsquared
        ff_report.loc[portf, 'Tracking Error'] = (res.resid.std()*np.sqrt(12))

        if resid:
            bm_residuals[portf] = res.resid
            
            
        
    if resid:
        return bm_residuals
        
    return ff_report

In [7]:
ts_CAPM = time_series_regression(portfolio_excess_returns.loc['1981':], factors.loc['1981':,'Mkt-RF'])
ts_CAPM

Unnamed: 0,alpha_hat,beta_mkt,info_ratio,treynor_ratio,R-squared,Tracking Error
SMALL LoBM,-0.10181,1.35012,-0.589319,0.006375,0.5984,0.172759
ME1 BM2,-0.003112,1.160335,-0.020651,0.079101,0.59127,0.150688
ME1 BM3,0.008102,1.034045,0.070829,0.089618,0.665975,0.114385
ME1 BM4,0.03828,0.967278,0.321317,0.121358,0.616611,0.119134
SMALL HiBM,0.048608,0.988304,0.347576,0.130966,0.549235,0.139848
ME2 BM1,-0.049163,1.333429,-0.38222,0.044914,0.723908,0.128624
ME2 BM2,0.009541,1.128858,0.092637,0.090235,0.745617,0.10299
ME2 BM3,0.023275,1.024793,0.252046,0.104495,0.750294,0.092343
ME2 BM4,0.029536,0.970403,0.300229,0.11222,0.703593,0.098379
ME2 BM5,0.02554,1.109383,0.203817,0.104805,0.656627,0.125306


### Cross-sectional Estimation


- #### The dependent variable, (y): mean excess returns from each of the n = 25 portfolios.
- #### The regressor, (x): the market beta from each of the n = 25 time-series regressions.

- #### Then we can estimate the following equation:

<center> $  \underbrace{\mathop{\mathbb{E}}[\tilde{r}^{i}]}_\text{n x 1 data} = 
    \underbrace{\eta}_\text{regression intercept} +
    \underbrace{\beta^{i,m}}_\text{n x 1 data} *\underbrace{\lambda_{m}}_\text{regression estimate} + \underbrace{\upsilon}_\text{n x 1 residuals}
 $ </center>
 
- #### Note that we use sample means as estimates of $\mathop{\mathbb{E}}[\tilde{r}^{i}]$.
- #### This is a weird regression! The regressors are the betas from the time-series regressions we already ran!
- #### This is a single regression, where we are combining evidence across all n = 25 series. Thus, it is a cross-sectional regression!
- #### The notation is trying to emphasize that the intercept is different than the time-series $\alpha$ and that the regressor coefficient is different than the time-series betas.

### We are essentially using our calculated betas from the time-series regression as a factor here and our portfolio will now consist of mean of the excess returns

In [8]:
portfolio = portfolio_excess_returns.loc['1981':].mean().to_frame('Mean Portfolio excess returns')

In [9]:
time_series_regression(portfolio, ts_CAPM['beta_mkt'])

Unnamed: 0,alpha_hat,beta_mkt,info_ratio,treynor_ratio,R-squared,Tracking Error
Mean Portfolio excess returns,0.203832,-0.008656,33.878194,-10.86324,0.272833,0.006017


###  What would these three statistics be if (CAPM) were true?

- Low Alpha: In the CAPM framework, alpha represents the abnormal return, or the return not explained by the systematic risk (market beta). If CAPM were true, it would imply that all assets are priced correctly, and there are no opportunities to earn abnormal returns. In this scenario, the expected alpha should be close to zero or very low because any positive or negative alpha would indicate that the asset is either underpriced or overpriced, which contradicts the efficient market hypothesis at the core of CAPM.


- High Beta: In CAPM, beta measures an asset's sensitivity to systematic market risk. If CAPM accurately describes asset pricing, the expected beta for any asset should accurately reflect its sensitivity to the market factor (e.g., represented by the S&P 500, often denoted as SPY). All assets should have betas close to 1 because they are all sensitive to the same market risk. A high beta, close to 1, is expected because it suggests that the asset's returns move in line with the market.


- High R-Squared: R-squared measures the proportion of the portfolio's returns that are explained by the market factor. If CAPM is true, you would expect a very high R-squared value, close to 1. This is because CAPM asserts that the market beta explains the majority of the portfolio's returns. Therefore, a high R-squared indicates that the model is doing an excellent job of explaining the portfolio's performance with the market factor.

## 4.1 Extensions

### Re-running the time series regression with Mkt, SMB and HML factors:

In [10]:
ts_FF3F = time_series_regression(portfolio_excess_returns.loc['1981':], factors.loc['1981':],True)

In [11]:
ts_FF3F

Unnamed: 0,alpha_hat,beta_mkt,Size beta,Value beta,info_ratio,treynor_ratio,R-squared,Tracking Error
SMALL LoBM,-0.08594,1.109864,1.382822,-0.257933,-0.946733,0.007755,0.889121,0.090776
ME1 BM2,0.002293,0.964265,1.316603,-0.015634,0.032343,0.095185,0.909553,0.070885
ME1 BM3,0.000546,0.917788,1.048864,0.26857,0.011601,0.10097,0.943533,0.04703
ME1 BM4,0.022147,0.878718,1.057622,0.472523,0.465496,0.133589,0.938856,0.047576
SMALL HiBM,0.02326,0.930187,1.061953,0.69121,0.312548,0.139149,0.87235,0.07442
ME2 BM1,-0.032166,1.138805,1.016961,-0.315877,-0.535355,0.05259,0.939757,0.060083
ME2 BM2,0.008048,1.011494,0.905911,0.112696,0.159761,0.100705,0.939132,0.050378
ME2 BM3,0.009439,0.974947,0.713799,0.388736,0.180882,0.109837,0.920256,0.052184
ME2 BM4,0.008081,0.942648,0.740196,0.571512,0.178753,0.115525,0.937406,0.045209
ME2 BM5,-0.006114,1.091105,0.924225,0.828847,-0.131075,0.10656,0.952419,0.046645


### Cross-sectional analysis with Fama-French factors

In [12]:
portfolio = portfolio_excess_returns.loc['1981':].mean().to_frame('Mean Portfolio excess returns')

In [13]:
time_series_regression(portfolio, ts_FF3F.loc[:,['beta_mkt','Size beta','Value beta']], True)

Unnamed: 0,alpha_hat,beta_mkt,Size beta,Value beta,info_ratio,treynor_ratio,R-squared,Tracking Error
Mean Portfolio excess returns,0.188234,-0.008553,4.1e-05,0.002887,35.878661,-10.993686,0.447089,0.005246


### Repeating the time-series regression and cross-sectional estimate with factor as the tangency weight

In [14]:
def tangency_weights(returns, cov_mat = 1):
    
    if cov_mat ==1:
        cov_inv = np.linalg.inv((returns.cov()*12))
    else:
        cov = returns.cov()
        covmat_diag = np.diag(np.diag((cov)))
        covmat = cov_mat * cov + (1-cov_mat) * covmat_diag
        cov_inv = np.linalg.inv((covmat*12))  
        
    ones = np.ones(returns.columns[1:].shape) 
    mu = returns.mean()*12
    scaling = 1/(np.transpose(ones) @ cov_inv @ mu)
    tangent_return = scaling*(cov_inv @ mu) 
    tangency_wts = pd.DataFrame(index = returns.columns[1:], data = tangent_return, columns = ['Tangent Weights'] )
        
    return tangency_wts

In [24]:
w_t = tangency_weights(portfolio_excess_returns['1981':].reset_index())
TangencyPort_df = pd.DataFrame(w_t,columns= ["Tangent Weights"],index=port_summary.index)
TangencyRets = (portfolio_excess_returns @ TangencyPort_df).rename(columns={'Tangency Portfolio Weight':'Tangency Returns'})
tangency_ts = time_series_regression(portfolio_excess_returns.loc['1981':], TangencyRets.loc['1981':],False)
tangency_ts

Unnamed: 0,alpha_hat,beta_mkt,info_ratio,treynor_ratio,R-squared,Tracking Error
SMALL LoBM,2.888315e-15,0.021492,1.059646e-14,0.400459,0.000282,0.272573
ME1 BM2,2.524023e-15,0.229197,1.094613e-14,0.400459,0.042922,0.230586
ME1 BM3,1.824062e-15,0.231408,9.516385e-15,0.400459,0.062056,0.191676
ME1 BM4,2.367898e-15,0.293131,1.301141e-14,0.400459,0.10536,0.181986
SMALL HiBM,2.838875e-15,0.323215,1.444109e-14,0.400459,0.109297,0.196583
ME2 BM1,2.599483e-15,0.149552,1.071028e-14,0.400459,0.016942,0.242709
ME2 BM2,1.475382e-15,0.254364,7.494028e-15,0.400459,0.070436,0.196874
ME2 BM3,2.59558e-15,0.267407,1.476499e-14,0.400459,0.09505,0.175793
ME2 BM4,2.863595e-15,0.271936,1.673043e-14,0.400459,0.102801,0.171161
ME2 BM5,2.875955e-15,0.290338,1.404972e-14,0.400459,0.083678,0.204698


In [25]:
portfolio = portfolio_excess_returns.loc['1981':].mean().to_frame('Mean Portfolio excess returns')

In [26]:
time_series_regression(portfolio, tangency_ts.loc[:,['beta_mkt']], False)

Unnamed: 0,alpha_hat,beta_mkt,info_ratio,treynor_ratio,R-squared,Tracking Error
Mean Portfolio excess returns,2.129807e-15,0.033372,13.859837,2.817789,1.0,1.536675e-16
