### CAPM

- In this exercise we will fit CAPM with real financial data(stock prices) and test whether the predictions of CAPM are true.

In [2]:
import numpy as np 
import pandas as pd 
import matplotlib.pyplot as plt 
import seaborn as sns; sns.set()
import statsmodels.api as sm
from statsmodels.formula.api import ols 

### Empirical test of CAPM

- We will conduct a simple test on whether the predictions of CAPM hold.
    - Asset return is a function of how much risk it is exposed to.
    - Thus, asset returns should be increasing in betas.
- It is done by the following steps.
    - Setting up the sample data
    - Estimating betas
    - Estimating the SML
    - 베타를 먼저 구한 뒤, 베타를 독립변수로 SML을 분석한다

In [3]:
df = pd.read_excel('beta_data.xls',  index_col=0, parse_dates = True, header=3)

In [4]:
df

Unnamed: 0_level_0,Low Beta,2,3,High Beta,Low Beta.1,2.1,3.1,High Beta.1,Low Beta.2,2.2,...,3.5,High Beta.5,Low Beta.6,2.6,3.6,High Beta.6,Low Beta.7,2.7,3.7,High Beta.7
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
196307,-0.233,-0.471,3.787,0.002,0.863,0.151,-0.597,-2.090,0.353,0.408,...,0.6966,0.5497,0.4947,0.4503,0.5452,0.8183,0.5834,0.5002,0.4438,0.3788
196308,1.124,-0.661,2.007,9.214,2.145,2.878,6.162,6.429,4.328,3.907,...,0.4689,0.6331,0.4935,0.4637,0.5328,0.8075,0.5463,0.5124,0.4356,0.3817
196309,-0.729,5.764,-0.323,-2.947,-1.337,0.236,-3.178,-4.137,-1.121,-2.219,...,0.4521,0.5482,0.4785,0.4424,0.5239,0.7232,0.5181,0.5073,0.4236,0.3678
196310,-0.939,1.462,0.544,2.286,-0.881,-1.154,-1.740,3.197,-0.895,-1.630,...,0.4559,0.5551,0.4871,0.4518,0.5503,0.7749,0.5119,0.5564,0.4403,0.3616
196311,-0.187,-0.706,-0.987,-0.994,-1.203,-0.654,-2.242,1.711,-1.314,-0.799,...,0.4660,0.6216,0.4192,0.4600,0.5125,0.6030,0.5436,0.5544,0.4434,0.3579
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
199808,-17.179,-16.508,-18.697,-23.085,-15.007,-16.385,-21.545,-26.331,-12.578,-16.167,...,2.0344,0.7662,0.6048,1.0952,1.5151,0.5604,0.5786,0.4059,0.2918,0.3000
199809,-1.690,-0.286,1.150,8.348,-0.335,2.835,4.701,9.024,3.717,4.291,...,3.3527,1.0314,0.6974,1.3192,1.1749,0.9185,0.5625,0.3886,0.3556,0.3756
199810,1.790,1.363,-0.061,5.254,-1.174,1.410,2.798,6.922,1.130,4.353,...,3.5288,1.1082,0.6792,1.2727,0.8448,0.8498,0.5799,0.3654,0.3276,0.3308
199811,7.513,6.595,13.932,15.281,5.503,6.963,9.170,9.413,4.003,4.272,...,2.5156,0.9713,0.6856,1.2767,0.8107,0.9725,0.6123,0.3210,0.2971,0.3072


In [5]:
pfs = np.mat(df.iloc[:,:16]) # 16 portfolios formed on size and past beta
factors = np.mat(df.iloc[:,16]) # market portfolio
BMs = np.mat(df.iloc[:,17:]) # 16 portfolios formed on B/M ratio and past beta

T, N = pfs.shape

In [9]:
df.iloc[:,16]

date
196307    -0.17
196308     5.27
196309    -1.19
196310     2.77
196311    -0.56
          ...  
199808   -15.67
199809     6.40
199810     7.41
199811     6.13
199812     6.32
Name: Market, Length: 426, dtype: float64

In [10]:
pfs

matrix([[-0.233, -0.471,  3.787, ...,  1.631,  0.949, -1.159],
        [ 1.124, -0.661,  2.007, ...,  4.33 ,  3.806,  5.708],
        [-0.729,  5.764, -0.323, ..., -1.338, -3.588,  0.386],
        ...,
        [ 1.79 ,  1.363, -0.061, ...,  6.754,  8.443,  9.849],
        [ 7.513,  6.595, 13.932, ...,  4.389,  4.114,  6.473],
        [-0.337,  1.355, -0.995, ...,  4.236,  4.109,  7.993]])

In [11]:
factors

matrix([[-1.700e-01,  5.270e+00, -1.190e+00,  2.770e+00, -5.600e-01,
          2.170e+00,  2.580e+00,  1.720e+00,  1.760e+00,  4.600e-01,
          1.740e+00,  1.510e+00,  2.010e+00, -1.130e+00,  3.050e+00,
          8.900e-01,  3.100e-01,  3.700e-01,  3.870e+00,  7.000e-01,
         -9.700e-01,  3.360e+00, -4.400e-01, -5.190e+00,  1.670e+00,
          3.090e+00,  3.190e+00,  2.930e+00,  3.000e-01,  1.350e+00,
          1.210e+00, -8.600e-01, -2.090e+00,  2.480e+00, -5.250e+00,
         -1.030e+00, -1.290e+00, -7.540e+00, -7.000e-01,  4.230e+00,
          1.750e+00,  6.200e-01,  8.550e+00,  1.100e+00,  4.340e+00,
          4.160e+00, -3.930e+00,  2.690e+00,  4.920e+00, -6.200e-01,
          3.430e+00, -2.740e+00,  7.900e-01,  3.370e+00, -3.630e+00,
         -3.360e+00,  5.100e-01,  9.410e+00,  2.700e+00,  1.150e+00,
         -2.200e+00,  1.800e+00,  4.450e+00,  9.000e-01,  5.850e+00,
         -3.390e+00, -6.700e-01, -5.360e+00,  3.050e+00,  2.050e+00,
          5.000e-01, -6.740e+00, -

In [6]:
pfs.shape

(426, 16)

In [7]:
# 1-stage time-series regression (estimating beta for each portfolio)

X = sm.add_constant(factors.T)

ts_model = sm.OLS(pfs, X).fit()
alphas = ts_model.params[0]
betas = ts_model.params[1:]
print("beta estimates from 1-stage time-series regression:")
print("rows: small to big, columns: low to high")
np.mat(betas).reshape(4,4)

beta estimates from 1-stage time-series regression:
rows: small to big, columns: low to high


matrix([[0.77842172, 0.98245255, 1.13454692, 1.33811698],
        [0.74968325, 1.02779653, 1.21785727, 1.44684187],
        [0.71854073, 1.01907206, 1.20557186, 1.50285895],
        [0.71755303, 0.97620692, 1.18288207, 1.41142603]])

#### In other words, fit

$r_{i,t} - r_{f,t} = \alpha_i + \beta_i \left( r_{M,t} - r+{f,t} \right) + u_{i,t}$ 

and get $\beta_i$

In [8]:
# 2-nd step cross-sectional regression (estimating SML)
# mean_ret = sm.add_constant(np.mat([np.mean(pfs[i,:]) for i in range(16)]))
mean_ret = np.mat([np.mean(pfs[:,i]) for i in range(16)]) # 수정(5/12)
betas1 = sm.add_constant(betas.T)
cs_model1 = sm.OLS(mean_ret.T, betas1).fit()

gammas = cs_model1.params.T
stds = cs_model1.bse

print("estimates of gamma0 and gamma1:")
print(gammas)
print("standard errors:")
print(stds)
print("t-values:")
print(gammas/stds)

estimates of gamma0 and gamma1:
[ 1.41661596 -0.08718031]
standard errors:
[0.36592632 0.32754168]
t-values:
[ 3.87131478 -0.26616556]


#### Estimating the SML
$\overline{r_i - r_f} = \gamma_0 + \gamma_1\beta_i+\epsilon_i$
- If CAPM holds, $\hat{\gamma_0}=0$ and $\hat{\gamma_1}>0$ should hold.
- gamma가 0보다 크면 양의 상관관계가 있다는 뜻이다.

The estimated model

$\hat{\overline{r_i - r_f}} = 1.41^{***} - 0.09\beta_i$

- Does CAPM hold?
- Given that $\gamma_1$ insignificant, CAPM doesn't hold.

## Homework(5/17 16:30까지 제출)
- let $r_f=0$
- Estimate the model $\overline{r_i - r_f} = \gamma_0 + \gamma_1\beta_i+ \gamma_2 s_i^2+\epsilon_i$, where $s_i^2 = {1 \over n-1} \sum (r_i - \bar{r})^2$
- What's the meaning of $\gamma_2$? 
- Interprete the result.