### CAPM

- In this exercise we will fit CAPM with real financial data(stock prices) and test whether the predictions of CAPM are true.

In [None]:
import numpy as np 
import pandas as pd 
import matplotlib.pyplot as plt 
import seaborn as sns; sns.set()
import pandas_datareader.data as web
import statsmodels.api as sm
from statsmodels.formula.api import ols 

  import pandas.util.testing as tm


### Empirical test of CAPM

- We will conduct a simple test on whether the predictions of CAPM hold.
    - Asset return is a function of how much risk it is exposed to.
    - Thus, asset returns should be increasing in betas.
- It is done by the following steps.
    - Setting up the sample data
    - Estimating betas
    - Estimating the SML

In [1]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [2]:
pwd

'/content'

In [3]:
cd /content/drive/MyDrive/Colab Notebooks/금융시장의빅데이터분석/7. Econometrics

/content/drive/MyDrive/Colab Notebooks/금융시장의빅데이터분석/7. Econometrics


In [4]:
pwd

'/content/drive/MyDrive/Colab Notebooks/금융시장의빅데이터분석/7. Econometrics'

In [None]:
pip install --upgrade xlrd

Collecting xlrd
  Downloading xlrd-2.0.1-py2.py3-none-any.whl (96 kB)
[?25l[K     |███▍                            | 10 kB 19.1 MB/s eta 0:00:01[K     |██████▉                         | 20 kB 11.3 MB/s eta 0:00:01[K     |██████████▏                     | 30 kB 8.8 MB/s eta 0:00:01[K     |█████████████▋                  | 40 kB 8.2 MB/s eta 0:00:01[K     |█████████████████               | 51 kB 4.5 MB/s eta 0:00:01[K     |████████████████████▍           | 61 kB 5.3 MB/s eta 0:00:01[K     |███████████████████████▊        | 71 kB 5.3 MB/s eta 0:00:01[K     |███████████████████████████▏    | 81 kB 5.2 MB/s eta 0:00:01[K     |██████████████████████████████▌ | 92 kB 5.8 MB/s eta 0:00:01[K     |████████████████████████████████| 96 kB 3.1 MB/s 
[?25hInstalling collected packages: xlrd
  Attempting uninstall: xlrd
    Found existing installation: xlrd 1.1.0
    Uninstalling xlrd-1.1.0:
      Successfully uninstalled xlrd-1.1.0
Successfully installed xlrd-2.0.1


In [None]:
df = pd.read_excel('beta_data.xls',  index_col=0, parse_dates = True, header=3)

In [None]:
df.columns

Index([   'Low Beta',             2,             3,   'High Beta',
        'Low Beta.1',         '2.1',         '3.1', 'High Beta.1',
        'Low Beta.2',         '2.2',         '3.2', 'High Beta.2',
        'Low Beta.3',         '2.3',         '3.3', 'High Beta.3',
            'Market',  'Low Beta.4',         '2.4',         '3.4',
       'High Beta.4',  'Low Beta.5',         '2.5',         '3.5',
       'High Beta.5',  'Low Beta.6',         '2.6',         '3.6',
       'High Beta.6',  'Low Beta.7',         '2.7',         '3.7',
       'High Beta.7'],
      dtype='object')

In [None]:
pfs = np.mat(df.iloc[:,:16]) # 16 portfolios formed on size and past beta
factors = np.mat(df.iloc[:,16]) # market portfolio
BMs = np.mat(df.iloc[:,17:]) # 16 portfolios formed on B/M ratio and past beta

T, N = pfs.shape

In [None]:
pfs.shape

(426, 16)

In [None]:
factors.shape #주가지수의 변화율

(1, 426)

In [None]:
# 1-stage time-series regression (estimating beta for each portfolio)

X = sm.add_constant(factors.T)

ts_model = sm.OLS(pfs, X).fit()
alphas = ts_model.params[0]
betas = ts_model.params[1:]
print("beta estimates from 1-stage time-series regression:")
print("rows: small to big, columns: low to high")
np.mat(betas).reshape(4,4)
#회사 규모가 가장 작고 beta가 낮은 회사가 (0, 0)
#작은 회사 중 beta가 가장 높은 회사가 (0, 3)
#왼 -> 오 : beta 값 커짐
#위 -> 아래

beta estimates from 1-stage time-series regression:
rows: small to big, columns: low to high


matrix([[0.77842172, 0.98245255, 1.13454692, 1.33811698],
        [0.74968325, 1.02779653, 1.21785727, 1.44684187],
        [0.71854073, 1.01907206, 1.20557186, 1.50285895],
        [0.71755303, 0.97620692, 1.18288207, 1.41142603]])

#### In other words, fit

$r_{i,t} - r_{f,t} = \alpha_i + \beta_i \left( r_{M,t} - r_{f,t} \right) + u_{i,t}$ 

and get $\beta_i$

In [None]:
# 2-nd step cross-sectional regression (estimating SML)
mean_ret = np.mat([np.mean(pfs[:,i]) for i in range(16)])
betas1 = sm.add_constant(betas.T)
cs_model1 = sm.OLS(mean_ret.T, betas1).fit()

gammas = cs_model1.params.T
stds = cs_model1.bse

print("estimates of gamma0 and gamma1:")
print(gammas)
print("standard errors:")
print(stds)
print("t-values:")
print(gammas/stds)

estimates of gamma0 and gamma1:
[ 1.41661596 -0.08718031]
standard errors:
[0.36592632 0.32754168]
t-values:
[ 3.87131478 -0.26616556]


#### Estimating the SML
$\overline{r_i - r_f} = \gamma_0 + \gamma_1\beta_i+\epsilon_i$
- If CAPM holds, $\hat{\gamma_0}=0$ and $\hat{\gamma_M}>0$ should hold.

The estimated model

$\hat{\overline{r_i - r_f}} = 1.42^{***} - 0.09\beta_i$

- Does CAPM hold?
- Given that $\gamma_1$ is negative and insignificant, CAPM doesn't hold.

## Homework

- Estimate the model $\overline{r_i - r_f} = \gamma_0 + \gamma_1\beta_i+ \gamma_2 s_i^2+\epsilon_i$, where $s_i^2 = {1 \over n-1} (r_i - \bar{r})^2$
- What's the meaning of $\gamma_2$? 
- Interprete the result.

In [None]:
betas1

array([[1.        , 0.77842172],
       [1.        , 0.98245255],
       [1.        , 1.13454692],
       [1.        , 1.33811698],
       [1.        , 0.74968325],
       [1.        , 1.02779653],
       [1.        , 1.21785727],
       [1.        , 1.44684187],
       [1.        , 0.71854073],
       [1.        , 1.01907206],
       [1.        , 1.20557186],
       [1.        , 1.50285895],
       [1.        , 0.71755303],
       [1.        , 0.97620692],
       [1.        , 1.18288207],
       [1.        , 1.41142603]])

In [None]:
np.mat([np.var(pfs[:,i], ddof = 1) for i in range(16)]) #16개 수익률들의 분산

matrix([[31.56010198, 43.61689865, 56.2405243 , 79.09251205, 18.62992185,
         30.65467764, 41.61809513, 60.17721057, 14.05338086, 24.83189253,
         34.01625193, 54.49688267, 12.77370495, 19.44765272, 28.50750858,
         44.11414501]])

In [None]:
mean_ret = np.mat([np.mean(pfs[:,i]) for i in range(16)])


### 나의 답안

In [None]:
vol_BMs = df.iloc[:,17:].std().values
vol_BMs

array([4.12965568, 4.62699954, 1.11774179, 0.71141938, 1.98606184,
       1.42168726, 0.69902732, 0.39118679, 1.0033503 , 0.58824704,
       0.28072682, 0.25853732, 0.26154474, 0.22991445, 0.2246163 ,
       0.2544771 ])

In [None]:
vol_pfs = df.iloc[:,:16].std().values
vol_pfs

array([5.61783784, 6.6043091 , 7.49936826, 8.8933971 , 4.31623932,
       5.53666665, 6.45120881, 7.75739715, 3.74878392, 4.9831609 ,
       5.83234532, 7.38220039, 3.57403203, 4.40994929, 5.33924232,
       6.64184801])

In [None]:
betas[0]

array([0.77842172, 0.98245255, 1.13454692, 1.33811698, 0.74968325,
       1.02779653, 1.21785727, 1.44684187, 0.71854073, 1.01907206,
       1.20557186, 1.50285895, 0.71755303, 0.97620692, 1.18288207,
       1.41142603])

In [None]:
X_pfs = pd.DataFrame({'beta': betas[0], 'volatility': vol_pfs})
X_pfs = sm.add_constant(X_pfs)

ols = sm.OLS(mean_ret.T, X_pfs).fit()
gammas = ols.params.T
stds = ols.bse

print("estimates of gamma0, gamma1 and gamma2:")
print(list(gammas))
print("standard errors:")
print(list(stds))
print("t-values:")
print(list(gammas/stds))

estimates of gamma0, gamma1 and gamma2:
[1.1550745336235, -1.6047375849506778, 0.3235619679201762]
standard errors:
[0.16422936200801094, 0.24358663200784553, 0.041913701518503786]
t-values:
[7.0333009852839625, -6.587954239208791, 7.719718282989924]


  x = pd.concat(x[::order], 1)



기업의 시가총액을 바탕으로 측정한 모델 : 
$\hat{\overline{r_i - r_f}} = 1.155^{***} - 1.604 \beta_i + 0.323s_i^2$

$\gamma_2$의 값은 포트폴리오의 개별 위험이 예상 수익률에 미치는 영향을 나타내는 계수이다. 

### 풀이

market beta만 유의미하게 영향을 주어야 한다. -> CAPM 