# Cheat sheet Exam

## Fetch Data

## Single-Index Models

See Bodie Kan S.256

Returns:
$r_i = E(r_i) + \beta_i m + e_i$

Variance
$\sigma_i^2 = \beta_i^2 \sigma_m^2 + \sigma^2(e_i)$

Covariance
$Cov(r_i, r_j) = Cov(\beta_i m + e_i, \beta_j m + e_j)= \beta_i \beta_j \sigma^2(m)$

 ## Volatility
 
Realized variance estimate: $RV = \sum^n_{i=1} r^2$ 

Realized annual vol: $\sqrt{RV \times 12}$

## CAPM

### Basic formulae

#### Return of an asset i

$ r_i = r_f + \beta_i (r_m - r_i) $

Return of asset i = Risk-free rate + beta i (market return rate - risk free rate)

##### Variance of a portfolio's return

Teh variance of a portfolio p is  $$ \sigma ^2_p := w'_p \Sigma w_p $$ with
$w_p$ the portfolio weights, $\Sigma$ the covariance matrix of assets.


###### For an equal weight portfolio:

The average variance of assets $\bar\sigma^2 := \frac{1}{N} \Sigma^{n}_{i=1} \sigma^2_i$
The average covariance is $\bar{\textit{Cov}} := \frac{1}{N(N-1) \Sigma^N_{i=1} \Sigma^N_{m=1,m\neq i}}$

or : $ \sigma^2_p =  \frac{1}{N} \times \bar\sigma^2 + \frac{N-1}{N} \times \bar{\textit{Cov}} $

What we can read from this:
- No correllation $\rho = 0$? Only idiosyncratic risk  $\sigma^2_p -> 0$ because $Cov -> 0, 1/N -> 0$
- The systematic risk captures the co-movement $\sigma^2_p -> Cov$ for $N->\inf$ and $\rho > 0$


### Insights
- the risk premium of asset a is proportional to the market risk premium of the market $ ERP_a = \beta_a \times ERP_m $
- The proportinality coefficient is $\beta_a = \frac{Cov(r_a,r_m)}{Var_r(m)}$
    - int quantifies the systematic risk
    - Beta is negative -> risk premium is negative
    - The expected risk premium of a stock is unrelated to the stocks idiosyncratic risk


### Assumptions
- Markowitz type investors
    - All investors are price takers (they can't affect the prices)
    - All investors have the sam single holding period (short sighted interests/(myopic)
    - Investors can only invest into publicly traded assets
    - Investors do not pay taxes or transaciton costs
- Investors sare the same expectations about risk-free rate, the expected hiolding period return of all assets, and the covariance matrix
    - All investors come up wiuth the same tangency portfolio
    - TP coincides with the aggreagte market; thus can be called 'market portfolio'
    - The proportion that each positive net supply asset has in the market portfolio coincides with the price of that asset multiplied by the number of by the number of share outstanding, and divided by the total market value of all positive net supply assets
      - aggregate demand for an asset equals its fixed supply

Hence:
$E[r_m] - r_f = \gamma_M \times \textit{Var}(r_M)$
   - gamma is the risk aversion
  
$\implies E[r_a] - r_f = \beta_a \times (E[r_m]-r_f), \forall$ assets $a$

### Security market line

The line of fairly priced assets.

- Mu-beta diagram
- Assets below the SML are overvalued, assets above undervalued
- SML goes thorugh risk-free rate in $\mu =  0$


### Risk analysis

Decompose an assets's return variance into a dviersifiable & systematic component

$$ \sigma^2_i = \beta^2_i \times \sigma^2_M + \sigma^2_{\epsilon,i}$$

- $\beta^2_i \times \sigma^2_M$ the systematic risk
- $ \sigma^2_{\epsilon,i}$ the risk specific risk <- No risk premium for that risk, but diversifiable


## Data-reader pandas

Docs: https://buildmedia.readthedocs.org/media/pdf/pandas-datareader/stable/pandas-datareader.pdf

Example

<code>
    import pandas_datareader.data as web
    df = web.DataReader("ADS.DE", 'quandl', '2015-01-01', '2015-01-05')

## Fama-French Model

### 3 Factor Model

Like in CAPM, plus size and btm (Book-to-market) factor:

- Size factor: Smaller Market Cap stocks tend to yield higher returns than High market cap stocks
- BTM factor: Value stocks (high btm ratio) tend to yield better results than growth stocks (low btm ratio)

$ r_i = r_f + \beta_i (r_m - r_i) + \beta_{SMB} + \beta_{HML}$

SMB = Small minus big. **Long small** cap stocks & **Short big** cap stocks

HML = High minus low. **Long high** btm stocks (Value) & **Short low** btm cap (Growth)

### 5 Factor Model

Critic article: https://www.robeco.com/en/insights/2015/10/fama-french-5-factor-model-why-more-is-not-always-better.html

Add two factors:
- profitability (stocks with a **high operating profitability** perform **better**)
- investment factor (stocks of companies with the **high total asset growth** have **below average returns**)



$ R_{it} - R_{ft} = a_i + b_i (R_mkt - R_rf) + s_i \beta_{SMB} + h_i \beta_{HML} + c_i \beta_{CMA} + r_i \beta_{RMW}$

- $R_it$ is the return on security or portfolio i for period t
- $R_Ft$ is the riskfree return
- $R_{Mt}$ is the return on the value-weight (VW) market portfolio,
- $SMB_{t}$ is the return on a diversified portfolio of small stocks minus the return on a diversified portfolio of big stocks
- $HML_{t}$ difference between the returns on diversified portfolios of high and low B/M stocks
- $c_i$ capture all variation in expected returns, the intercept
- $e_{it}$ is a zero-mean residual
- $RMW_t$ is the difference between the returns on diversified portfolios of stocks with robust and weak profitability
- $CMA_t$ is the difference between the returns on diversified portfolios of the stocks of low and high investment firms

- If in 3 or 5 factor model capture all variation in expected returns, $a_i$ is zero for all securities and portfolios i

**FROM** http://www.sciencedirect.com/science/article/pii/S0304405X14002323

### Factor matrices

|Model/Factor| RF_MKT | SMB | HML | RMW | CMA | REV | MOM |
|------------|--------|-----|-----|-----|-----|-----|-----|
| 3 Factor   | x      | x   | x   | _   | _   |  _  |  _  |
| 5 Factor   | x      | x   | x   | x   | x   |  _  | _   |
| 7 Factor   | x      | x   | x   | x   | x   | x   | x   |

### Fama-Macbeth regression

In [2]:
import pandas as pd
import statsmodels.api as sm
import numpy as np


class FMacBeth:

    def __init__(self, r: pd.DataFrame, x: pd.DataFrame, rf: pd.DataFrame):

        self.r = r  # R_T: The return panel. T x I for I assets and T periods
        self.rf = rf                     # R_f: Risk free rate. T x 1
        self.x = x                       # E.g. FF7 Matrix

        self.excess_ret = [self.r[col] for col in self.r]  # y = R_T_i - r_f, for all i in I

        self.summary_table = None
        self.B = None
        self.lambda_T = None
        self.lambda_MB = None
        self.R_squared = None
        self.t_stat = None
        self.lambda_r2 = None
        self.beta_r2 = None

    def fit(self):
        # First pass
        self.B, self.beta_r2 = self._get_betas_for_assets()
        # Second pass
        self.lambda_T, self.lambda_r2 = self._get_lambda_per_t()
        self.lambda_MB = np.mean(self.lambda_T)

        T = len(self.lambda_T)
        self.t_stat = self.lambda_MB / (np.std(self.lambda_T) / np.sqrt(T))

    def _get_betas_for_assets(self):
        B_I_T = []
        r_sq = []
        for r_exc in self.excess_ret:
            res = sm.OLS(r_exc, self.x).fit()
            B_i_T = res.params
            B_I_T.append(B_i_T)
            r_sq.append(res.rsquared)
        return pd.DataFrame(B_I_T, index=self.x.columns), pd.DataFrame(r_sq, columns=self.x.columns)

    def _get_lambda_per_t(self):
        lambda_T_f = []
        r_sq = []
        for t, r_t in self.r.iterrows():
            res = sm.OLS(r_t, self.B).fit()
            lambda_t_f = res.params
            lambda_T_f.append(lambda_t_f)
            r_sq.append(res.rsquared)

        return pd.DataFrame(lambda_T_f, index=self.r.index), pd.DataFrame(r_sq, index=self.r.index)



## Fama-MacBeth Procedure Explained. Application: CAPM, 377 Stocks and T time points
$$
\\
$$

We use Fama-MacBeth regressions to identify $\lambda$ and $\beta$ using a two step approach, here applied to the CAPM
$$
\\
$$


**Step 1:** time-series regressions for each individual stock returns r to recover the stock's  $\beta$
$$
r_{t=1,...,T} - r_{f} = \alpha + \beta (r_{MKT,t}-r_{f}) + \epsilon_{t=1,...,T}
$$
$$
\\
$$

**Step 2:** cross-sectional regressions for each time point t to recover $\lambda_f$
$$
r_{i=1,...,377} - r_f = \alpha + \lambda \times \beta_{i=1,...,377} + \epsilon_{i=1,...,377}
$$
$$
\\
$$

**Step 3:**
$$
\lambda_{FMacB} = \frac{1}{T} \sum_t \lambda_t, \qquad \text{and} \qquad t(\lambda_{FMacB}) = \frac{\lambda_{FMacB}}{std(\lambda_t)/ \sqrt{T}}
$$

## GLS Regression

#### Generalized Least Squares
##### Why?
Ordinary least squares (OLS) assumes iid errors. We omit this assumption and use GLS.
GLS can deal with heteroscedasticity

1. Homoscedastic: $\Sigma$ a diagonal matrix with $\sigma_i = \sigma_j$ for all $i,j$
2. Heterscedastic: $\Sigma$ a diagonal matrix with $\sigma_i = \sigma_j$ for **at least one** $i \neq j$
3. Autocorrelated: $\epsilon$ is autocorrelated if there is one $\textit{cov}(\epsilon_i,\epsilon_j) \neq 0$ with $i \neq j$. The resulting matrix is non-diagonal

## KPIs

#### Sharpe ratio

Measures the performance of an asset compared to a risk free asset, after adjusting the risk. It's the difference between the return of the investment and a risk-free investment, divided by the standard deviation of the investment (its volatility). A risk free investment is e.g. U.S. security treasury.


$\text{Ex ante}: \frac{E\left[R_{investment} - R_{market}\right]}{\sigma_investment} = \frac{{E\left[R_{investment} - R_{market}\right]}}{\sqrt{var(R_{investment}, R_{market})}}$

$\text{Ex post}: \frac{R_{investment} - R_{market}}{\sigma_investment} = \frac{{R_{investment} - R_{market}}}{\sqrt{var(R_{investment}, R_{market})}}$

Information ratio is similar but instead of risk-free investment it uses a risky index (S&P 500).


In [2]:
# InvOpp["SR_raw"] = InvOpp.iloc[:,0] / InvOpp.iloc[:,1] # calc sharp ratio with r_market = 0

## Statistical Tests

In [2]:
# Some imports

import pandas as pd
import numpy as np

#### Augmented Dickey-Fuller

- Test for stationarity
- Available in statsmodels


In [3]:
# Example; Fast test for one column

adf_test = adfuller(P_d.iloc[:,1])
p_vle = adf_test[1]
print(p_vle)     


NameError: name 'adfuller' is not defined

=> 99% certainty that PRICE data of AI.PA is NON-STATIONARY.

Implications:
- typical time-series models require stationary data
- first difference price data to get it stationary. simple, geometric or log returns work.

In [4]:
# Example: Test for returns - applyingh to stationary data

#ADF Test

def ADF(r: pd.DataFrame) -> pd.DataFrame:
    #input: pd.DataFrame
    #output: p-values and print of a string
    
    output = "Data is stationary" #initialize
    
    pvle = pd.DataFrame()
    
    for i in range(0,r.shape[1]):
        adf_test = adfuller(r.iloc[:,i])
        pvle.loc[i,0] = adf_test[1]
        
        if adf_test[1] > 0.05:
            output = "At least one column of input data is NON-STATIONARY"
        
        
    print(output)
    
    return pvle

#### Jarque-Berra Test

//TODO

## Making Data stationary / To returns

#### Continuously Compounded Daily Returns

- Makes data stationary

$$
r_t := ln (P_t / P_{t-1})
$$


In [7]:
# Example
P_d = pd.DataFrame()

r_d = np.log(P_d/P_d.shift())

# But first row is nan, therefore
r_d = r_d.dropna() 

In [9]:
# Annualize 
InvOpp = pd.DataFrame()

InvOpp["mean"] = InvOpp["mean"]*252
InvOpp["std"]  = InvOpp["std"]*np.sqrt(252)
InvOpp["SR_raw"] = InvOpp["SR_raw"]*np.sqrt(252)

KeyError: 'mean'

## 'Realized' Investment Opportunity Set

In [None]:
InvOpp = r_d.agg(['mean', 'std']).T  #get mean and vol as columns

InvOpp["SR_raw"] = InvOpp.iloc[:,0] / InvOpp.iloc[:,1] #calc sharp ratio with r_f = 0

# Annualize
InvOpp["mean"] = InvOpp["mean"]*252 # Remember 252 trading days
InvOpp["std"]  = InvOpp["std"]*np.sqrt(252)
InvOpp["SR_raw"]=InvOpp["SR_raw"]*np.sqrt(252)