<a href="https://colab.research.google.com/github/boyerb/Investments/blob/master/Ex10-Estimating_Alpha.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

***Investment Analysis***, Bates, Boyer, and Fletcher  

# Example Chapter 10: Estimating Alpha in Multi-factor Models

### Imports and Setup

In [1]:
#import packages
import pandas as pd
import statsmodels.api as sm

### Load in Data and Convert to Dataframe  

In [2]:
# Load in the data by first specifying the URL where the data can be found
url='https://github.com/boyerb/Investments/raw/master/Examples_3.41.xlsx'
columns_to_read = ["Date","Market","rz1","rz2","Portfolio_A","Tangent_Proposed"]
# read the data into a DataFrame
df = pd.read_excel(url, sheet_name='10-Zero Cost', header=1, usecols=columns_to_read, engine='openpyxl')
print(df.head())

        Date  Market     rz1     rz2  Portfolio_A  Tangent_Proposed
0 2020-01-31  0.0002 -0.0313 -0.0625    -0.003932          0.026149
1 2020-02-29 -0.0801  0.0107 -0.0380    -0.093185         -0.089211
2 2020-03-31 -0.1326 -0.0479 -0.1388    -0.311914         -0.093063
3 2020-04-30  0.1365  0.0245 -0.0134     0.222114          0.115936
4 2020-05-31  0.0559  0.0249 -0.0485     0.085738          0.034859


### Estimate the Single Factor Model Based on Proposed Tangent Portfolio
In the block of code below we run the regression

$ r_A-r_f=\alpha_{Ap}+\beta_{Ap}(r_p-r_f)$  

The key parameter here is the intercept $\alpha_{Ap}$.  If we have correctly built the tangent portfolio using the appropriate factors, then $\alpha_{Ap}=0$.  On the other hand, if $\alpha_A>0$ then the Sharpe ratio of portfolio $p$ can be increased by tilting the portfolio towards asset A. If $\alpha_{Ap}<0$ then the Sharpe ratio of portfolio $p$ can be increased by tilting the portfolio away from asset A. Since intetcept of any regression is

$a=E[y]-bE[x]$,  

alpha in our regression will be measured as  

$\alpha_{Ap}=E[r_i]-r_f-\beta_{ip}(E[r_p]-r_f)$.  

The regression is measured with sampling error. Can we reject that the true $\alpha_A$ is zero with 95\% confidence?  

In [3]:
rf = 0.003  # define the risk-free rate
df["ex_A"] = df["Portfolio_A"]-rf # excess return on Porfolio_A
X=df["Tangent_Proposed"]-rf # excess return on p
Y=df["ex_A"]
X=sm.add_constant(X) # we specify that we want to add a constant to the regression equation
model=sm.OLS(Y,X).fit() # run the regression
params = model.params  # extract the parameters
conf_int = model.conf_int(alpha=0.05)  # 95% CIs
# create a table of output
results_df = pd.DataFrame({
    'Parameter': params.index, # 1st column: parameter name
    'Estimate': params.values, # 2nd column: parameer value
    'CI Lower': conf_int[0].values, # 3rd column: lower bound on 95\% CI
    'CI Upper': conf_int[1].values  # 4th column: upper bound on 95\% CI
})

print(results_df)


          Parameter  Estimate  CI Lower  CI Upper
0             const  0.017075 -0.008089  0.042239
1  Tangent_Proposed  1.084012  0.613166  1.554858


### Estimate the Alpha of the Multi-factor Model  
In the block of code below we run the regression  
$ r_A-r_f=\alpha_A+\beta_{Am}(r_m-r_f)+\beta_{A1}r_{z1}+\beta_{A2}r_{z2}+z_A$  
The key parameter here is the intercept $\alpha_A$.  Note that the intercept of this regression is numerically identical to the intercept of the regression above. (The 95% confindence intervals will be slightly different.) If we have correctly built the tangent portfolio using the appropriate factors, then $\alpha_i=0$ for any asset $i$ and the multi-factor model holds  

$ E[r_i]=r_f+\beta_{im}(E[r_i]-r_f)+\beta_{i1}E[r_{z1}]+\beta_{i2}E[r_{z2}]$.

Can we reject that $\alpha_A$ is zero with 95\% confidence?

In [4]:
df["exmkt"] =df["Market"]-rf  # excess return on the market
X=df[["exmkt","rz1","rz2"]] # explanatory variables
Y=df["ex_A"] # dependent variable
X=sm.add_constant(X) # we specify that we want to add a constant to the regression equation
model=sm.OLS(Y,X).fit() # run regression
params = model.params # pull out parameters
conf_int = model.conf_int(alpha=0.05)  # 95% CIs
r_squared = model.rsquared
# create a table of output
results_df = pd.DataFrame({
    'Parameter': params.index, # 1st column: parameter name
    'Estimate': params.values, # 2nd column: parameer value
    'CI Lower': conf_int[0].values, # 3rd column: lower bound on 95\% CI
    'CI Upper': conf_int[1].values  # 4th column: upper bound on 95\% CI
})

print(results_df)

  Parameter  Estimate  CI Lower  CI Upper
0     const  0.017075  0.005050  0.029100
1     exmkt  1.306133  1.077008  1.535257
2       rz1  0.709247  0.314966  1.103529
3       rz2  1.193415  0.938124  1.448706
