# Part I: Multiple Choice Questions

#### 1. (5 points) If our regression equation is $y = X\beta + u$, where we have $T$ observations and $K$ regressors, what will be the dimension of $\beta$ using the standard matrix notation

> (a) $T \times K$
>
> (b) $T \times 1$
>
> (c) $K \times 1$
>
> (d) $K \times K.$


**Your answer:**

C



#### 2. (5 points) Suppose that the value of $R^2$ for an estimated regression model is exactly one. Which of the following are true?

>> i. All of the data points must lie exactly on the line
>>
>> ii. All of the residuals must be zero
>>
>> iii. All of the variability of $y$ about its mean has been explained by the model
>>
>> iv. The fitted line will be horizontal with respect to all of the explanatory variables.
>
> (a) (ii) and (iv) only
>
> (b) (i) and (iii) only
>
> (c) (i), (ii), and (iii) only
>
> (d) (i), (ii), (iii), and (iv).


**Your answer:**

C

#### 3. (5 points) Which of these is a mathematical expression of the residual sum of squares?

>> i. $\hat{u}^\prime \hat{u}$
>>
>> ii. $\left[\hat{u}_1, \hat{u}_2, ..., \hat{u}_T \right]$
>>
>> iii. $\hat{u}_1 + \hat{u}_2 + ... + \hat{u}_T$
>
> (a) (i) only
>
> (b) (i) and (ii) only
>
> (c) (i) and (iii) only
>
> (d) (i), (ii), and (iii).


**Your answer:**

A


#### 4. (5 points) Why is $R^2$ a commonly used and perhaps better measure of how well a regression model fits the data than the residual sum of squares (RSS)?

> (a) The RSS is often too large
>
> (b) The RSS does not depend on the scale of the dependent variable whereas the $R^2$ does
>
> (c) The RSS depends on the scale of the dependent variable whereas the $R^2$  does not
>
> (d) The RSS depends on the scale of the independent variable whereas the $R^2$ does not.


**Your answer:**

C

#### 5. (5 points) In the following regression estimated on 64 observations
        
$$y_t = \beta_1 + \beta_2 X_{2,t} + \beta_3 X_{3,t} + \beta_4 X_{4,t} + u_t,$$

**Which of the following null hypotheses could we test using an $F$-test?**

>> i. $\beta_2 = 0$
>>
>> ii. $\beta_2 = 1$ and $\beta_3 + \beta_4 = 1$
>>
>> iii. $\beta_3 \beta_4 = 1$
>>
>> iv. $\beta_2 - \beta_3 - \beta_4 = 1.$
>
> (a) (i) and (ii) only
>
> (b) (ii) and (iv) only
>
> (c) (i) and (iii) only
>
> (d) (i), (ii), (iii), and (iv)
>
> (e) (i), (ii), (iv) only.
>


**Your answer:**

E

# Part II: The Fama and French Model

## The Framework
We briefly explore the mathematical and explanatory description of key asset pricing models (i.e., CAPM, Fama-French 3 Factor, Fama-French 5 factor), and how to run these models in Python.

### Capital Asset Pricing Model (CAPM)
We have already introduced and estimated the Capital Asset Pricing Model (CAPM). The CAPM model explains the relationship between systematic risk and the expected return for assets (e.g., stocks). The CAPM is used for pricing of risky assets, by generating the expected return of the asset given its riskiness, and calculating the cost of capital. Intuitively, the CAPM model tells us that the return of a risky asset is explained by the market factor (i.e., $r_m$)

$$\bar{r}_a = r_f + \beta_a (\bar{r}_m - r_f)$$


$\bar{r}_a$: The return of the asset. This is can be the return of the any stock (i.e., Apple, Google, Tesla) or investment portfolio (i.e., any mutual/hedge fund portfolio).

$r_f$: The return of the risk-free asset. The risk-free asset is usually given by the US 3-month Treasury bill. It is assumed that the US government will not default on a short-term government security, thus the US 3-month Treasury bill is widely assumed in finance to be risk-free. $\bar{r}_m$

$\bar{r}_m$​​ : The return of the market. This is usually given by the S&P500 return as it is the largest market index in the world.

\\
The CAPM can also be expressed mathematically in the following notation:
$$r-r_f = \alpha + \beta_M (MKT-r_f)$$


We will use the above notation going forward.

Note: If you are running an asset pricing model on stocks in countries outside the US, it may be more applicable to change $r_m$ ($r_f$) to the market index (short-term government security) of that country.

### Fama-French 3-factor (FF3)
Another very popular asset pricing model in the empirical finance literature is the Fama-French 3-factor (FF3) that was published in 1993. Nobel Laureate Eugene Fama and researcher Kenneth French found that value stocks tend to outperform growth stocks (i.e., value), and that small-cap stocks outperform large-cap stocks (i.e., size). Thus, the FF3 mode adds in size and value as risk factors to the model as shown below

$$r-r_f = \alpha + \beta_M (MKT-r_f) + \beta_S SMB + \beta_v HML$$

### Fama-French 5-factor (FF5)
In 2015, Fama-French added two more risk factors into their popular 3-factor asset pricing model to make a Fama-French 5-factor (FF5) model. This model added two 'quality' factors, namely profitability (stocks with a high operating profitability perform better) and investment (stocks of companies with high total asset growth have below average returns) factors.

$$r-r_f = \alpha + \beta_M (MKT-r_f) + \beta_S SMB + \beta_v HML + \beta_r RMW + \beta_c CMA$$

SMB: The return spread of small minus large stocks (size).

HML: The return of cheap minus expensive stocks (value).

RMW: The return spread of the most profitable firms minus the least profitable (profit).

CMA: The return spread of firms that invest conservatively minus aggressively (investment).



## Solution

### Importing the Python modules
When you use the import *module* as *short_form*, this allows you to apply functions with the short_form instead. For example, when using import numpy as np, numpy.log becomes np.log.

In [None]:
import pandas as pd
import numpy as np

import statsmodels.formula.api as sm # module for stats models
from statsmodels.iolib.summary2 import summary_col # module for presenting stats models outputs nicely

### Defining the Python regression function
The function extracts the risk factor returns from Ken French's website and runs a CAPM, FF3, and FF5 regression. Input: Pandas series of a stock/portfolio returns.

In [None]:
def assetPriceReg(df_stk):
  import pandas_datareader.data as web  # module for reading datasets directly from the web

  # Reading in factor data
  df_factors = web.DataReader('F-F_Research_Data_5_Factors_2x3', 'famafrench', start=1900)[0]
  df_factors.rename(columns={'Mkt-RF': 'MKT'}, inplace=True)
  df_factors['MKT'] = df_factors['MKT']/100
  df_factors['SMB'] = df_factors['SMB']/100
  df_factors['HML'] = df_factors['HML']/100
  df_factors['RF'] = df_factors['RF']/100

  df_stk_name = df_stk.name
  df_stock_factor = pd.merge(df_stk,df_factors,left_index=True,right_index=True) # Merging the stock and factor returns dataframes together
  df_stock_factor['XsRet'] = df_stock_factor[df_stk_name] - df_stock_factor['RF'] # Calculating excess returns

  # Running CAPM, FF3, and FF5 models.
  CAPM = sm.ols(formula = 'XsRet ~ MKT', data=df_stock_factor).fit(cov_type='HAC',cov_kwds={'maxlags':1})
  FF3 = sm.ols( formula = 'XsRet ~ MKT + SMB + HML', data=df_stock_factor).fit(cov_type='HAC',cov_kwds={'maxlags':1})

  CAPMtstat = CAPM.tvalues
  FF3tstat = FF3.tvalues

  CAPMcoeff = CAPM.params
  FF3coeff = FF3.params

  # DataFrame with coefficients and t-stats
  results_df = pd.DataFrame({'CAPMcoeff':CAPMcoeff,'CAPMtstat':CAPMtstat,
                              'FF3coeff':FF3coeff, 'FF3tstat':FF3tstat},
  index = ['Intercept', 'MKT', 'SMB', 'HML'])


  dfoutput = summary_col([CAPM,FF3],stars=True,float_format='%0.4f',
                model_names=['CAPM','FF3'],
                info_dict={'N':lambda x: "{0:d}".format(int(x.nobs)),
                            'Adjusted R2':lambda x: "{:.4f}".format(x.rsquared_adj)},
                            regressor_order = ['Intercept', 'MKT', 'SMB', 'HML', 'RMW', 'CMA'])

  print(dfoutput)

  return {'results': results_df, 'CAPM': CAPM, 'FF3': FF3}

### Loading portfolios from Ken French's website
First, let's explore available datasets on Ken French's website

In [None]:
from pandas_datareader.famafrench import get_available_datasets
datasets = get_available_datasets()
print(datasets)

['F-F_Research_Data_Factors', 'F-F_Research_Data_Factors_weekly', 'F-F_Research_Data_Factors_daily', 'F-F_Research_Data_5_Factors_2x3', 'F-F_Research_Data_5_Factors_2x3_daily', 'Portfolios_Formed_on_ME', 'Portfolios_Formed_on_ME_Wout_Div', 'Portfolios_Formed_on_ME_Daily', 'Portfolios_Formed_on_BE-ME', 'Portfolios_Formed_on_BE-ME_Wout_Div', 'Portfolios_Formed_on_BE-ME_Daily', 'Portfolios_Formed_on_OP', 'Portfolios_Formed_on_OP_Wout_Div', 'Portfolios_Formed_on_OP_Daily', 'Portfolios_Formed_on_INV', 'Portfolios_Formed_on_INV_Wout_Div', 'Portfolios_Formed_on_INV_Daily', '6_Portfolios_2x3', '6_Portfolios_2x3_Wout_Div', '6_Portfolios_2x3_weekly', '6_Portfolios_2x3_daily', '25_Portfolios_5x5', '25_Portfolios_5x5_Wout_Div', '25_Portfolios_5x5_Daily', '100_Portfolios_10x10', '100_Portfolios_10x10_Wout_Div', '100_Portfolios_10x10_Daily', '6_Portfolios_ME_OP_2x3', '6_Portfolios_ME_OP_2x3_Wout_Div', '6_Portfolios_ME_OP_2x3_daily', '25_Portfolios_ME_OP_5x5', '25_Portfolios_ME_OP_5x5_Wout_Div', '25_

Now, let's import the dataset containing 5x5 portfolios sorted on market equity (ME) and book-to-market ratio (B/M). It's a large file, so be patient.

The file is split into multiple segments; the first one contains daily returns on 25 portfolios.

In [None]:
import pandas_datareader.data as web  # module for reading datasets directly from the web
r = web.DataReader('25_Portfolios_5x5', 'famafrench', start=1900)[0]/100

In [None]:
r.head()

Unnamed: 0_level_0,SMALL LoBM,ME1 BM2,ME1 BM3,ME1 BM4,SMALL HiBM,ME2 BM1,ME2 BM2,ME2 BM3,ME2 BM4,ME2 BM5,...,ME4 BM1,ME4 BM2,ME4 BM3,ME4 BM4,ME4 BM5,BIG LoBM,ME5 BM2,ME5 BM3,ME5 BM4,BIG HiBM
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
1926-07,0.058248,-0.017006,0.004875,-0.01458,0.020534,0.012077,0.024192,0.004926,-0.026049,-0.003344,...,0.015893,0.015278,0.012978,0.002727,0.024678,0.034539,0.060902,0.020266,0.031111,0.005623
1926-08,-0.020206,-0.080282,0.013796,0.014606,0.083968,0.023618,-0.011849,0.040084,0.005038,0.061675,...,0.013336,0.03873,0.020021,0.021706,0.053422,0.010124,0.041903,0.020131,0.054849,0.077576
1926-09,-0.048291,-0.026154,-0.043417,-0.032729,0.008649,-0.02654,-0.012618,0.010829,-0.03548,-0.009401,...,0.010923,-0.00525,-0.017636,0.014646,0.00873,-0.012906,0.036538,0.00095,-0.007487,-0.024284
1926-10,-0.093729,-0.035519,-0.034948,0.034413,-0.025476,-0.028069,-0.032663,-0.050745,-0.080191,-0.013213,...,-0.033361,-0.026559,-0.02107,-0.031051,-0.053525,-0.027413,-0.030071,-0.022437,-0.046719,-0.058129
1926-11,0.055888,0.041877,0.024623,-0.044494,0.005362,0.031033,-0.02369,0.030078,0.051546,0.027292,...,0.034448,0.023887,0.037335,0.04932,0.018213,0.042946,0.025326,0.015204,0.036619,0.025636


### Run a regression for each stock
Try to collect all alphas into a list. We did this in the *CAPM* notebook. Do the same for betas on SMB and HML. This would make it easier to see any patterns. If you unsure how to do this -- it's okay -- you can simply look the regression output one-by-one and try to find a pattern.

In [None]:
reg = assetPriceReg(r[r.columns[0]])["results"]


                  CAPM      FF3    
-----------------------------------
Intercept      -0.0049** -0.0047***
               (0.0019)  (0.0009)  
MKT            1.4203*** 1.0867*** 
               (0.0420)  (0.0244)  
SMB                      1.3906*** 
                         (0.0430)  
HML                      -0.4870***
                         (0.0427)  
R-squared      0.6308    0.9098    
R-squared Adj. 0.6303    0.9094    
N              710       710       
Adjusted R2    0.6303    0.9094    
Standard errors in parentheses.
* p<.1, ** p<.05, ***p<.01


Answer and explain the rest of the questions.

In [None]:
r.columns.values.reshape(5, 5)

array([['SMALL LoBM', 'ME1 BM2', 'ME1 BM3', 'ME1 BM4', 'SMALL HiBM'],
       ['ME2 BM1', 'ME2 BM2', 'ME2 BM3', 'ME2 BM4', 'ME2 BM5'],
       ['ME3 BM1', 'ME3 BM2', 'ME3 BM3', 'ME3 BM4', 'ME3 BM5'],
       ['ME4 BM1', 'ME4 BM2', 'ME4 BM3', 'ME4 BM4', 'ME4 BM5'],
       ['BIG LoBM', 'ME5 BM2', 'ME5 BM3', 'ME5 BM4', 'BIG HiBM']],
      dtype=object)

In [None]:
np.set_printoptions(precision=3, suppress=True)
r.mean().values.reshape(5, 5)

array([[0.009, 0.01 , 0.013, 0.014, 0.016],
       [0.009, 0.012, 0.012, 0.013, 0.015],
       [0.01 , 0.012, 0.012, 0.013, 0.014],
       [0.01 , 0.011, 0.011, 0.012, 0.013],
       [0.009, 0.009, 0.01 , 0.009, 0.012]])

In [None]:
#ME1 BM2	0.000324	0.954457	1.309284	-0.191418	0.926001	6.571513e-01
#ME1 BM3
reg = assetPriceReg(r['SMALL HiBM'])["FF3"]
print(reg.bse[0])


                  CAPM      FF3   
----------------------------------
Intercept      0.0053*** 0.0020***
               (0.0016)  (0.0006) 
MKT            1.0703*** 0.9405***
               (0.0450)  (0.0191) 
SMB                      1.0876***
                         (0.0431) 
HML                      0.5281***
                         (0.0308) 
R-squared      0.5799    0.8996   
R-squared Adj. 0.5793    0.8991   
N              710       710      
Adjusted R2    0.5793    0.8991   
Standard errors in parentheses.
* p<.1, ** p<.05, ***p<.01
0.0006234405924756384


In [None]:
alphas_FF3 = []
betas_FF3_mkt = []
betas_FF3_smb = []
betas_FF3_hml = []
r_squared_FF3 = []
std_error_FF3 = []
tstat_FF3 = []
p_values_FF3 = []


# iterate over all stocks
for c in r.columns:
  reg = assetPriceReg(r[c])["FF3"]
  alphas_FF3.append(reg.params[0])
  betas_FF3_mkt.append(reg.params[1])
  betas_FF3_smb.append(reg.params[2])
  betas_FF3_hml.append(reg.params[3])
  r_squared_FF3.append(reg.rsquared)
  std_error_FF3.append(reg.bse[0])
  tstat_FF3.append(reg.tvalues[0])
  p_values_FF3.append(reg.pvalues[0])



alphas_FF3 = np.array(alphas_FF3)
alphas_FF3 = pd.DataFrame(alphas_FF3, index = r.columns)
alphas_FF3 = alphas_FF3.rename(columns={0: 'alphas_FF3'})
alphas_FF3

betas_FF3_mkt = np.array(betas_FF3_mkt)
betas_FF3_mkt = pd.DataFrame(betas_FF3_mkt, index = r.columns)
betas_FF3_mkt = betas_FF3_mkt.rename(columns={0: 'betas_FF3_mkt'})


betas_FF3_smb = np.array(betas_FF3_smb)
betas_FF3_smb = pd.DataFrame(betas_FF3_smb, index = r.columns)
betas_FF3_smb = betas_FF3_smb.rename(columns={0: 'betas_FF3_smb'})


betas_FF3_hml = np.array(betas_FF3_hml)
betas_FF3_hml = pd.DataFrame(betas_FF3_hml, index = r.columns)
betas_FF3_hml = betas_FF3_hml.rename(columns={0: 'betas_FF3_hml'})


r_squared_FF3 = np.array(r_squared_FF3)
r_squared_FF3 = pd.DataFrame(r_squared_FF3, index = r.columns)
r_squared_FF3 = r_squared_FF3.rename(columns={0: 'r_squared_FF3'})

std_error_FF3 = np.array(std_error_FF3)
std_error_FF3 = pd.DataFrame(std_error_FF3, index = r.columns)
std_error_FF3 = std_error_FF3.rename(columns={0: 'std_error_FF3'})

tstat_FF3 = np.array(tstat_FF3)
tstat_FF3 = pd.DataFrame(tstat_FF3, index = r.columns)
tstat_FF3 = tstat_FF3.rename(columns={0: 'tstat_FF3'})

p_values_FF3 = np.array(p_values_FF3)
p_values_FF3 = pd.DataFrame(p_values_FF3, index = r.columns)
p_values_FF3 = p_values_FF3.rename(columns={0: 'p_values_FF3'})



                  CAPM      FF3    
-----------------------------------
Intercept      -0.0049** -0.0047***
               (0.0019)  (0.0009)  
MKT            1.4203*** 1.0867*** 
               (0.0420)  (0.0244)  
SMB                      1.3906*** 
                         (0.0430)  
HML                      -0.4870***
                         (0.0427)  
R-squared      0.6308    0.9098    
R-squared Adj. 0.6303    0.9094    
N              710       710       
Adjusted R2    0.6303    0.9094    
Standard errors in parentheses.
* p<.1, ** p<.05, ***p<.01

                  CAPM      FF3    
-----------------------------------
Intercept      0.0012    0.0003    
               (0.0016)  (0.0007)  
MKT            1.2300*** 0.9545*** 
               (0.0404)  (0.0214)  
SMB                      1.3093*** 
                         (0.0578)  
HML                      -0.1914***
                         (0.0453)  
R-squared      0.6250    0.9260    
R-squared Adj. 0.6245    0.9257    
N  

In [None]:
df_FF3 = pd.concat([alphas_FF3, betas_FF3_mkt, betas_FF3_smb, betas_FF3_hml, r_squared_FF3, std_error_FF3, tstat_FF3, p_values_FF3, ], axis = 1)
df_FF3

Unnamed: 0,alphas_FF3,betas_FF3_mkt,betas_FF3_smb,betas_FF3_hml,r_squared_FF3,std_error_FF3,tstat_FF3,p_values_FF3
SMALL LoBM,-0.004704,1.0867,1.390633,-0.487034,0.909792,0.000903,-5.209919,1.889232e-07
ME1 BM2,0.000324,0.954457,1.309284,-0.191418,0.926001,0.000729,0.44385,0.6571513
ME1 BM3,-0.00017,0.919855,1.09056,0.124242,0.954379,0.000489,-0.347692,0.7280716
ME1 BM4,0.001609,0.877676,1.074757,0.310933,0.950279,0.000516,3.116592,0.001829543
SMALL HiBM,0.001995,0.940474,1.087624,0.528105,0.899551,0.000623,3.200044,0.001374067
ME2 BM1,-0.001726,1.116447,1.034919,-0.513227,0.950539,0.00063,-2.737954,0.006182275
ME2 BM2,0.000244,1.002637,0.922244,-0.02906,0.955136,0.000482,0.506436,0.6125507
ME2 BM3,0.000723,0.965165,0.766746,0.259917,0.937241,0.000533,1.357207,0.1747153
ME2 BM4,0.000606,0.945648,0.72475,0.453861,0.946903,0.000466,1.301431,0.1931111
ME2 BM5,0.00027,1.07576,0.879592,0.655156,0.951475,0.00051,0.528445,0.5971902


In [None]:
def mean_of_betas(df):
  for x in df.columns[1:4]:
    x = df[x].mean()
    print(x)


In [None]:
mean_of_betas(df_FF3)

1.0162309926811424
0.5421405314214436
0.2145237305349336


In [None]:
alphas_CAPM = []
betas_CAPM = []
r_squared_CAPM = []

# iterate over all stocks
for c in r.columns:
  reg = assetPriceReg(r[c])["CAPM"]
  alphas_CAPM.append(reg.params[0])
  betas_CAPM.append(reg.params[1])
  r_squared_CAPM.append(reg.rsquared)


alphas_CAPM = np.array(alphas_CAPM)
alphas_CAPM = pd.DataFrame(alphas_CAPM, index = r.columns)
alphas_CAPM = alphas_CAPM.rename(columns={0: 'alphas_CAPM'})

betas_CAPM = np.array(betas_CAPM)
betas_CAPM = pd.DataFrame(betas_CAPM, index = r.columns)
betas_CAPM = betas_CAPM.rename(columns={0: 'betas_CAPM'})


r_squared_CAPM  = np.array(r_squared_CAPM )
r_squared_CAPM = pd.DataFrame(r_squared_CAPM, index = r.columns)
r_squared_CAPM = r_squared_CAPM.rename(columns={0: 'r_squared_CAPM'})




                  CAPM      FF3    
-----------------------------------
Intercept      -0.0049** -0.0047***
               (0.0019)  (0.0009)  
MKT            1.4203*** 1.0867*** 
               (0.0420)  (0.0244)  
SMB                      1.3906*** 
                         (0.0430)  
HML                      -0.4870***
                         (0.0427)  
R-squared      0.6308    0.9098    
R-squared Adj. 0.6303    0.9094    
N              710       710       
Adjusted R2    0.6303    0.9094    
Standard errors in parentheses.
* p<.1, ** p<.05, ***p<.01

                  CAPM      FF3    
-----------------------------------
Intercept      0.0012    0.0003    
               (0.0016)  (0.0007)  
MKT            1.2300*** 0.9545*** 
               (0.0404)  (0.0214)  
SMB                      1.3093*** 
                         (0.0578)  
HML                      -0.1914***
                         (0.0453)  
R-squared      0.6250    0.9260    
R-squared Adj. 0.6245    0.9257    
N  

In [None]:
df_CAPM = pd.concat([alphas_CAPM, betas_CAPM, r_squared_CAPM], axis = 1)
df_CAPM

Unnamed: 0,alphas_CAPM,betas_CAPM,r_squared_CAPM
SMALL LoBM,-0.004858,1.420329,0.630823
ME1 BM2,0.00119,1.230027,0.624986
ME1 BM3,0.001625,1.108459,0.67418
ME1 BM4,0.004092,1.03635,0.633845
SMALL HiBM,0.005316,1.070251,0.579943
ME2 BM1,-0.002411,1.386489,0.740694
ME2 BM2,0.001255,1.181484,0.760513
ME2 BM3,0.002639,1.07287,0.753101
ME2 BM4,0.003205,1.017417,0.724984
ME2 BM5,0.003818,1.147809,0.681445


In [None]:
df_all = pd.concat([df_CAPM, df_FF3], axis = 1)
df_all

Unnamed: 0,alphas_CAPM,betas_CAPM,r_squared_CAPM,alphas_FF3,betas_FF3_mkt,betas_FF3_smb,betas_FF3_hml,r_squared_FF3,std_error_FF3,tstat_FF3,p_values_FF3
SMALL LoBM,-0.004858,1.420329,0.630823,-0.004704,1.0867,1.390633,-0.487034,0.909792,0.000903,-5.209919,1.889232e-07
ME1 BM2,0.00119,1.230027,0.624986,0.000324,0.954457,1.309284,-0.191418,0.926001,0.000729,0.44385,0.6571513
ME1 BM3,0.001625,1.108459,0.67418,-0.00017,0.919855,1.09056,0.124242,0.954379,0.000489,-0.347692,0.7280716
ME1 BM4,0.004092,1.03635,0.633845,0.001609,0.877676,1.074757,0.310933,0.950279,0.000516,3.116592,0.001829543
SMALL HiBM,0.005316,1.070251,0.579943,0.001995,0.940474,1.087624,0.528105,0.899551,0.000623,3.200044,0.001374067
ME2 BM1,-0.002411,1.386489,0.740694,-0.001726,1.116447,1.034919,-0.513227,0.950539,0.00063,-2.737954,0.006182275
ME2 BM2,0.001255,1.181484,0.760513,0.000244,1.002637,0.922244,-0.02906,0.955136,0.000482,0.506436,0.6125507
ME2 BM3,0.002639,1.07287,0.753101,0.000723,0.965165,0.766746,0.259917,0.937241,0.000533,1.357207,0.1747153
ME2 BM4,0.003205,1.017417,0.724984,0.000606,0.945648,0.72475,0.453861,0.946903,0.000466,1.301431,0.1931111
ME2 BM5,0.003818,1.147809,0.681445,0.00027,1.07576,0.879592,0.655156,0.951475,0.00051,0.528445,0.5971902


In [None]:
df_alpha_annual = df_all[["alphas_CAPM", "alphas_FF3"]]*12
df_alpha_annual

Unnamed: 0,alphas_CAPM,alphas_FF3
SMALL LoBM,-0.058292,-0.056451
ME1 BM2,0.014285,0.003885
ME1 BM3,0.019506,-0.002041
ME1 BM4,0.049105,0.01931
SMALL HiBM,0.063787,0.02394
ME2 BM1,-0.028931,-0.020713
ME2 BM2,0.015058,0.002929
ME2 BM3,0.031665,0.008679
ME2 BM4,0.03846,0.007278
ME2 BM5,0.04582,0.003237


In [None]:
reg1 = assetPriceReg(r['SMALL HiBM'])


                  CAPM      FF3   
----------------------------------
Intercept      0.0053*** 0.0020***
               (0.0016)  (0.0006) 
MKT            1.0703*** 0.9405***
               (0.0450)  (0.0191) 
SMB                      1.0876***
                         (0.0431) 
HML                      0.5281***
                         (0.0308) 
R-squared      0.5799    0.8996   
R-squared Adj. 0.5793    0.8991   
N              710       710      
Adjusted R2    0.5793    0.8991   
Standard errors in parentheses.
* p<.1, ** p<.05, ***p<.01


In [None]:
import pandas_datareader.data as web  # module for reading datasets directly from the web
df_stk = r['SMALL HiBM']
# Reading in factor data
df_factors = web.DataReader('F-F_Research_Data_5_Factors_2x3', 'famafrench', start=1900)[0]
df_factors.rename(columns={'Mkt-RF': 'MKT'}, inplace=True)
df_factors['MKT'] = df_factors['MKT']/100
df_factors['SMB'] = df_factors['SMB']/100
df_factors['HML'] = df_factors['HML']/100
df_factors['RF'] = df_factors['RF']/100

df_stk_name = df_stk.name
df_stock_factor = pd.merge(df_stk,df_factors,left_index=True,right_index=True) # Merging the stock and factor returns dataframes together
df_stock_factor['XsRet'] = df_stock_factor[df_stk_name] - df_stock_factor['RF'] # Calculating excess returns

# Running CAPM, FF3, and FF5 models.
CAPM = sm.ols(formula = 'XsRet ~ MKT', data=df_stock_factor).fit(cov_type='HAC',cov_kwds={'maxlags':1})
FF3 = sm.ols( formula = 'XsRet ~ MKT + SMB + HML', data=df_stock_factor).fit(cov_type='HAC',cov_kwds={'maxlags':1})


In [None]:
print(FF3.summary())

                            OLS Regression Results                            
Dep. Variable:                  XsRet   R-squared:                       0.900
Model:                            OLS   Adj. R-squared:                  0.899
Method:                 Least Squares   F-statistic:                     2193.
Date:                Sun, 09 Oct 2022   Prob (F-statistic):               0.00
Time:                        22:48:40   Log-Likelihood:                 1772.7
No. Observations:                 710   AIC:                            -3537.
Df Residuals:                     706   BIC:                            -3519.
Df Model:                           3                                         
Covariance Type:                  HAC                                         
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
Intercept      0.0020      0.001      3.200      0.0

In [None]:
print(FF3.f_test('SMB = HML'))

<F test: F=array([[137.792]]), p=3.4263493284710105e-29, df_denom=706, df_num=1>


In [None]:
import scipy.stats as stats
two_sided_tstat = stats.norm.ppf(.975)
two_sided_tstat

1.959963984540054

In [None]:
negative_side = -two_sided_tstat
negative_side

-1.959963984540054

In [None]:
df_all['Hypothesis at 5% FF3 T-test'] = df_all['tstat_FF3'].apply(lambda x: 'Reject Null' if x >two_sided_tstat  or x<negative_side else 'Cannot Reject Null')
df_all

Unnamed: 0,alphas_CAPM,betas_CAPM,r_squared_CAPM,alphas_FF3,betas_FF3_mkt,betas_FF3_smb,betas_FF3_hml,r_squared_FF3,std_error_FF3,tstat_FF3,p_values_FF3,Hypothesis at 5% FF3 T-test
SMALL LoBM,-0.004858,1.420329,0.630823,-0.004704,1.0867,1.390633,-0.487034,0.909792,0.000903,-5.209919,1.889232e-07,Reject Null
ME1 BM2,0.00119,1.230027,0.624986,0.000324,0.954457,1.309284,-0.191418,0.926001,0.000729,0.44385,0.6571513,Cannot Reject Null
ME1 BM3,0.001625,1.108459,0.67418,-0.00017,0.919855,1.09056,0.124242,0.954379,0.000489,-0.347692,0.7280716,Cannot Reject Null
ME1 BM4,0.004092,1.03635,0.633845,0.001609,0.877676,1.074757,0.310933,0.950279,0.000516,3.116592,0.001829543,Reject Null
SMALL HiBM,0.005316,1.070251,0.579943,0.001995,0.940474,1.087624,0.528105,0.899551,0.000623,3.200044,0.001374067,Reject Null
ME2 BM1,-0.002411,1.386489,0.740694,-0.001726,1.116447,1.034919,-0.513227,0.950539,0.00063,-2.737954,0.006182275,Reject Null
ME2 BM2,0.001255,1.181484,0.760513,0.000244,1.002637,0.922244,-0.02906,0.955136,0.000482,0.506436,0.6125507,Cannot Reject Null
ME2 BM3,0.002639,1.07287,0.753101,0.000723,0.965165,0.766746,0.259917,0.937241,0.000533,1.357207,0.1747153,Cannot Reject Null
ME2 BM4,0.003205,1.017417,0.724984,0.000606,0.945648,0.72475,0.453861,0.946903,0.000466,1.301431,0.1931111,Cannot Reject Null
ME2 BM5,0.003818,1.147809,0.681445,0.00027,1.07576,0.879592,0.655156,0.951475,0.00051,0.528445,0.5971902,Cannot Reject Null


In [None]:
df_all['Hypothesis at 5% FF3 P-val test'] = df_all['p_values_FF3'].apply(lambda x: 'Reject Null' if x <= 0.05 else 'Cannot Reject Null')
df_all

Unnamed: 0,alphas_CAPM,betas_CAPM,r_squared_CAPM,alphas_FF3,betas_FF3_mkt,betas_FF3_smb,betas_FF3_hml,r_squared_FF3,std_error_FF3,tstat_FF3,p_values_FF3,Hypothesis at 5% FF3 T-test,Hypothesis at 5% FF3 P-val test
SMALL LoBM,-0.004858,1.420329,0.630823,-0.004704,1.0867,1.390633,-0.487034,0.909792,0.000903,-5.209919,1.889232e-07,Reject Null,Reject Null
ME1 BM2,0.00119,1.230027,0.624986,0.000324,0.954457,1.309284,-0.191418,0.926001,0.000729,0.44385,0.6571513,Cannot Reject Null,Cannot Reject Null
ME1 BM3,0.001625,1.108459,0.67418,-0.00017,0.919855,1.09056,0.124242,0.954379,0.000489,-0.347692,0.7280716,Cannot Reject Null,Cannot Reject Null
ME1 BM4,0.004092,1.03635,0.633845,0.001609,0.877676,1.074757,0.310933,0.950279,0.000516,3.116592,0.001829543,Reject Null,Reject Null
SMALL HiBM,0.005316,1.070251,0.579943,0.001995,0.940474,1.087624,0.528105,0.899551,0.000623,3.200044,0.001374067,Reject Null,Reject Null
ME2 BM1,-0.002411,1.386489,0.740694,-0.001726,1.116447,1.034919,-0.513227,0.950539,0.00063,-2.737954,0.006182275,Reject Null,Reject Null
ME2 BM2,0.001255,1.181484,0.760513,0.000244,1.002637,0.922244,-0.02906,0.955136,0.000482,0.506436,0.6125507,Cannot Reject Null,Cannot Reject Null
ME2 BM3,0.002639,1.07287,0.753101,0.000723,0.965165,0.766746,0.259917,0.937241,0.000533,1.357207,0.1747153,Cannot Reject Null,Cannot Reject Null
ME2 BM4,0.003205,1.017417,0.724984,0.000606,0.945648,0.72475,0.453861,0.946903,0.000466,1.301431,0.1931111,Cannot Reject Null,Cannot Reject Null
ME2 BM5,0.003818,1.147809,0.681445,0.00027,1.07576,0.879592,0.655156,0.951475,0.00051,0.528445,0.5971902,Cannot Reject Null,Cannot Reject Null


# **Answer to Questions Part II: The Fama and French Model (40 points):**

### **Question 1)**

Here we can see all the data for the 25 portfolios in a single dataframe for the CAPM and FF3:

In [None]:
df_all

Unnamed: 0,alphas_CAPM,betas_CAPM,r_squared_CAPM,alphas_FF3,betas_FF3_mkt,betas_FF3_smb,betas_FF3_hml,r_squared_FF3,std_error_FF3,tstat_FF3,p_values_FF3,Hypothesis at 5% FF3 T-test,Hypothesis at 5% FF3 P-val test
SMALL LoBM,-0.004858,1.420329,0.630823,-0.004704,1.0867,1.390633,-0.487034,0.909792,0.000903,-5.209919,1.889232e-07,Reject Null,Reject Null
ME1 BM2,0.00119,1.230027,0.624986,0.000324,0.954457,1.309284,-0.191418,0.926001,0.000729,0.44385,0.6571513,Cannot Reject Null,Cannot Reject Null
ME1 BM3,0.001625,1.108459,0.67418,-0.00017,0.919855,1.09056,0.124242,0.954379,0.000489,-0.347692,0.7280716,Cannot Reject Null,Cannot Reject Null
ME1 BM4,0.004092,1.03635,0.633845,0.001609,0.877676,1.074757,0.310933,0.950279,0.000516,3.116592,0.001829543,Reject Null,Reject Null
SMALL HiBM,0.005316,1.070251,0.579943,0.001995,0.940474,1.087624,0.528105,0.899551,0.000623,3.200044,0.001374067,Reject Null,Reject Null
ME2 BM1,-0.002411,1.386489,0.740694,-0.001726,1.116447,1.034919,-0.513227,0.950539,0.00063,-2.737954,0.006182275,Reject Null,Reject Null
ME2 BM2,0.001255,1.181484,0.760513,0.000244,1.002637,0.922244,-0.02906,0.955136,0.000482,0.506436,0.6125507,Cannot Reject Null,Cannot Reject Null
ME2 BM3,0.002639,1.07287,0.753101,0.000723,0.965165,0.766746,0.259917,0.937241,0.000533,1.357207,0.1747153,Cannot Reject Null,Cannot Reject Null
ME2 BM4,0.003205,1.017417,0.724984,0.000606,0.945648,0.72475,0.453861,0.946903,0.000466,1.301431,0.1931111,Cannot Reject Null,Cannot Reject Null
ME2 BM5,0.003818,1.147809,0.681445,0.00027,1.07576,0.879592,0.655156,0.951475,0.00051,0.528445,0.5971902,Cannot Reject Null,Cannot Reject Null


The folowing dataframe shows annualized alphas for both CAPM and FF3. We can understand that some portfolios have an economically large return and some have a rather low economic return. For example: ME1 BM4 has an annual return of approximately 4.9% which is quite large, and ME4 BM2 has a return of 0.054% which is economically small.
When it comes to the magnitude of the alphas in CAPM and FF3, some portfolios have a close magnitude and others have a high magnitude between the two models.

In [None]:
df_alpha_annual

Unnamed: 0,alphas_CAPM,alphas_FF3
SMALL LoBM,-0.058292,-0.056451
ME1 BM2,0.014285,0.003885
ME1 BM3,0.019506,-0.002041
ME1 BM4,0.049105,0.01931
SMALL HiBM,0.063787,0.02394
ME2 BM1,-0.028931,-0.020713
ME2 BM2,0.015058,0.002929
ME2 BM3,0.031665,0.008679
ME2 BM4,0.03846,0.007278
ME2 BM5,0.04582,0.003237


The R^2 between the CAPM and FF3 differ highly. The R^2 of the FF3 is always higher for each of the 25 portfolios compared to the R^2 of the CAPM. It seems as though the FF3 can better explain the variation of the dependent variables (the 25 portfolios) compared to the CAPM.

In [None]:
df_all[["r_squared_CAPM", "r_squared_FF3"]]

Unnamed: 0,r_squared_CAPM,r_squared_FF3
SMALL LoBM,0.630823,0.909792
ME1 BM2,0.624986,0.926001
ME1 BM3,0.67418,0.954379
ME1 BM4,0.633845,0.950279
SMALL HiBM,0.579943,0.899551
ME2 BM1,0.740694,0.950539
ME2 BM2,0.760513,0.955136
ME2 BM3,0.753101,0.937241
ME2 BM4,0.724984,0.946903
ME2 BM5,0.681445,0.951475


### Question 2)

Running the F-test for "Small HiBM" where we test the null hypothesis:
βSMB = βHML
From the follwing code we have a very low p_value: 3.43e-29, and so we can conclude that in this case we Reject the Null.

In [None]:
print(FF3.f_test('SMB = HML'))

<F test: F=array([[137.792]]), p=3.4263493284710105e-29, df_denom=706, df_num=1>


### **Question 3)**

At a 5% level of significance as we can see the following dataframe some portfolios have their null hypothesis rejected and other not rejected by the FF3 model.


In [None]:
df_all[["std_error_FF3",	"tstat_FF3",	"p_values_FF3",	"Hypothesis at 5% FF3 P-val test",	"Hypothesis at 5% FF3 T-test"]]

Unnamed: 0,std_error_FF3,tstat_FF3,p_values_FF3,Hypothesis at 5% FF3 P-val test,Hypothesis at 5% FF3 T-test
SMALL LoBM,0.000903,-5.209919,1.889232e-07,Reject Null,Reject Null
ME1 BM2,0.000729,0.44385,0.6571513,Cannot Reject Null,Cannot Reject Null
ME1 BM3,0.000489,-0.347692,0.7280716,Cannot Reject Null,Cannot Reject Null
ME1 BM4,0.000516,3.116592,0.001829543,Reject Null,Reject Null
SMALL HiBM,0.000623,3.200044,0.001374067,Reject Null,Reject Null
ME2 BM1,0.00063,-2.737954,0.006182275,Reject Null,Reject Null
ME2 BM2,0.000482,0.506436,0.6125507,Cannot Reject Null,Cannot Reject Null
ME2 BM3,0.000533,1.357207,0.1747153,Cannot Reject Null,Cannot Reject Null
ME2 BM4,0.000466,1.301431,0.1931111,Cannot Reject Null,Cannot Reject Null
ME2 BM5,0.00051,0.528445,0.5971902,Cannot Reject Null,Cannot Reject Null


### Question 4)

The following is an output for the mean of the betas related to the FF3 model: As follows:
betas_FF3_mkt
betas_FF3_smb
betas_FF3_hml
We have different betas for each of the 25 portfolios, and so the average shows that difference.
The average beta for smb is at 0.54 and the average beta for hml is at 0.21 which shows a great gap on average.

In [None]:
mean_of_betas(df_FF3)

1.0162309926811424
0.5421405314214436
0.2145237305349336


### Question 5

The CAPM suggest that the best performing portfolio is:
- SMALL HiBM with an alpha of :0.0053
- ME1 BM4 with an alpha of : 0.0041

CAPM worst:
- SMALL LoBM alpha : -0.0049

FF3 best:
- SMALL HiBM with an alpha of :0.002
- ME1 BM4 with an alpha of : 0.0016

FF3 worst:
- SMALL LoBM alpha : -0.0047

In both models, we can see that the better performing are small sized and medium book marke portfolios.



### Question 6)

If I were an asset manager, I would invest in SMALL HiBM and ME1 BM4, as both models suggest. And I would also short the worst performing portfolio which has a negative return, SMALL LoBM. By shorting this portfolio I would be able to invest more on the better performing portfolios.


# Part III: An Optimal Portfolio of Industries

## Modern Portfolio Theory

Modern Portfolio Theory (MPT) is an investment theory developed by Harry Markowitz and published under the title "Portfolio Selection" in the Journal of Finance in 1952.

There are a few underlying concepts that can help anyone to understand MPT. If you are familiar with finance, you might know what the acronym "TANSTAAFL" stands for. It is a famous acronym for "There Ain't No Such Thing As A Free Lunch". This concept is also closely related to 'risk-return trade-off'.

Higher risk is associated with greater probability of higher return and lower risk with a greater probability of smaller return. MPT assumes that investors are risk-averse, meaning that given two portfolios that offer the same expected return, investors will prefer the less risky one. Thus, an investor will take on increased risk only if compensated by higher expected returns.

Another factor comes in to play in MPT is "diversification". Modern portfolio theory says that it is not enough to look at the expected risk and return of one particular stock. By investing in more than one stock, an investor can reap the benefits of diversification – chief among them, a reduction in the riskiness of the portfolio.

What you need to understand is "risk of a portfolio is not equal to average/weighted-average of individual stocks in the portfolio". In terms of return, yes it is the average/weighted average of individual stock's returns, but that's not the case for risk. The risk is about how volatile the asset is, if you have more than one stock in your portfolio, then you have to take count of how these stocks movement correlates with each other. The beauty of diversification is that you can even get lower risk than a stock with the lowest risk in your portfolio, by optimising the allocation.

I will try to explain as I go along with the actual code. First, let's start by importing some libraries we need. "Quandl" is a financial platform which also offers Python library. If you haven't installed it before, of course, you first need to install the package in your command line "!pip -q install quandl", and before you can use it, you also need to get an API key on Quandl's website. Sign-up and getting an API key is free but has some limits. As a logged-in free user, you will be able to call 2,000 calls per 10 minutes maximum (speed limit), and 50,000 calls per day (volume limit). https://www.quandl.com/

### Construct excess returns on all industries
####Load Ken French's industry data
https://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_library.html

In [None]:
import pandas as pd
import numpy as np

import pandas_datareader.data as web  # module for reading datasets directly from the web
p = web.DataReader('12_Industry_Portfolios', 'famafrench', start=1900)[0]/100



In [None]:
p.head()

Unnamed: 0_level_0,NoDur,Durbl,Manuf,Enrgy,Chems,BusEq,Telcm,Utils,Shops,Hlth,Money,Other
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
1926-07,0.0145,0.1555,0.0367,-0.0118,0.0801,0.0316,0.0083,0.0704,0.0011,0.0177,-0.0002,0.0222
1926-08,0.0397,0.0368,0.0242,0.0347,0.0514,0.0197,0.0217,-0.0169,-0.0071,0.0425,0.0447,0.0434
1926-09,0.0114,0.048,-0.0007,-0.0339,0.053,-0.0034,0.0241,0.0204,0.0021,0.0069,-0.0161,0.0037
1926-10,-0.0124,-0.0823,-0.0316,-0.0078,-0.0455,-0.0538,-0.0011,-0.0263,-0.0229,-0.0057,-0.0551,-0.0273
1926-11,0.052,-0.0019,0.0382,0.0001,0.0511,0.0479,0.0163,0.0371,0.0643,0.0542,0.0234,0.021


#### Load the risk-free rate from Ken French's Factors data

In [None]:
p_factors = web.DataReader('F-F_Research_Data_Factors', 'famafrench', start=1900)[0]
rf = p_factors[['RF']]/100
rf.head()

Unnamed: 0_level_0,RF
Date,Unnamed: 1_level_1
1926-07,0.0022
1926-08,0.0025
1926-09,0.0023
1926-10,0.0032
1926-11,0.0031


#### Compute excess returns

In [None]:
rx = pd.concat([p, rf], axis=1)
for col in p.columns:
  rx[col] = rx[col] - rx.RF

rx = rx.drop('RF', axis=1)
rx.head()

Unnamed: 0_level_0,NoDur,Durbl,Manuf,Enrgy,Chems,BusEq,Telcm,Utils,Shops,Hlth,Money,Other
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
1926-07,0.0123,0.1533,0.0345,-0.014,0.0779,0.0294,0.0061,0.0682,-0.0011,0.0155,-0.0024,0.02
1926-08,0.0372,0.0343,0.0217,0.0322,0.0489,0.0172,0.0192,-0.0194,-0.0096,0.04,0.0422,0.0409
1926-09,0.0091,0.0457,-0.003,-0.0362,0.0507,-0.0057,0.0218,0.0181,-0.0002,0.0046,-0.0184,0.0014
1926-10,-0.0156,-0.0855,-0.0348,-0.011,-0.0487,-0.057,-0.0043,-0.0295,-0.0261,-0.0089,-0.0583,-0.0305
1926-11,0.0489,-0.005,0.0351,-0.003,0.048,0.0448,0.0132,0.034,0.0612,0.0511,0.0203,0.0179


## Compute portfolio return statistics

Compute mean returns on all industries and their covariance matrix, as we did in class. This should require only calling two simple functions. Consult the *Efficient Frontier and Modern Portfolio Theory* notebook if unsure.

In [None]:
mean_returns = rx.mean()
cov_matrix = rx.cov()

Print mean excess returns:

In [None]:
mean_returns

NoDur    0.006942
Durbl    0.008983
Manuf    0.007642
Enrgy    0.007698
Chems    0.007516
BusEq    0.008734
Telcm    0.005664
Utils    0.006184
Shops    0.007550
Hlth     0.008079
Money    0.007423
Other    0.005740
dtype: float64

## Minimum variance portfolio
Compute the minimum variance portfolio using the formula in class. The easiest way to obtain this is to compute
$$w = V^{-1} \mathbf{1}_N,$$
where $\mathbf{1}_N$ is the column vector of ones and $V^{-1}$ is the inverse of the variance-covariance matrix of industry returns. Then, simply rescale all weights so that sum one to one (by dividing by their sum). This is equivalent to the formula I showed in class.


In [None]:
y = np.linalg.inv(cov_matrix)@(np.ones((12,), dtype=int))
y = pd.Series(y / y.sum(), index=mean_returns.index)
print(y)

NoDur    0.659427
Durbl   -0.036664
Manuf   -0.216841
Enrgy    0.168347
Chems    0.100097
BusEq   -0.087873
Telcm    0.506926
Utils    0.169408
Shops    0.024945
Hlth     0.107469
Money   -0.411688
Other    0.016447
dtype: float64


## Maximum Sharpe Ratio portfolio
Compute the maximum Sharpe ratio portfolio using the formula in class. The easiest way to obtain this is to compute
$$w = V^{-1} \left( E[r] - r_f \right).$$
Then, simply rescale all weights so that sum one to one (by dividing by their sum). This is equivalent to the formula I showed in class.
*Hint:* I have shown you how to compute this in the *Efficient Frontier and Modern Portfolio Theory* notebook.

In [None]:
w = np.linalg.inv(cov_matrix) @ (mean_returns)
w = pd.Series(w / w.sum(), index=mean_returns.index)
print(w)

NoDur    0.750613
Durbl    0.118130
Manuf   -0.057490
Enrgy    0.288203
Chems    0.024734
BusEq    0.130730
Telcm    0.250072
Utils    0.055885
Shops    0.044834
Hlth     0.313418
Money   -0.278927
Other   -0.640201
dtype: float64


# Answers Part III: An Optimal Portfolio of Industries (35 points)

### Question 1

From the following output we can undertand that most industries have performed well. The one that have stood out are the following:

-Durbl    0.008983

-BusEq    0.008734

-Hlth     0.008079

In [None]:
mean_returns

NoDur    0.006942
Durbl    0.008983
Manuf    0.007642
Enrgy    0.007698
Chems    0.007516
BusEq    0.008734
Telcm    0.005664
Utils    0.006184
Shops    0.007550
Hlth     0.008079
Money    0.007423
Other    0.005740
dtype: float64

### Question 2


In the following out we have the investment weights into the different sectors.
The highest weight is given to:
- NoDur    0.659427
And lowest weight is given to:
- Money   -0.411688

Our minimum variance portfolio suggest that we short BusEq at a weight of
-0.087873 yet it was one of the industries with the highest average returns with 0.008734. This is because the minimum variance portfolio takes into account the risk related to one portfolio and so with the amount of risk in BusEq our formula found that in order to have a minimum variance (risk) portfolio we would need to short this industry.

In [None]:
y = np.linalg.inv(cov_matrix)@(np.ones((12,), dtype=int))
y = pd.Series(y / y.sum(), index=mean_returns.index)
print(y)

NoDur    0.659427
Durbl   -0.036664
Manuf   -0.216841
Enrgy    0.168347
Chems    0.100097
BusEq   -0.087873
Telcm    0.506926
Utils    0.169408
Shops    0.024945
Hlth     0.107469
Money   -0.411688
Other    0.016447
dtype: float64


### Question 3

In the following output we can see that:
Highest weight:
- NoDur    0.750613
Lowest weight:
- Money   -0.278927

The weights given by the minimum variance portfolio and the highest sharpe ratio portfolio differ for some industries.

The maximum sharpe ratio portfolio takes into account the highest possible expected excess return per unit of risk. And so, it would possibly short some industries so taht it may increase the investement in other well defined industries for this case.

We can see that Other and Money have been highly shorted so that Hlth and NoDur can increase in weight of investement.



In [None]:
w = np.linalg.inv(cov_matrix) @ (mean_returns)
w = pd.Series(w / w.sum(), index=mean_returns.index)
print(w)

NoDur    0.750613
Durbl    0.118130
Manuf   -0.057490
Enrgy    0.288203
Chems    0.024734
BusEq    0.130730
Telcm    0.250072
Utils    0.055885
Shops    0.044834
Hlth     0.313418
Money   -0.278927
Other   -0.640201
dtype: float64
