# M1 Portfolio Theory I

## JTCerda

# 1 True / False

1. Depends. If all assets on which we are performing the optimization are subject to the same inflation, then there is no difference. Returns might differ but the composition of the portfolio should not. However, if we are optimizing over assets on different currencies and different denominations, we might encounter some differences in our answer.

2. False. It is enough for many investors to take that risk in return of the premium.

3. Depends. It depends on how we define better performance. Usually we will get higher returns, but also higher volatility. This might be unwanted for an investor. It is not clear that we will get better Sharpe Ratio always so it can be the case that it has better performance but that is not always the case. 

4. False. In the data we studied in one of our HWs, we actually saw that the long only had better performance in the period and with the assets we made the comparison. However, to compare performance we may also want to know about other metrics like the market correlation. In that sense, a long only has very high market correlation unlike the long-short and that may take away all appeal from it. One of the advantages of momentum is that has lower market correlation.

5. False. DFA bases its strategy in broader portfolios and asset classes rather than in individual stocks. Improves its performance by doing smart rebalancing (less often, more cheap) and relies on other strategies like stocks borrowing to keep trading costs low. Also takes advantage from tax regulation when offering its products. 

## 2 Short Answer

1. Harvard management Co makes sure its weights are realistic by doing a two stage optimization. First does it optimizing over very broad asset classes and as a second stage, optimizes within each class to select the assets. One of the drawbacks will be that by doing this, it might pick two assets highly correlated in different classes.

2. Fama and French construct the factors by decomposing into the other factors, trying the different combinations. For example, when examining momentum (not one of their 5 factors) they test it by decomposing momentum for small and for big firms, thus controlling right away by the size factor. We did this in one of our homeworks.

3. This is related to the pepperoni pizza analogy. DFA is confident that they are the best on providing the beta. They do it in a smart way by lowering trading costs and tax breaks strategies.

4. As in the CAPM, with the time series regression we asess the statistical significance of the parameters for each individual asset. On the other hand, when doing the cross section we are testing the model across the assets and so is a broader test of the model, we study the $R^2$. 

5. When doing a top-down replication strategy like the one proposed by ProShares in order to get hedge-fund exposure it is possible to gain in diversification, liquidity and transparency. However, it may not be easy to market it to investors as there is a high chance that it trails its benchmark do to restrictions that hedge funds don't face, like strict withdrawal rules, reporting, etc. 

## 3 Allocation 

In [1]:
#Import some libraries
import numpy as np
import pandas as pd
import statsmodels.formula.api as smf
import statsmodels.api as sm
import matplotlib.pyplot as plt
import seaborn as sns
from dataclasses import dataclass
import warnings
sns.set()

pd.set_option('display.float_format', lambda x: '%.4f' % x)

In [2]:
#Paths
path = 'data/momentum_data.xlsx'
ff_data_path = 'data/ff_data.xlsx'

#Momentum
df_factor_mom = pd.read_excel(path, sheet_name='Momentum Factor').set_index('date')
ff_factors_df_temp = pd.read_excel(ff_data_path, sheet_name='FACTORS').set_index('date') #MKT, SMB, HML, RF
ff_factors_df_temp.columns = ff_factors_df_temp.columns.str.strip() #?
ff_factors_names = ff_factors_df_temp.columns#?
df_factor = df_factor_mom.join(ff_factors_df_temp).rename(columns={"MOM": "rx_mom_FF", "MKT": "rx_mkt", "SMB": "rx_size", "HML": "rx_value", "RF": "r_f"})

#FF data
ff_data_path = 'data/ff_data.xlsx'
temp = pd.read_excel(ff_data_path, sheet_name='PORTFOLIOS').set_index('date')
temp.columns = temp.columns.str.strip()
ind_portfolios_names = temp.columns
df_industries = df_factor.join(temp)

temp = pd.read_excel(path, sheet_name='Momentum Deciles').set_index('date')
temp.columns = temp.columns.str.strip()
momentum_portfolio_names = temp.columns
df_momentum = df_factor.join(temp)

temp = pd.read_excel(path, sheet_name='Momentum by Size').set_index('date')
temp.columns = temp.columns.str.strip()
mom_size_portfolio_names = temp.columns
df_mom_size = df_factor.join(temp)

del temp

### 3.1

In [3]:
df_4_factors = df_factor[['rx_mom_FF','rx_mkt', 'rx_size', 'rx_value']]

In [4]:
def table_row(df, portfolio, annualize_factor=12):
    mean = df[portfolio].mean() * annualize_factor
    vol = df[portfolio].std() * np.sqrt(annualize_factor)
    sharpe_ratio = mean/vol
    skew = df[portfolio].skew() #unscaled
    corr = df[[portfolio, 'rx_mkt', 'rx_value']].corr()
    corr_to_mkt = corr.loc[portfolio, 'rx_mkt']
    corr_to_value = corr.loc[portfolio, 'rx_value']
    return [mean, vol, sharpe_ratio]

In [5]:
table_3_1 = pd.DataFrame(index=['rx_mom_FF', 'rx_mkt', 'rx_size', 'rx_value'],
    columns=['mean ', 'vol ', 'sharpe_ratio '])
table_3_1.loc['rx_mom_FF', :] = table_row(df_4_factors, portfolio='rx_mom_FF')
table_3_1.loc['rx_mkt', :] = table_row(df_4_factors, portfolio='rx_mkt')
table_3_1.loc['rx_size', :] = table_row(df_4_factors, portfolio='rx_size')
table_3_1.loc['rx_value', :] = table_row(df_4_factors, portfolio='rx_value')

In [6]:
print('Moments stats')
table_3_1

Moments stats


Unnamed: 0,mean,vol,sharpe_ratio
rx_mom_FF,0.0796,0.1626,0.4893
rx_mkt,0.0804,0.1857,0.4332
rx_size,0.0233,0.1102,0.2118
rx_value,0.039,0.1217,0.3204


In [7]:
print('Moments correlation matrix')
df_4_factors.corr()

Moments correlation matrix


Unnamed: 0,rx_mom_FF,rx_mkt,rx_size,rx_value
rx_mom_FF,1.0,-0.3405,-0.1548,-0.4114
rx_mkt,-0.3405,1.0,0.3235,0.2356
rx_size,-0.1548,0.3235,1.0,0.1264
rx_value,-0.4114,0.2356,0.1264,1.0


### 3.2

In [8]:
df_4_factors_in = df_4_factors[:'Dec 2010']
#df_4_factors_in.tail()
df_4_factors_out = df_4_factors['Jan 2011':]
#df_4_factors_out.tail()

In [9]:
#Compute the stats and anualize
mu_in = df_4_factors_in.mean()
vol_in = df_4_factors_in.std()
sharpe_in = mu_in / vol_in
summary_in = pd.DataFrame({'Mean':mu_in * 12, 'Vol':vol_in * np.sqrt(12), 'Sharpe': sharpe_in * np.sqrt(12)})
print('In sample data summary')
summary_in

In sample data summary


Unnamed: 0,Mean,Vol,Sharpe
rx_mom_FF,0.0833,0.1672,0.4984
rx_mkt,0.0746,0.1903,0.3922
rx_size,0.0281,0.1133,0.248
rx_value,0.0503,0.1243,0.405


In [10]:
Sigma_in = df_4_factors_in.cov()
Sigma_inv_in = np.linalg.inv(Sigma_in)

# from the formula for the tangency weights
N_in = mu_in.shape[0]
weights_in = Sigma_inv_in @ mu_in / (np.ones(N_in) @ Sigma_inv_in @ mu_in)      

#Make a series
wts_tan_in = pd.Series(weights_in, index=summary_in.index)

print('Weights of the in sample tangency portfolio')
wts_tan_in

Weights of the in sample tangency portfolio


rx_mom_FF   0.3827
rx_mkt      0.1788
rx_size     0.0874
rx_value    0.3511
dtype: float64

### 3.3

In [11]:
#Compute the tangency in sample series and stats
df_4_factors_tan_in = df_4_factors_in @ wts_tan_in
mu_tan_in = df_4_factors_tan_in.mean()
vol_tan_in = df_4_factors_tan_in.std()
sharpe_tan_in = mu_tan_in / vol_tan_in
print('Tangency in-sample mean: ', mu_tan_in * 12)
print('Tangency in-sample volatility: ', vol_tan_in * np.sqrt(12))
print('Tangency in-sample sharpe ratio: ', sharpe_tan_in * np.sqrt(12))

Tangency in-sample mean:  0.0653631836744814
Tangency in-sample volatility:  0.06581818532903927
Tangency in-sample sharpe ratio:  0.9930869918050275


### 3.4

In [12]:
#Compute the tangency out sample series and stats
df_4_factors_tan_out = df_4_factors_out @ wts_tan_in
mu_tan_out = df_4_factors_tan_out.mean()
vol_tan_out = df_4_factors_tan_out.std()
sharpe_tan_out = mu_tan_out / vol_tan_out
print('Tangency out-sample mean: ', mu_tan_out * 12)
print('Tangency out-sample volatility: ', vol_tan_out * np.sqrt(12))
print('Tangency out-sample sharpe ratio: ', sharpe_tan_out * np.sqrt(12))

Tangency out-sample mean:  0.01890503873877531
Tangency out-sample volatility:  0.04326549539867129
Tangency out-sample sharpe ratio:  0.4369541724779578


### 3.5

Here we optimize using one dataset and the model fits that data. If we use those results out of sample, we can expect it to have a lower fit because of the different characteristics of the data in that particular period. For example, the in-sample includes the financial crisis of 08-09, and the out of sample not.
Also, the optimal formula in MV optimization penalizes securities for marginal risk (covariance), not total risk (volatility). we would like overall risk optimization.
In this case we see that it does affect as the mean return is very different and in concequence the sharpe ratio too. We obtain a poorer performance out of sample.

## 4 Hedging & Replication 

In [13]:
#df_4_factors.head()
#df_4_factors.tail()
# smb = df_4_factors['rx_size']
# mkt = df_4_factors['rx_mkt']

### 4.1.a

In [14]:
rhs_4_1 = sm.add_constant(df_4_factors[['rx_mkt']])
lhs_4_1 = df_4_factors['rx_size']
res_4_1 = sm.OLS(lhs_4_1, rhs_4_1, missing='drop').fit()
betas_4_1 = res_4_1.params
r_sq_4_1 = res_4_1.rsquared
res_4_1.summary()

  return ptp(axis=axis, out=out, **kwargs)


0,1,2,3
Dep. Variable:,rx_size,R-squared:,0.105
Model:,OLS,Adj. R-squared:,0.104
Method:,Least Squares,F-statistic:,131.1
Date:,"Wed, 18 Nov 2020",Prob (F-statistic):,8.640000000000001e-29
Time:,20:18:37,Log-Likelihood:,2342.7
No. Observations:,1124,AIC:,-4681.0
Df Residuals:,1122,BIC:,-4671.0
Df Model:,1,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5,6
,coef,std err,t,P>|t|,[0.025,0.975]
const,0.0007,0.001,0.727,0.468,-0.001,0.002
rx_mkt,0.1920,0.017,11.451,0.000,0.159,0.225

0,1,2,3
Omnibus:,492.085,Durbin-Watson:,2.037
Prob(Omnibus):,0.0,Jarque-Bera (JB):,12031.768
Skew:,1.457,Prob(JB):,0.0
Kurtosis:,18.761,Cond. No.,18.7


In [15]:
print('Regression parameters')
print('Alpha: ', betas_4_1[0])
print('Beta: ', betas_4_1[1])
print('Regression R-squared: ', r_sq_4_1)

Regression parameters
Alpha:  0.0006582302962350529
Beta:  0.19204930753013147
Regression R-squared:  0.10463264171320041


### 4.1.b

In [16]:
smb_star = df_4_factors['rx_size'] - df_4_factors['rx_mkt']*betas_4_1[1]
mu_smb_star = smb_star.mean()
vol_smb_star = smb_star.std()
sharpe_smb_star = mu_smb_star / vol_smb_star
print('Mean SMB*: ', mu_smb_star)
print('Volatility SMB*: ', vol_smb_star)
print('Sharpe ratio SMB*: ', sharpe_smb_star)

Mean SMB*:  0.0006582302962350531
Volatility SMB*:  0.030115250894454954
Sharpe ratio SMB*:  0.021857041754091824


In [17]:
smb_star_df = pd.DataFrame(df_4_factors)
smb_star_df['smb*'] = smb_star
#smb_star_df

In [18]:
table_4_1 = pd.DataFrame(index=['rx_mom_FF', 'rx_mkt', 'rx_size', 'rx_value', 'smb*'],
    columns=['mean ', 'vol ', 'sharpe_ratio '])
table_4_1.loc['rx_mom_FF', :] = table_row(smb_star_df, portfolio='rx_mom_FF')
table_4_1.loc['rx_mkt', :] = table_row(smb_star_df, portfolio='rx_mkt')
table_4_1.loc['rx_size', :] = table_row(smb_star_df, portfolio='rx_size')
table_4_1.loc['rx_value', :] = table_row(smb_star_df, portfolio='rx_value')
table_4_1.loc['smb*', :] = table_row(smb_star_df, portfolio='smb*')
table_4_1

Unnamed: 0,mean,vol,sharpe_ratio
rx_mom_FF,0.0796,0.1626,0.4893
rx_mkt,0.0804,0.1857,0.4332
rx_size,0.0233,0.1102,0.2118
rx_value,0.039,0.1217,0.3204
smb*,0.0079,0.1043,0.0757


### 4.1.c

In [19]:
print('Moments correlation matrix')
smb_star_df.corr()

Moments correlation matrix


Unnamed: 0,rx_mom_FF,rx_mkt,rx_size,rx_value,smb*
rx_mom_FF,1.0,-0.3405,-0.1548,-0.4114,-0.0472
rx_mkt,-0.3405,1.0,0.3235,0.2356,-0.0
rx_size,-0.1548,0.3235,1.0,0.1264,0.9462
rx_value,-0.4114,0.2356,0.1264,1.0,0.053
smb*,-0.0472,-0.0,0.9462,0.053,1.0


Here, we can see that the hedge is long on smb, short on $\beta^{smb,mkt} \tilde{r}^m$. So, it optimally hedges the market, by forcing the asset to have correlation zero with the market as seen in the table above.

### 4.2.a

In [20]:
rhs_4_2 = sm.add_constant(df_4_factors[['rx_mkt','rx_size', 'rx_value']])
lhs_4_2 = df_4_factors['rx_mom_FF']
res_4_2 = sm.OLS(lhs_4_2, rhs_4_2, missing='drop').fit()
betas_4_2 = res_4_2.params
r_sq_4_2 = res_4_2.rsquared
res_4_2.summary()

0,1,2,3
Dep. Variable:,rx_mom_FF,R-squared:,0.233
Model:,OLS,Adj. R-squared:,0.231
Method:,Least Squares,F-statistic:,113.4
Date:,"Wed, 18 Nov 2020",Prob (F-statistic):,4.16e-64
Time:,20:18:37,Log-Likelihood:,1992.6
No. Observations:,1124,AIC:,-3977.0
Df Residuals:,1120,BIC:,-3957.0
Df Model:,3,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5,6
,coef,std err,t,P>|t|,[0.025,0.975]
const,0.0097,0.001,7.814,0.000,0.007,0.012
rx_mkt,-0.2176,0.025,-8.789,0.000,-0.266,-0.169
rx_size,-0.0448,0.041,-1.095,0.274,-0.125,0.035
rx_value,-0.4665,0.036,-12.944,0.000,-0.537,-0.396

0,1,2,3
Omnibus:,394.835,Durbin-Watson:,1.955
Prob(Omnibus):,0.0,Jarque-Bera (JB):,3552.244
Skew:,-1.357,Prob(JB):,0.0
Kurtosis:,11.276,Cond. No.,34.1


In [21]:
print('Regression parameters')
print('Alpha: ', betas_4_2[0])
print('Beta MKT: ', betas_4_2[1])
print('Beta SMB: ', betas_4_2[2])
print('Beta HML: ', betas_4_2[3])
print('Regression R-squared: ', r_sq_4_2)

Regression parameters
Alpha:  0.009692763635650699
Beta MKT:  -0.21761919990881878
Beta SMB:  -0.04475815912017368
Beta HML:  -0.4664898769069509
Regression R-squared:  0.23293080073820593


### 4.2.b

In [22]:
mom_2star = res_4_2.predict()

mu_mom_2star = mom_2star.mean()
vol_mom_2star = mom_2star.std()
sharpe_mom_2star = mu_mom_2star / vol_mom_2star
print('Mean MOM**: ', mu_mom_2star)
print('Volatility MOM**: ', vol_mom_2star)
print('Sharpe ratio MOM**: ', sharpe_mom_2star)

Mean MOM**:  0.006630871886121001
Volatility MOM**:  0.022648679417412176
Sharpe ratio MOM**:  0.2927707953260721


In [23]:
smb_star_df['mom**'] = mom_2star
print('Moments correlation matrix')
smb_star_df.corr()

Moments correlation matrix


Unnamed: 0,rx_mom_FF,rx_mkt,rx_size,rx_value,smb*,mom**
rx_mom_FF,1.0,-0.3405,-0.1548,-0.4114,-0.0472,0.4826
rx_mkt,-0.3405,1.0,0.3235,0.2356,-0.0,-0.7056
rx_size,-0.1548,0.3235,1.0,0.1264,0.9462,-0.3208
rx_value,-0.4114,0.2356,0.1264,1.0,0.053,-0.8525
smb*,-0.0472,-0.0,0.9462,0.053,1.0,-0.0979
mom**,0.4826,-0.7056,-0.3208,-0.8525,-0.0979,1.0


Correlation of MOM and MOM** as shown in the table above is 0.4826, less than 1.

### 4.3

In the case of SMB* it would not be possible to hedge out the market as the hedge is actually holding $\tilde{r}^i_t - \beta^{smb,mkt} \tilde{r}^m_t = \alpha + \epsilon_t$ on the regression. In the case of the replication of MOM** we would have an extra restriction (not possible to invest in the risk free asset), however the replication should be very similar.