### Section 2: Analyzing GMO
#### 2.1 Examine GMO's performance. Calculate the mean, volatility, and Sharpe ratio for GMWAX. Do this for inception - 2011, 2012- present, and full sample.

In [35]:
import pandas as pd
import numpy as np
import scipy.stats as stats
import matplotlib.pyplot as plt
import seaborn as sns
import statsmodels.api as sm
returns_data = pd.read_excel('../data/gmo_analysis_data.xlsx', sheet_name='returns (total)')
returns_data.rename(columns={'Unnamed: 0':'Date'},inplace=True)
returns_data = returns_data.set_index('Date')

risk_free_rate = pd.read_excel('../data/gmo_analysis_data.xlsx', sheet_name='risk-free rate')
risk_free_rate.rename(columns={'Unnamed: 0':'Date'},inplace=True)
risk_free_rate = risk_free_rate.set_index('Date')
returns_data['RF'] = risk_free_rate
returns_data = returns_data.dropna()
excess_returns = pd.DataFrame(index=risk_free_rate.index)
excess_returns['SPY'] = returns_data['SPY'] - returns_data['RF']
excess_returns['GMWAX'] = returns_data['GMWAX'] - returns_data['RF']

def portfolio_stats_2(data):
    # Calculate the mean and annualize
    mean = data.mean() * 12

    # Volatility = standard deviation
    # Annualize the result with sqrt(12)
    vol = data.std() * np.sqrt(12)

    # Sharpe Ratio is mean / vol
    sharpe_ratio = mean / vol

    # Format for easy reading
    return round(pd.DataFrame(data = [mean, vol, sharpe_ratio], 
        index = ['Mean', 'Volatility', 'Sharpe']), 4)
    
print("Inception - 2011")
display(portfolio_stats_2(excess_returns.loc[:'2011']))

print("2012-Present")
display(portfolio_stats_2(excess_returns.loc['2012':]))

print("Inception - Present")
display(portfolio_stats_2(excess_returns))

Inception - 2011


Unnamed: 0,SPY,GMWAX
Mean,0.04,0.0158
Volatility,0.165,0.125
Sharpe,0.2424,0.1266


2012-Present


Unnamed: 0,SPY,GMWAX
Mean,0.1265,0.0366
Volatility,0.1431,0.092
Sharpe,0.8843,0.3982


Inception - Present


Unnamed: 0,SPY,GMWAX
Mean,0.076,0.0245
Volatility,0.1565,0.1123
Sharpe,0.486,0.2181


##### Has the mean, vol, and Sharpe changed much since the case?
Yes, we actually see much better performance for GMWAX after 2012. The Mean goes up over 2% and the Sharpe ratio triples.

#### 2.2 GMO believes a risk premium is compensation for a security's tendency to lose money at "bad times". For these three samples, analyze extreme scenarios by looking at min return, 5th percentile VaR, and Maximum drawdown.

In [60]:
def tail_risk(df, total_returns):
    tr_df = pd.DataFrame(data = None)
    tr_df['Min return'] = df.min()
    tr_df['VaR-5th'] = df.quantile(.05)
    cum_ret = (1 + total_returns).cumprod()
    rolling_max = cum_ret.cummax()
    drawdown = (cum_ret - rolling_max) / rolling_max
    tr_df['Max Drawdown'] = drawdown.min()
    
    return tr_df

print("Inception - 2011")
display(tail_risk(excess_returns.loc[:'2011'], returns_data.loc[:'2011']))

print("2012 - Present")
display(tail_risk(excess_returns.loc['2012':], returns_data.loc['2012':]))

print("Inception - Present")
display(tail_risk(excess_returns, returns_data))

Inception - 2011


Unnamed: 0,Min return,VaR-5th,Max Drawdown
SPY,-0.16557,-0.080224,-0.50798
GMWAX,-0.149179,-0.059806,-0.355219


2012 - Present


Unnamed: 0,Min return,VaR-5th,Max Drawdown
SPY,-0.124734,-0.068658,-0.239281
GMWAX,-0.11865,-0.039686,-0.216773


Inception - Present


Unnamed: 0,Min return,VaR-5th,Max Drawdown
SPY,-0.16557,-0.080006,-0.50798
GMWAX,-0.149179,-0.048293,-0.355219


#### 2.2 a) Does GMWAX have high or low tail-risk as seen by these stats?
GMWAX has lower tail-risk than SPY as seen across all three measurments and all three samples.
#### 2.2 b) Does that vary much across the two subsamples?
Yes, especially when looking at the max drawdown. There was much worse max drawdown in the earlier subsample, and the min returns and VaR-5th were also worse in the first subsample.

#### 2.3 For all three samples, regress excess returns of GMWAX on SPY. 
#### 2.3 a) Report the estimated alpha, beta, and R-squared


In [37]:
def reg_params(df, y_col, X_col, intercept = True, annual_fac=12):
    y = df[y_col]
    if intercept == True:
        X = sm.add_constant(df[X_col])
    else:
        X = df[X_col]
    
    model = sm.OLS(y, X, missing = 'drop').fit()
    reg_df = model.params.to_frame('Regression Parameters')
    reg_df.loc['R-squared'] = model.rsquared
    
    if intercept == True:
        reg_df.loc['const'] *= annual_fac
    
    return reg_df

print("Inception - 2011")
display(reg_params(excess_returns.loc[:'2011'], 'GMWAX', 'SPY'))

print("2012 - Present")
display(reg_params(excess_returns.loc['2012':], 'GMWAX', 'SPY'))

print("Inception - Present")
display(reg_params(excess_returns, 'GMWAX', 'SPY'))

Inception - 2011


Unnamed: 0,Regression Parameters
const,-0.005751
SPY,0.539616
R-squared,0.507129


2012 - Present


Unnamed: 0,Regression Parameters
const,-0.034492
SPY,0.562232
R-squared,0.764506


Inception - Present


Unnamed: 0,Regression Parameters
const,-0.016989
SPY,0.5456
R-squared,0.577744


#### 2.3 b) Is GMWAX a low-beta strategy? Has that changed since the case?
While there is moderate exposure to the market (SPY), GMO could be considered low beta due to the consistent betas across the subsamples. I.e. their exposure to the market has not changed.

#### 2.3 c) Does GMWAX provide alpha? Has that changed across the subsamples?
No, as we can see in the regressions, the const (alpha) is very close to zero, and in fact is negative in both subsamples. 

### Section 3 Forecast Regressions
#### 3.1 Consider the lagged regression, where the regressor (X) is a period behind the target (SPY). Estimate and report the R-squared as well as the OLS estimates for alpha and beta. Do this for:
- the dividend-price ratio
- the earnings-price ratio
- the dividend-price ratio, the earnings-price ratio, and the 10-year yield

In [38]:
signals_data = pd.read_excel('../data/gmo_analysis_data.xlsx', sheet_name='signals')
signals_data.rename(columns={'Unnamed: 0':'Date'},inplace=True)
signals_data = signals_data.set_index('Date').dropna()
signals_data = signals_data.shift()
signals_data['SPY'] = returns_data['SPY']
signals_data.head()

Unnamed: 0_level_0,DP,EP,US10Y,SPY
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
1993-02-28,,,,
1993-03-31,2.82,4.44,6.03,
1993-04-30,2.77,4.41,6.03,
1993-05-31,2.82,4.44,6.05,
1993-06-30,2.81,4.38,6.16,


In [39]:
print("Lagged Regression 1: Dividend-Price Ratio")
display(reg_params(signals_data, 'SPY', 'DP'))

print("Lagged Regression 2: Earnings-Price Ratio")
display(reg_params(signals_data, 'SPY', 'EP'))

print("Lagged Regression 3: Dividend-Price Ratio, Earnings-Price Ratio, 10-Year Treasury")
display(reg_params(signals_data, 'SPY', ['DP', 'EP', 'US10Y']))

Lagged Regression 1: Dividend-Price Ratio


Unnamed: 0,Regression Parameters
const,-0.170188
DP,0.012212
R-squared,0.010371


Lagged Regression 2: Earnings-Price Ratio


Unnamed: 0,Regression Parameters
const,-0.046901
EP,0.002687
R-squared,0.005618


Lagged Regression 3: Dividend-Price Ratio, Earnings-Price Ratio, 10-Year Treasury


Unnamed: 0,Regression Parameters
const,-0.219312
DP,0.010283
EP,0.002271
US10Y,-0.000696
R-squared,0.015164


#### 3.2 For each of the three regressions, construct a trading strategy. For each strategy, estimate:
 - mean, volatility, and Sharpe ratio
 - max-drawdown
 - market alpha
 - market beta 
 - market Information Ratio

In [40]:
# Factors already shifted
# weight * 100
DP = reg_params(signals_data, 'SPY', 'DP')
# mimic returns of spy with alpha + beta(DP returns)
DP_weighted = 100 * (DP.loc['const'][0]/12 + DP.loc['DP'][0] * signals_data['DP'])
DP_returns = (DP_weighted * signals_data['SPY']).dropna()

EP = reg_params(signals_data, 'SPY', 'EP')
EP_weighted = 100 * (EP.loc['const'][0]/12 + EP.loc['EP'][0] * signals_data['EP'])
EP_returns = (EP_weighted * signals_data['SPY']).dropna()

ThreeFactor = reg_params(signals_data, 'SPY', ['DP', 'EP', 'US10Y'])
ThreeFactor_Weighted = 100 * (ThreeFactor.loc['const'][0]/12 + ThreeFactor.loc['EP'][0] * signals_data['EP']\
                                             + ThreeFactor.loc['DP'][0] * signals_data['DP']\
                                             + ThreeFactor.loc['US10Y'][0] * signals_data['US10Y'])
ThreeFactor_Returns = (ThreeFactor_Weighted * signals_data['SPY']).dropna()

In [41]:
# This summary stats method gets you Mean, Vol, Sharpe
def summary_stats_bm(series, bm, annual_fac=12):
    ss_df = pd.DataFrame(data = None, index = ['Summary Stats'])
    ss_df['Mean'] = series.mean() * annual_fac
    ss_df['Vol'] = series.std() * np.sqrt(annual_fac)
    ss_df['Sharpe (Mean/Vol)'] = ss_df['Mean'] / ss_df['Vol']
    
    y = series
    X = sm.add_constant(bm.loc[series.index])
    reg = sm.OLS(y,X).fit()
    regParams = reg.params
    ss_df['alpha'] = regParams[0] * annual_fac
    ss_df['SPY beta'] = regParams[1]
    
    cum_ret = (1 + series).cumprod()
    rolling_max = cum_ret.cummax()
    drawdown = (cum_ret - rolling_max) / rolling_max
    ss_df['Max Drawdown'] = drawdown.min()
    ss_df['Information Ratio'] = (regParams[0] / reg.resid.std()) * np.sqrt(12)
    
    return round(ss_df, 4)

In [42]:
print("Dividend Price Strategy")
display(summary_stats_bm(DP_returns, signals_data[['SPY']]))

print("Earnings Price Strategy")
display(summary_stats_bm(EP_returns, signals_data[['SPY']]))

print("Three Factor Strategy")
display(summary_stats_bm(ThreeFactor_Returns, signals_data[['SPY']]))

Dividend Price Strategy


Unnamed: 0,Mean,Vol,Sharpe (Mean/Vol),alpha,SPY beta,Max Drawdown,Information Ratio
Summary Stats,0.1014,0.1691,0.5999,0.0175,0.8775,-0.7128,0.1773


Earnings Price Strategy


Unnamed: 0,Mean,Vol,Sharpe (Mean/Vol),alpha,SPY beta,Max Drawdown,Information Ratio
Summary Stats,0.0899,0.1251,0.7186,0.0223,0.707,-0.354,0.3802


Three Factor Strategy


Unnamed: 0,Mean,Vol,Sharpe (Mean/Vol),alpha,SPY beta,Max Drawdown,Information Ratio
Summary Stats,0.1131,0.1569,0.7211,0.0379,0.7868,-0.5978,0.389


#### 3.3 a) GMO believes a risk premium is compensation for a security's tendency to lose money at "bad times". For the strategies, the market, and GMO, calculate the monthly VaR (0.05)

In [43]:
VaR = pd.DataFrame([DP_returns.quantile(.05), EP_returns.quantile(.05), ThreeFactor_Returns.quantile(.05), 
                    signals_data['SPY'].quantile(.05), 
                    returns_data['GMWAX'].quantile(.05)],
                   index = ['DP Strategy','EP Strategy','3-Factor Strategy','SPY','GMO'], 
                   columns = ['VaR (0.05)'])
display(VaR)

Unnamed: 0,VaR (0.05)
DP Strategy,-0.055534
EP Strategy,-0.056032
3-Factor Strategy,-0.06458
SPY,-0.078975
GMO,-0.047306


#### 3.3 b) The GMO case mentions that stocks underpeformed short-term bonds from 2000-2011. Does the dynamic portfolio above under-perform the risk-free rate over this time?

In [44]:
print("Risk Free Rate")
display(portfolio_stats_2(returns_data['RF'].loc['2000':'2011']))

print("Dividend Price Strategy")
display(portfolio_stats_2(DP_returns.loc['2000':'2011']))

print("Earnings Price Strategy")
display(portfolio_stats_2(EP_returns.loc['2000':'2011']))

print("Three Factor Strategy")
display(portfolio_stats_2(ThreeFactor_Returns.loc['2000':'2011']))

Risk Free Rate


Unnamed: 0,0
Mean,0.0231
Volatility,0.0058
Sharpe,3.9866


Dividend Price Strategy


Unnamed: 0,0
Mean,0.0473
Volatility,0.2118
Sharpe,0.2233


Earnings Price Strategy


Unnamed: 0,0
Mean,0.0333
Volatility,0.1264
Sharpe,0.2634


Three Factor Strategy


Unnamed: 0,0
Mean,0.0627
Volatility,0.1757
Sharpe,0.3569


We can see in the statistics above that all three dynamic strategies had means greater than the risk-free rate.

#### 3.3 c) Based on the regression estimates, in how many periods do we estimate a negative risk premium?

In [45]:
all_returns = ThreeFactor_Returns.to_frame('3-Factor Strategy')
all_returns['DP Strategy'] = DP_returns
all_returns['EP Strategy'] = EP_returns
all_returns['Risk-Free'] = risk_free_rate['US3M']

df_riskprem = pd.DataFrame(data=None, index=['% of periods underperforming risk-free rate'])
for col in all_returns.columns[:3]:
    df_riskprem[col] = len(all_returns[all_returns[col] < all_returns['Risk-Free']])/len(all_returns) * 100
    
display(df_riskprem)

Unnamed: 0,3-Factor Strategy,DP Strategy,EP Strategy
% of periods underperforming risk-free rate,38.141026,40.064103,39.423077


#### 3.3 d) Do you believe the dynamic strategy takes on extra risk?
No. The tail-risk metrics don't seem significantly worse, and the vols are also roughly equivalent to SPY's.

### 4 Out of Sample Forecasting
#### 4.1 Report the out-of-sample R-squared


In [46]:
# Used for calculating an out of sample R-squared value where we are told where to start (t=60 in this hw)
def OOS_r2(df, factors, start):
    y = df['SPY']
    X = sm.add_constant(df[factors])

    forecast_err, null_err = [], []

    for i,j in enumerate(df.index):
        if i >= start:
            currX = X.iloc[:i]
            currY = y.iloc[:i]
            reg = sm.OLS(currY, currX, missing = 'drop').fit()
            null_forecast = currY.mean()
            reg_predict = reg.predict(X.iloc[[i]])
            actual = y.iloc[[i]]
            forecast_err.append(reg_predict - actual)
            null_err.append(null_forecast - actual)
            
    RSS = (np.array(forecast_err)**2).sum()
    TSS = (np.array(null_err)**2).sum()
    
    return 1 - RSS/TSS

In [47]:
dividend_price_r2 = OOS_r2(signals_data, ['DP'], 60)
earnings_price_r2 = OOS_r2(signals_data, ['EP'], 60)

print("Dividend Price R-squared: ", dividend_price_r2)
print("Earnings Price R-squared: ", earnings_price_r2)

Dividend Price R-squared:  -0.02014701776090866
Earnings Price R-squared:  -0.018315436829004383


#### 4.1 Did this forecast produce a positive out of sample r-squared?
No, R-squared was negative for both forecasts

#### 4.2 Redo problem 3.2 using this OOS forecast

In [50]:
def OOS_strat(df, factors, start, weight):
    returns = []
    y = df['SPY']
    X = sm.add_constant(df[factors])

    for i,j in enumerate(df.index):
        if i >= start:
            currX = X.iloc[:i]
            currY = y.iloc[:i]
            reg = sm.OLS(currY, currX, missing = 'drop').fit()
            pred = reg.predict(X.iloc[[i]])
            w = pred * weight
            returns.append((df.iloc[i]['SPY'] * w)[0])

    df_strat = pd.DataFrame(data = returns, index = df.iloc[-(len(returns)):].index, columns = ['Strat Returns'])
    return df_strat

dp_oos = OOS_strat(signals_data, ['DP'], 60, 100)
# display(dp_oos)

In [52]:
ep_oos = OOS_strat(signals_data, ['EP'], 60, 100)
# display(ep_oos)

In [54]:
print("Dividend Price Strategy Out of Sample")
display(summary_stats_bm(dp_oos['Strat Returns'], signals_data[['SPY']]))

print("Earnings Price Strategy Out of Sample")
display(summary_stats_bm(ep_oos['Strat Returns'], signals_data[['SPY']]))

Dividend Price Strategy Out of Sample


Unnamed: 0,Mean,Vol,Sharpe (Mean/Vol),alpha,SPY beta,Max Drawdown,Information Ratio
Summary Stats,0.0674,0.1892,0.3563,-0.0088,0.8896,-0.7417,-0.0689


Earnings Price Strategy Out of Sample


Unnamed: 0,Mean,Vol,Sharpe (Mean/Vol),alpha,SPY beta,Max Drawdown,Information Ratio
Summary Stats,0.096,0.1778,0.5401,0.0518,0.5161,-0.5594,0.3269


#### 4.2 How much better/worse is the performance of the OOS strategy compared to in-sample version of 3.2?
We see much lower means and higher vols in the out of sample regressions, leading to significantly lower Sharpe ratios. The information ratios in the out of sample regressions were also much worse than problem 3.2.

#### 4.3 Redo problem 3.3 using the OOS forecast.
#### 4.3 a) Calculate monthly VaR


In [55]:
VaR = pd.DataFrame([dp_oos['Strat Returns'].quantile(.05), ep_oos['Strat Returns'].quantile(.05),
                    signals_data['SPY'].quantile(.05), 
                    returns_data['GMWAX'].quantile(.05)],
                   index = ['DP OOS Strategy','EP OOS Strategy','SPY','GMO'], 
                   columns = ['VaR (0.05)'])
display(VaR)

Unnamed: 0,VaR (0.05)
DP OOS Strategy,-0.061408
EP OOS Strategy,-0.07616
SPY,-0.078975
GMO,-0.047306


#### 4.3 b) Do these portfolios underperform the risk free rate from 2000-2011?

In [58]:
print("Dividend Price Out of Sample")
display(summary_stats_bm(dp_oos.loc['2000':'2011']['Strat Returns'], returns_data[['SPY']]))

print("Earnings Price Out of Sample")
display(summary_stats_bm(ep_oos.loc['2000':'2011']['Strat Returns'], returns_data[['SPY']]))

print("Risk Free Rate")
portfolio_stats_2(risk_free_rate.loc['2000':'2011'])

Dividend Price Out of Sample


Unnamed: 0,Mean,Vol,Sharpe (Mean/Vol),alpha,SPY beta,Max Drawdown,Information Ratio
Summary Stats,0.0079,0.2412,0.0329,-0.0103,1.0022,-0.7417,-0.0578


Earnings Price Out of Sample


Unnamed: 0,Mean,Vol,Sharpe (Mean/Vol),alpha,SPY beta,Max Drawdown,Information Ratio
Summary Stats,0.0859,0.2212,0.3884,0.0803,0.3098,-0.5594,0.3728


Risk Free Rate


Unnamed: 0,US3M
Mean,0.0231
Volatility,0.0058
Sharpe,3.9866


The Dividend Price dynamic strategy does significantly worse than the Risk Free Rate, but the Earnings Price Strategy does better.

#### 4.3 c) How many periods do we expect a negative risk premium?

In [59]:
r_df_OOS = ep_oos.rename(columns={"Strat Returns": "EP Strat"})
r_df_OOS['DP Strat'] = dp_oos.rename(columns={"Strat Returns": "DP Strat"})
r_df_OOS['rf'] = risk_free_rate['US3M']

df_riskprem2 = pd.DataFrame(data=None, index=['% of periods underperforming R-Squared'])
for col in r_df_OOS.columns[:3]:
    df_riskprem2[col] = len(r_df_OOS[r_df_OOS[col] < r_df_OOS['rf']])/len(r_df_OOS) * 100
    
df_riskprem2

Unnamed: 0,EP Strat,DP Strat,rf
% of periods underperforming R-Squared,37.037037,39.393939,0.0


#### 4.3 d) Do you believe this strategy takes on extra risk? 
Given that the strategies seem to have worse risk metrics than SPY, it does seem to take on extra risk