<left>FINM 36700 - Portfolio Theory and Risk Management</left> 
<br>

<h2><center> Homework 7 (GMO) </center></h2>

<left> Group 22 (Raafay, Shrey, Sarp, Aditya, Riccardo)

## 1) GMO

#### <i> This section is not graded, and you do not need to submit your answers. But you are expected to consider these issues and be ready to discuss them. </i>

## 1.1) GMO's approach

### 1.1.a) Why does GMO believe they can more easily predict long-run than short-run asset class performance?

GMO believed that investors demand risk premiums over the long run when investing in stocks because stocks tend to lose money at "bad times". GMO also believed that since market prices could significantly deviate away from fundamental value in short run, the expected returns in short term could differ from the long run estimates. This led GMO to believe that near term expected returns were noisy and hence, they can more easily predict long-run than short-run asset class performance. Additionally, GMO believed that while in the short term, the stock market was like a voting machine, in the long run, the market was actually like a weighting machine. 

### 1.1.b) What predicting variables does the case mention are used by GMO? Does this fit with the goal of long-run forecasts?

GMO used dividend yield estimates that investors were likely to require over the long run and the expected long-run dividend growth rate. They also employed the "Gordon Growth Model", which determined that long-run required return on stocks was the sum of fair dividend yield. Additionally, GMO also built its forcast using the fact that the return on stocks was equal to the dividend yield, plus the percentage change in the price-earnings multiple, plus the percentage change in profit margins, plus the percentage changes in sales per share.

### 1.1.c) How has this approach led to contrarian positions?

This approach led to contrarian positions because GMO thought that prices could deviate from fundamental value, particularly at the level of broad asset classes such as U.S. stocks and that the prices would revert to fundamental value over time. For instance, when the market has an overly optimistic view of future dividend, prices would exceed fair value. However, investors would then eventually realize they were too optimistic and that prices would rever toward fair value. Thus at times of high prices GMO would have a contrarian view of expected returns being low and vice versa.

### 1.1.d) How does this approach raise business risk and managerial career risk?

GMO's contrarian approach would create business or managerial risk if stocks did not revert to their fundamental value. For instance, GMO became bearish on US stocks in 1997 and underweighted Stocks in their asset allocation funds. But as U.S. stocks soared between 1997 and 2000, GMO's asset allocation underperformed severly and they lost nearly 60% of their assets due to investors withdrawing from the company.

## 1.2) Market Environment

### 1.2.a) We often estimate the market risk premium by looking at a large sample of historic data. What reasons does the case give to be skeptical that the market risk premium will be as high in the future as it has been over the past 50 years?

GMO had a slightly negative outlook for stocks in 2012, even though valuations had fallen from their 1999 peak. Although, GMO still believed that stocks would continue to earn a healthy risk premium over the long run. Additionally, GMO believed that certain trends in historical data were not necessarily indicative of how markets would behave in the future.

### 1.2.b) In 2007, GMO forecasts real excess equity returns will be negative. What are the biggest drivers of their pessimistic conditional forecast relative to the unconditional forecast. (See Exhibit 9.) 

- $\% \Delta(\frac{P}{E})$ was estimated to be negative (-2.8%) for the 7 year forecast
- $\% \Delta(\frac{E}{S})$ was also estimated to be negative (-3.9%) for the 7 year forecast

### 1.2.c) In the 2011 forecast, what components has GMO revised most relative to 2007? Now how does their conditional forecast compare to the unconditional? (See Exhibit 10.)

- $\% \Delta(\frac{P}{E})$ improved from -2.8% in 2007 to 0.0% in 2011

- $\% \Delta(\frac{E}{S})$ was estimated to be slightly less negative from -3.9% in 2007 to -3.7% in 2011

- $\% \Delta(S)$ improved from 2.4% in 2007 to 2.9% in 2011

- Dividend yield also increased from 2.3% in 2007 to 2.5% in 2011

## 3. Consider the asset class forecasts in Exhibit 1.
### 1.3.a) Which asset class did GMO estimate to have a negative 10-year return over 2002-2011?

US Equities

### 1.3.b) Which asset classes substantially outperformed GMO's estimate over that time period?

EM Equities and Foreign Government Debt

### 1.3.c) Which asset classes substantially underperformed GMO's estimate over that time period?

US Treasury Bills and US REITs

## 4. Fund Performance.
### 1.4.a) In which asset class was GMWAX most heavily allocated throughout the majority of 1997-2011?

U.S. Fixed Income

### 1.4.b) Comment on the performance of GMWAX versus its benchmark. (No calculation needed, simply comment on the comparison in the exhibits.)

-  The fund seems to have been quite successful relative to the benchmark

# Imports

In [1]:
import pandas as pd
import numpy as np
import scipy.stats as stats
from scipy.stats import kurtosis, skew
from scipy.stats import norm
import seaborn as sns
import statsmodels.api as sm
from statsmodels.regression.rolling import RollingOLS


from sklearn.ensemble import RandomForestRegressor
from sklearn.linear_model import LinearRegression
from sklearn import tree
from sklearn.neural_network import MLPRegressor

import warnings
warnings.filterwarnings("ignore")

%matplotlib inline


import matplotlib.pyplot as plt
plt.rcParams['figure.figsize']=[15, 6]
import matplotlib.cm as cm

# Helper Functions

In [2]:
def performance_summary(return_data):
    """ 
        Returns the Performance Stats for given set of returns
        Inputs: 
            return_data - DataFrame with Date index and Monthly Returns for different assets/strategies.
        Output:
            summary_stats - DataFrame with annualized mean return, vol, sharpe ratio. Skewness, Excess Kurtosis, Var (0.5) and
                            CVaR (0.5) and drawdown based on monthly returns. 
    """
    summary_stats = return_data.mean().to_frame('Mean').apply(lambda x: x*12)
    summary_stats['Volatility'] = return_data.std().apply(lambda x: x*np.sqrt(12))
    summary_stats['Sharpe Ratio'] = summary_stats['Mean']/summary_stats['Volatility']
    
    summary_stats['Skewness'] = return_data.skew()
    summary_stats['Excess Kurtosis'] = return_data.kurtosis()
    summary_stats['VaR (0.05)'] = return_data.quantile(.05, axis = 0)
    summary_stats['CVaR (0.05)'] = return_data[return_data <= return_data.quantile(.05, axis = 0)].mean()
    summary_stats['Min'] = return_data.min()
    summary_stats['Max'] = return_data.max()
    
    wealth_index = 1000*(1+return_data).cumprod()
    previous_peaks = wealth_index.cummax()
    drawdowns = (wealth_index - previous_peaks)/previous_peaks

    summary_stats['Max Drawdown'] = drawdowns.min()
    summary_stats['Peak'] = [previous_peaks[col][:drawdowns[col].idxmin()].idxmax() for col in previous_peaks.columns]
    summary_stats['Bottom'] = drawdowns.idxmin()
    
    recovery_date = []
    for col in wealth_index.columns:
        prev_max = previous_peaks[col][:drawdowns[col].idxmin()].max()
        recovery_wealth = pd.DataFrame([wealth_index[col][drawdowns[col].idxmin():]]).T
        recovery_date.append(recovery_wealth[recovery_wealth[col] >= prev_max].index.min())
    summary_stats['Recovery'] = recovery_date
    
    return summary_stats

In [3]:
def regression_based_performance(factor,fund_ret,rf,constant = True):
    """ 
        Returns the Regression based performance Stats for given set of returns and factors
        Inputs:
            factor - Dataframe containing monthly returns of the regressors
            fund_ret - Dataframe containing monthly excess returns of the regressand fund
            rf - Monthly risk free rate of return
        Output:
            summary_stats - (Beta of regression, treynor ratio, information ratio, alpha). 
    """
    if constant:
        X = sm.tools.add_constant(factor)
    else:
        X = factor
    y=fund_ret
    model = sm.OLS(y,X,missing='drop').fit()
    
    if constant:
        beta = model.params[1:]
        alpha = round(float(model.params['const']),6) *12

        
    else:
        beta = model.params
    treynor_ratio = ((fund_ret - rf).mean()*12)/beta[0]
    tracking_error = (model.resid.std()*np.sqrt(12))
    if constant:        
        information_ratio = model.params[0]*12/tracking_error
    r_squared = model.rsquared
    if constant:
        return (beta,treynor_ratio,information_ratio,alpha,r_squared,tracking_error,model.resid,model)
    else:
        return (beta,treynor_ratio,r_squared,tracking_error,model.resid)

In [4]:
def rolling_regression_param(factor,fund_ret,roll_window = 60):
    """ 
        Returns the Rolling Regression parameters for given set of returns and factors
        Inputs:
            factor - Dataframe containing monthly returns of the regressors
            fund_ret - Dataframe containing monthly excess returns of the regressand fund
            roll_window = rolling window for regression
        Output:
            params - Dataframe with time-t as the index and constant and Betas as columns
    """
    X = sm.add_constant(factor)
    y= fund_ret
    rols = RollingOLS(y, X, window=roll_window)
    rres = rols.fit()
    params = rres.params.copy()
    params.index = np.arange(1, params.shape[0] + 1)
    return params
    

# Reading Data

In [5]:
gmo_total_ret = pd.read_excel('gmo_analysis_data.xlsx',sheet_name = 'returns (total)', index_col = 0)
gmo_total_ret.index.name = 'Date'

In [6]:
path = 'gmo_analysis_data.xlsx'
rf = pd.read_excel(path,sheet_name = 'risk-free rate', index_col = 0)
rf.index.name = 'Date'

In [7]:
path = 'gmo_analysis_data.xlsx'
gmo_signals = pd.read_excel(path,sheet_name = 'signals', index_col = 0)
gmo_signals.index.name = 'Date'

In [8]:
gmo_excess_ret = gmo_total_ret.copy()
for col in gmo_excess_ret.columns:
    gmo_excess_ret[col] = gmo_excess_ret[col] - rf['US3M']

gmo_excess_ret.tail()

Unnamed: 0_level_0,SPY,GMWAX
Date,Unnamed: 1_level_1,Unnamed: 2_level_1
2023-06-30,0.060289,0.035234
2023-07-31,0.028108,0.019797
2023-08-31,-0.020885,-0.02565
2023-09-30,-0.052018,-0.020966
2023-10-31,-0.026367,-0.0343


## 2) Analyzing GMO

#### This section utilizes data in the file, `gmo_analysis_data.xlsx`.
#### Examine GMO's performance. Use the risk-free rate to convert the total returns to excess returns

### 2.1) Calculate the mean, volatility, and Sharpe ratio for GMWAX. Do this for three samples:

### • from inception through 2011
### • 2012-present
### • inception - present

In [9]:
sub_samples = {
              '1993-2011' : ['1993','2011'],
              '2012-2022' : ['2012','2022'],
              '1993-2022' : ['1993','2022'],
              }

gmo_sum = []
for k,v in sub_samples.items():
    sub_gmo = gmo_excess_ret.loc[sub_samples[k][0]:sub_samples[k][1],['GMWAX']].dropna()
    gmo_summary = performance_summary(sub_gmo)
    gmo_summary = gmo_summary
    gmo_summary.index = [k]
    gmo_sum.append(gmo_summary)

gmo_summary = pd.concat(gmo_sum)
gmo_summary.loc[:,['Mean','Volatility','Sharpe Ratio']]

Unnamed: 0,Mean,Volatility,Sharpe Ratio
1993-2011,0.015827,0.125011,0.126603
2012-2022,0.040248,0.09398,0.428265
1993-2022,0.026093,0.112897,0.231123


### Has the mean, vol, and Sharpe changed much since the case?

The mean increased and volatility decreased since the case.

### 2.2 GMO believes a risk premium is compensation for a security's tendency to lose money at "bad times". For all three samples, analyze extreme scenarios by looking at -
### • Min return
### • 5th percentile (VaR-5th)
### • Maximum  Drawdown

In [10]:
sub_samples = {
              '1993-2011' : ['1993','2011'],
              '2012-2022' : ['2012','2022'],
              '1993-2022' : ['1993','2022'],
              }

gmo_mdd = []
for k,v in sub_samples.items():
    sub_gmo = gmo_total_ret.loc[sub_samples[k][0]:sub_samples[k][1],['GMWAX']].dropna()
    gmo_drawdown = performance_summary(sub_gmo)
    gmo_drawdown = gmo_drawdown.loc[:,['Max Drawdown']]
    gmo_drawdown.index = [k]
    gmo_mdd.append(gmo_drawdown)

gmo_mdd = pd.concat(gmo_mdd)
gmo_mdd_summary = gmo_summary.loc[:,['Min','VaR (0.05)']].merge(gmo_mdd,how='inner',on=gmo_mdd.index).rename(columns={'key_0':'Sub-Sample'})
gmo_mdd_summary.index = gmo_mdd_summary['Sub-Sample']
gmo_mdd_summary = gmo_mdd_summary.drop(['Sub-Sample'],axis = 1)
gmo_mdd_summary

Unnamed: 0_level_0,Min,VaR (0.05),Max Drawdown
Sub-Sample,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1993-2011,-0.149179,-0.059806,-0.355219
2012-2022,-0.11865,-0.039362,-0.216773
1993-2022,-0.149179,-0.048109,-0.355219


### 2.2.a) Does GMWAX have high or low tail-risk as seen by these stats

GMWAX seems to have low tail-risk. 

### 2.2.b) Does that vary much across the two subsamples?

No, there does not seem to be much variation

### 2.3) For all three samples, regress excess returns of GMWAX on excess returns of SPY.

In [11]:
reg_sub_sample = []
for k,v in sub_samples.items():    
    fund_ret = gmo_excess_ret.loc[sub_samples[k][0]:sub_samples[k][1],['GMWAX']].dropna()
    factor = gmo_excess_ret.loc[fund_ret.index[0]:fund_ret.index[-1],['SPY']]
    reg = regression_based_performance(factor,fund_ret,0)
    beta_mkt = reg[0][0]
    treynor_ratio = reg[1]
    information_ratio = reg[2]
    alpha = reg[3]
    r_squared = reg[4]
    reg_sub_sample.append(pd.DataFrame([[beta_mkt,treynor_ratio,information_ratio,alpha,r_squared]],columns=['SPY Beta','Treynor Ratio','Information Ratio','Alpha','R-Squared'],index = ['GMWAX '+k]))

reg_performance = pd.concat(reg_sub_sample)


### 2.3.a) Report the estimated alpha, beta, and r-squared.

In [12]:
reg_performance.loc[:,['SPY Beta','Alpha','R-Squared']]

Unnamed: 0,SPY Beta,Alpha,R-Squared
GMWAX 1993-2011,0.539615,-0.005748,0.507129
GMWAX 2012-2022,0.564932,-0.029664,0.751044
GMWAX 1993-2022,0.547133,-0.015048,0.577321


### 2.3.b) Is GMWAX a low-beta strategy? Has that changed since the case?

GMWAX seems to have a relatively moderate beta with the market: ~50%-56%. So no, it's market beta is not very low to be considered a low-beta strategy.

In [13]:
reg_performance.loc['GMWAX 2012-2022',['SPY Beta','Alpha','R-Squared']].to_frame().T

Unnamed: 0,SPY Beta,Alpha,R-Squared
GMWAX 2012-2022,0.564932,-0.029664,0.751044


### 2.3.c) Does GMWAX provide alpha? Has that changed across the subsamples?

GMWAX has a negative alpha across both sub-samples. However, the alpha decreases (becomes more negative) since the case.

## 3 Forecast Regressions

#### This section utilizes data in the file,`gmo_analysis_data.xlsx`.

### 3.1) Consider the lagged regression, where the regressor, ($X$), is a period behind the target, ($r^{SPY}$).
\begin{align}
r^{SPY}_t = \alpha^{SPY,X}+(\beta^{SPY,X})'X_{t-1}+\epsilon^{SPY,X}_t
\end{align}
### Estimate (1) and report the $R^2$, as well as the OLS estimates for $\alpha$ and $\beta$. Do this for...
- $X$ as a single regressor, the dividend-price ratio.
- $X$ as a single regressor, the earnings-price ratio.
- $X$ as three regressors, the dividend-price ratio, the earnings-price ratio, and the 10-year yield.

### For each, report the r-squared.

In [14]:
fund_ret = gmo_total_ret.loc[:,['SPY']]
signals = [['DP'],['EP'],['DP','EP','US10Y']]
df_lst= []
for signal in signals:
    factor = gmo_signals.loc[:,signal].shift(1)
    reg = regression_based_performance(factor,fund_ret,0)
    reg_params = []
    columns=[]
    indexes = []
    if len(signal) > 1:
            for ele in signal:
                columns.append(ele+'-Beta')
                indexes.append(ele)
            index = ', '.join(map(str, indexes))
    else:
        columns.append(str(signal[0]) + '-Beta')
        index = signal[0]
    for i in range(len(signal)):
        reg_params.append(reg[0][i])
    reg_params.append(reg[3]) #alpha
    reg_params.append(reg[4]) #r-squared
    
    lst_col= ['Alpha','R-Squared']
    for col in lst_col:
        columns.append(col)
    df_lst.append(pd.DataFrame([reg_params],columns=columns,index = [index]))

In [15]:
df_lst[0]

Unnamed: 0,DP-Beta,Alpha,R-Squared
DP,0.009516,-0.113772,0.009359


In [16]:
df_lst[1]

Unnamed: 0,EP-Beta,Alpha,R-Squared
EP,0.003252,-0.073932,0.008692


In [17]:
df_lst[2]

Unnamed: 0,DP-Beta,EP-Beta,US10Y-Beta,Alpha,R-Squared
"DP, EP, US10Y",0.008023,0.002694,-0.000982,-0.180768,0.016364


### 3.2) For each of the three regressions, let’s try to utilize the resulting forecast in a trading strategy.
- Build the forecasted SPY returns: $\hat{r}^{SPY}_{t+1}$. Note that this denotes the forecast made using $X_t$ to forecast the $(t+1)$ return.
- Set the scale of the investment in SPY equal to 100 times the forecasted value:
$
w_t = 100 \hat{r}^{SPY}_{t+1}
$
- We are not taking this scaling too seriously. We just want the  strategy  to  go  bigger  inperiods where the forecast is high and to withdraw in periods where the forecast is low, or even negative.
- Calcualte the return on this strategy:
$
r^X_{t+1} = w_tr^{SPY}_{t+1}
$

#### You should now have the trading strategy returns, $r^x$ for each of the forecasts. For each strategy, estimate:
- mean, volatility, Sharpe,
- max-drawdown
- market alpha
- market beta
- market Information

In [18]:
dp_forecast_rtn = (gmo_signals.loc[:,'DP'].shift(1).to_frame() * df_lst[0]['DP-Beta'])+df_lst[0]['Alpha']/12
dp_forecast_rtn = dp_forecast_rtn.rename(columns={'DP':'Forecasted Return'}) * 100
dp_strat_rtn = pd.DataFrame(dp_forecast_rtn['Forecasted Return']*gmo_total_ret.loc[:,['SPY']]['SPY'], columns=dp_forecast_rtn.columns, index=dp_forecast_rtn.index)

In [19]:
ep_forecast_rtn = (gmo_signals.loc[:,'EP'].shift(1).to_frame() * df_lst[1]['EP-Beta'])+df_lst[1]['Alpha']/12
ep_forecast_rtn = ep_forecast_rtn.rename(columns={'EP':'Forecasted Return'}) * 100
ep_strat_rtn = pd.DataFrame(ep_forecast_rtn['Forecasted Return']*gmo_total_ret.loc[:,['SPY']]['SPY'], columns=ep_forecast_rtn.columns, index=ep_forecast_rtn.index)

In [20]:
forecasted_rets = (np.array(gmo_signals.shift(1).loc[:,['DP','EP','US10Y']]) @ np.array(df_lst[2].loc[:,['DP-Beta','EP-Beta','US10Y-Beta']].T))
fac3_forecast_rtn = (pd.DataFrame(forecasted_rets,columns = ['Forecasted Return'],index= gmo_signals.index)) 
fac3_forecast_rtn['Forecasted Return'] = (fac3_forecast_rtn['Forecasted Return'] + float(df_lst[2]['Alpha']/12))*100
fac3_strat_rtn = pd.DataFrame(fac3_forecast_rtn['Forecasted Return'] *gmo_total_ret.loc[:,['SPY']]['SPY'], columns=fac3_forecast_rtn.columns, index=fac3_forecast_rtn.index)

In [21]:
strats = {'DP': dp_strat_rtn.dropna(),
          'EP': ep_strat_rtn.dropna(),
          'DP-EP-US10Y': fac3_strat_rtn.dropna()
         }
factor = gmo_excess_ret.loc[:,['SPY']]
strat_summary =[]
for k,v in strats.items():
    strat = strats[k]
    perf_summary = performance_summary(strat)
    perf_summary['Negative Risk Premium Months'] = len(strat[strat['Forecasted Return'] - rf['US3M'] <0])
    perf_summary['Total Months'] = len(strat)
    perf_summary.index = [k]
    reg = regression_based_performance(factor[strat.index[0]:],strat,0)
    perf_summary['Market Beta'] = reg[0][0]
    perf_summary['Market Alpha'] = reg[3]
    perf_summary['Market Information Ratio'] = reg[2]
    strat_summary.append(perf_summary)
    

strat_summary_df = pd.concat(strat_summary)
strat_summary_df.loc[:,['Mean','Volatility','Sharpe Ratio','Max Drawdown','Market Beta','Market Alpha','Market Information Ratio']]

Unnamed: 0,Mean,Volatility,Sharpe Ratio,Max Drawdown,Market Beta,Market Alpha,Market Information Ratio
DP,0.109539,0.148858,0.735859,-0.656967,0.861746,0.041112,0.549044
EP,0.108055,0.128905,0.838249,-0.385317,0.733554,0.0498,0.732559
DP-EP-US10Y,0.125094,0.145603,0.859145,-0.524606,0.778129,0.0633,0.721235


### 3.3) GMO believes a risk premium is compensation for a security's tendency to lose money at "bad times". Let's consider risk characteristics.

### 3.3.a) For both strategies, the market, and GMO, calculate the monthly VaR for $\pi=.05$. Just use the quantile of the historic data for this VaR calculation.

In [22]:
market_summary = performance_summary(gmo_excess_ret.loc[:,['SPY']])
gmo_summary = performance_summary(gmo_excess_ret.loc[:,['GMWAX']].dropna())
strat_var= pd.concat([strat_summary_df.loc[:,['VaR (0.05)']],market_summary.loc[:,['VaR (0.05)']],gmo_summary.loc[:,['VaR (0.05)']]])
strat_var

Unnamed: 0,VaR (0.05)
DP,-0.052335
EP,-0.053892
DP-EP-US10Y,-0.064082
SPY,-0.073525
GMWAX,-0.047061


### 3.3.b) The GMO case mentions that stocks under-performed short-term bonds from 2000-2011. Does the dynamic portfolio above under-perform the risk-free rate over this time?

In [23]:
strats = {'DP': dp_strat_rtn.dropna(),
          'EP': ep_strat_rtn.dropna(),
          'DP-EP-US10Y': fac3_strat_rtn.dropna(),
          'Risk Free Rate': rf['US3M'].to_frame('Forecasted Return')
         }
strat_summary_0011 =[]
for k,v in strats.items():
    strat = (strats[k]['2000':'2011']['Forecasted Return']).to_frame('Forecasted Returns')
    perf_summary = performance_summary(strat)
    perf_summary.index = [k]
    strat_summary_0011.append(perf_summary)
    

strat_summary_df_0011 = pd.concat(strat_summary_0011)
strat_summary_df_0011.loc[:,['Mean','Volatility','Sharpe Ratio','Max Drawdown']]

Unnamed: 0,Mean,Volatility,Sharpe Ratio,Max Drawdown
DP,0.039709,0.186016,0.213468,-0.656967
EP,0.037709,0.134767,0.279805,-0.385317
DP-EP-US10Y,0.061471,0.158851,0.386972,-0.524606
Risk Free Rate,0.023062,0.005785,3.986632,0.0


All of the portfolios outperform the risk-free rate.

### 3.3.c) Based on the regression estimates, in how many periods do we estimate a negative risk premium?

In [24]:
neg_risk_premium = strat_summary_df.loc[:,['Negative Risk Premium Months','Total Months']]
neg_risk_premium['Negative Risk Premium Months (%)'] = neg_risk_premium['Negative Risk Premium Months'] *100/ neg_risk_premium['Total Months']
neg_risk_premium

Unnamed: 0,Negative Risk Premium Months,Total Months,Negative Risk Premium Months (%)
DP,139,368,37.771739
EP,139,368,37.771739
DP-EP-US10Y,138,368,37.5


### 3.3.d) Do you believe the dynamic strategy takes on extra risk?

Depends. By comparing mean and volatility of the dynamic strategies compared to SPY, it does not seem like they do not take on extra risk (higher means / lower vols). However, with a negative risk premium around 38% of the time for the dynamic strategies does seem to be somewhat risky.

In [25]:
strat_summary_df.loc[:,['Mean','Volatility','Sharpe Ratio','VaR (0.05)','Max Drawdown','Market Beta','Market Alpha','Market Information Ratio']]

Unnamed: 0,Mean,Volatility,Sharpe Ratio,VaR (0.05),Max Drawdown,Market Beta,Market Alpha,Market Information Ratio
DP,0.109539,0.148858,0.735859,-0.052335,-0.656967,0.861746,0.041112,0.549044
EP,0.108055,0.128905,0.838249,-0.053892,-0.385317,0.733554,0.0498,0.732559
DP-EP-US10Y,0.125094,0.145603,0.859145,-0.064082,-0.524606,0.778129,0.0633,0.721235


In [26]:
market_summary.loc[:,['Mean','Volatility','Sharpe Ratio','VaR (0.05)','Max Drawdown']]

Unnamed: 0,Mean,Volatility,Sharpe Ratio,VaR (0.05),Max Drawdown
SPY,0.07946,0.149097,0.532939,-0.073525,-0.560012


## 4. Out-of-Sample Forecasting

This section utilizes data in the file, `gmo_analysis_data.xlsx`.

Reconsider the problem above, of estimating (1) for $x$. The reported $R^2$ was the in-sample $R^2$ it examined how well the forecasts fit in the sample from which the parameters were estimated. <br><br>

**In particular, focus on the case of using both dividend-price and earnings-price as signals.**

Let's consider the out-of-sample r-squared. To do so, we need the following:
- Start at $t=60$.
- Estmiate (1) only using data through time $t$.
- Use the estimated parameters of (1), along with $x_{t+1}$ to calculate the out-of-sample forecast for the following period, $t+1$.
\begin{align}
\hat{r}^{SPY}_{t+1} = \hat{a}^{SPY,x}_t+(\beta^{SPY,x})'x_t 
\end{align}
- Calculate the $t+1$ forecast error,
\begin{align}
  e^x_{t+1} = r^{SPY}_{t+1} - \hat{r}^{SPY}_{t+1}
\end{align}
- Move to $t=61$, and loop through the rest of the sample.

You now have the time-series of out-of-sample prediction errors, $e^x$.

Calculate the time-series of out-of-sample prediction errors $e^0$, which are based on the null forecast:
\begin{align*}
\bar{r}^{SPY}_{t+1} &= \frac{1}{t}\sum^{t}_{i=1}r^{SPY}_i \\
e^0_{t+1} &= r^{SPY}_{t+1} - \bar{r}^{SPY}_{t+1}
\end{align*}


In [27]:
def OOS_r2(df, factors, start):
    y = df['SPY']
    X = sm.add_constant(factors)

    forecast_err, null_err = [], []

    for i,j in enumerate(df.index):
        if i >= start:
            currX = X.iloc[:i]
            currY = y.iloc[:i]
            reg = sm.OLS(currY, currX, missing = 'drop').fit()
            null_forecast = currY.mean()
            reg_predict = reg.predict(X.iloc[[i]])
            actual = y.iloc[[i]]
            forecast_err.append(reg_predict - actual)
            null_err.append(null_forecast - actual)
            
    RSS = (np.array(forecast_err)**2).sum()
    TSS = (np.array(null_err)**2).sum()
    
    return ((1 - RSS/TSS),reg)

In [28]:
factor = gmo_signals.loc[:,'EP'].shift(1).to_frame()
fund_ret = gmo_total_ret.loc[factor.index[0]:,['SPY']]
reg_ep = OOS_r2(fund_ret,factor,60)
OOS_RSquared_ep = reg_ep[0]
OOS_r2_ep = pd.DataFrame([[OOS_RSquared_ep]], columns = ['OOS R-Squared'], index = ['EP'])
reg_ep_params = reg_ep[1]

In [29]:
factor = gmo_signals.loc[:,'DP'].shift(1).to_frame()
fund_ret = gmo_total_ret.loc[factor.index[0]:,['SPY']]
reg_dp = OOS_r2(fund_ret,factor,60)
OOS_RSquared_dp = reg_dp[0]
OOS_r2_dp = pd.DataFrame([[OOS_RSquared_dp]], columns = ['OOS R-Squared'], index = ['DP'])
reg_dp_params = reg_dp[1]

In [30]:
factor = gmo_signals.loc[:,['DP','EP']].shift(1)
fund_ret = gmo_total_ret.loc[factor.index[0]:,['SPY']]
reg_epdp = OOS_r2(fund_ret,factor,60)
OOS_r2_epdp  = reg_epdp[0]
OOS_r2_epdp = pd.DataFrame([[OOS_r2_epdp]], columns = ['OOS R-Squared'], index = ['DP-EP'])
reg_epdp_params = reg_epdp[1]

In [31]:
factor = gmo_signals.loc[:,['DP','EP','US10Y']].shift(1)
fund_ret = gmo_total_ret.loc[factor.index[0]:,['SPY']]
reg_all = OOS_r2(fund_ret,factor,60)
OOS_RSquared_all  = reg_all[0]
OOS_r2_all = pd.DataFrame([[OOS_RSquared_all]], columns = ['OOS R-Squared'], index = ['All'])
reg_all_params = reg_all[1]

### 4.1) Report the out-of-sample $R^2$:
\begin{align}
 R^2_{OOS} \equiv 1-\frac{\sum^T_{i=61}(e^x_i)^2}{\sum^T_{i=61}(e^0_i)^2} 
\end{align}
### note that unlike an in-sample r-squared, the out-of-sample r-squared can be anywhere between $(-\infty,1]$.

In [32]:
oos_r2_sum = pd.concat([OOS_r2_dp,OOS_r2_ep,OOS_r2_epdp,OOS_r2_all])
oos_r2_sum

Unnamed: 0,OOS R-Squared
DP,-0.002074
EP,-0.006394
DP-EP,-0.017227
All,-0.030651


### Did this forecasting strategy produce a positive OOS r-squared?

This forecasting strategy produced a negative OOS r-squared.


### 4.2) Re-do problem 3.2 using this OOS forecast. <br><br> How much better/worse is the OOS Earnings-Price ratio strategy compared to the in-sample version of 3.2?

The Out-of-Sample performs worse than in-sample.

In [33]:
def OOS_strat(df, factors, start, weight):
    returns = []
    y = df['SPY']
    X = sm.add_constant(factors)

    for i,j in enumerate(df.index):
        if i >= start:
            currX = X.iloc[:i]
            currY = y.iloc[:i]
            reg = sm.OLS(currY, currX, missing = 'drop').fit()
            pred = reg.predict(X.iloc[[i]])
            w = pred * weight
            returns.append((df.iloc[i]['SPY'] * w)[0])

    df_strat = pd.DataFrame(data = returns, index = df.iloc[-(len(returns)):].index, columns = ['Strat Returns'])
    return df_strat

In [34]:
factor = gmo_signals.loc[:,'EP'].shift(1).to_frame()
fund_ret= gmo_total_ret.loc[factor.index[0]:,['SPY']]
OOS_EP_predict = OOS_strat(fund_ret,factor, 60, 100).rename(columns={'Strat Returns':'EP_OOS_Returns'})

In [35]:
factor = gmo_signals.loc[:,'DP'].shift(1).to_frame()
fund_ret= gmo_total_ret.loc[factor.index[0]:,['SPY']]
OOS_DP_predict = OOS_strat(fund_ret,factor, 60, 100).rename(columns={'Strat Returns':'DP_OOS_Returns'})

In [36]:
factor = gmo_signals.loc[:,['DP','EP']].shift(1)
fund_ret= gmo_total_ret.loc[factor.index[0]:,['SPY']]
OOS_EPDP_predict = OOS_strat(fund_ret,factor, 60, 100).rename(columns={'Strat Returns':'DP-EP_OOS_Returns'})

In [37]:
factor = gmo_signals.loc[:,['DP','EP','US10Y']].shift(1)
fund_ret= gmo_total_ret.loc[factor.index[0]:,['SPY']]
OOS_all_predict = OOS_strat(fund_ret,factor, 60, 100).rename(columns={'Strat Returns':'All_OOS_Returns'})

In [38]:
oos_prediction_sum = pd.concat([OOS_DP_predict.T,OOS_EP_predict.T,OOS_all_predict.T])
oos_prediction_sum = oos_prediction_sum.T

strats = {'DP': OOS_DP_predict.dropna(),
          'EP': OOS_EP_predict.dropna(),
          'DP-EP':OOS_EPDP_predict.dropna(),
          'All': OOS_all_predict.dropna(),
          'SPY':gmo_excess_ret.loc[OOS_all_predict.index[0]:,['SPY']].rename(columns={'SPY':'SPY_OOS_Returns'}),
          'US3M':rf['US3M'].to_frame('US3M_OOS_Returns')
         }
factor = gmo_excess_ret.loc[:,['SPY']]
strat_summary =[]
for k,v in strats.items():
    strat = strats[k]
    perf_summary = performance_summary(strat)
    perf_summary['Negative Risk Premium Months'] = len(strat[strat[k+'_OOS_Returns'] - rf['US3M'] <0])
    perf_summary['Total Months'] = len(strat)
    perf_summary.index = [k]
    reg = regression_based_performance(factor[strat.index[0]:],strat,0)
    perf_summary['Market Beta'] = reg[0][0]
    perf_summary['Market Alpha'] = reg[3]
    perf_summary['Market Information Ratio'] = reg[2]
    strat_summary.append(perf_summary)
    

strat_summary_df = pd.concat(strat_summary)
strat_summary_df.loc[:,['Mean','Volatility','Sharpe Ratio','VaR (0.05)','Max Drawdown','Market Beta','Market Alpha','Market Information Ratio']]

Unnamed: 0,Mean,Volatility,Sharpe Ratio,VaR (0.05),Max Drawdown,Market Beta,Market Alpha,Market Information Ratio
DP,0.079626,0.173732,0.458326,-0.071179,-0.551925,0.99449,0.012756,0.16344
EP,0.082373,0.163741,0.503066,-0.068431,-0.583693,0.54966,0.04542,0.325618
DP-EP,0.096815,0.226111,0.428174,-0.071698,-0.76091,0.469532,0.065244,0.305011
All,0.112022,0.247893,0.451897,-0.071882,-0.804959,0.490154,0.079068,0.335312
SPY,0.067239,0.15607,0.430826,-0.080066,-0.560012,1.0,0.0,0.733621
US3M,0.023803,0.006165,3.860934,2e-05,0.0,-0.001179,0.023892,3.877706


### 4.3) Re-do problem 3.3 using this OOS forecast. <br><br> Is the point-in-time version of the strategy riskier?

Compared to the full sub-sample, the mean returns go down significantly during 2000-2011.The volatility slightly increases as well. Hence, the strategy does seem riskier.

In [39]:
oos_prediction_sum = pd.concat([OOS_DP_predict.T,OOS_EP_predict.T,OOS_all_predict.T])
oos_prediction_sum = oos_prediction_sum.T

strats = {'DP': OOS_DP_predict.dropna(),
          'EP': OOS_EP_predict.dropna(),
          'DP-EP':OOS_EPDP_predict.dropna(),
          'All': OOS_all_predict.dropna(),
          'US3M':rf['US3M'].to_frame('US3M_OOS_Returns')
         }
factor = gmo_excess_ret.loc[:,['SPY']]['2000':'2011']
strat_summary =[]
for k,v in strats.items():
    strat = strats[k]['2000':'2011']
    perf_summary = performance_summary(strat)
    perf_summary['Negative Risk Premium Months'] = len(strat[strat[k+'_OOS_Returns'] - rf['2000':'2011']['US3M'] <0])
    perf_summary['Total Months'] = len(strat)
    perf_summary.index = [k]
    reg = regression_based_performance(factor[strat.index[0]:],strat,0)
    perf_summary['Market Beta'] = reg[0][0]
    perf_summary['Market Alpha'] = reg[3]
    perf_summary['Market Information Ratio'] = reg[2]
    strat_summary.append(perf_summary)
    

strat_summary_df_0011 = pd.concat(strat_summary)
strat_summary_df_0011.loc[:,['Mean','Volatility','Sharpe Ratio','VaR (0.05)','Max Drawdown','Market Beta','Market Alpha','Market Information Ratio']]

Unnamed: 0,Mean,Volatility,Sharpe Ratio,VaR (0.05),Max Drawdown,Market Beta,Market Alpha,Market Information Ratio
DP,-0.010895,0.163221,-0.066749,-0.09473,-0.551925,0.952248,-0.006228,-0.124928
EP,0.038768,0.195919,0.197877,-0.085329,-0.583693,0.296065,0.040224,0.211832
DP-EP,0.04329,0.290943,0.148793,-0.100111,-0.76091,0.076108,0.043668,0.150212
All,0.084077,0.328892,0.255636,-0.091413,-0.804959,0.114138,0.084636,0.257752
US3M,0.023062,0.005785,3.986632,3.5e-05,0.0,-0.002853,0.023052,3.997186


In [40]:
neg_risk_premium = strat_summary_df.loc[:,['Negative Risk Premium Months','Total Months']]
neg_risk_premium['Negative Risk Premium Months (%)'] = neg_risk_premium['Negative Risk Premium Months'] *100/ neg_risk_premium['Total Months']
neg_risk_premium

Unnamed: 0,Negative Risk Premium Months,Total Months,Negative Risk Premium Months (%)
DP,122,309,39.482201
EP,120,309,38.834951
DP-EP,121,309,39.158576
All,117,309,37.864078
SPY,125,309,40.453074
US3M,0,369,0.0
