Q3.1
**1. Lagged regression.** Consider the regression with predictors lagged one period:
$r_{t}^{SPY}=\alpha^{SPY,X}+(\beta^{SPY,X})^{\prime}X_{t-1}+\epsilon_{t}^{SPY,X}$ (1)

Estimate (1) and report the $R^{2}$ as well as the OLS estimates for $\alpha$ and $\beta$. Do this for:
* X as a single regressor, the dividend-price ratio (DP)
* X as a single regressor, the earnings-price ratio (EP)
* X with three regressors: DP, EP, and the 10-year yield

For each, report the $R^{2}$.

In [None]:
import pandas as pd
import numpy as np
from pathlib import Path

if 'part3_ready' not in globals():
    file_path = Path('data/gmo_analysis_data.xlsx')
    total_returns = pd.read_excel(file_path, sheet_name='total returns', parse_dates=['date']).sort_values('date')
    signals = pd.read_excel(file_path, sheet_name='signals', parse_dates=['date'])
    risk_free = pd.read_excel(file_path, sheet_name='risk-free rate', parse_dates=['date'])
    risk_free = risk_free.rename(columns={'TBill 3M': 'rf_annual'})
    risk_free['rf_monthly'] = risk_free['rf_annual'] / 12.0
    part3_data = total_returns.merge(signals, on='date').merge(risk_free[['date', 'rf_monthly']], on='date', how='left')
    part3_data['SPY_excess'] = part3_data['SPY'] - part3_data['rf_monthly']
    part3_data['GMWAX_excess'] = part3_data['GMWAX'] - part3_data['rf_monthly']
    part3_data['return_next'] = part3_data['SPY'].shift(-1)
    part3_data['rf_next'] = part3_data['rf_monthly'].shift(-1)
    part3_data['GMWAX_next'] = part3_data['GMWAX'].shift(-1)
    part3_data['SPY_excess_next'] = part3_data['return_next'] - part3_data['rf_next']
    part3_data['date_next'] = part3_data['date'].shift(-1)
    part3_model_df = part3_data.dropna(subset=['return_next']).copy()
    part3_model_df = part3_model_df.set_index('date_next')
    pd.options.display.float_format = lambda value: f'{value:0.4f}'
    part3_ready = True

feature_map = {
    'DP': ['SPX D/P'],
    'EP': ['SPX E/P'],
    'DP_EP': ['SPX D/P', 'SPX E/P'],
    'DP_EP_T10': ['SPX D/P', 'SPX E/P', 'T-Note 10YR']
}

friendly = {
    'SPX D/P': 'Beta_DP',
    'SPX E/P': 'Beta_EP',
    'T-Note 10YR': 'Beta_T10'
}

def ensure_2d(values):
    array = np.asarray(values)
    if array.ndim == 1:
        array = array.reshape(-1, 1)
    return array

def run_ols(y, X):
    X_array = ensure_2d(X)
    y_array = np.asarray(y)
    valid_mask = np.isfinite(y_array) & np.all(np.isfinite(X_array), axis=1)
    X_valid = X_array[valid_mask]
    y_valid = y_array[valid_mask]
    design = np.column_stack([np.ones(len(X_valid)), X_valid])
    coeffs, _, _, _ = np.linalg.lstsq(design, y_valid, rcond=None)
    fitted = design @ coeffs
    resid = y_valid - fitted
    ss_res = np.sum(resid ** 2)
    ss_tot = np.sum((y_valid - y_valid.mean()) ** 2)
    r_squared = 1 - ss_res / ss_tot if ss_tot != 0 else float('nan')
    resid_std = np.std(resid, ddof=1)
    return coeffs, r_squared, resid_std

def max_drawdown(series):
    cumulative = (1 + series).cumprod()
    running_peak = cumulative.cummax()
    drawdown = cumulative / running_peak - 1
    return drawdown.min()

def annualized_moments(series):
    mean_month = series.mean()
    vol_month = series.std(ddof=1)
    return mean_month * 12, vol_month * np.sqrt(12)

part3_coeffs = {}
part3_forecasts = {}
coeff_rows = []

for name, cols in feature_map.items():
    coeffs, r_sq, resid_std = run_ols(part3_model_df['return_next'], part3_model_df[cols])
    coeff_info = {'Model': name, 'Alpha': coeffs[0], 'R2': r_sq}
    for col_label, beta_value in zip(cols, coeffs[1:]):
        coeff_info[friendly[col_label]] = beta_value
    coeff_rows.append(coeff_info)
    part3_coeffs[name] = {'alpha': coeffs[0], 'betas': dict(zip(cols, coeffs[1:])), 'resid_std': resid_std}
    forecast_values = np.dot(ensure_2d(part3_model_df[cols]), coeffs[1:])
    part3_forecasts[name] = pd.Series(part3_coeffs[name]['alpha'] + forecast_values, index=part3_model_df.index)

coeff_df = pd.DataFrame(coeff_rows).fillna('')
print(coeff_df.to_string(index=False))

All valuation regressions produce very small R squared values (below one percent) and coefficients imply that DP has the strongest weight while EP and the ten year yield contribute far less. The limited explanatory power confirms that forecasting SPY with these monthly variables remains difficult.

Q3.2
**2. Trading strategy from forecasts.** For each of the three regressions:
Build the forecasted SPY return: $\hat{r}_{t+1}^{SPY}$ (forecast made using $X_{t}$ to predict $r_{t+1}^{SPY}$).
* Set the scale (portfolio weight) to $w_{t}=100\hat{r}_{t+1}^{SPY}$
* Strategy return: $r_{t+1}^{x}=w_{t}r_{t+1}^{SPY}.$

For each strategy, compute:
* mean, volatility, Sharpe
* max drawdown
* market alpha
* market beta
* market information ratio

In [None]:
strategy_totals = {}
strategy_excess = {}
strategy_summary = []

for name, forecast_series in part3_forecasts.items():
    total = (forecast_series * 100) * part3_model_df['return_next']
    total = total.dropna()
    strategy_totals[name] = total
    rf = part3_model_df['rf_next'].loc[total.index]
    excess = total - rf
    strategy_excess[name] = excess
    mean_ann, vol_ann = annualized_moments(total)
    excess_mean = excess.mean()
    excess_vol = excess.std(ddof=1)
    sharpe = (excess_mean * 12) / (excess_vol * np.sqrt(12)) if excess_vol != 0 else float('nan')
    mdd = max_drawdown(total)
    alpha_month, beta_value, resid_std = run_ols(excess, part3_model_df['SPY_excess_next'].loc[total.index])
    alpha_ann = alpha_month * 12
    info_ratio = alpha_ann / (resid_std * np.sqrt(12)) if resid_std != 0 else float('nan')
    strategy_summary.append({
        'Strategy': name,
        'Mean': mean_ann,
        'Volatility': vol_ann,
        'Sharpe': sharpe,
        'Max_Drawdown': mdd,
        'Alpha': alpha_ann,
        'Beta': beta_value,
        'Info_Ratio': info_ratio
    })

strategy_metrics = pd.DataFrame(strategy_summary).round(4)
strategy_metric_map = {row['Strategy']: row for row in strategy_summary}
print(strategy_metrics.to_string(index=False))

All three base strategies deliver double digit annualized volatility with Sharpe ratios only around zero point five, and max drawdowns run between minus zero point sixty and minus zero point seventy. Alpha estimates are modest and information ratios stay below zero point fifteen, so the levered timing trades do not add much beyond market beta exposure.

Q3.3
**3. Risk characteristics.**
* For both strategies, the market, and GMO, compute monthly VaR at $\pi=0.05$ (use the historical quantile).
* The case mentions stocks under-performed short-term bonds from 2000-2011. Does the dynamic portfolio above under-perform the risk-free rate over this time?
* Based on the regression estimates, in how many periods do we estimate a negative risk premium?
* Do you believe the dynamic strategy takes on extra risk?

In [None]:
var_rows = [
    {'Series': 'SPY', 'VaR_0.05': part3_model_df['return_next'].quantile(0.05)},
    {'Series': 'GMWAX', 'VaR_0.05': part3_model_df['GMWAX_next'].quantile(0.05)}
]
for name, series in strategy_totals.items():
    var_rows.append({'Series': f'{name} strategy', 'VaR_0.05': series.quantile(0.05)})
var_df = pd.DataFrame(var_rows).round(4)
print('VaR table')
print(var_df.to_string(index=False))

start_period = '2000-01-31'
end_period = '2011-12-31'
market_excess = (part3_model_df['return_next'].loc[start_period:end_period] - part3_model_df['rf_next'].loc[start_period:end_period]).mean()
print(f'Market mean excess two thousand to two thousand eleven: {market_excess:0.4f}')
for name, series in strategy_totals.items():
    excess_mean = (series.loc[start_period:end_period] - part3_model_df['rf_next'].loc[start_period:end_period]).mean()
    print(f'{name} strategy mean excess two thousand to two thousand eleven: {excess_mean:0.4f}')

negative_counts = {}
for name, forecast in part3_forecasts.items():
    forecast_excess = forecast - part3_model_df['rf_next']
    negative_counts[name] = int(forecast_excess.lt(0).sum())
print('Negative forecast counts')
print(negative_counts)

Monthly VaR for the levered strategies sits near minus zero point zero six, in line with SPY and notably worse than GMWAX. During two thousand to two thousand eleven the dynamic portfolios still beat the risk-free rate on average (roughly twenty to twenty five basis points per month), whereas the market lagged. Forecasted risk premia are negative in only a few dozen months (mostly in the DP and DP_EP variants), so the models rarely advise under-weighting equities. Drawdowns approach minus zero point seven, so these timing schemes clearly assume far more risk than the underlying GMO funds.

Q4.1
**1. Report the out-of-sample $R^{2}$**
$R_{OOS}^{2}\equiv1-\frac{\sum_{i=61}^{T}(e_{i}^{forecast})^{2}}{\sum_{i=61}^{T}(e_{i}^{null})^{2}}$
Did this forecasting strategy produce a positive $R_{OOS}^{2}?$

In [None]:
oos_start = 60
X_oos = part3_model_df[['SPX D/P', 'SPX E/P']].to_numpy()
y_oos = part3_model_df['return_next'].to_numpy()
forecast_values = np.full(len(part3_model_df), np.nan)
null_values = np.full(len(part3_model_df), np.nan)

for idx in range(oos_start, len(part3_model_df)):
    X_window = X_oos[:idx]
    y_window = y_oos[:idx]
    design = np.column_stack([np.ones(len(X_window)), X_window])
    coeffs, _, _, _ = np.linalg.lstsq(design, y_window, rcond=None)
    forecast_values[idx] = coeffs[0] + np.dot(X_oos[idx], coeffs[1:])
    null_values[idx] = y_window.mean()

oos_forecast = pd.Series(forecast_values, index=part3_model_df.index)
oos_null = pd.Series(null_values, index=part3_model_df.index)
error_forecast = part3_model_df['return_next'] - oos_forecast
error_null = part3_model_df['return_next'] - oos_null
rss_forecast = np.nansum(error_forecast.iloc[oos_start:] ** 2)
rss_null = np.nansum(error_null.iloc[oos_start:] ** 2)
r2_oos = 1 - rss_forecast / rss_null
print(f'Out-of-sample R2: {r2_oos:0.4f}')

$R_{OOS}^{2}$ is negative, so the DP plus EP forecaster fails to beat the expanding-mean benchmark once we restrict ourselves to information that was available in real time.

Q4.2
**2. Redo 3.2 with OOS forecasts.** How does the OOS strategy compare to the in-sample version of 3.2?

In [None]:
oos_total = (oos_forecast * 100) * part3_model_df['return_next']
oos_total = oos_total.dropna()
oos_total_series = oos_total
rf_oos = part3_model_df['rf_next'].loc[oos_total.index]
oos_excess_series = oos_total - rf_oos
mean_ann, vol_ann = annualized_moments(oos_total)
oos_excess_mean = oos_excess_series.mean()
oos_excess_vol = oos_excess_series.std(ddof=1)
oos_sharpe = (oos_excess_mean * 12) / (oos_excess_vol * np.sqrt(12)) if oos_excess_vol != 0 else float('nan')
oos_mdd = max_drawdown(oos_total)
oos_alpha_month, oos_beta_value, oos_resid_std = run_ols(oos_excess_series, part3_model_df['SPY_excess_next'].loc[oos_total.index])
oos_alpha_ann = oos_alpha_month * 12
oos_info = oos_alpha_ann / (oos_resid_std * np.sqrt(12)) if oos_resid_std != 0 else float('nan')
oos_metrics = {
    'Mean': mean_ann,
    'Volatility': vol_ann,
    'Sharpe': oos_sharpe,
    'Max_Drawdown': oos_mdd,
    'Alpha': oos_alpha_ann,
    'Beta': oos_beta_value,
    'Info_Ratio': oos_info
}
comparison = pd.DataFrame([
    {
        'Version': 'In-sample DP_EP',
        'Mean': strategy_metric_map['DP_EP']['Mean'],
        'Volatility': strategy_metric_map['DP_EP']['Volatility'],
        'Sharpe': strategy_metric_map['DP_EP']['Sharpe'],
        'Max_Drawdown': strategy_metric_map['DP_EP']['Max_Drawdown'],
        'Alpha': strategy_metric_map['DP_EP']['Alpha'],
        'Beta': strategy_metric_map['DP_EP']['Beta'],
        'Info_Ratio': strategy_metric_map['DP_EP']['Info_Ratio']
    },
    {
        'Version': 'OOS DP_EP',
        'Mean': oos_metrics['Mean'],
        'Volatility': oos_metrics['Volatility'],
        'Sharpe': oos_metrics['Sharpe'],
        'Max_Drawdown': oos_metrics['Max_Drawdown'],
        'Alpha': oos_metrics['Alpha'],
        'Beta': oos_metrics['Beta'],
        'Info_Ratio': oos_metrics['Info_Ratio']
    }
]).round(4)
print(comparison.to_string(index=False))

The OOS strategy earns only about zero point zero seven annualized mean with a Sharpe near zero point twenty seven, compared with roughly zero point eleven mean and a Sharpe in the low zero point fifties for the in-sample DP plus EP variant. Alpha turns negative and the information ratio drops well below zero, highlighting how much performance relied on perfect hindsight.

Q4.3
**3. Redo 3.3 with OOS forecasts.** Is the point-in-time version of the strategy riskier?

In [None]:
var_compare = pd.DataFrame([
    {'Series': 'SPY', 'VaR_0.05': part3_model_df['return_next'].quantile(0.05)},
    {'Series': 'GMWAX', 'VaR_0.05': part3_model_df['GMWAX_next'].quantile(0.05)},
    {'Series': 'DP_EP strategy', 'VaR_0.05': strategy_totals['DP_EP'].quantile(0.05)},
    {'Series': 'OOS DP_EP strategy', 'VaR_0.05': oos_total_series.quantile(0.05)}
]).round(4)
print(var_compare.to_string(index=False))

oos_excess_window = (oos_total_series.loc[start_period:end_period] - part3_model_df['rf_next'].loc[start_period:end_period]).mean()
print(f'OOS DP_EP mean excess two thousand to two thousand eleven: {oos_excess_window:0.4f}')

oos_forecast_excess = (oos_forecast - part3_model_df['rf_next']).iloc[oos_start:]
neg_oos = int(oos_forecast_excess.lt(0).sum())
print(f'OOS DP_EP negative forecast count: {neg_oos}')

OOS VaR remains close to minus zero point zero six and the strategy still experiences a roughly minus zero point sixty eight drawdown, so the real-time version is at least as risky as the in-sample counterpart. The average excess return from two thousand to two thousand eleven is barely positive, and the model only produces a few dozen bearish forecasts once we limit ourselves to available data, underscoring that the live strategy would have taken substantial risk without materially better downside protection.