# Return Predictability

The following project aims to recreate Welch and Goyal's prediction of the equity premium in their paper "A Comprehensive Look at The Empirical Performance of Equity Premium Prediction". Data was sourced from Amit Goyal's website and includes monthly data from January 1950 until December 2022.

In order to predict the following months excess market return, data from the previous 10 years was used in a linear regression. The following variables were included in the linear regression:

 - D/P
 - Term Spread
 - Default Spread
 - Net Stock Issuance
 
The returns were then weighted using the following logic, where w(t) is the weight and m(t) is the predicted excess market return utilizing values from the previous 10 years: 
 
 w(t) = min{1.5, max{0.5, 100×m(t)}}
 
As a final result, the portfolio generated an average monthly excess return of 0.00598 and a monthly Sharpe Ratio of 0.14. This outperformed a 100% weighted market portfolio during this time period, which had an average monthly excess return of 0.00544 and a monthly Sharpe Ratio of 0.13. The higher Sharpe ratio of the portfolio that utilized return predictability weighting indicates that the higher excess returns were not due to additional risk. Overall, this strategy shows potential to be a useful alternative to 100% market weighting.


In [45]:
import pandas as pd
import numpy as np
import statsmodels.api as sm

In [46]:
predictor_data = pd.read_excel('PredictorData2022.xlsx',sheet_name= 'Monthly',index_col=[0])

  warn("""Cannot parse header or footer so it will be ignored""")


In [47]:
predictor_data.head()

Unnamed: 0_level_0,Index,D12,E12,b/m,tbl,AAA,BAA,lty,ntis,Rfree,infl,ltr,corpr,svar,csp,CRSP_SPvw,CRSP_SPvwx
yyyymm,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1
187101,4.44,0.26,0.4,,,,,,,,,,,,,,
187102,4.5,0.26,0.4,,,,,,,0.004967,,,,,,,
187103,4.61,0.26,0.4,,,,,,,0.004525,,,,,,,
187104,4.74,0.26,0.4,,,,,,,0.004252,,,,,,,
187105,4.86,0.26,0.4,,,,,,,0.004643,,,,,,,


In [48]:
#exclude any data before 1950
predictor_data = predictor_data.loc[194912:]

In [49]:
#calculate variables for the linear regression
predictor_data['D/P'] = predictor_data['D12']/predictor_data['Index']
predictor_data['Term_Spread'] = predictor_data['lty']-predictor_data['tbl']
predictor_data['Default_Spread'] = predictor_data['BAA']-predictor_data['AAA']
predictor_data['issuance'] = predictor_data['ntis']
predictor_data['excess_ret'] = predictor_data['CRSP_SPvw']-predictor_data['Rfree']

In [50]:
#drop unwanted columns
predictor_data = predictor_data[['D/P','Term_Spread','Default_Spread','issuance','excess_ret']]

In [51]:
#lag all X variables
predictor_data['D/P'] = predictor_data['D/P'].shift(1)
predictor_data['Term_Spread'] = predictor_data['Term_Spread'].shift(1)
predictor_data['Default_Spread'] = predictor_data['Default_Spread'].shift(1)
predictor_data['issuance'] = predictor_data['issuance'].shift(1)

In [52]:
#drop NA values
predictor_data.dropna(inplace=True)

In [53]:
predictor_data.head()

Unnamed: 0_level_0,D/P,Term_Spread,Default_Spread,issuance,excess_ret
yyyymm,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
195001,0.068019,0.0099,0.0073,0.027176,0.018803
195002,0.067449,0.0108,0.0067,0.027102,0.018703
195003,0.067364,0.0102,0.0066,0.025492,0.007185
195004,0.067669,0.0103,0.0066,0.029291,0.044987
195005,0.065302,0.0099,0.0063,0.026398,0.045902


In [54]:
predictor_data.iloc[start:i]

Unnamed: 0_level_0,D/P,Term_Spread,Default_Spread,issuance,excess_ret
yyyymm,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
201301,0.021909,0.0239,0.0098,-0.011549,0.052361
201302,0.021050,0.0284,0.0093,-0.008017,0.013013
201303,0.021010,0.0275,0.0095,-0.008864,0.037584
201304,0.020464,0.0278,0.0092,-0.008911,0.019621
201305,0.020341,0.0258,0.0086,-0.008005,0.023120
...,...,...,...,...,...
202206,0.015328,0.0192,0.0099,-0.003372,-0.082283
202207,0.016912,0.0165,0.0103,-0.004815,0.092966
202208,0.015605,0.0067,0.0115,-0.006121,-0.042205
202209,0.016406,0.0027,0.0108,-0.009732,-0.093395


In [55]:
predictor_data['forecast'] = np.nan
predictor_data['weight'] = np.nan
predictor_data['excess portfolio return'] = np.nan
start=0

for i in range(119, predictor_data.shape[0]):
    knowndata = predictor_data.iloc[start:i]
    
    X = knowndata[['D/P', 'Term_Spread', 'Default_Spread', 'issuance']]
    y = knowndata['excess_ret']
    
    X = sm.add_constant(X)
    model = sm.OLS(y, X).fit()
    
    alpha = model.params[0]
    beta1=model.params[1] 
    beta2=model.params[2]
    beta3=model.params[3]
    beta4=model.params[4]
    
    if i + 1 < len(predictor_data):
        predictor_data['forecast'].iloc[i+1] = alpha + beta1*predictor_data['D/P'].iloc[i] + beta2*predictor_data['Term_Spread'].iloc[i]+ beta3*predictor_data['Default_Spread'].iloc[i]+ beta4*predictor_data['issuance'].iloc[i]
        predictor_data['weight'].iloc[i+1] = min(1.5, max(0.5, 100 * predictor_data['forecast'].iloc[i+1]))   
    
    start += 1

predictor_data['excess portfolio return'] = predictor_data['excess_ret'] * predictor_data['weight'].shift()

In [56]:
predictor_data.tail(5)

Unnamed: 0_level_0,D/P,Term_Spread,Default_Spread,issuance,excess_ret,forecast,weight,excess portfolio return
yyyymm,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
202208,0.015605,0.0067,0.0115,-0.006121,-0.042205,-0.00111,0.5,-0.021102
202209,0.016406,0.0027,0.0108,-0.009732,-0.093395,0.000484,0.5,-0.046698
202210,0.018217,0.0039,0.011,-0.011292,0.077948,0.003974,0.5,0.038974
202211,0.017008,0.0026,0.0116,-0.015252,0.051266,0.012872,1.287152,0.025633
202212,0.016271,-0.0026,0.0117,-0.017011,-0.062084,0.001345,0.5,-0.079912


In [57]:
#only use data with forecasted values
predictor_data = predictor_data.loc[196001:]

In [58]:
#Forecasted weighted returns
print('Average Monthly Weighted Predictions:', predictor_data['excess portfolio return'].mean())
print('Monthly Weighted Predictions Standard Deviation:', predictor_data['excess portfolio return'].std())
print('Monthly Sharpe Ratio:', predictor_data['excess portfolio return'].mean()/predictor_data['excess portfolio return'].std())

Average Monthly Weighted Predictions: 0.005980756441604889
Monthly Weighted Predictions Standard Deviation: 0.04207807169339748
Monthly Sharpe Ratio: 0.14213475572701534


In [59]:
#100% weighted returns
print('Average Monthly Weighted Predictions:', predictor_data['excess_ret'].mean())
print('Monthly Weighted Predictions Standard Deviation:', predictor_data['excess_ret'].std())
print('Monthly Sharpe Ratio:', predictor_data['excess_ret'].mean()/predictor_data['excess_ret'].std())

Average Monthly Weighted Predictions: 0.005440255291005289
Monthly Weighted Predictions Standard Deviation: 0.0433089030464988
Monthly Sharpe Ratio: 0.1256151716695372
