# Predicting Long-Run Future Stock Returns with the Cyclically Adjusted Price-Earnings Ratio (CAPE)

In 1998, Robert Shiller and John Campbell published the pathbreaking article “Valuation Ratios and the Long-Run Stock Market Outlook.” A follow-up to some of their earlier work on stock market predictability, it established that long-term stock market returns were not random walks but, rather, could be forecast by a valuation measure called the “cyclically adjusted price–earnings ratio,” or CAPE ratio. Shiller and Campbell calculated the CAPE ratio by dividing a long-term broad-based index of stock market prices and earnings from 1871 by the average of the last 10 years of earnings per share, with earnings and stock prices measured in real terms. **They regressed 10-year real stock returns against the CAPE ratio and found that the CAPE ratio is a significant variable that can predict long-run stock returns.** The predictability of real stock returns implies that long-term equity returns are mean reverting. In other words, if the CAPE ratio is above (below) its long-run average, the model predicts below average (above-average) real stock returns for the next 10 years. 

*Jeremy J. Siegel (2016) The Shiller CAPE Ratio: A New Look, Financial Analysts Journal, 72:3, 41-50, DOI: 10.2469/faj.v72.n3.1*

In [1]:
import pandas as pd
import numpy as np
import statsmodels.api as sm
from tqdm import tqdm

In [2]:
classifications = pd.read_csv('https://raw.githubusercontent.com/nathanramoscfa/cape/main/data/classification_data.csv', index_col=0).iloc[:, :-2]
grinold_kroner = pd.read_csv('https://raw.githubusercontent.com/nathanramoscfa/cape/main/data/grinold_kroner_returns.csv', index_col=0)
current_fwd_return_5y_forecast = pd.read_csv('https://raw.githubusercontent.com/nathanramoscfa/cape/main/data/current_fwd_return_5y_forecast.csv', index_col=0)
benchmark_prices = pd.read_csv('https://raw.githubusercontent.com/nathanramoscfa/cape/main/data/benchmark_prices.csv', index_col=0)
benchmark_lt_pe = pd.read_csv('https://raw.githubusercontent.com/nathanramoscfa/cape/main/data/benchmark_lt_pe.csv', index_col=0)

classifications.index.name = 'BENCHMARK_TICKER'
grinold_kroner.index.name = 'BENCHMARK_TICKER'
current_fwd_return_5y_forecast.index.name = 'BENCHMARK_TICKER'

In [3]:
results = pd.read_csv('https://raw.githubusercontent.com/nathanramoscfa/cape/main/data/equity_etf_posterior_returns.csv')
results.columns = ['BENCHMARK_TICKER', 'ETF_TICKER', 'CORRELATION', 'P_VALUE', 'BENCHMARK_NAME', 'PRIOR_RETURN', 'POSTERIOR_RETURN', 'VIEW']
results = results[['ETF_TICKER', 'CORRELATION', 'P_VALUE', 'PRIOR_RETURN', 'POSTERIOR_RETURN', 'VIEW', 'BENCHMARK_NAME', 'BENCHMARK_TICKER']]
results.ETF_TICKER = results.ETF_TICKER.str.replace(' US Equity', '')
results['ETF_NAME'] = classifications.loc[results.ETF_TICKER.values].NAME.values
results['CLASSIFICATION'] = classifications.loc[results.ETF_TICKER.values].CLASSIFICATION.values
results = results.set_index('ETF_TICKER')
results = pd.merge(results, grinold_kroner, left_on='BENCHMARK_TICKER', right_index=True, how='left')
results = pd.merge(results, current_fwd_return_5y_forecast.FWD_RETURN_5Y_FORECAST, left_on='BENCHMARK_TICKER', right_index=True, how='left')
results = results[results['CORRELATION']>=0.95].sort_values(by='POSTERIOR_RETURN', ascending=False)
results.head()

Unnamed: 0_level_0,CORRELATION,P_VALUE,PRIOR_RETURN,POSTERIOR_RETURN,VIEW,BENCHMARK_NAME,BENCHMARK_TICKER,ETF_NAME,CLASSIFICATION,LONG_TERM_EARNINGS_YIELD,NOMINAL_EARNINGS_GROWTH,REPRICING_RETURN,GRINOLD_KRONER_RETURN,FWD_RETURN_5Y_FORECAST
ETF_TICKER,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1
PSCD,0.9993,0,0.184952,0.121506,0.17615,S&P 600 Consumer Discretionary Sector GICS Lev...,S6COND Index,Invesco S&P SmallCap Consumer Discretionary ETF,U.S. Small-cap Value ETP,0.0781,0.0375,0.0178,0.1334,0.2189
XRT,0.9682,0,0.184952,0.121506,0.17615,S&P 600 Consumer Discretionary Sector GICS Lev...,S6COND Index,SPDR S&P Retail ETF,U.S. Broad Market Blend ETP,0.0781,0.0375,0.0178,0.1334,0.2189
VIOV,0.9924,0,0.164868,0.096484,0.13455,S&P Small Cap 600 Value Index,SMLV Index,Vanguard S&P Small-Cap 600 Value ETF,U.S. Small-cap Value ETP,0.0683,0.0375,0.004,0.1098,0.1593
FYT,0.9634,0,0.164868,0.096484,0.13455,S&P Small Cap 600 Value Index,SMLV Index,First Trust Small Cap Value AlphaDEX Fund,U.S. Small-cap Value ETP,0.0683,0.0375,0.004,0.1098,0.1593
IJS,0.9933,0,0.164868,0.096484,0.13455,S&P Small Cap 600 Value Index,SMLV Index,iShares S&P Small-Cap 600 Value ETF,U.S. Small-cap Value ETP,0.0683,0.0375,0.004,0.1098,0.1593


In [29]:
df1 = results[['ETF_NAME', 'BENCHMARK_NAME', 'BENCHMARK_TICKER', 'FWD_RETURN_5Y_FORECAST']].drop_duplicates(subset=['BENCHMARK_TICKER'])
df1

Unnamed: 0_level_0,ETF_NAME,BENCHMARK_NAME,BENCHMARK_TICKER,FWD_RETURN_5Y_FORECAST
ETF_TICKER,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
PSCD,Invesco S&P SmallCap Consumer Discretionary ETF,S&P 600 Consumer Discretionary Sector GICS Lev...,S6COND Index,0.2189
VIOV,Vanguard S&P Small-Cap 600 Value ETF,S&P Small Cap 600 Value Index,SMLV Index,0.1593
VICE,AdvisorShares Vice ETF,S&P Supercomposite Casinos & Gaming Sub Indust...,S15CASI Index,0.0647
PSCI,Invesco S&P SmallCap Industrials ETF,S&P 600 Industrials Sector GICS Level 1 Index,S6INDU Index,0.1317
FXD,First Trust Consumer Discretionary AlphaDEX Fund,S&P 400 Consumer Discretionary Sector GICS Lev...,S4COND Index,0.1965
...,...,...,...,...
XLC,Communication Services Select Sector SPDR Fund,MSCI World ex AUS Communication Services Index,MXWOOTC Index,-0.0561
XLU,Utilities Select Sector SPDR Fund,S&P 500 Utilities Sector GICS Level 1 Index,S5UTIL Index,-0.0167
JXI,iShares Global Utilities ETF,MSCI ACWI Utilities Sector Local Index,MSCLUTI Index,-0.0335
DJUL,FT Cboe Vest US Equity Deep Buffer ETF -Jul,Dow Jones Global Titans 50 Index,DJGT Index,-0.0403


In [None]:
threshold = 10
vif_returns = benchmark_prices.copy()[df1.columns]
for i in tqdm(range(vif_returns.shape[1])):
    vif = pd.DataFrame()
    vif["VIF Factor"] = [variance_inflation_factor(vif_returns.values, i) for i in range(vif_returns.shape[1])]
    vif.index = vif_returns.columns
    if (vif.max()[0] > threshold):
        omit = vif.idxmax()
        vif_returns = vif_returns.drop(omit, axis=1)
vif.index.name = 'TICKER'
# px_last = px_last[list(vif.index)]
# lt_pe = lt_pe[list(vif.index)]
# vif.sort_values(by='VIF Factor')
vif_tickers = list(vif.index)

In [23]:
results.loc['SPY'].FWD_RETURN_5Y_FORECAST

-0.0066

In [None]:
# df1[df1.FWD_RETURN_5Y_FORECAST>=results.loc['SPY'].FWD_RETURN_5Y_FORECAST].sort_values(by='FWD_RETURN_5Y_FORECAST', ascending=False)
df1[df1.FWD_RETURN_5Y_FORECAST>=0.10].sort_values(by='FWD_RETURN_5Y_FORECAST', ascending=False)