On March 16th, 2022 the Federal Reserve (FED) raised the interest rates by 0.25% and signaled six more rates hikes to come until the end of 2022 to fight the highest inflation in four decades.

These interest rate hikes are only one side of the FED’s hawkish monetary policy. The other side consists in shrinking its $9 trillion balance sheet by first stopping the purchase of securities, which was initiated to support the economy during the pandemic, and second by starting to sell part of it to reduce money in circulation.

Here, an attempt is made to forecast the evolution of the S&P500 index for hypothetical scenarios of the evolution of the FED balance sheet.


### 1. Historical Data Set
The first step to start the exercise consists in gathering the data of interest:

S&P500 index historical values,
the FED balance sheet historical values,
the historical interest rates for the United States.
The S&P500 historical values have downloaded easily from Yahoo! Finance using Python and the yahooquery package

In [49]:
from yahooquery import Ticker
import pandas as pd

sp500 = Ticker("^GSPC").history(period='21Y', interval='1d')
sp500 = sp500.reset_index()
sp500["date"] = pd.to_datetime(sp500["date"])
sp500.set_index("date",inplace=True)

In [50]:
sp500

Unnamed: 0_level_0,symbol,close,high,volume,open,low,adjclose
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
2001-08-20,^GSPC,1171.410034,1171.410034,897100000,1161.969971,1160.939941,1171.410034
2001-08-21,^GSPC,1157.260010,1179.849976,1041600000,1171.410034,1156.560059,1157.260010
2001-08-22,^GSPC,1165.310059,1168.560059,1110800000,1157.260010,1153.339966,1165.310059
2001-08-23,^GSPC,1162.089966,1169.859985,986200000,1165.310059,1160.959961,1162.089966
2001-08-24,^GSPC,1184.930054,1185.150024,1043600000,1162.089966,1162.089966,1184.930054
...,...,...,...,...,...,...,...
2022-08-11,^GSPC,4207.270020,4257.910156,3925060000,4227.399902,4201.410156,4207.270020
2022-08-12,^GSPC,4280.149902,4280.470215,3252290000,4225.020020,4219.779785,4280.149902
2022-08-15,^GSPC,4297.140137,4301.790039,3087740000,4269.370117,4256.899902,4297.140137
2022-08-16,^GSPC,4305.200195,4325.279785,3792010000,4290.459961,4277.770020,4305.200195


In [3]:
#path = r'C:\Users\antoine.dedave\Downloads\\'

#fed_bs = pd.read_csv(path + "WALCL.csv")
#rates = pd.read_csv(path + "INTDSRUSM193N.csv")

In [51]:
fed_bs = pd.read_csv("WALCL.csv")
rates = pd.read_csv("INTDSRUSM193N.csv")

In [52]:
fed_bs.set_index("DATE",inplace=True)
rates.set_index("DATE",inplace=True)

rates["INTDSRUSM193N"] = rates[rates["INTDSRUSM193N"] != '.']
rates["INTDSRUSM193N"] = rates["INTDSRUSM193N"].astype(float)

fed_bs.index = pd.to_datetime(fed_bs.index)
rates.index = pd.to_datetime(rates.index)

fed = fed_bs.copy()
fed['Rates'] = rates["INTDSRUSM193N"]
fed = fed.fillna(method="ffill")
fed = fed.dropna()
fed

Unnamed: 0_level_0,WALCL,Rates
DATE,Unnamed: 1_level_1,Unnamed: 2_level_1
2003-01-01,730994,2.25
2003-01-08,723762,2.25
2003-01-15,720074,2.25
2003-01-22,735953,2.25
2003-01-29,712809,2.25
...,...,...
2022-01-26,8860485,0.25
2022-02-02,8873211,0.25
2022-02-09,8878009,0.25
2022-02-16,8911033,0.25


### 2. Monetary policy forecasts


In this context, I consider one scenario of rate hikes and four different scenarios of balance sheet shrinking until 2025.

To do so, target values are set at several dates in the future, and the intermediate values are interpolated using forward fill for interest rates and quadratic interpolation for the balance sheet:

In [53]:
import numpy as np
import datetime as dt

#extend dates range
for date in pd.date_range(start="2022-02-24", end="2027-09-15"):
    fed.loc[date,:] = np.nan

#7 rate hikes of 2022
fed.loc[dt.datetime(2022,3,15),"Rates"] = 0.5
fed.loc[dt.datetime(2022,5,15),"Rates"] = 0.75
fed.loc[dt.datetime(2022,6,15),"Rates"] = 1.0
fed.loc[dt.datetime(2022,7,15),"Rates"] = 1.25
fed.loc[dt.datetime(2022,9,15),"Rates"] = 1.5
fed.loc[dt.datetime(2022,11,15),"Rates"] = 1.75
fed.loc[dt.datetime(2022,12,15),"Rates"] = 2.0

#4 rate hikes of 2023
fed.loc[dt.datetime(2023,3,15),"Rates"] = 2.25
fed.loc[dt.datetime(2023,5,15),"Rates"] = 2.5
fed.loc[dt.datetime(2023,7,15),"Rates"] = 2.75
fed.loc[dt.datetime(2023,9,15),"Rates"] = 3.0
fed.loc[dt.datetime(2027,9,15),"Rates"] = 3.0

#interpolation of interest rates
fed["Rates"] = fed["Rates"].fillna(method="ffill")

#four balance sheet scenarios
fed_forecasts =  [("5T",5000000),("7T",7000000),("8T",8000000),("9T",9000000)]

#set BS values and interpolate
for label,forecast in fed_forecasts:

    fed["WALCL " + label ] = fed["WALCL"]

    fed.loc[dt.datetime(2023,12,15),"WALCL " + label ] = forecast
    fed.loc[dt.datetime(2027,6,15),"WALCL " + label ] = forecast
    
    fed.loc[fed.index<=dt.datetime(2023,12,15),"WALCL " + label ]=fed.loc[fed.index<=dt.datetime(2023,12,15),"WALCL " + label ].interpolate(method="quadratic")
    fed.loc[fed.index>=dt.datetime(2023,12,15),"WALCL " + label ]=fed.loc[fed.index>=dt.datetime(2023,12,15),"WALCL " + label ].interpolate(method="linear")

    fed.loc[fed.index>=dt.datetime(2023,3,15),"WALCL " + label ]=\
        fed.rolling(400,center=True).mean().loc[fed.index>=dt.datetime(2023,3,15),"WALCL " + label ] + \
        fed.loc[fed.index<dt.datetime(2023,3,15),"WALCL " + label].iloc[-1] - \
        fed.rolling(400,center=True).mean().loc[fed.index==dt.datetime(2023,3,15),"WALCL " + label ].iloc[-1]

    fed = fed.rename(columns={"WALCL " + label :"BS" + label })

fed["WALCL"] = fed["WALCL"].fillna(0)    
fed = fed.rename(columns={"WALCL" :"BS" })
fed = fed.dropna()
#fed

In [54]:
import plotly.graph_objects as go
from plotly.subplots import make_subplots


fig = make_subplots(specs=[[{"secondary_y": True}]])


fig.add_trace(go.Scatter(
    x=fed[(fed.index<"2022-02-23")].index,
    y=fed[(fed.index<"2022-02-23")]["Rates"],
    name='Interest rates history',
    marker_color="black"
))  
    
    
fig.add_trace(go.Scatter(
    x=fed[(fed.index>="2022-02-23")&(fed.index<="2025-01-26")].index,
    y=fed[(fed.index>="2022-02-23")&(fed.index<="2025-01-26")]["Rates"],
    name='Interest rates forecast ',
    marker_color="black",
    line=dict(dash='dash')
))  
    

fig.update_layout(
    paper_bgcolor='white',
    plot_bgcolor='#fafafa',
    hovermode='closest',
#     title="FED Balance Sheet scenarios",
    xaxis = dict(
        title=""
    ),
    yaxis = dict(
        title="%"
    ),
    showlegend=True)

fig.show()


fig = make_subplots(specs=[[{"secondary_y": True}]])


fig.add_trace(go.Scatter(
    x=fed[(fed.index<"2022-02-23")].index,
    y=fed[(fed.index<"2022-02-23")]["BS"],
    name='Balance Sheet history',
    marker_color="black"
))  
    
    
for label,col in [ ("5T","#0D0628"),\
                     ("7T","#885053"),\
                     ("8T","#FE5F55"),\
                     ("9T","#7FD1B9")]:
        
    fig.add_trace(go.Scatter(
        x=fed[(fed.index>="2022-02-23")&(fed.index<="2025-01-26")].index,
        y=fed[(fed.index>="2022-02-23")&(fed.index<="2025-01-26")]["BS"+label],
        name='Balance Sheet ' + label,
        marker_color=col,
        line=dict(dash='dash')
    ))  
    

# fig.add_trace(go.Scatter(
#     x=d.index,
#     y=d["close"],
#     name='Real',
# ))


fig.update_layout(
    paper_bgcolor='white',
    plot_bgcolor='#fafafa',
    hovermode='closest',
    title="FED Balance Sheet scenarios",
    xaxis = dict(
        title=""
    ),
    yaxis = dict(
        title="M€"
    ),
    showlegend=True)

fig.show()

### 3. Data pre-processing
At this step, now I have a dataset with the historical FED balance sheet, the historical target interest rates, the historical S&P500 close prices, and the forecasted scenarios for the FED balance sheet and target interest rates 

In [55]:
fed

Unnamed: 0_level_0,BS,Rates,BS5T,BS7T,BS8T,BS9T
DATE,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2003-01-01,730994.0,2.25,7.309940e+05,7.309940e+05,7.309940e+05,7.309940e+05
2003-01-08,723762.0,2.25,7.237620e+05,7.237620e+05,7.237620e+05,7.237620e+05
2003-01-15,720074.0,2.25,7.200740e+05,7.200740e+05,7.200740e+05,7.200740e+05
2003-01-22,735953.0,2.25,7.359530e+05,7.359530e+05,7.359530e+05,7.359530e+05
2003-01-29,712809.0,2.25,7.128090e+05,7.128090e+05,7.128090e+05,7.128090e+05
...,...,...,...,...,...,...
2027-02-24,0.0,3.00,5.161517e+06,7.098992e+06,8.067730e+06,9.036468e+06
2027-02-25,0.0,3.00,5.161517e+06,7.098992e+06,8.067730e+06,9.036468e+06
2027-02-26,0.0,3.00,5.161517e+06,7.098992e+06,8.067730e+06,9.036468e+06
2027-02-27,0.0,3.00,5.161517e+06,7.098992e+06,8.067730e+06,9.036468e+06


In [56]:
sp500=sp500.iloc[:-1]

In [57]:
data_set = fed.copy()
data_set["close"] = sp500["close"]
data_set["close"] = data_set["close"].fillna(method="ffill")
data_set["close"] = data_set["close"].rolling(15,center=True).mean() 
data_set = data_set.dropna()
data_set


Unnamed: 0_level_0,BS,Rates,BS5T,BS7T,BS8T,BS9T,close
DATE,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
2003-02-26,721980.0,2.25,7.219800e+05,7.219800e+05,7.219800e+05,7.219800e+05,860.708663
2003-03-05,722649.0,2.25,7.226490e+05,7.226490e+05,7.226490e+05,7.226490e+05,861.314665
2003-03-12,717014.0,2.25,7.170140e+05,7.170140e+05,7.170140e+05,7.170140e+05,861.227999
2003-03-19,729923.0,2.25,7.299230e+05,7.299230e+05,7.299230e+05,7.299230e+05,864.645333
2003-03-26,725019.0,2.25,7.250190e+05,7.250190e+05,7.250190e+05,7.250190e+05,869.640002
...,...,...,...,...,...,...,...
2027-02-17,0.0,3.00,5.161517e+06,7.098992e+06,8.067730e+06,9.036468e+06,4305.200195
2027-02-18,0.0,3.00,5.161517e+06,7.098992e+06,8.067730e+06,9.036468e+06,4305.200195
2027-02-19,0.0,3.00,5.161517e+06,7.098992e+06,8.067730e+06,9.036468e+06,4305.200195
2027-02-20,0.0,3.00,5.161517e+06,7.098992e+06,8.067730e+06,9.036468e+06,4305.200195


As the FED balance sheet feature and the S&P500 price target are trended time series, the first pre-processing operation consists in transforming them into non-stationary time series, to be able to use a larger set of modeling techniques.

To do so, I difference the time series by computing the log return between two-time steps.

In [58]:
data_set_log_m = data_set.resample('1W').mean()

for c in data_set_log_m.columns:
    
    if "BS" in c: 
        data_set_log_m[c] = np.log(data_set_log_m[c]) - np.log(data_set_log_m[c].shift(1))
        
data_set_log_m["close"] = np.log(data_set_log_m["close"]) - np.log(data_set_log_m["close"].shift(1))

data_set_log_m


divide by zero encountered in log



Unnamed: 0_level_0,BS,Rates,BS5T,BS7T,BS8T,BS9T,close
DATE,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
2003-03-02,,2.25,,,,,
2003-03-09,0.000926,2.25,0.000926,0.000926,0.000926,0.000926,0.000704
2003-03-16,-0.007828,2.25,-0.007828,-0.007828,-0.007828,-0.007828,-0.000101
2003-03-23,0.017844,2.25,0.017844,0.017844,0.017844,0.017844,0.003960
2003-03-30,-0.006741,2.25,-0.006741,-0.006741,-0.006741,-0.006741,0.005760
...,...,...,...,...,...,...,...
2027-01-24,,3.00,0.000000,0.000000,0.000000,0.000000,0.000000
2027-01-31,,3.00,0.000000,0.000000,0.000000,0.000000,0.000000
2027-02-07,,3.00,0.000000,0.000000,0.000000,0.000000,0.000000
2027-02-14,,3.00,0.000000,0.000000,0.000000,0.000000,0.000000


In [59]:
data_set_log_m_no_surprise["BS"].dropna().describe()

count    992.000000
mean           -inf
std             NaN
min            -inf
25%       -0.002666
50%        0.000525
75%        0.004450
max        0.045820
Name: BS, dtype: float64

In [60]:
data_set_log_m["BS"].dropna().describe()

count    992.000000
mean           -inf
std             NaN
min            -inf
25%       -0.002666
50%        0.000581
75%        0.004934
max        0.215993
Name: BS, dtype: float64

The next steps consists in adding some kind of view of the past and anticipation of the future in the features. The latter seems quite important as we know that the market mainly anticipates what comes in the future.

The view of the past is quite straightforward: the rolling average of the 1,3,6,12 past months of the balance sheet and interest rates are computed.

The anticipation of the future is more tricky. In this exercise, we consider the anticipation of the near future to be perfect (6 and 12 months), except for exceptional increases (quantitative easing bootstraps) in the balance sheet due to unexpected events: Subprimes and Covid crisis

In [61]:
colors = ["#0D0628","#885053","#FE5F55","#7FD1B9"]

fig = make_subplots(specs=[[{"secondary_y": True}]])


fig.add_trace(go.Scatter(
    x=data_set_log_m["BS"].dropna().index,
    y=data_set_log_m["BS"].dropna(),
    marker_color=colors[0],
    name="Realized",
))

fig.add_trace(go.Scatter(
    x=data_set_log_m_no_surprise["BS"].dropna().index,
    y=data_set_log_m_no_surprise["BS"].dropna(),
    marker_color=colors[3],
    name="Expected",
))

fig.update_layout(
    paper_bgcolor='white',
    plot_bgcolor='#fafafa',
    hovermode='closest',
    title="",
    xaxis = dict(
        title=""
    ),
    yaxis = dict(
        title="FED Balance Sheet log return"
    ),
    showlegend=True)

fig.show()

In [62]:

data_set_log_m_no_surprise = data_set_log_m.copy()
data_set_log_m_no_surprise.loc[data_set_log_m_no_surprise["BS"]>=0.05]=0

weeks = 4

for c in data_set_log_m.columns:
    
    if "BS" in c:
        
        for i in [1,3,6,12]:
            data_set_log_m[c+"-"+str(i)+"mean"] = data_set_log_m[c].rolling(i*weeks).mean()
        
        for i in [6,12]:
            data_set_log_m[c+"+"+str(i)+"mean"] = data_set_log_m_no_surprise[c].rolling(i*weeks).mean().shift(-(i)*weeks)

            
for i in [1,3,6,12]:
    data_set_log_m["Rates"+"-"+str(i)+"mean"] = data_set_log_m["Rates"].rolling(i*weeks).mean()


for i in [6,12]:
    data_set_log_m["Rates"+"+"+str(i)+"mean"] = data_set_log_m["Rates"].rolling(i*weeks).mean().shift(-(i)*weeks)

data_set_log_m["week"] = data_set_log_m.index.week

data_set_log_m = data_set_log_m[[c for c in data_set_log_m.columns if c!= "close"] + ["close"]]
data_set_log_m


weekofyear and week have been deprecated, please use DatetimeIndex.isocalendar().week instead, which returns a Series. To exactly reproduce the behavior of week and weekofyear and return an Index, you may call pd.Int64Index(idx.isocalendar().week)



Unnamed: 0_level_0,BS,Rates,BS5T,BS7T,BS8T,BS9T,BS-1mean,BS-3mean,BS-6mean,BS-12mean,...,BS9T+6mean,BS9T+12mean,Rates-1mean,Rates-3mean,Rates-6mean,Rates-12mean,Rates+6mean,Rates+12mean,week,close
DATE,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2003-03-02,,2.25,,,,,,,,,...,0.000767,0.000939,,,,,2.25,2.156250,9,
2003-03-09,0.000926,2.25,0.000926,0.000926,0.000926,0.000926,,,,,...,0.001594,0.000889,,,,,2.25,2.151042,10,0.000704
2003-03-16,-0.007828,2.25,-0.007828,-0.007828,-0.007828,-0.007828,,,,,...,0.001351,0.000965,,,,,2.25,2.145833,11,-0.000101
2003-03-23,0.017844,2.25,0.017844,0.017844,0.017844,0.017844,,,,,...,0.001156,0.000807,2.25,,,,2.25,2.140625,12,0.003960
2003-03-30,-0.006741,2.25,-0.006741,-0.006741,-0.006741,-0.006741,0.00105,,,,...,0.000696,0.000724,2.25,,,,2.25,2.135417,13,0.005760
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2027-01-24,,3.00,0.000000,0.000000,0.000000,0.000000,,,,,...,,,3.00,3.0,3.0,3.0,,,3,0.000000
2027-01-31,,3.00,0.000000,0.000000,0.000000,0.000000,,,,,...,,,3.00,3.0,3.0,3.0,,,4,0.000000
2027-02-07,,3.00,0.000000,0.000000,0.000000,0.000000,,,,,...,,,3.00,3.0,3.0,3.0,,,5,0.000000
2027-02-14,,3.00,0.000000,0.000000,0.000000,0.000000,,,,,...,,,3.00,3.0,3.0,3.0,,,6,0.000000


In [63]:
fed_forecasts

[('5T', 5000000), ('7T', 7000000), ('8T', 8000000), ('9T', 9000000)]

### 4. Model Training
Before being able to train a model, the dataset is divided into three sets:

a training set: 80% of the dataset samples until 30 June 2021,
a validation set: 20% of the dataset samples until 30 June 2021,
a test set: dataset samples after 30 June 2021.

In [64]:
from sklearn.model_selection import train_test_split

split_date = "2021-06-30"

training_set = data_set_log_m[data_set_log_m.index<split_date].copy()

training_cols = training_set.columns

for label,forecast in fed_forecasts:
    training_cols = [c for c in training_cols if label not in c]
    

training_set = training_set[training_cols]
training_set = training_set.dropna()

test_sets = {}
for l,f in fed_forecasts:
    test_sets[l] = data_set_log_m[data_set_log_m.index>=split_date].copy()
    
train,val = train_test_split(training_set, test_size=0.2)

In [65]:
test_sets

{'5T':                   BS  Rates      BS5T      BS7T      BS8T      BS9T  BS-1mean  \
 DATE                                                                            
 2021-07-04 -0.002892   0.25 -0.002892 -0.002892 -0.002892 -0.002892  0.004460   
 2021-07-11  0.002377   0.25  0.002377  0.002377  0.002377  0.002377  0.004531   
 2021-07-18  0.012746   0.25  0.012746  0.012746  0.012746  0.012746  0.004223   
 2021-07-25  0.004729   0.25  0.004729  0.004729  0.004729  0.004729  0.004240   
 2021-08-01 -0.002315   0.25 -0.002315 -0.002315 -0.002315 -0.002315  0.004384   
 ...              ...    ...       ...       ...       ...       ...       ...   
 2027-01-24       NaN   3.00  0.000000  0.000000  0.000000  0.000000       NaN   
 2027-01-31       NaN   3.00  0.000000  0.000000  0.000000  0.000000       NaN   
 2027-02-07       NaN   3.00  0.000000  0.000000  0.000000  0.000000       NaN   
 2027-02-14       NaN   3.00  0.000000  0.000000  0.000000  0.000000       NaN   
 2027-02-2

In [66]:
from sklearn.preprocessing import MinMaxScaler

scaler = MinMaxScaler()
scaled_cols = training_set.columns

train[scaled_cols] = scaler.fit_transform(train[scaled_cols])
val[scaled_cols] = scaler.transform(val)



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



In [67]:
from sklearn.ensemble import RandomForestRegressor

def test_(yhat,X_test, y_test):
    
    # invert scaling for forecast
    inv_yhat_full = X_test.copy()
    inv_yhat_full["yhat"] = yhat
    inv_yhat_full[inv_yhat_full.columns]  = scaler.inverse_transform(inv_yhat_full)
    inv_yhat = inv_yhat_full.iloc[:,-1]

    # invert scaling for actual
    inv_y = X_test.copy()
    inv_y["y"] = y_test
    inv_y[inv_y.columns] = scaler.inverse_transform(inv_y)
    inv_y = inv_y.iloc[:,-1]

    df_Result = pd.DataFrame()
    df_result = pd.DataFrame(index=y_test.index)
    df_result['yhat'] = inv_yhat
    df_result['y']=inv_y
    
    return df_result.sort_index()

def train_validate_RF(X_train, y_train,X_test, y_test):
    
    model = RandomForestRegressor()
    model.fit(X_train, y_train)    
    yhat = model.predict(X_test)

    return (test_(yhat,X_test, y_test),model)


X_train, y_train = train.iloc[:, :-1], train.iloc[:, -1]
X_val, y_val = val.iloc[:, :-1], val.iloc[:, -1]

train_result,model = train_validate_RF(X_train, y_train,X_val, y_val)


In [68]:
X_train

Unnamed: 0_level_0,BS,Rates,BS-1mean,BS-3mean,BS-6mean,BS-12mean,BS+6mean,BS+12mean,Rates-1mean,Rates-3mean,Rates-6mean,Rates-12mean,Rates+6mean,Rates+12mean,week
DATE,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1
2008-04-13,0.243145,0.916667,0.211338,0.164614,0.097259,0.125881,0.546672,0.193718,0.916667,0.916667,0.916667,0.950617,0.916667,0.550265,0.269231
2010-09-19,0.188939,0.083333,0.206026,0.145856,0.079324,0.150695,0.688094,0.810558,0.072917,0.052083,0.046875,0.044974,0.083333,0.084656,0.692308
2019-12-29,0.229821,0.458333,0.256681,0.215393,0.172107,0.134282,0.986588,0.864241,0.458333,0.458333,0.458333,0.442681,0.180556,0.091711,0.980769
2015-07-05,0.191430,0.083333,0.221496,0.164300,0.075916,0.122276,0.475714,0.438412,0.083333,0.083333,0.083333,0.084656,0.083333,0.085538,0.500000
2008-03-09,0.152143,0.916667,0.233846,0.164617,0.099559,0.127394,0.512060,0.173160,0.916667,0.916667,0.916667,0.959436,0.916667,0.629630,0.173077
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2017-07-16,0.204032,0.208333,0.213956,0.162584,0.087248,0.106192,0.462259,0.391167,0.208333,0.208333,0.201389,0.165785,0.223958,0.240741,0.519231
2007-06-03,0.217337,1.000000,0.194374,0.178465,0.110890,0.142845,0.500777,0.466126,1.000000,1.000000,1.000000,0.985891,0.944444,0.945326,0.403846
2020-11-22,0.239069,0.000000,0.230236,0.198078,0.094953,0.642347,0.629948,0.724967,0.000000,0.000000,0.000000,0.101411,0.000000,0.000000,0.884615
2013-09-29,0.216184,0.083333,0.252124,0.227065,0.229936,0.368109,0.709571,0.724308,0.083333,0.083333,0.083333,0.084656,0.083333,0.084656,0.730769


In [69]:
from sklearn.metrics import mean_squared_error,mean_absolute_error,r2_score,max_error

display('MSE')
display(mean_squared_error(train_result["y"],train_result["yhat"]))

display('MAE')
display(mean_absolute_error(train_result["y"],train_result["yhat"]))

display('R2')
display(r2_score(train_result["y"],train_result["yhat"]))

display('Max')
display(max_error(train_result["y"],train_result["yhat"]))

'MSE'

5.992345377139711e-06

'MAE'

0.0017576016396788445

'R2'

0.8149633912712595

'Max'

0.010316061471951871

It’s then possible to assess the performance of the model on the validation set and iterate on the feature selection and model parameters until one finds it satisfactory.

The retained model has a Mean Absolute Error of 0.0014 (the forecasted value being a weekly log return) and an R2 Score of 0.9 on the validation set.

In [70]:
import plotly.graph_objects as go
from plotly.subplots import make_subplots
fig = make_subplots(specs=[[{"secondary_y": True}]])
fig.add_trace(go.Scatter(
    x=train_result.index,
    y=train_result["y"],
    marker_color=colors[2],
    name='Forecast'))

fig.add_trace(go.Scatter(
    x=train_result.index,
    y=train_result["yhat"],
    name='Realized',
    marker_color=colors[0]))

fig.update_layout(
    paper_bgcolor='white',
    plot_bgcolor='#fafafa',
    hovermode='closest',
    title="Prediction on the validation set",
    xaxis = dict(
        title=""
    ),
    yaxis = dict(
        title=""
    ),
    showlegend=True)

fig.show()

In [38]:
model.feature_importances_

array([0.02582074, 0.00147795, 0.06954743, 0.0488537 , 0.0513001 ,
       0.10066238, 0.34304776, 0.07384621, 0.00744974, 0.0181823 ,
       0.03514183, 0.0386274 , 0.05414342, 0.07986426, 0.05203479])

In [39]:
scaled_cols[:-1]

Index(['BS', 'Rates', 'BS-1mean', 'BS-3mean', 'BS-6mean', 'BS-12mean',
       'BS+6mean', 'BS+12mean', 'Rates-1mean', 'Rates-3mean', 'Rates-6mean',
       'Rates-12mean', 'Rates+6mean', 'Rates+12mean', 'week'],
      dtype='object')

In [40]:
import plotly.graph_objects as go
from plotly.subplots import make_subplots

fig = make_subplots(specs=[[{"secondary_y": True}]])

fig.add_trace(go.Bar(
    x=scaled_cols[:-1],
    y=model.feature_importances_,
    marker_color=colors[1],
    name='Forecast',
))



fig.update_layout(
    paper_bgcolor='white',
    plot_bgcolor='#fafafa',
    hovermode='closest',
    title="",
    xaxis = dict(
        title=""
    ),
    yaxis = dict(
        title=""
    ),
    showlegend=False)

fig.show()

The feature's importance of the model is presented in the above figure. The anticipation of the balance sheet for the six months to come is the main feature.

### 5. Model testing
The performance of the final model is then evaluated on the test set and the predictions for the several monetary policies defined will be computed.

### test set 

In [71]:
def test_predict(X_test,model,test_set):
    
    yhat_val = model.predict(X_test)

    inv_yhat_test_full = X_test.copy()
    inv_yhat_test_full["yhat"] = yhat_val
    inv_yhat_test_full[inv_yhat_test_full.columns]  = scaler.inverse_transform(inv_yhat_test_full)
    inv_yhat_test = inv_yhat_test_full.iloc[:,-1]
    
    df_Result = pd.DataFrame()
    df_result = pd.DataFrame(index=inv_yhat_test.index)
    df_result['yhat'] = inv_yhat_test
    df_result['y']=test_set["close"]

    return df_result.sort_index()



test_and_predict = {}

#predict for the 4 scenarii
for l,f in fed_forecasts:

    test = test_sets[l].copy()

    test_cols = training_cols.copy()

    for i in range(len(training_cols)):

        if "BS" in training_cols[i]:
            test_cols[i] = test_cols[i].replace("BS","BS"+l) 

    test = test[test_cols]
    test[test_cols] = scaler.transform(test[test_cols])
    test = test.dropna()
    X_test, y_test = test.iloc[:, :-1], test.iloc[:, -1]

    forecast = test_predict(X_test,model,test_sets[l])

    test_and_predict[l] = forecast

In [72]:
from sklearn.metrics import mean_squared_error,mean_absolute_error,r2_score,max_error

test_results = test_and_predict["5T"][test_and_predict["5T"].index<="2022-02-20"]

display('MSE')
display(mean_squared_error(test_results["y"],test_results["yhat"]))

display('MAE')
display(mean_absolute_error(test_results["y"],test_results["yhat"]))

display('R2')
display(r2_score(test_results["y"],test_results["yhat"]))

display('Max')
display(max_error(test_results["y"],test_results["yhat"]))

'MSE'

1.324893867652573e-05

'MAE'

0.00277452013956337

'R2'

0.015222903052254488

'Max'

0.008652900175441753

In [73]:
import plotly.graph_objects as go
from plotly.subplots import make_subplots

fig = make_subplots(specs=[[{"secondary_y": True}]])

fig.add_trace(go.Scatter(
    x=test_results.index,
    y=test_results["yhat"],
    marker_color=colors[2],
    name='Forecast',
))

fig.add_trace(go.Scatter(
    x=test_results.index,
    y=test_results["y"],
    name='Realized',
    marker_color=colors[0],
))



fig.update_layout(
    paper_bgcolor='white',
    plot_bgcolor='#fafafa',
    hovermode='closest',
    title="Prediction on the test set",
    xaxis = dict(
        title=""
    ),
    yaxis = dict(
        title=""
    ),
    showlegend=True)

fig.show()

On the test set, the Mean Absolute Error of 0.0025 is not as low as on the training set but looking at above Figure shows us that the general trend is respected, while the short-term volatility or geopolitical events (Ukrainian war) are not captured (as expected).

### 6. Scenarios predictions and final results
The last step to get our predictions for the four scenarios of the monetary policy is to invert the log return operation

In [74]:
last_value = data_set.resample("1w").mean()[data_set.resample("1w").mean().index < test_sets["5T"].index[0]].iloc[-1,-1]  

display(last_value)

def compute_close(l):

    test_and_predict[l]["cum_return"] = test_and_predict[l]["yhat"].cumsum()
    test_and_predict[l]["Price Forecast"] = last_value*np.exp(test_and_predict[l]["cum_return"])
        
    return test_and_predict[l]

4271.322688802084

In [75]:
for l,f in fed_forecasts:
    test_and_predict[l] = compute_close(l)

In [76]:
test_results

Unnamed: 0_level_0,yhat,y
DATE,Unnamed: 1_level_1,Unnamed: 2_level_1
2021-07-04,0.006232,0.003625
2021-07-11,0.006419,0.006714
2021-07-18,0.005592,0.006289
2021-07-25,0.005794,0.004871
2021-08-01,0.005647,0.004155
2021-08-08,0.00573,0.002675
2021-08-15,0.005386,0.002058
2021-08-22,0.005507,0.001841
2021-08-29,0.005604,0.001001
2021-09-05,0.005378,0.002685


In [78]:
import plotly.graph_objects as go
from plotly.subplots import make_subplots

fig = make_subplots(specs=[[{"secondary_y": True}]])
fig2 = make_subplots(specs=[[{"secondary_y": True}]])

for label,df,col in [ ("BS5T",test_and_predict["5T"],"#0D0628"),\
                     ("BS7T",test_and_predict["7T"],"#885053"),\
                     ("BS8T",test_and_predict["8T"],"#FE5F55"),\
                     ("BS9T",test_and_predict["9T"],"#7FD1B9")]:
    
    fig2.add_trace(go.Scatter(
        x=fed[(fed.index>="2021-07-01")&(fed.index<="2025-01-26")].index,
        y=fed[(fed.index>="2021-07-01")&(fed.index<="2025-01-26")][label],
        name='Balance Sheet ' + label,
        marker_color=col
    ))  
    
    
    fig.add_trace(go.Scatter(
        x=df.index,
        y=df["Price Forecast"],
        name='Forecast ' + label,
        marker_color=col
    ))

    
real = data_set.resample("1w").mean()
real = real[real.index.isin(df.index)]
real = real[real.index<="2022-04-03"]
    
fig.add_trace(go.Scatter(
    x=real.index,
    y=real["close"],
    name='Real',
))

fig2.update_layout(
    paper_bgcolor='white',
    plot_bgcolor='#fafafa',
    hovermode='closest',
    title="FED Balance Sheet Scenario",
    xaxis = dict(
        title=""
    ),
    yaxis = dict(
        title="M€"
    ),
    showlegend=True)

fig.update_layout(
    paper_bgcolor='white',
    plot_bgcolor='#fafafa',
    hovermode='closest',
    title="S&P500 Forecast",
    xaxis = dict(
        title=""
    ),
    yaxis = dict(
        title="€"
    ),
    showlegend=True)

fig2.show()
fig.show()

As expected, the more the balance sheet shrinks, the less the model is optimistic about the future.

For the 9 and 8 trillion balance sheets scenarios, the future is not so dark with returns of around 20% by the end of 2025.

The 7 trillion scenario seems more uncertain with a forecasted return of not more than 10% by the end of 2025.

Finally, the 5 trillion scenario looks like the Mother of All Crashes scenario with a return of -50%, due to the pace of the tightening which seems hopefully a bit unrealistic.

One final note is also that the market probably anticipates the hawkish policy a bit earlier than the model, as the real S&P500 curve starts to diverge for the forecast from January 2022, while the two were almost perfectly aligned before. The divergence could also be explained by geopolitical events (Ukraine War) that are not captured by the model.