In [1]:
# installing requirements from txt file
#pip install -r requirements.txt

In [15]:
# importing necessary libraries
import pandas as pd
import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, LSTM, Dropout
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
import yfinance as yf
from datetime import datetime
from sklearn.metrics import mean_squared_error

# **Step 1: Data retrieval and indicators calculation**

The idea is to consider a portfolio made only of SPY ETF, as it has been seen that holding an ETF which replicates the Standard and Poor 500 can be one of the best investments you can make.

In [16]:
# downloading monthly prices of the SPY ETF, as VIX data will be monthly and therefore we keep returns as monthly
spy_prices = yf.download('SPY', start = '2005-07-01', end = '2024-12-31', interval = '1mo') # starting since when we have availability for the VIX futures historical term structure
spy_prices = spy_prices['Adj Close']
spy_prices

[*********************100%%**********************]  1 of 1 completed


Date
2005-07-01     85.602913
2005-08-01     84.800438
2005-09-01     85.118645
2005-10-01     83.459297
2005-11-01     87.127548
                 ...    
2024-08-01    560.071289
2024-09-01    570.086792
2024-10-01    566.732605
2024-11-01    600.528809
2024-12-01    584.114075
Name: Adj Close, Length: 234, dtype: float64

In [17]:
spy_rets = spy_prices.pct_change().dropna() # computing returns and dropping NAs (most importantly, dropping the first observation)
spy_rets.rename('SPY returns', inplace = True) # renaming the column as now we have returns and not prices
spy_rets

Date
2005-08-01   -0.009374
2005-09-01    0.003752
2005-10-01   -0.019495
2005-11-01    0.043953
2005-12-01   -0.007176
                ...   
2024-08-01    0.023365
2024-09-01    0.017883
2024-10-01   -0.005884
2024-11-01    0.059633
2024-12-01   -0.027334
Name: SPY returns, Length: 233, dtype: float64

Next, we upload VIX future (UX1 index) term structure data downloaded from Bloomberg as of January 2025, for the next months of 2025. The idea is to use this data for forecasting SPY returns for the year 2025 and, more in general, to predict scenarios of what will happen in the upcoming months and how our strategy works:

In [18]:
vix_data_ahead = pd.read_excel('VIX_term_structure_20250117.xlsx', header = 0) # uploading VIX futures ahead term structure data
vix_data_ahead = vix_data_ahead.drop(vix_data_ahead.index[0]) # removing first unnecessary row

for i in range(2, len(vix_data_ahead) + 1): # turning the Period column into datetime, needed for further analyses
    vix_data_ahead['Period'][i] = datetime.strptime(vix_data_ahead['Period'][i], '%m/%Y')
    
vix_data_ahead

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  vix_data_ahead['Period'][i] = datetime.strptime(vix_data_ahead['Period'][i], '%m/%Y')
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  vix_data_ahead['Period'][i] = datetime.strptime(vix_data_ahead['Period'][i], '%m/%Y')
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  vix_data_ahead['Period'][i] = datetime.strptime(vix_data_ahead['Period'][i], '%m/%Y')
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://panda

Unnamed: 0,Tenor,Ticker,Period,Last Price,Days to expiration
1,Spot,VIX Index,Spot,15.97,0.0
2,1M,UXF5 Index,2025-01-01 00:00:00,16.1792,30.0
3,1M,UXG5 Index,2025-02-01 00:00:00,17.2382,60.0
4,2M,UXH5 Index,2025-03-01 00:00:00,17.8351,90.0
5,3M,UXJ5 Index,2025-04-01 00:00:00,18.1962,120.0
6,4M,UXK5 Index,2025-05-01 00:00:00,18.4005,150.0
7,5M,UXM5 Index,2025-06-01 00:00:00,18.5484,180.0
8,6M,UXN5 Index,2025-07-01 00:00:00,18.825,210.0
9,7M,UXQ5 Index,2025-08-01 00:00:00,18.8,240.0
10,8M,UXU5 Index,2025-09-01 00:00:00,19.1,270.0


In the table above, the Tenor column represents the maturity of each row of the term structure, ahead in time. The Ticker column gives information on which is the Bloomberg ticker used for the specific row, then a datetime version of the maturity follows in the Period column, together with price and days to expiration. We are basing our research as if we are at the beginning of January 2025 and we are going to be stuck there in time, as doing this in real time would be too computationally expensive.

In addition to that, we also upload the historical VIX futures term structure, as we will use this for training our model, together with historical past returns of the SPY. The process of uploading and data manipulation is exactly the same as for the ahead term structure data:

In [19]:
vix_data_hist = pd.read_excel('hist_vix_term_structure.xlsx', header = 0) # uploading VIX futures ahead term structure data
vix_data_hist = vix_data_hist.drop(vix_data_hist.index[0]) # removing first unnecessary row

for i in range(1, len(vix_data_hist) + 1): # turning the Period column into datetime, needed for further analyses
    vix_data_hist['Period'][i] = datetime.strptime(vix_data_hist['Period'][i], '%m/%Y')
    
vix_data_hist = vix_data_hist.sort_values(by = 'Period') # in the historical data, futures prices are not ordered properly in ascending or descending order
vix_data_hist

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  vix_data_hist['Period'][i] = datetime.strptime(vix_data_hist['Period'][i], '%m/%Y')
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  vix_data_hist['Period'][i] = datetime.strptime(vix_data_hist['Period'][i], '%m/%Y')
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  vix_data_hist['Period'][i] = datetime.strptime(vix_data_hist['Period'][i], '%m/%Y')
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pyda

Unnamed: 0,Tenor,Ticker,Period,Last Price,Days past
166,5Y,UXG05 Index,2005-02-01 00:00:00,11.4800,7170.0
167,61M,UXH05 Index,2005-03-01 00:00:00,12.2200,7140.0
168,63M,UXK05 Index,2005-05-01 00:00:00,13.1600,7080.0
169,66M,UXQ05 Index,2005-08-01 00:00:00,14.4300,6990.0
162,6Y,UXG06 Index,2006-02-01 00:00:00,12.8700,6810.0
...,...,...,...,...,...
5,292M,UXM4 Index,2024-06-01 00:00:00,16.6821,210.0
6,293M,UXN4 Index,2024-07-01 00:00:00,17.1011,180.0
7,294M,UXQ4 Index,2024-08-01 00:00:00,17.3603,150.0
8,295M,UXU4 Index,2024-09-01 00:00:00,17.6000,120.0


Now we are going to define three functions which will be used to compute the indicators that are going to be part of the innovative strategy. The rolling volatility, rolling correlation and rate of change indicators will be used to double check that the switches between momentum and mean-reversion strategies make sense or to question them.

In [20]:
# creating functions for the three indicators which will compose the innovative part of our approach

def rolling_std(series, time_interval): # defining a function for volatility, which we consider as rolling standard deviation
    return series.rolling(window = time_interval).std()

def rolling_correlation(series1, series2, time_interval): # defining a function for the rolling correlation
    return series1.rolling(window = time_interval).corr(series2)

def ROC(series): # defining a function for the rate of change, that will be primarly used for the VIX slope
    return series.pct_change()

Furthermore, we are going to define a function that will create two new term structures, one for the historical one and the other for the ahead in time one, with newly calculated prices.
The prices will be calculated via linear interpolation with a targeted maturity, taken as the midpoint between two maturities, so that the new term structure is characterized by constant maturity.

In [21]:
def constant_mat_term_structure(vix_term_structure):
    """
    This function computes the linear interpolation of VIX futures prices 
    for generating a constant maturity term structure.
    
    It takes the VIX dataframe, be it either the historical one or the future one, as input,
    and will return the interpolated prices of the VIX futures.
    """
    
    constant_maturity_prices = [] # allocating memory for the prices computed with constant maturity approach
    for i in range(1, len(vix_term_structure)): # looping over all observations of the dataframe fed to the function
        maturity1 = vix_term_structure.loc[i - 1, "Days to expiration"] # first maturity
        maturity2 = vix_term_structure.loc[i, "Days to expiration"] # second maturity
        price1 = vix_term_structure.loc[i - 1, "Last Price"] # first price
        price2 = vix_term_structure.loc[i, "Last Price"] # second price
        
        target_maturity = (maturity1 + maturity2) / 2 # the target maturity is identified as the middle point between maturity 1 and 2
        
        # formula decided to be used for price interpolation
        interpolated_price = price1 * (maturity2 - target_maturity) / (maturity2 - maturity1) + price2 * (target_maturity - maturity1) / (maturity2 - maturity1)
        constant_maturity_prices.append(interpolated_price) # appending each result of the loop in the initially created variable
    
    constant_maturity_prices.insert(0, None) # adding nan for the first row since it doesn't have a previous contract
    vix_term_structure["Constant Maturity Price"] = constant_maturity_prices # adding the newly computed prices to the original table
    return vix_term_structure

Running the function for both historical and 2025 VIX futures term structures:

In [22]:
vix_data_ahead = vix_data_ahead.reset_index(drop = True) # resetting the index of the ahead term structure for data manipulation
constant_maturity_ahead = constant_mat_term_structure(vix_data_ahead) # computing constant maturity ahead term structure

vix_data_hist = vix_data_hist.reset_index(drop = True) # resetting the index of the ahead term structure for data manipulation
vix_data_hist.rename(columns = {'Days past': 'Days to expiration'}, inplace = True) # column renaming for data manipulation (the function admits 'Days to expiration', we could have changed the function but it would require more than one line as here)
constant_maturity_hist = constant_mat_term_structure(vix_data_hist) # computing constant maturity historical term structure
vix_data_hist.rename(columns = {'Days to expiration': 'Days past'}, inplace = True) # column renaming for data manipulation (the function admits 'Days to expiration', we could have changed the function but it would require more than one line as here)

In [23]:
print("Constant Maturity Term Structure ahead:") # printing the new term structure, made of prices at constant maturity
constant_maturity_ahead[['Period', 'Constant Maturity Price']]

Constant Maturity Term Structure ahead:


Unnamed: 0,Period,Constant Maturity Price
0,Spot,
1,2025-01-01 00:00:00,16.0746
2,2025-02-01 00:00:00,16.7087
3,2025-03-01 00:00:00,17.53665
4,2025-04-01 00:00:00,18.01565
5,2025-05-01 00:00:00,18.29835
6,2025-06-01 00:00:00,18.47445
7,2025-07-01 00:00:00,18.6867
8,2025-08-01 00:00:00,18.8125
9,2025-09-01 00:00:00,18.95


In [24]:
print("Constant Maturity Term Structure historical:") # printing the new term structure, made of prices at constant maturity
constant_maturity_hist[['Period', 'Constant Maturity Price']]

Constant Maturity Term Structure historical:


Unnamed: 0,Period,Constant Maturity Price
0,2005-02-01 00:00:00,
1,2005-03-01 00:00:00,11.85000
2,2005-05-01 00:00:00,12.69000
3,2005-08-01 00:00:00,13.79500
4,2006-02-01 00:00:00,13.65000
...,...,...
164,2024-06-01 00:00:00,16.51625
165,2024-07-01 00:00:00,16.89160
166,2024-08-01 00:00:00,17.23070
167,2024-09-01 00:00:00,17.48015


If one compares the two term structures just obtained with the previous versions originally uploaded here, not at constant maturity, then it can be observed that the interpolated prices and the original prices are not so distant from each other.

Calculating the slope of the constant maturity VIX futures term structure in a normalized way, meaning with the difference in prices at the numerator and the difference in days at the denominator:

In [25]:
# computing the slope of the ahead term structure
constant_maturity_ahead['vix_slope'] = constant_maturity_ahead["Constant Maturity Price"].diff() / constant_maturity_ahead["Days to expiration"].diff()
constant_maturity_ahead['vix_slope']

0         NaN
1         NaN
2    0.021137
3    0.027598
4    0.015967
5    0.009423
6    0.005870
7    0.007075
8    0.004193
9    0.004583
Name: vix_slope, dtype: float64

Applying the same to the historical constant maturity term structure (to evaluate if a negative sign is needed in front of this):

In [26]:
# computing the slope of the historical term structure
constant_maturity_hist["vix_slope"] = constant_maturity_hist["Constant Maturity Price"].diff() / constant_maturity_hist["Days past"].diff() # maybe a negative sign in front of it?
constant_maturity_hist["vix_slope"]

0           NaN
1           NaN
2     -0.014000
3     -0.012278
4      0.000806
         ...   
164   -0.013038
165   -0.012512
166   -0.011303
167   -0.008315
168   -0.052328
Name: vix_slope, Length: 169, dtype: float64

An initial calculation of the three indicators follows. In order to compute the rolling correlation between the slope of the VIX futures term structure, we are going to combine that data with the SPY returns in a unique dataframe for ease of computation.

In [27]:
# now the idea is to merge the historical VIX dataframe with returns from SPY, so that we have aligned data and we can compute correlation
constant_maturity_hist = constant_maturity_hist.set_index('Period', drop = True) # setting the dates as index
correlation_dataset = constant_maturity_hist.join(spy_rets, how = 'left') # adding the returns from SPY to the historical VIX dataframe
correlation_dataset = correlation_dataset.drop(['Tenor', 'Ticker', 'Last Price', 'Days past', 'Constant Maturity Price'], axis = 1) # dropping unnecessary columns for correlation analysis

correlation_dataset

Unnamed: 0_level_0,vix_slope,SPY returns
Period,Unnamed: 1_level_1,Unnamed: 2_level_1
2005-02-01,,
2005-03-01,,
2005-05-01,-0.014000,
2005-08-01,-0.012278,-0.009374
2006-02-01,0.000806,0.005726
...,...,...
2024-06-01,-0.013038,0.031951
2024-07-01,-0.012512,0.015374
2024-08-01,-0.011303,0.023365
2024-09-01,-0.008315,0.017883


Using the previously defined three functions, we are computing the innovative indicators that we are going to use as a double check after the momentum transformer:

In [29]:
vol_indicator = rolling_std(spy_rets, time_interval = 5).dropna() # calculating volatility indicator on the returns of SPY
correlation_indicator = rolling_correlation(correlation_dataset['vix_slope'], correlation_dataset['SPY returns'], time_interval = 5).dropna() # computing correlation indicator between SPY returns and historical VIX slope
roc_indicator = ROC(correlation_dataset['vix_slope']).dropna() # calculating rate of change of the historical VIX futures constant maturity term structure slope
roc_indicator.replace([np.inf, -np.inf], np.nan, inplace = True) # replacing infinite values with nan, as there are a couple of zeros in the slope columns which generate inf
roc_indicator.dropna(inplace = True) # dropping again rows with nan values

In [31]:
print("Historical rolling volatility indicator:")
print(vol_indicator)

Historical rolling volatility indicator:
Date
2005-12-01    0.024689
2006-01-01    0.026147
2006-02-01    0.026042
2006-03-01    0.020082
2006-04-01    0.013524
                ...   
2024-08-01    0.032965
2024-09-01    0.014218
2024-10-01    0.014054
2024-11-01    0.023751
2024-12-01    0.032741
Name: SPY returns, Length: 229, dtype: float64


In [32]:
print("Historical rolling correlation indicator:")
print(correlation_indicator)

Historical rolling correlation indicator:
Period
2006-08-01    0.411577
2007-02-01    0.156482
2007-03-01    0.276884
2007-04-01   -0.271600
2007-05-01   -0.628516
                ...   
2024-06-01    0.292605
2024-07-01    0.264485
2024-08-01    0.620177
2024-09-01   -0.911279
2024-10-01    0.866637
Length: 162, dtype: float64


In [33]:
print("Historical VIX slope rate of change indicator:")
print(roc_indicator)

Historical VIX slope rate of change indicator:
Period
2005-08-01    -0.123016
2006-02-01    -1.065611
2006-03-01    25.068966
2006-05-01    -1.690476
2006-08-01    -0.022989
                ...    
2024-06-01    -0.420519
2024-07-01    -0.040394
2024-08-01    -0.096577
2024-09-01    -0.264376
2024-10-01     5.293245
Name: vix_slope, Length: 163, dtype: float64


# **Step 2: Training the Momentum Transformer model**

In order to train the momentum transformer model, we take the correlation_dataset dataframe created above, because it contains everything we need for the model training: namely, our x variable is the slope of the historical VIX futures term structure data, while our y variable is represented by the SPY returns.

In [34]:
correlation_dataset = correlation_dataset.dropna() # dropping the first NAs due to rolling window
spy_rets_training = correlation_dataset['SPY returns'] # identifying the historical SPY returns for training
vix_slope_training = correlation_dataset['vix_slope'] # identifying the historical VIX slope for training

Computing the three indicators for the training part:

In [35]:
vol_indicator_training = rolling_std(spy_rets_training, time_interval = 5).dropna() # computing volatility indicator for the training part
correlation_indicator_training = rolling_correlation(spy_rets_training, vix_slope_training, time_interval = 5).dropna() # computing correlation indicator for the training part
roc_indicator_training = ROC(vix_slope_training).dropna() # computing roc indicator for the training part
roc_indicator_training.replace([np.inf, -np.inf], np.nan, inplace = True) # replacing infinite values with nan, as there are a couple of zeros in the slope columns which generate inf
roc_indicator_training.dropna(inplace = True) # dropping again rows with nan values

Combining the two variables in a proper dataframe, which is called also differently, so that it is clear that now we are starting to group the data needed for the momentum transformer model.

In [36]:
transformer_train_data = pd.DataFrame({ # combining historical vix slope and SPY returns data for training
    'Historical VIX futures slope': vix_slope_training,
    'Stock Returns': spy_rets_training
}).dropna()

transformer_train_data

Unnamed: 0_level_0,Historical VIX futures slope,Stock Returns
Period,Unnamed: 1_level_1,Unnamed: 2_level_1
2005-08-01,-0.012278,-0.009374
2006-02-01,0.000806,0.005726
2006-03-01,0.021000,0.012477
2006-05-01,-0.014500,-0.030121
2006-08-01,-0.014167,0.021822
...,...,...
2024-06-01,-0.013038,0.031951
2024-07-01,-0.012512,0.015374
2024-08-01,-0.011303,0.023365
2024-09-01,-0.008315,0.017883


Moving on, we are going to use StandardScaler to standardize the two variables that are going to be part of the model. In addition to this, we are also feeding to the model the vix slope variable in sequence: instead of providing the data as it is, we are providing sequences of the time series data, so that the model tries to understand the trends in the data over the specified length of the sequence. The idea is that, this way, the model will try to learn the data not taking each data point individually but rather trying to understand the temporal dependencies.
In our case, if the length of the sequence is equal to 5, it means the model is learning with sequences of 5-months data. Given this, the length of the sequence can be considered as a hyperparameter and we are going to explore also other possible options.

In [37]:
scaler_transformer = StandardScaler() # activating the scaler for standardizing the two variables
scaled_features_transformer = scaler_transformer.fit_transform(transformer_train_data) # standardizing

X_transformer, y_transformer = [], [] # pre-allocating memory for appending standardized values
sequence_length = 5 # instead of considering single data points, deciding for the length of a sequence of consecutive observations to fed the model with, in order to try to capture temporal dependencies

for i in range(sequence_length, len(scaled_features_transformer)): # appending
    X_transformer.append(scaled_features_transformer[i-sequence_length:i]) # TAKE VIX_SLOPE AND NOT SCALED_FEATURES???
    y_transformer.append(spy_rets_training.iloc[i])

X_transformer = np.array(X_transformer) # turning the list into a numpy array
y_transformer = np.array(y_transformer) # turning the list into a numpy array

Splitting the two variables into a training and a testing part, for now considering an 80/20 split.

In [38]:
X_train_transformer, X_test_transformer, y_train_transformer, y_test_transformer = train_test_split(
    X_transformer, y_transformer, test_size = 0.2, random_state = 42
) # training-testing dataframes split, for now going with 80/20

Once the dataframes are prepared, it's time to define the approach we are going to use, and to actually run it to fit it on the training part of the model. We decided to opt for a sequential object with two LSTM: the first one with 64 hidden neurons to initially learn from the data and then a second LSTM, with 32 hidden neurons, to redefine the knowledge in a less complex way. 
Sigmoid activation function is used (we are exploring the possibility to change it to classification), Adam is the chosen optimizer and the mean squared error was selected as a loss measure.

In [39]:
transformer_model = Sequential([ # grouping layers into a model with Sequential, so that we have one output for each input in a layer
    LSTM(64, input_shape = (X_transformer.shape[1], X_transformer.shape[2]), return_sequences = True), # first Long-Short Term Memory approach
    Dropout(0.2), # first dropout rate
    LSTM(32, return_sequences = False), # second Long-Short Term Memory approach
    Dropout(0.2), # second dropout rate
    Dense(1, activation = 'sigmoid') # the output is a single value, try to see what happens if you change sigmoid with classification
])

transformer_model.compile(optimizer = 'adam', loss = 'mean_squared_error') # compiling with adam optimizer and mean-squared error loss
res = transformer_model.fit(X_train_transformer, y_train_transformer, # fitting the model on the testing part of the dataframes
                            epochs = 20, batch_size = 16, 
                            validation_data = (X_test_transformer, y_test_transformer)) # validation part (IS THIS REALLY NEEDED HERE?)

  super().__init__(**kwargs)


Epoch 1/20
[1m8/8[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m8s[0m 193ms/step - loss: 0.2420 - val_loss: 0.2254
Epoch 2/20
[1m8/8[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 24ms/step - loss: 0.2120 - val_loss: 0.1945
Epoch 3/20
[1m8/8[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 24ms/step - loss: 0.1797 - val_loss: 0.1425
Epoch 4/20
[1m8/8[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 24ms/step - loss: 0.1159 - val_loss: 0.0633
Epoch 5/20
[1m8/8[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 25ms/step - loss: 0.0428 - val_loss: 0.0097
Epoch 6/20
[1m8/8[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 25ms/step - loss: 0.0069 - val_loss: 0.0026
Epoch 7/20
[1m8/8[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 24ms/step - loss: 0.0021 - val_loss: 0.0023
Epoch 8/20
[1m8/8[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 24ms/step - loss: 0.0019 - val_loss: 0.0023
Epoch 9/20
[1m8/8[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m 

Extracting forecasts and calculating the mean squared error between them and the actually testing part of the SPY returns:

In [40]:
forecasts = transformer_model.predict(X_test_transformer) # we can eventually check the forecasts on the testing part of the x variables
rmse = np.sqrt(mean_squared_error(y_test_transformer, forecasts)) # calculating root mean squared error as error measure between testing part of returns and forecasts
print(rmse)

[1m2/2[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 693ms/step
0.04827738524774549


The MSE value obtained is around 0.048, which is a small enough value to state that the forecasts of the SPY returns seem to be accurate.

# **Step 3: Using trained model for forecasting 2025 scenarios**

As of now, given that it is impossible to have true values of SPY returns for the 2025 as they still don't exist, instead of setting up another machine learning/deep learning model to forecast them, our proposal is to use the estimates of SPY prices, for the whole 2025, provided by the Economy Forecast Agency (EFA). Their website is the following: https://usdforecast.com/ , and the specific estimates we are taking are here:  https://longforecast.com/spy-stock

In [41]:
spy_prices_pred_2025 = pd.read_excel('spy_pred_2025.xlsx') # uploading predicted prices of SPY for 2025
spy_prices_pred_2025 = spy_prices_pred_2025[['Month', 'Close']] # taking only the month and the close prices
spy_prices_pred_2025

Unnamed: 0,Month,Close
0,Feb,603
1,Mar,605
2,Apr,621
3,May,604
4,Jun,637
5,Jul,640
6,Aug,646
7,Sep,679
8,Oct,690
9,Nov,727


In [42]:
forward_vix_slope = constant_maturity_ahead[['Period', 'vix_slope']] # storing the ahead vix slope in a new variable
forward_vix_slope

Unnamed: 0,Period,vix_slope
0,Spot,
1,2025-01-01 00:00:00,
2,2025-02-01 00:00:00,0.021137
3,2025-03-01 00:00:00,0.027598
4,2025-04-01 00:00:00,0.015967
5,2025-05-01 00:00:00,0.009423
6,2025-06-01 00:00:00,0.00587
7,2025-07-01 00:00:00,0.007075
8,2025-08-01 00:00:00,0.004193
9,2025-09-01 00:00:00,0.004583


# **Step 4: Backtesting**