# Build and Train Long, Short-Term Memory (LSTM) Models
Author: Amaris Williams, PhD

Date: October 2025

This analysis is loosely based on https://doi.org/10.21203/rs.3.rs-7509723/v1
### Set Up Environment

In [1]:
# set up environment
import numpy as np
import pandas as pd
import tensorflow as tf
import keras
from keras import activations
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense, Dropout
import matplotlib.pyplot as plt
import seaborn as sns
from tensorflow.keras.callbacks import EarlyStopping
import joblib

sns.set_style('darkgrid')
np.random.seed(222)


2025-11-07 05:07:48.302191: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2025-11-07 05:07:49.031023: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2025-11-07 05:07:50.162356: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.


### Load data processed in notebook 0

In [2]:
path = "/home/anw/Documents/Python/Data/"
fs1 = pd.read_csv(str(path) + "fs1train.csv", header = [0,1], index_col = 0)
fs2 = pd.read_csv(str(path) + "fs2train.csv", header = [0,1], index_col = 0)
fs3 = pd.read_csv(str(path) + "fs3train.csv", header = [0,1], index_col = 0)
fs4 = pd.read_csv(str(path) + "fs4train.csv", header = [0,1], index_col = 0)
y = np.load(str(path) + "snp500sectorsytrain.npy")
pd.set_option('display.max_columns', None)
print(fs1.shape)
print(fs2.shape)
print(fs3.shape)
print(fs4.shape)
fs1

(3241, 12)
(3241, 11)
(3241, 122)
(3241, 121)


Unnamed: 0_level_0,Close,Volume,log_ret1,sigma_roll1,log_ret5,sigma_roll5,log_ret22,sigma_roll22,log_ret66,log_ret132,sigma_roll132,High
Unnamed: 0_level_1,^SPX,^SPX,^SPX,^SPX,^SPX,^SPX,^SPX,^SPX,^SPX,^SPX,^SPX,^VIX
Date,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2
2010-07-14,0.012789,0.453169,0.173274,0.001215,0.285743,0.089897,0.305275,0.185116,-0.205760,-0.245360,0.357834,0.216518
2010-07-15,0.013138,0.456318,0.185703,0.009364,0.240138,0.077959,0.314743,0.185058,-0.207445,-0.250866,0.357734,0.235425
2010-07-16,0.004710,0.530982,-0.094402,0.229083,0.036918,0.158570,0.149257,0.194752,-0.295990,-0.336492,0.367955,0.247505
2010-07-19,0.006409,0.409912,0.229586,0.046721,0.066108,0.161207,0.169823,0.195809,-0.311151,-0.330827,0.368188,0.236213
2010-07-20,0.009671,0.472437,0.279166,0.088928,0.044292,0.153248,0.201585,0.199586,-0.280057,-0.306467,0.369613,0.237526
...,...,...,...,...,...,...,...,...,...,...,...,...
2023-05-25,0.827895,0.415752,0.254942,0.068306,0.042524,0.073869,0.356084,0.110028,0.162576,-0.010178,0.245227,0.139706
2023-05-26,0.842343,0.372420,0.294008,0.101563,0.122938,0.096505,0.409116,0.117050,0.205592,0.002151,0.246651,0.134585
2023-05-30,0.842362,0.423846,0.174855,0.000130,0.122165,0.096502,0.348049,0.099225,0.189916,0.026018,0.245598,0.118566
2023-05-31,0.835510,0.599475,0.118316,0.048000,0.150946,0.084949,0.302797,0.097702,0.203129,0.017299,0.246027,0.119354


In [3]:
# need to convert dfs to 3 dimensions for LSTM model fitting
def create_3Darray(Xdata, ydata, lookback):
    if len(Xdata) == len(ydata):
        X, y = [], []
        for i in range(len(Xdata) - lookback):
            X.append(Xdata[i:i+lookback])      # sequence of lookback timesteps
            y.append(ydata[i+lookback])
        return np.array(X), np.array(y)
    else:
        raise Exception("Xdata must be same length as ydata.")

lookback = 22
fs1, y = create_3Darray(fs1, y,lookback)
fs2, hold = create_3Darray(fs2, np.zeros(len(fs2)), lookback)
fs3, hold = create_3Darray(fs3, np.zeros(len(fs3)), lookback)
fs4, hold = create_3Darray(fs4, np.zeros(len(fs4)), lookback)
del hold

print(f' Feature Set 1: {fs1.shape[2]} features')
print(f' Feature Set 2: {fs2.shape[2]} features')
print(f' Feature Set 3: {fs3.shape[2]} features')
print(f' Feature Set 4: {fs4.shape[2]} features')
print(y.shape)

 Feature Set 1: 12 features
 Feature Set 2: 11 features
 Feature Set 3: 122 features
 Feature Set 4: 121 features
(3219, 3)


## Build the Models

Info on LSTMs: https://medium.com/analytics-vidhya/lstms-explained-a-complete-technically-accurate-conceptual-guide-with-keras-2a650327e8f2

* Increasing ```units``` relative to number of features allows the model to calculate more relationships between features.
    * Some response to a question said the number of units should be a power of 2, such as 32, 64, 128, etc., but did not offer an explanation why.
* ```return_sequences``` determines whether all hidden states are returned (```True```) or only the last one (```False```).
    * hidden state is the numerical representation of the input data after being passed through a sigmoid function and multiplied by the cell state (defined below), which has been passed through a tangent function.
    * should be set to ```True``` for lower levels of a stacked LSTM architecture (passing the results from one LSTM into another).
    * should be set to ```True``` for the last layer if you want an output the same dimensions as your input data. If you want a single number or choice as the output, this can be ```False``` on the last layer.
* ```return_states``` determines whether all cell states are returned (```True```) or none (```False```).
    * cell state is the numerical representation of all past input data after being processed (multiple times) by the "forget" weights. The "older" input data (data from several steps ago) has been processed more times than "newer" input data. Low forget weights (closer to 0) make the old data insignificant (forgotten), and high forget weights (closer to 1) keep the size of old data (remember).
    * this value is almost always set to ```False``` (the default).

Because of the difference in number of features, the models for feature sets 1 and 2 will have different architectures (fewer units and layers) from the models for feature sets 3 and 4.  

### Train and Test the LSTM Model for Feature Set 1
Feature Set 1 contains technical features from ^SPX and the high of ^VIX.

In [4]:
fs1model = Sequential()
fs1model.add(keras.Input(shape=(fs1.shape[1], fs1.shape[2])))
fs1model.add(LSTM(units=64, return_sequences=True))
fs1model.add(LSTM(units=32))
# condense the output of the layers into three predictions: fut_sigma_1, fut_sigma_5, and fut_sigma_22
fs1model.add(Dense(units=3))  
fs1model.compile(optimizer='adam', loss='mean_squared_error')

es = EarlyStopping(monitor="val_loss", patience=5, restore_best_weights=True)

fs1model.fit(fs1, y,  # the training data
             epochs=100,  # maximum number of training rounds
             batch_size=1,  # want to generate one number for each fut_sigma in testing
             verbose=2, 
             validation_split=0.1,  # % training data set aside for validation
             callbacks=es,  # stop early if model not improving anymore
             shuffle = False)  # don't shuffle data each epoch, as we have time series data

I0000 00:00:1762510071.995214    4315 gpu_device.cc:2020] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 9525 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 4070, pci bus id: 0000:01:00.0, compute capability: 8.9


Epoch 1/100


2025-11-07 05:07:53.528048: I external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:473] Loaded cuDNN version 91301


2897/2897 - 14s - 5ms/step - loss: 0.0032 - val_loss: 0.0044
Epoch 2/100
2897/2897 - 13s - 4ms/step - loss: 0.0056 - val_loss: 0.0044
Epoch 3/100
2897/2897 - 7s - 2ms/step - loss: 0.0060 - val_loss: 0.0045
Epoch 4/100
2897/2897 - 10s - 4ms/step - loss: 0.0059 - val_loss: 0.0043
Epoch 5/100
2897/2897 - 11s - 4ms/step - loss: 0.0047 - val_loss: 0.0042
Epoch 6/100
2897/2897 - 11s - 4ms/step - loss: 0.0049 - val_loss: 0.0045
Epoch 7/100
2897/2897 - 11s - 4ms/step - loss: 0.0038 - val_loss: 0.0047
Epoch 8/100
2897/2897 - 14s - 5ms/step - loss: 0.0044 - val_loss: 0.0046
Epoch 9/100
2897/2897 - 11s - 4ms/step - loss: 0.0042 - val_loss: 0.0051
Epoch 10/100
2897/2897 - 11s - 4ms/step - loss: 0.0038 - val_loss: 0.0055


<keras.src.callbacks.history.History at 0x7c3daf7add30>

In [5]:
fs1test = pd.read_csv(str(path) + "fs1test.csv", header = [0,1], index_col = 0)
ytest = np.load(str(path) + "snp500sectorsytest.npy")

fs1test, ytest = create_3Darray(fs1test, ytest, lookback)
print(fs1test.shape)
print(ytest.shape)

(550, 22, 12)
(550, 3)


In [6]:
guesses = fs1model.predict(fs1test, verbose = 2)
guesses = pd.DataFrame(guesses, columns=["fut_sigma_1","fut_sigma_5","fut_sigma_22"])

ytestdf = pd.DataFrame(ytest, columns=["fut_sigma_1","fut_sigma_5","fut_sigma_22"])

mse1 = np.mean((ytestdf["fut_sigma_1"] - guesses["fut_sigma_1"])**2)
mse5 = np.mean((ytestdf["fut_sigma_5"] - guesses["fut_sigma_5"])**2)
mse22 = np.mean((ytestdf["fut_sigma_22"] - guesses["fut_sigma_22"])**2)

print(f'Scaled Mean Squared Error of 1-Day Prediction: {mse1:.3f}')
print(f'Scaled Mean Squared Error of 5-Day Prediction: {mse5:.3f}')
print(f'Scaled Mean Squared Error of 22-Day Prediction: {mse22:.3f}')

results = pd.DataFrame([mse1,mse5,mse22], columns = ["fs1_MSE_scaled"],
                       index = ["1-Day","5-Day","22-Day"])


18/18 - 0s - 10ms/step
Scaled Mean Squared Error of 1-Day Prediction: 0.004
Scaled Mean Squared Error of 5-Day Prediction: 0.006
Scaled Mean Squared Error of 22-Day Prediction: 0.013


In [7]:
# Back transform y to un-scaled volatility
yscaler = joblib.load(str(path) + 'yscaler.gz')
vol = yscaler.inverse_transform(ytest)
predvol = yscaler.inverse_transform(guesses)

vol = pd.DataFrame(vol, columns=["fut_sigma_1","fut_sigma_5","fut_sigma_22"])
predvol = pd.DataFrame(predvol, columns=["fut_sigma_1","fut_sigma_5","fut_sigma_22"])

mse1 = np.mean((vol["fut_sigma_1"] - predvol["fut_sigma_1"])**2)
mse5 = np.mean((vol["fut_sigma_5"] - predvol["fut_sigma_5"])**2)
mse22 = np.mean((vol["fut_sigma_22"] - predvol["fut_sigma_22"])**2)

print(f'Mean Squared Error of 1-Day Prediction: {mse1:.5f}')
print(f'Mean Squared Error of 5-Day Prediction: {mse5:.5f}')
print(f'Mean Squared Error of 22-Day Prediction: {mse22:.5f}')

results = pd.concat([results, 
                     pd.DataFrame([mse1,mse5,mse22], columns = ["fs1_MSE"],
                                  index = ["1-Day","5-Day","22-Day"])],
                    axis = 1)



Mean Squared Error of 1-Day Prediction: 0.00006
Mean Squared Error of 5-Day Prediction: 0.00005
Mean Squared Error of 22-Day Prediction: 0.00004


### Train and Test LSTM for Feature Set 2
Feature Set 2 contains technical features from ^SPX and NOT ^VIX.

In [8]:
fs2model = Sequential()
fs2model.add(keras.Input(shape=(fs2.shape[1], fs2.shape[2])))
fs2model.add(LSTM(units=64, return_sequences=True))
fs2model.add(LSTM(units=32))
# condense the output of the layers into three predictions: fut_sigma_1, fut_sigma_5, and fut_sigma_22
fs2model.add(Dense(units=3))  
fs2model.compile(optimizer='adam', loss='mean_squared_error')

es = EarlyStopping(monitor="val_loss", patience=5, restore_best_weights=True)

fs2model.fit(fs2, y,  # the training data
             epochs=100,  # maximum number of training rounds
             batch_size=1,  # want to generate one number for each fut_sigma in testing
             verbose=2, 
             validation_split=0.1,  # % training data set aside for validation
             callbacks=es,  # stop early if model not improving anymore
             shuffle = False)  # don't shuffle data each epoch, as we have time series data

Epoch 1/100
2897/2897 - 20s - 7ms/step - loss: 0.0034 - val_loss: 0.0048
Epoch 2/100
2897/2897 - 16s - 5ms/step - loss: 0.0050 - val_loss: 0.0046
Epoch 3/100
2897/2897 - 13s - 5ms/step - loss: 0.0035 - val_loss: 0.0064
Epoch 4/100
2897/2897 - 15s - 5ms/step - loss: 0.0067 - val_loss: 0.0058
Epoch 5/100
2897/2897 - 9s - 3ms/step - loss: 0.0059 - val_loss: 0.0090
Epoch 6/100
2897/2897 - 15s - 5ms/step - loss: 0.0057 - val_loss: 0.0076
Epoch 7/100
2897/2897 - 16s - 6ms/step - loss: 0.0059 - val_loss: 0.0060


<keras.src.callbacks.history.History at 0x7c3daf787d90>

In [9]:
fs2test = pd.read_csv(str(path) + "fs2test.csv", header = [0,1], index_col = 0)
ytest = np.load(str(path) + "snp500sectorsytest.npy")

fs2test, ytest = create_3Darray(fs2test, ytest, lookback)
print(fs2test.shape)
print(ytest.shape)

(550, 22, 11)
(550, 3)


In [10]:
guesses = fs2model.predict(fs2test, verbose = 2)
guesses = pd.DataFrame(guesses, columns=["fut_sigma_1","fut_sigma_5","fut_sigma_22"])

ytestdf = pd.DataFrame(ytest, columns=["fut_sigma_1","fut_sigma_5","fut_sigma_22"])

mse1 = np.mean((ytestdf["fut_sigma_1"] - guesses["fut_sigma_1"])**2)
mse5 = np.mean((ytestdf["fut_sigma_5"] - guesses["fut_sigma_5"])**2)
mse22 = np.mean((ytestdf["fut_sigma_22"] - guesses["fut_sigma_22"])**2)

print(f'Scaled Mean Squared Error of 1-Day Prediction: {mse1:.3f}')
print(f'Scaled Mean Squared Error of 5-Day Prediction: {mse5:.3f}')
print(f'Scaled Mean Squared Error of 22-Day Prediction: {mse22:.3f}')

results = pd.concat([results, 
                     pd.DataFrame([mse1,mse5,mse22], columns = ["fs2_MSE_scaled"],
                       index = ["1-Day","5-Day","22-Day"])],
                    axis = 1)


18/18 - 0s - 8ms/step
Scaled Mean Squared Error of 1-Day Prediction: 0.004
Scaled Mean Squared Error of 5-Day Prediction: 0.009
Scaled Mean Squared Error of 22-Day Prediction: 0.017


In [11]:
# Back transform y to un-scaled volatility
predvol = yscaler.inverse_transform(guesses)

predvol = pd.DataFrame(predvol, columns=["fut_sigma_1","fut_sigma_5","fut_sigma_22"])

mse1 = np.mean((vol["fut_sigma_1"] - predvol["fut_sigma_1"])**2)
mse5 = np.mean((vol["fut_sigma_5"] - predvol["fut_sigma_5"])**2)
mse22 = np.mean((vol["fut_sigma_22"] - predvol["fut_sigma_22"])**2)

print(f'Mean Squared Error of 1-Day Prediction: {mse1:.5f}')
print(f'Mean Squared Error of 5-Day Prediction: {mse5:.5f}')
print(f'Mean Squared Error of 22-Day Prediction: {mse22:.5f}')

results = pd.concat([results, 
                     pd.DataFrame([mse1,mse5,mse22], columns = ["fs2_MSE"],
                                  index = ["1-Day","5-Day","22-Day"])],
                    axis = 1)



Mean Squared Error of 1-Day Prediction: 0.00007
Mean Squared Error of 5-Day Prediction: 0.00007
Mean Squared Error of 22-Day Prediction: 0.00006


### Train and Test LSTM for Feature Set 3
Feature Set 3 contains technical features of each S&P500 sub index and the high of ^VIX.

In [12]:
fs3model = Sequential()
fs3model.add(keras.Input(shape=(fs3.shape[1], fs3.shape[2])))
fs3model.add(LSTM(units=256, return_sequences=True))
fs3model.add(LSTM(units=128, return_sequences=True))
fs3model.add(LSTM(units=64))
# condense the output of the layers into three predictions: fut_sigma_1, fut_sigma_5, and fut_sigma_22
fs3model.add(Dense(units=3))  
fs3model.compile(optimizer='adam', loss='mean_squared_error')

es = EarlyStopping(monitor="val_loss", patience=5, restore_best_weights=True)

fs3model.fit(fs3, y,  # the training data
             epochs=100,  # maximum number of training rounds
             batch_size=1,  # want to generate one number for each fut_sigma in testing
             verbose=2, 
             validation_split=0.1,  # % training data set aside for validation
             callbacks=es,  # stop early if model not improving anymore
             shuffle = False)  # don't shuffle data each epoch, as we have time series data

Epoch 1/100
2897/2897 - 13s - 5ms/step - loss: 0.0025 - val_loss: 0.0043
Epoch 2/100
2897/2897 - 12s - 4ms/step - loss: 0.0049 - val_loss: 0.0043
Epoch 3/100
2897/2897 - 14s - 5ms/step - loss: 0.0037 - val_loss: 0.0044
Epoch 4/100
2897/2897 - 17s - 6ms/step - loss: 0.0051 - val_loss: 0.0044
Epoch 5/100
2897/2897 - 15s - 5ms/step - loss: 0.0054 - val_loss: 0.0044
Epoch 6/100
2897/2897 - 13s - 4ms/step - loss: 0.0052 - val_loss: 0.0047


<keras.src.callbacks.history.History at 0x7c3daf7879d0>

In [13]:
fs3test = pd.read_csv(str(path) + "fs3test.csv", header = [0,1], index_col = 0)
ytest = np.load(str(path) + "snp500sectorsytest.npy")

fs3test, ytest = create_3Darray(fs3test, ytest, lookback)
print(fs3test.shape)
print(ytest.shape)

(550, 22, 122)
(550, 3)


In [14]:
guesses = fs3model.predict(fs3test, verbose = 2)
guesses = pd.DataFrame(guesses, columns=["fut_sigma_1","fut_sigma_5","fut_sigma_22"])

ytestdf = pd.DataFrame(ytest, columns=["fut_sigma_1","fut_sigma_5","fut_sigma_22"])

mse1 = np.mean((ytestdf["fut_sigma_1"] - guesses["fut_sigma_1"])**2)
mse5 = np.mean((ytestdf["fut_sigma_5"] - guesses["fut_sigma_5"])**2)
mse22 = np.mean((ytestdf["fut_sigma_22"] - guesses["fut_sigma_22"])**2)

print(f'Scaled Mean Squared Error of 1-Day Prediction: {mse1:.3f}')
print(f'Scaled Mean Squared Error of 5-Day Prediction: {mse5:.3f}')
print(f'Scaled Mean Squared Error of 22-Day Prediction: {mse22:.3f}')

results = pd.concat([results, 
                     pd.DataFrame([mse1,mse5,mse22], columns = ["fs3_MSE_scaled"],
                       index = ["1-Day","5-Day","22-Day"])],
                    axis = 1)


18/18 - 0s - 12ms/step
Scaled Mean Squared Error of 1-Day Prediction: 0.004
Scaled Mean Squared Error of 5-Day Prediction: 0.007
Scaled Mean Squared Error of 22-Day Prediction: 0.013


In [15]:
# Back transform y to un-scaled volatility
predvol = yscaler.inverse_transform(guesses)

predvol = pd.DataFrame(predvol, columns=["fut_sigma_1","fut_sigma_5","fut_sigma_22"])

mse1 = np.mean((vol["fut_sigma_1"] - predvol["fut_sigma_1"])**2)
mse5 = np.mean((vol["fut_sigma_5"] - predvol["fut_sigma_5"])**2)
mse22 = np.mean((vol["fut_sigma_22"] - predvol["fut_sigma_22"])**2)

print(f'Mean Squared Error of 1-Day Prediction: {mse1:.5f}')
print(f'Mean Squared Error of 5-Day Prediction: {mse5:.5f}')
print(f'Mean Squared Error of 22-Day Prediction: {mse22:.5f}')

results = pd.concat([results, 
                     pd.DataFrame([mse1,mse5,mse22], columns = ["fs3_MSE"],
                                  index = ["1-Day","5-Day","22-Day"])],
                    axis = 1)



Mean Squared Error of 1-Day Prediction: 0.00007
Mean Squared Error of 5-Day Prediction: 0.00005
Mean Squared Error of 22-Day Prediction: 0.00004


### Train and Test LSTM for Feature Set 4
Feature Set 4 contains technical features of each S&P500 sub index without ^VIX.

In [16]:
fs4model = Sequential()
fs4model.add(keras.Input(shape=(fs4.shape[1], fs4.shape[2])))
fs4model.add(LSTM(units=256, return_sequences=True))
fs4model.add(LSTM(units=128, return_sequences=True))
fs4model.add(LSTM(units=64))
# condense the output of the layers into three predictions: fut_sigma_1, fut_sigma_5, and fut_sigma_22
fs4model.add(Dense(units=3))  
fs4model.compile(optimizer='adam', loss='mean_squared_error')

es = EarlyStopping(monitor="val_loss", patience=5, restore_best_weights=True)

fs4model.fit(fs4, y,  # the training data
             epochs=100,  # maximum number of training rounds
             batch_size=1,  # want to generate one number for each fut_sigma in testing
             verbose=2, 
             validation_split=0.1,  # % training data set aside for validation
             callbacks=es,  # stop early if model not improving anymore
             shuffle = False)  # don't shuffle data each epoch, as we have time series data

Epoch 1/100
2897/2897 - 13s - 4ms/step - loss: 0.0027 - val_loss: 0.0046
Epoch 2/100
2897/2897 - 12s - 4ms/step - loss: 0.0039 - val_loss: 0.0052
Epoch 3/100
2897/2897 - 12s - 4ms/step - loss: 0.0051 - val_loss: 0.0219
Epoch 4/100
2897/2897 - 12s - 4ms/step - loss: 0.0049 - val_loss: 0.0064
Epoch 5/100
2897/2897 - 10s - 4ms/step - loss: 0.0035 - val_loss: 0.0047
Epoch 6/100
2897/2897 - 15s - 5ms/step - loss: 0.0045 - val_loss: 0.0048


<keras.src.callbacks.history.History at 0x7c3daf79be10>

In [17]:
fs4test = pd.read_csv(str(path) + "fs4test.csv", header = [0,1], index_col = 0)
ytest = np.load(str(path) + "snp500sectorsytest.npy")

fs4test, ytest = create_3Darray(fs4test, ytest, lookback)
print(fs4test.shape)
print(ytest.shape)

(550, 22, 121)
(550, 3)


In [18]:
guesses = fs4model.predict(fs4test, verbose = 2)
guesses = pd.DataFrame(guesses, columns=["fut_sigma_1","fut_sigma_5","fut_sigma_22"])

ytestdf = pd.DataFrame(ytest, columns=["fut_sigma_1","fut_sigma_5","fut_sigma_22"])

mse1 = np.mean((ytestdf["fut_sigma_1"] - guesses["fut_sigma_1"])**2)
mse5 = np.mean((ytestdf["fut_sigma_5"] - guesses["fut_sigma_5"])**2)
mse22 = np.mean((ytestdf["fut_sigma_22"] - guesses["fut_sigma_22"])**2)

print(f'Scaled Mean Squared Error of 1-Day Prediction: {mse1:.3f}')
print(f'Scaled Mean Squared Error of 5-Day Prediction: {mse5:.3f}')
print(f'Scaled Mean Squared Error of 22-Day Prediction: {mse22:.3f}')

results = pd.concat([results, 
                     pd.DataFrame([mse1,mse5,mse22], columns = ["fs4_MSE_scaled"],
                       index = ["1-Day","5-Day","22-Day"])],
                    axis = 1)


18/18 - 0s - 19ms/step
Scaled Mean Squared Error of 1-Day Prediction: 0.005
Scaled Mean Squared Error of 5-Day Prediction: 0.007
Scaled Mean Squared Error of 22-Day Prediction: 0.014


In [19]:
# Back transform y to un-scaled volatility
predvol = yscaler.inverse_transform(guesses)

predvol = pd.DataFrame(predvol, columns=["fut_sigma_1","fut_sigma_5","fut_sigma_22"])

mse1 = np.mean((vol["fut_sigma_1"] - predvol["fut_sigma_1"])**2)
mse5 = np.mean((vol["fut_sigma_5"] - predvol["fut_sigma_5"])**2)
mse22 = np.mean((vol["fut_sigma_22"] - predvol["fut_sigma_22"])**2)

print(f'Mean Squared Error of 1-Day Prediction: {mse1:.5f}')
print(f'Mean Squared Error of 5-Day Prediction: {mse5:.5f}')
print(f'Mean Squared Error of 22-Day Prediction: {mse22:.5f}')

results = pd.concat([results, 
                     pd.DataFrame([mse1,mse5,mse22], columns = ["fs4_MSE"],
                                  index = ["1-Day","5-Day","22-Day"])],
                    axis = 1)



Mean Squared Error of 1-Day Prediction: 0.00008
Mean Squared Error of 5-Day Prediction: 0.00006
Mean Squared Error of 22-Day Prediction: 0.00005


### Which model is best?

In [20]:
results

Unnamed: 0,fs1_MSE_scaled,fs1_MSE,fs2_MSE_scaled,fs2_MSE,fs3_MSE_scaled,fs3_MSE,fs4_MSE_scaled,fs4_MSE
1-Day,0.003977,6.5e-05,0.004493,7.3e-05,0.004242,6.9e-05,0.005097,8.3e-05
5-Day,0.005733,4.6e-05,0.008551,6.8e-05,0.006796,5.4e-05,0.007011,5.6e-05
22-Day,0.012844,4.1e-05,0.017267,5.5e-05,0.012908,4.1e-05,0.014268,4.6e-05


Feature Set 1 yielded the lowest mean squared error at 2 of 3 time windows among all feature sets. It appears including ^VIX in the feature set improves prediction of actual, observed volatility.

### Save the best model for later use

In [23]:
fs1model.save(str(path) + 'fs1model.keras')