# VIX Modeling

In this notebook, we evaluate three different models of the VIX that we may use in our trading algorithm.

First, we import the libraries we need for our models. Some are more common (yfinance, pandas, numpy, matplotlib), while others are less so.

We import arch.arch_model, where arch is a library that allows us to model the VIX using a GARCH(1, 1) model (see [here](https://bashtage.github.io/arch/univariate/univariate_volatility_modeling.html) and [here](https://bashtage.github.io/arch/univariate/introduction.html) for documentation).

We also import the ARIMA model from the [statsmodels library](https://www.statsmodels.org/dev/generated/statsmodels.tsa.arima.model.ARIMA.html), as well as other functions from [tensorflow](https://keras.io/about/).

In [None]:
# Import libraries
import yfinance as yf
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from arch import arch_model
from statsmodels.tsa.arima.model import ARIMA
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, GRU

Next, we download the necessary VIX data using YahooFinance (as Mihai did in the other notebook).

Quick note: setting progress to `False` hides the progress bar while the data is downloading

In [None]:
# Download VIX data
vix = yf.download('^VIX', start='2000-01-01', end='2023-03-21', progress=False)

We then calculate the log returns of the VIX for our modeling, shifting by 1 to offset the first day of the dataset. The second line drops this part of the table, as it has a NaN value.

In [None]:
# Calculate log returns
vix['log_returns'] = np.log(vix['Adj Close'] / vix['Adj Close'].shift(1))
vix.dropna(inplace=True)

### GARCH model
Using the arch_model from our arch library, we model the log returns of the VIX, setting error and time dependency terms to order of magnitude of 1. We then use the arch_garch_model.fit() method to fit our model. This produces an output table that is printed at the end.

In [None]:
# ARCH/GARCH model
arch_garch_model = arch_model(vix['log_returns'], vol='Garch', p=1, q=1)
arch_garch_fit = arch_garch_model.fit()

### ARIMA model
Next, we use the ARIMA library to model the VIX returns, where we set the lag value to 1 for autoregression, have no time differencing, and set our moving averave component order to 1. This means that our model will use the previous value and error term to predict future values of the time series (the VIX).

In [None]:
# ARIMA model
arima_model = ARIMA(vix['log_returns'], order=(1, 0, 1))
arima_fit = arima_model.fit()

### Machine Learning that I don't understand

**Description from GPT-3.5**: This code prepares data for and trains a GRU (Gated Recurrent Unit) RNN (Recurrent Neural Network) model to forecast future values of the "log_returns" column of a pandas DataFrame named "vix".

The first part of the code prepares the data for the model by reshaping the "log_returns" column into a 2D array using the "reshape" method, and then scaling the data using the "MinMaxScaler" function from the scikit-learn library. This is done to ensure that the data is normalized to a range of 0 to 1, which is a common practice when working with neural networks.

The next part of the code defines a function called "create_dataset" that creates the input and output data for the time series model. The function takes a dataset and a "look_back" parameter, which specifies the number of time steps to look back when creating the input data. The function then creates a set of input and output pairs for the model by iterating over the dataset and selecting a window of "look_back" time steps as the input, and the next time step as the output.

The code then calls the "create_dataset" function to create the input and output data for the model, and reshapes the input data into a 3D array to match the input shape expected by the GRU RNN model.

The next part of the code defines the GRU RNN model using the Keras library. The model consists of a GRU layer with 4 units, followed by a Dense layer with a single output unit. The model is then compiled using the mean squared error loss function and the Adam optimizer.

Finally, the model is trained using the input and output data created earlier, with a batch size of 1 and a verbose level of 0 to suppress output during training. The model is trained for 100 epochs, which means that the input data is passed through the model 100 times during training, with the weights of the model updated after each pass to minimize the mean squared error loss.

**Description from BARD:** The code first prepares the data for the GRU RNN model by scaling it to the range [0, 1] using a MinMaxScaler. Then, it creates a dataset of input and output data by sliding a window of length 1 over the data and taking the first and last elements of each window. The input data is reshaped to (1, 1, look_back) and the GRU RNN model is created. The model is compiled and trained on the data for 100 epochs with a batch size of 1 and a verbose level of 0.

The GRU RNN model is a recurrent neural network that uses a gated recurrent unit (GRU) as its basic unit. The GRU has two gates: a reset gate and an update gate. The reset gate decides how much of the previous state to forget, and the update gate decides how much of the new input to keep. The GRU is able to learn long-term dependencies in the data, which makes it well-suited for tasks such as time series forecasting.

The code uses the mean squared error (MSE) as the loss function and the Adam optimizer. The Adam optimizer is an adaptive learning rate optimizer that is often used in deep learning. It uses a combination of the first and second moments of the gradients to update the weights of the model.

The code trains the model for 100 epochs. An epoch is a single pass through the training data. The batch size is 1, which means that the model is trained on one example at a time. The verbose level is 0, which means that the model does not print any progress reports during training.

In [None]:
# Data preparation for GRU RNN model
train_data = vix['log_returns'].values.reshape(-1, 1)

# Scaling the data
from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler(feature_range=(0, 1))
train_data_scaled = scaler.fit_transform(train_data)

# Prepare data for time series model
def create_dataset(dataset, look_back=1):
    X, Y = [], []
    for i in range(len(dataset) - look_back - 1):
        x = dataset[i:(i + look_back), 0]
        y = dataset[i + look_back, 0]
        X.append(x)
        Y.append(y)
    return np.array(X), np.array(Y)

look_back = 1
X_train, y_train = create_dataset(train_data_scaled, look_back)

X_train = np.reshape(X_train, (X_train.shape[0], 1, X_train.shape[1]))

# GRU RNN model
gru_rnn_model = Sequential()
gru_rnn_model.add(GRU(4, input_shape=(1, look_back)))
gru_rnn_model.add(Dense(1))
gru_rnn_model.compile(loss='mean_squared_error', optimizer='adam')
gru_rnn_model.fit(X_train, y_train, epochs=100, batch_size=1, verbose=0)

### Print Model Summaries
We them print a summary of each of our models, which gives us an indication of their parameters, as well as how well they perform.

In [None]:
# Model summaries
print("ARCH/GARCH Model Summary:")
print(arch_garch_fit.summary())
print("\nARIMA Model Summary:")
print(arima_fit.summary())
print("\nGRU RNN Model Summary:")
gru_rnn_model.summary()

### Example output from first run