# Modèle GINN

Dans ce notebook, on s'intéresse à l'amélioration éventuelle des performances d'un LSTM classique à l'aide du modèle GINN présenté dans le papier: https://arxiv.org/pdf/2410.00288 \
Nous utilisons ici également les rendements logarithmiques. \
L'architecture GINN que nous avons choisie est similaire à celle du papier: 
un modèle GARCH (1,1) avec moyenne constante qui nous fournit des prédictions de volatilité et de moyenne \
un calcul de la "vraie" volatilité à partir de la moyenne calculée par le modèle GARCH \
un modèle LSTM qui effectue des prédictions de volatilité à partir des "vraies" volatilités et de la volatilité prédite par le GARCH en utilisant une fonction de coût customisée 
 

In [1]:

from arch import arch_model

In [2]:
import tensorflow as tf
import os
from tensorflow.keras import layers, models,Input,regularizers
import pandas as pd
import yfinance as yf
from sklearn.model_selection import train_test_split
import numpy as np

## Chargement des données et division train/test

In [None]:
def load_data(symbol="SPY", start="1994-01-01", end="2021-01-01", interval='1d'): #1 donnée=1jour
    df = yf.download(symbol, start=start, end=end, interval=interval)
    df.reset_index(inplace=True)
    return df

def preprocess_data(df):
    df = df[['Date', 'Close']]
    df.dropna(inplace=True)
    df['Date'] = pd.to_datetime(df['Date'])
    df.set_index('Date', inplace=True)
    return df

def train_test(list_df, train_size=0.7, test_size=0.3):
    result = []
    for df in list_df:
        X = df.index.values.reshape(-1, 1)
        y = df['Close'].values

        X_train, X_test, y_train, y_test = train_test_split(
            X, y, train_size=train_size, test_size=test_size, shuffle=False
        )
        result.append([X_train, X_test, y_train, y_test])
    return result

def etl_pipeline(train_size=0.7, test_size=0.3):
    # Chargement
    gspc = load_data('^GSPC')
    dji = load_data('^DJI')
    nyse = load_data('^NYA')

    # Preprocess
    gspc_close = preprocess_data(gspc)
    dji_close = preprocess_data(dji)
    nyse_close = preprocess_data(nyse)

    list_df_close = [gspc_close, dji_close, nyse_close]

    # Pour faire le split et shift la data de 90 jours
    def train_test_shift(df, train_size, shift_days=90):
        # Split 
        train_size = int(len(df) * train_size)
        X_train = df.iloc[:train_size]
        X_test = df.iloc[train_size:]

        # Shift 
        y_train = df.iloc[shift_days:train_size + shift_days]
        y_test = df.iloc[train_size + shift_days:]

        X_train = X_train.iloc[:-shift_days]
        X_test = X_test.iloc[:-shift_days]

        return X_train, X_test, y_train, y_test

    # Appliquer train_test_shift à tous les dataframes
    splits = [train_test_shift(df, train_size) for df in list_df_close]

    return {
        "GSPC": {"X_train": splits[0][0], "X_test": splits[0][1], "y_train": splits[0][2], "y_test": splits[0][3]},
        "DJI":  {"X_train": splits[1][0], "X_test": splits[1][1], "y_train": splits[1][2], "y_test": splits[1][3]},
        "NYSE": {"X_train": splits[2][0], "X_test": splits[2][1], "y_train": splits[2][2], "y_test": splits[2][3]},
    }


Nous avons choisi 90 jours de shift comme dans l'article

In [4]:
gspc=load_data('^GSPC')
dji=load_data('^DJI')
nyse=load_data('^NYA')

[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed


In [5]:
gspc_close=preprocess_data(gspc)
dji_close=preprocess_data(dji)
nyse_close=preprocess_data(nyse)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df.dropna(inplace=True)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df['Date'] = pd.to_datetime(df['Date'])
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df.dropna(inplace=True)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html

In [6]:
print(gspc_close)

Price             Close
Ticker            ^GSPC
Date                   
1994-01-03   465.440002
1994-01-04   466.890015
1994-01-05   467.549988
1994-01-06   467.119995
1994-01-07   469.899994
...                 ...
2020-12-24  3703.060059
2020-12-28  3735.360107
2020-12-29  3727.040039
2020-12-30  3732.040039
2020-12-31  3756.070068

[6799 rows x 1 columns]


In [None]:

result = etl_pipeline()

# pour GSPC
X_gspc_train = result["GSPC"]["X_train"]
X_gspc_test  = result["GSPC"]["X_test"]
y_gspc_train = result["GSPC"]["y_train"]
y_gspc_test  = result["GSPC"]["y_test"]

# pour DJI 
X_dji_train = result["DJI"]["X_train"]
X_dji_test  = result["DJI"]["X_test"]
y_dji_train = result["DJI"]["y_train"]
y_dji_test  = result["DJI"]["y_test"]

# pour NYSE
X_nyse_train = result["NYSE"]["X_train"]
X_nyse_test  = result["NYSE"]["X_test"]
y_nyse_train = result["NYSE"]["y_train"]
y_nyse_test  = result["NYSE"]["y_test"]


[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df.dropna(inplace=True)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df['Date'] = pd.to_datetime(df['Date'])
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df.dropna(inplace=True)
A value is trying to be 

In [8]:
print(X_dji_train)

Price              Close
Ticker              ^DJI
Date                    
1994-01-03   3756.600098
1994-01-04   3783.899902
1994-01-05   3798.820068
1994-01-06   3803.879883
1994-01-07   3820.770020
...                  ...
2012-07-10  12653.120117
2012-07-11  12604.530273
2012-07-12  12573.269531
2012-07-13  12777.089844
2012-07-16  12727.209961

[4669 rows x 1 columns]


In [9]:
print(y_dji_train)

Price              Close
Ticker              ^DJI
Date                    
1994-05-12   3652.840088
1994-05-13   3659.679932
1994-05-16   3671.500000
1994-05-17   3720.610107
1994-05-18   3732.889893
...                  ...
2013-04-01  14572.849609
2013-04-02  14662.009766
2013-04-03  14550.349609
2013-04-04  14606.110352
2013-04-05  14565.250000

[4759 rows x 1 columns]


In [None]:
from statsmodels.tsa.stattools import adfuller

# pour calculer les log returns
def calculate_log_returns(prices):
    return np.log(prices + 1e-8 / prices.shift(1) +1e-8).fillna(0)

#Fonction qui permet de calculer les moyennes et volatilités selon le modèle GARCH (1,1) à moyenne constante
def garch_predictions(prices_series, forecast_horizon=1):
    mean_predictions = []
    volatility_predictions = []
    prediction_dates = []
    true_volatility = []

    prices_series = prices_series.sort_index()

    # Calculer log returns et scale
    log_returns = calculate_log_returns(prices_series) * 100  # Scaling

    # Check d'erreur:
    if np.all(log_returns == log_returns.iloc[0]):
        raise ValueError("Log returns are constant. Check the input data.")

    adf_result = adfuller(log_returns.dropna())
    print(f"ADF Statistic: {adf_result[0]}")
    print(f"p-value: {adf_result[1]}")

    if adf_result[1] > 0.05:
        print("Warning: Log returns may not be stationary.")

    for i in range(90, len(prices_series)):
        window_prices = prices_series.iloc[i-90:i]
        window_log_returns = calculate_log_returns(window_prices) * 100  # Scale log returns

        model = arch_model(window_log_returns, mean='constant', vol='GARCH', p=1, q=1, rescale=False)
        options = {
        'maxiter': 1000,  # Nb d'itérations max
        'ftol': 1e-6,     # Tolerance pour convergence
        'disp': False     # Enlever les commentaires
        }

        #On fit le modèle 
        model_fit = model.fit(disp='off',options=options)

        #On réalsie les prédictions (l'objet forecast contient les moyennes et volatilités prédites)
        forecast = model_fit.forecast(horizon=forecast_horizon)

        predicted_mean = forecast.mean.iloc[-1].values[0]
        predicted_volatility = forecast.variance.iloc[-1].values[0]

        mean_predictions.append(predicted_mean)
        volatility_predictions.append(predicted_volatility)
        prediction_dates.append(prices_series.index[i])

        # Calcul de la "vraie" volatilité à l'aide de la formule sigma_t**2=(r_t-m_t)**2
        current_log_return = np.log(prices_series.iloc[i] / prices_series.iloc[i-1]) 
        true_volatility.append((current_log_return - predicted_mean) ** 2)

    predictions_df = pd.DataFrame({
        'Date': prediction_dates,
        'Predicted_Mean': mean_predictions,
        'Predicted_Volatility': volatility_predictions,
        'True_Volatility': true_volatility
    }).set_index('Date')

    return predictions_df



In [None]:
#Modèle LSTM 
def create_lstm_model():
    model = models.Sequential([
        Input(shape=(90, 1)),  # Define input shape here
        layers.LSTM(256, return_sequences=True),
        layers.Dropout(0.2),
        layers.LSTM(256, return_sequences=True),
        layers.Dropout(0.2),
        layers.LSTM(256),
        layers.Dense(128),
        layers.BatchNormalization(),
        layers.ReLU(),
        layers.Dense(1, kernel_regularizer=regularizers.l2(1e-4))  # Output is a single value (predicted variance)
    ])
    return model





In [None]:

# Calcul parallèle pour accélérer
os.environ["OMP_NUM_THREADS"] = "8"  
os.environ["TF_NUM_INTRAOP_THREADS"] = "8"
os.environ["TF_NUM_INTEROP_THREADS"] = "8"

tf.config.threading.set_intra_op_parallelism_threads(8)
tf.config.threading.set_inter_op_parallelism_threads(8)

prices_series = X_dji_train
predictions_df = garch_predictions(prices_series)

true_volatility = predictions_df['True_Volatility'].values
predicted_volatility = predictions_df['Predicted_Volatility'].values




ADF Statistic: -2.2485887735694807
p-value: 0.18908523296431257


Positive directional derivative for linesearch
See scipy.optimize.fmin_slsqp for code meaning.

Positive directional derivative for linesearch
See scipy.optimize.fmin_slsqp for code meaning.

Positive directional derivative for linesearch
See scipy.optimize.fmin_slsqp for code meaning.

Positive directional derivative for linesearch
See scipy.optimize.fmin_slsqp for code meaning.

Positive directional derivative for linesearch
See scipy.optimize.fmin_slsqp for code meaning.

Positive directional derivative for linesearch
See scipy.optimize.fmin_slsqp for code meaning.

Positive directional derivative for linesearch
See scipy.optimize.fmin_slsqp for code meaning.

Positive directional derivative for linesearch
See scipy.optimize.fmin_slsqp for code meaning.

Positive directional derivative for linesearch
See scipy.optimize.fmin_slsqp for code meaning.

Positive directional derivative for linesearch
See scipy.optimize.fmin_slsqp for code meaning.

Positive directional derivative for line

In [None]:
from sklearn.preprocessing import StandardScaler

true_volatility = predictions_df['True_Volatility'].values
predicted_volatility = predictions_df['Predicted_Volatility'].values

# Preparer la data pour le LSTM
X = np.array([true_volatility[i-90:i] for i in range(90, len(true_volatility))], dtype=np.float32).reshape(-1, 90, 1)
y = np.vstack((true_volatility[90:], predicted_volatility[90:])).astype(np.float32).T  # Shape: [n_samples, 2]

# Standardize X pour de meilleures perfs
scaler_X = StandardScaler()
X_scaled = np.array([scaler_X.fit_transform(seq.reshape(-1, 1)) for seq in X], dtype=np.float32)  # Reshape each sequence to 2D before scaling

# Standardize y pour de meilleures perfs
scaler_y = StandardScaler()
y_scaled = scaler_y.fit_transform(y)  # Standardize y

# Check 
print(f"X shape: {X_scaled.shape}, X dtype: {X_scaled.dtype}")
print(f"y shape: {y_scaled.shape}, y dtype: {y_scaled.dtype}")




  X = np.array([true_volatility[i-90:i] for i in range(90, len(true_volatility))], dtype=np.float32).reshape(-1, 90, 1)
  y = np.vstack((true_volatility[90:], predicted_volatility[90:])).astype(np.float32).T  # Shape: [n_samples, 2]


X shape: (4489, 90, 1), X dtype: float32
y shape: (4489, 2), y dtype: float32


In [None]:


os.environ["OMP_NUM_THREADS"] = "8"  
os.environ["TF_NUM_INTRAOP_THREADS"] = "8"
os.environ["TF_NUM_INTEROP_THREADS"] = "8"
tf.config.threading.set_intra_op_parallelism_threads(8)
tf.config.threading.set_inter_op_parallelism_threads(8)

# Fonction coût utilisée lors de l'entraînement: 

def custom_loss(predicted_volatility, lambda_param=0.5):
    predicted_volatility = tf.convert_to_tensor(predicted_volatility, dtype=tf.float32)

    def loss(y_true, y_pred):
        batch_size = tf.shape(y_true)[0]
        predicted_vol_batch = predicted_volatility[:batch_size]  

        mse = tf.keras.losses.MeanSquaredError()
        mse_true = mse(y_true, y_pred)
        mse_garch = mse(predicted_vol_batch, y_pred)

        return lambda_param * mse_true + (1 - lambda_param) * mse_garch

    return loss


# Initialisation et Compilation
model = create_lstm_model()
optimizer = tf.keras.optimizers.Adam(learning_rate=0.0005, clipnorm=1.0)  # Clip gradients with L2 norm threshold = 1.0
model.compile(optimizer=optimizer, loss=custom_loss(y[:,1],lambda_param=0.01))

# Entrainement
model.fit(X, y_scaled[:, 0], epochs=5, batch_size=32)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.History at 0x2570666e2c0>

In [None]:
# true_volatility = predictions_df['True_Volatility'].values
# predicted_volatility = predictions_df['Predicted_Volatility'].values

# # Preparing the data for LSTM
# X = np.array([true_volatility[i-90:i] for i in range(90, len(true_volatility))]).reshape(-1, 90, 1)
# y = np.vstack((true_volatility[90:], predicted_volatility[90:])).T

# # Ensure X and y are of type float32
# X = X.astype(np.float32)
# y = y.astype(np.float32)

# # Check shapes and types
# print(f"X shape: {X.shape}, X dtype: {X.dtype}")
# print(f"y shape: {y.shape}, y dtype: {y.dtype}")

# # Model Initialization and Compilation
# model = create_lstm_model()
# model.compile(optimizer='adam', loss="mse")

# # Model Training
# model.fit(X[:,1], y[:,1], epochs=100, batch_size=32)

In [51]:
# Step 1: Predict standardized volatilities using the LSTM model
y_pred_scaled = model.predict(X_scaled)  # Shape: (n_samples, 1)

# Step 2: Inverse transform the predicted volatilities to the original scale
# Create a dummy array with the same shape as y_scaled
dummy_array = np.zeros_like(y_scaled)
dummy_array[:, 1] = y_pred_scaled.flatten()  # Replace the second column with predicted volatilities

# Inverse transform the dummy array
y_pred_inverse = scaler_y.inverse_transform(dummy_array)  # Shape: (n_samples, 2)




In [52]:

# Extract the inverse-transformed predicted volatilities
predicted_volatility_lstm_original_scale = y_pred_inverse[:, 1]  # Shape: (n_samples,)

# Step 3: Combine predicted volatilities with corresponding means
means = predictions_df['Predicted_Mean'].values[90:]  # Shape: (n_samples,)

# Calculate log returns predicted
log_returns_pred = means + np.sqrt(predicted_volatility_lstm_original_scale)

output_df = pd.DataFrame({
    'Date': predictions_df.index[90:],  # Use the corresponding dates
    'Predicted_Mean': means,
    'Predicted_Volatility GINN': predicted_volatility_lstm_original_scale,
    'Log_Return': log_returns_pred
})

# Display the output DataFrame
print(output_df)

           Date  Predicted_Mean  Predicted_Volatility GINN  Log_Return
0    1994-09-20      823.590258                  10.878177  826.888466
1    1994-09-21      823.627961                  10.731100  826.903796
2    1994-09-22      823.659441                  10.582673  826.912543
3    1994-09-23      823.685630                  10.432508  826.915569
4    1994-09-26      823.709501                  10.290258  826.917345
...         ...             ...                        ...         ...
4484 2012-07-10      945.893304                 139.192093  957.691273
4485 2012-07-11      945.835104                 139.378479  957.640970
4486 2012-07-12      946.478102                 139.569885  958.292071
4487 2012-07-13      945.721688                 139.795135  957.545187
4488 2012-07-16      945.688446                 139.985580  957.519997

[4489 rows x 4 columns]


In [56]:
print(y[:,0])

[678329.44 678370.5  678421.06 ... 895825.5  894359.1  894334.06]
