# SigTKAN avec Boucle Manuelle - Pr√©diction de S√©ries Temporelles

Ce notebook impl√©mente une version de SigTKAN avec une boucle manuelle (sans h√©riter de la classe RNN de Keras) pour tester l'am√©lioration des performances gr√¢ce √† l'int√©gration des signatures dans le TKAN.

## Objectifs:
1. Impl√©menter SigTKAN avec une boucle manuelle
2. Comparer les performances avec TKAN standard et autres mod√®les
3. Tester sur la pr√©diction de volume/volatilit√© de cryptomonnaies

In [None]:
# üßπ Nettoyage et setup initial
import os, sys, shutil

# 1. Revenir √† la racine
os.chdir("/content")
print("üìç R√©pertoire actuel :", os.getcwd())

# 2. Supprimer toute copie existante de SigKAN
if os.path.exists("sigtkan"):
    shutil.rmtree("sigtkan")
    print("üßπ Dossier sigtkan supprim√©")

# 3. Cloner le d√©p√¥t GitHub
!git clone https://github.com/julienmoury/sigtkan.git
%cd TKAN

# 4. Ajouter le projet au PYTHONPATH
sys.path.append(os.getcwd())

# 5. Installer les d√©pendances
if os.path.exists("requirements.txt"):
    %pip install -r requirements.txt
else:
    print("‚ö†Ô∏è Pas de requirements.txt trouv√©")

# 6. Afficher o√π on est et ce qu‚Äôon a
print("üìÇ R√©pertoire courant :", os.getcwd())
print("üìÅ Contenu :", os.listdir())

In [None]:
# Installation des d√©pendances
import os
import sys
import warnings
warnings.filterwarnings('ignore')

# Installer les packages n√©cessaires si pas d√©j√† fait
try:
    import keras_efficient_kan
    import keras_sig
except ImportError:
    print("Installation des d√©pendances manquantes...")
    os.system("pip install keras-efficient-kan keras-sig")

import time
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from IPython.display import display

import tensorflow as tf
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input, Dense, Layer
from tensorflow.keras.optimizers import Adam
from sklearn.metrics import r2_score, mean_squared_error, mean_absolute_error

# Configuration pour la reproductibilit√©
tf.keras.utils.set_random_seed(42)
tf.config.experimental.enable_op_determinism()

print("TensorFlow version:", tf.__version__)
print("Setup termin√©!")

In [None]:
# Import des modules locaux
from keras_efficient_kan import KANLinear
from keras_sig import SigLayer
from sigtkan import SigTKANCell  # Pour comparaison avec la version RNN

# Import des autres mod√®les pour comparaison
try:
    from tkan import TKAN
    from sigkan import SigKAN
    print("Modules TKAN et SigKAN import√©s avec succ√®s")
except ImportError as e:
    print(f"Erreur d'import: {e}")
    print("Certains modules de comparaison ne sont pas disponibles")

In [None]:
class MinMaxScaler:
    """Scaler MinMax personnalis√© pour g√©rer diff√©rentes dimensions de donn√©es."""
    
    def __init__(self, feature_axis=None, minmax_range=(0, 1)):
        self.feature_axis = feature_axis
        self.min_ = None
        self.max_ = None
        self.scale_ = None
        self.minmax_range = minmax_range

    def fit(self, X):
        if X.ndim == 3 and self.feature_axis is not None:
            axis = tuple(i for i in range(X.ndim) if i != self.feature_axis)
            self.min_ = np.min(X, axis=axis)
            self.max_ = np.max(X, axis=axis)
        elif X.ndim == 2:
            self.min_ = np.min(X, axis=0)
            self.max_ = np.max(X, axis=0)
        elif X.ndim == 1:
            self.min_ = np.min(X)
            self.max_ = np.max(X)
        else:
            raise ValueError("Data must be 1D, 2D, or 3D.")

        self.scale_ = self.max_ - self.min_
        return self

    def transform(self, X):
        X_scaled = (X - self.min_) / (self.scale_ + 1e-8)  # √âviter division par z√©ro
        X_scaled = X_scaled * (self.minmax_range[1] - self.minmax_range[0]) + self.minmax_range[0]
        return X_scaled

    def fit_transform(self, X):
        return self.fit(X).transform(X)

    def inverse_transform(self, X_scaled):
        X = (X_scaled - self.minmax_range[0]) / (self.minmax_range[1] - self.minmax_range[0])
        X = X * self.scale_ + self.min_
        return X

In [None]:
@tf.keras.utils.register_keras_serializable(package="sigtkan_manual", name="SigTKANManual")
class SigTKANManual(Layer):
    """
    Impl√©mentation manuelle de SigTKAN sans h√©riter de RNN.
    Cette version impl√©mente explicitement la boucle temporelle.
    """
    
    def __init__(
        self,
        units,
        sig_level=2,
        sub_kan_configs=None,
        sub_kan_output_dim=None,
        sub_kan_input_dim=None,
        activation="tanh",
        recurrent_activation="sigmoid",
        return_sequences=False,
        return_state=False,
        **kwargs
    ):
        super().__init__(**kwargs)
        self.units = units
        self.sig_level = sig_level
        self.sub_kan_configs = sub_kan_configs or [None]
        self.sub_kan_output_dim = sub_kan_output_dim
        self.sub_kan_input_dim = sub_kan_input_dim
        self.activation = tf.keras.activations.get(activation)
        self.recurrent_activation = tf.keras.activations.get(recurrent_activation)
        self.return_sequences = return_sequences
        self.return_state = return_state
        
        # Param√®tres internes
        self.cell = None
        self.built_cell = False

    def build(self, input_shape):
        batch_size, sequence_length, input_dim = input_shape
        
        # D√©finir les dimensions par d√©faut si non sp√©cifi√©es
        if self.sub_kan_input_dim is None:
            self.sub_kan_input_dim = input_dim
        if self.sub_kan_output_dim is None:
            self.sub_kan_output_dim = input_dim
            
        # Cr√©er une cellule SigTKAN pour les op√©rations internes
        self.cell = SigTKANCell(
            units=self.units,
            sig_level=self.sig_level,
            sub_kan_configs=self.sub_kan_configs,
            sub_kan_output_dim=self.sub_kan_output_dim,
            sub_kan_input_dim=self.sub_kan_input_dim,
            activation=self.activation,
            recurrent_activation=self.recurrent_activation
        )
        
        # Construire la cellule avec la forme d'entr√©e appropri√©e
        self.cell.build((batch_size, input_dim))
        self.built_cell = True
        
        super().build(input_shape)

    def call(self, inputs, initial_state=None, training=None):
        """
        Boucle manuelle sur la s√©quence temporelle.
        
        Args:
            inputs: Tensor de forme (batch_size, sequence_length, input_dim)
            initial_state: √âtat initial (optionnel)
            training: Mode d'entra√Ænement
        
        Returns:
            outputs: Sorties selon return_sequences
            states: √âtats finaux si return_state=True
        """
        batch_size = tf.shape(inputs)[0]
        sequence_length = tf.shape(inputs)[1]
        
        # Initialiser les √©tats si non fournis
        if initial_state is None:
            states = self.cell.get_initial_state(
                inputs=inputs, 
                batch_size=batch_size, 
                dtype=inputs.dtype
            )
        else:
            states = initial_state
        
        # Pr√©parer les conteneurs pour les sorties
        if self.return_sequences:
            outputs = tf.TensorArray(
                dtype=inputs.dtype,
                size=sequence_length,
                dynamic_size=False
            )
        
        # Boucle manuelle sur la s√©quence
        for t in range(sequence_length):
            # Extraire l'entr√©e au temps t
            input_t = inputs[:, t, :]
            
            # Appliquer la cellule SigTKAN
            output_t, states = self.cell(input_t, states, training=training)
            
            # Stocker les sorties si n√©cessaire
            if self.return_sequences:
                outputs = outputs.write(t, output_t)
        
        # Pr√©parer les r√©sultats finaux
        if self.return_sequences:
            # Empiler toutes les sorties temporelles
            outputs = outputs.stack()
            outputs = tf.transpose(outputs, [1, 0, 2])  # (batch, time, features)
            final_output = outputs
        else:
            # Retourner seulement la derni√®re sortie
            final_output = output_t
        
        if self.return_state:
            return final_output, states
        else:
            return final_output

    def get_config(self):
        config = super().get_config()
        config.update({
            "units": self.units,
            "sig_level": self.sig_level,
            "sub_kan_configs": self.sub_kan_configs,
            "sub_kan_output_dim": self.sub_kan_output_dim,
            "sub_kan_input_dim": self.sub_kan_input_dim,
            "activation": tf.keras.activations.serialize(self.activation),
            "recurrent_activation": tf.keras.activations.serialize(self.recurrent_activation),
            "return_sequences": self.return_sequences,
            "return_state": self.return_state,
        })
        return config

In [None]:
# Chargement et pr√©paration des donn√©es
df = pd.read_parquet('data.parquet')
df = df[(df.index >= pd.Timestamp('2020-01-01')) & (df.index < pd.Timestamp('2023-01-01'))]

# S√©lection des actifs crypto
assets = ['BTC', 'ETH', 'ADA', 'XMR', 'EOS', 'MATIC', 'TRX', 'FTM', 'BNB', 'XLM', 
          'ENJ', 'CHZ', 'BUSD', 'ATOM', 'LINK', 'ETC', 'XRP', 'BCH', 'LTC']

# Filtrage pour les volumes de quote asset
df = df[[c for c in df.columns if 'quote asset volume' in c and any(asset in c for asset in assets)]]
df.columns = [c.replace(' quote asset volume', '') for c in df.columns]

# Cr√©ation des features temporelles connues
known_input_df = pd.DataFrame(
    index=df.index, 
    data=np.array([
        df.reset_index()['group'].apply(lambda x: x.hour).values, 
        df.reset_index()['group'].apply(lambda x: x.dayofweek).values
    ]).T, 
    columns=['hour', 'dayofweek']
)

print(f"Donn√©es charg√©es: {df.shape}")
print(f"P√©riode: {df.index.min()} √† {df.index.max()}")
print(f"Colonnes: {list(df.columns[:5])}..." if len(df.columns) > 5 else f"Colonnes: {list(df.columns)}")

display(df.head())
display(known_input_df.head())

In [None]:
def generate_sequences(df, sequence_length, n_ahead):
    """
    G√©n√®re les s√©quences pour l'entra√Ænement des mod√®les.
    
    Args:
        df: DataFrame avec les donn√©es temporelles
        sequence_length: Longueur des s√©quences d'entr√©e
        n_ahead: Nombre de pas de temps √† pr√©dire
    
    Returns:
        Donn√©es d'entra√Ænement et de test pr√©par√©es
    """
    # Normalisation avec m√©diane mobile
    scaler_df = df.copy().shift(n_ahead).rolling(24 * 14).median()
    tmp_df = df.copy() / scaler_df
    tmp_df = tmp_df.iloc[24 * 14 + n_ahead:].fillna(0.)
    scaler_df = scaler_df.iloc[24 * 14 + n_ahead:].fillna(0.)
    
    def prepare_sequences(df, scaler_df, n_history, n_future):
        X, y, y_scaler = [], [], []
        
        for i in range(n_history, len(df) - n_future + 1):
            # S√©quence d'entr√©e
            X.append(df.iloc[i - n_history:i].values)
            # Valeurs √† pr√©dire (premi√®re colonne seulement)
            y.append(df.iloc[i:i + n_future, 0:1].values)
            # Facteur de d√©normalisation
            y_scaler.append(scaler_df.iloc[i:i + n_future, 0:1].values)
        
        return np.array(X), np.array(y), np.array(y_scaler)
    
    # Pr√©paration des s√©quences
    X, y, y_scaler = prepare_sequences(tmp_df, scaler_df, sequence_length, n_ahead)
    
    # Division train/test
    train_ratio = 0.8
    train_size = int(len(X) * train_ratio)
    
    X_train_raw, X_test_raw = X[:train_size], X[train_size:]
    y_train_raw, y_test_raw = y[:train_size], y[train_size:]
    y_scaler_train, y_scaler_test = y_scaler[:train_size], y_scaler[train_size:]
    
    # Normalisation des features
    X_scaler = MinMaxScaler(feature_axis=2)
    X_train = X_scaler.fit_transform(X_train_raw)
    X_test = X_scaler.transform(X_test_raw)
    
    # Normalisation des targets
    y_scaler_norm = MinMaxScaler(feature_axis=2)
    y_train = y_scaler_norm.fit_transform(y_train_raw).reshape(y_train_raw.shape[0], -1)
    y_test = y_scaler_norm.transform(y_test_raw).reshape(y_test_raw.shape[0], -1)
    
    return {
        'X_scaler': X_scaler,
        'X_train': X_train, 'X_test': X_test,
        'X_train_raw': X_train_raw, 'X_test_raw': X_test_raw,
        'y_scaler': y_scaler_norm,
        'y_train': y_train, 'y_test': y_test,
        'y_train_raw': y_train_raw, 'y_test_raw': y_test_raw,
        'y_scaler_train': y_scaler_train, 'y_scaler_test': y_scaler_test
    }

# Param√®tres
SEQUENCE_LENGTH = 24  # 24 heures d'historique
N_AHEAD = 6          # Pr√©dire 6 heures √† l'avance

# G√©n√©ration des donn√©es
print("G√©n√©ration des s√©quences...")
data = generate_sequences(df, SEQUENCE_LENGTH, N_AHEAD)

print(f"Forme X_train: {data['X_train'].shape}")
print(f"Forme y_train: {data['y_train'].shape}")
print(f"Forme X_test: {data['X_test'].shape}")
print(f"Forme y_test: {data['y_test'].shape}")

In [None]:
def create_sigtkan_manual_model(input_shape, units=64, sig_level=2):
    """
    Cr√©e un mod√®le avec SigTKAN manuel.
    """
    inputs = Input(shape=input_shape, name='input_sequence')
    
    # Couche SigTKAN manuelle
    sigtkan_out = SigTKANManual(
        units=units,
        sig_level=sig_level,
        return_sequences=False,
        name='sigtkan_manual'
    )(inputs)
    
    # Couches de sortie
    dense1 = Dense(units//2, activation='relu', name='dense1')(sigtkan_out)
    outputs = Dense(N_AHEAD, activation='linear', name='output')(dense1)
    
    model = Model(inputs=inputs, outputs=outputs, name='SigTKAN_Manual')
    return model

def create_baseline_lstm_model(input_shape, units=64):
    """
    Cr√©e un mod√®le LSTM de r√©f√©rence.
    """
    inputs = Input(shape=input_shape, name='input_sequence')
    
    lstm_out = tf.keras.layers.LSTM(
        units=units,
        return_sequences=False,
        name='lstm'
    )(inputs)
    
    dense1 = Dense(units//2, activation='relu', name='dense1')(lstm_out)
    outputs = Dense(N_AHEAD, activation='linear', name='output')(dense1)
    
    model = Model(inputs=inputs, outputs=outputs, name='LSTM_Baseline')
    return model

def create_baseline_gru_model(input_shape, units=64):
    """
    Cr√©e un mod√®le GRU de r√©f√©rence.
    """
    inputs = Input(shape=input_shape, name='input_sequence')
    
    gru_out = tf.keras.layers.GRU(
        units=units,
        return_sequences=False,
        name='gru'
    )(inputs)
    
    dense1 = Dense(units//2, activation='relu', name='dense1')(gru_out)
    outputs = Dense(N_AHEAD, activation='linear', name='output')(dense1)
    
    model = Model(inputs=inputs, outputs=outputs, name='GRU_Baseline')
    return model

# Forme d'entr√©e
input_shape = (SEQUENCE_LENGTH, data['X_train'].shape[2])
print(f"Forme d'entr√©e: {input_shape}")

# Cr√©ation des mod√®les
print("Cr√©ation des mod√®les...")
sigtkan_model = create_sigtkan_manual_model(input_shape)
lstm_model = create_baseline_lstm_model(input_shape)
gru_model = create_baseline_gru_model(input_shape)

print("Mod√®les cr√©√©s avec succ√®s!")
print(f"SigTKAN param√®tres: {sigtkan_model.count_params():,}")
print(f"LSTM param√®tres: {lstm_model.count_params():,}")
print(f"GRU param√®tres: {gru_model.count_params():,}")

In [None]:
# Configuration de l'entra√Ænement
EPOCHS = 50
BATCH_SIZE = 64
LEARNING_RATE = 0.001

# Callbacks
early_stopping = tf.keras.callbacks.EarlyStopping(
    monitor='val_loss',
    patience=10,
    restore_best_weights=True,
    verbose=1
)

reduce_lr = tf.keras.callbacks.ReduceLROnPlateau(
    monitor='val_loss',
    factor=0.5,
    patience=5,
    min_lr=1e-6,
    verbose=1
)

callbacks = [early_stopping, reduce_lr]

# Fonction d'entra√Ænement
def train_model(model, model_name, X_train, y_train, X_test, y_test):
    print(f"\n{'='*50}")
    print(f"Entra√Ænement du mod√®le: {model_name}")
    print(f"{'='*50}")
    
    # Compilation
    model.compile(
        optimizer=Adam(learning_rate=LEARNING_RATE),
        loss='mse',
        metrics=['mae']
    )
    
    # Entra√Ænement
    start_time = time.time()
    history = model.fit(
        X_train, y_train,
        batch_size=BATCH_SIZE,
        epochs=EPOCHS,
        validation_data=(X_test, y_test),
        callbacks=callbacks,
        verbose=1
    )
    training_time = time.time() - start_time
    
    print(f"Temps d'entra√Ænement: {training_time:.2f} secondes")
    
    return history, training_time

# Dictionnaire pour stocker les r√©sultats
results = {}
models = {
    'SigTKAN_Manual': sigtkan_model,
    'LSTM_Baseline': lstm_model,
    'GRU_Baseline': gru_model
}

print("D√©but de l'entra√Ænement des mod√®les...")

In [None]:
# Entra√Ænement de tous les mod√®les
for model_name, model in models.items():
    try:
        history, training_time = train_model(
            model, model_name, 
            data['X_train'], data['y_train'],
            data['X_test'], data['y_test']
        )
        
        # Pr√©dictions
        y_pred_train = model.predict(data['X_train'], verbose=0)
        y_pred_test = model.predict(data['X_test'], verbose=0)
        
        # M√©triques
        train_mse = mean_squared_error(data['y_train'], y_pred_train)
        test_mse = mean_squared_error(data['y_test'], y_pred_test)
        train_mae = mean_absolute_error(data['y_train'], y_pred_train)
        test_mae = mean_absolute_error(data['y_test'], y_pred_test)
        train_r2 = r2_score(data['y_train'], y_pred_train)
        test_r2 = r2_score(data['y_test'], y_pred_test)
        
        # Stockage des r√©sultats
        results[model_name] = {
            'model': model,
            'history': history,
            'training_time': training_time,
            'y_pred_train': y_pred_train,
            'y_pred_test': y_pred_test,
            'metrics': {
                'train_mse': train_mse,
                'test_mse': test_mse,
                'train_mae': train_mae,
                'test_mae': test_mae,
                'train_r2': train_r2,
                'test_r2': test_r2
            }
        }
        
        print(f"\n{model_name} - R√©sultats:")
        print(f"  Train MSE: {train_mse:.6f}")
        print(f"  Test MSE: {test_mse:.6f}")
        print(f"  Train R¬≤: {train_r2:.4f}")
        print(f"  Test R¬≤: {test_r2:.4f}")
        
    except Exception as e:
        print(f"Erreur lors de l'entra√Ænement de {model_name}: {str(e)}")
        import traceback
        traceback.print_exc()

print("\nEntra√Ænement termin√©!")

In [None]:
# Comparaison des r√©sultats
print("\n" + "="*80)
print("COMPARAISON DES PERFORMANCES")
print("="*80)

# Tableau de comparaison
comparison_data = []
for model_name, result in results.items():
    metrics = result['metrics']
    comparison_data.append({
        'Mod√®le': model_name,
        'Train MSE': f"{metrics['train_mse']:.6f}",
        'Test MSE': f"{metrics['test_mse']:.6f}",
        'Train R¬≤': f"{metrics['train_r2']:.4f}",
        'Test R¬≤': f"{metrics['test_r2']:.4f}",
        'Temps (s)': f"{result['training_time']:.1f}"
    })

comparison_df = pd.DataFrame(comparison_data)
display(comparison_df)

# Trouver le meilleur mod√®le
best_model_name = min(results.keys(), key=lambda x: results[x]['metrics']['test_mse'])
print(f"\nüèÜ Meilleur mod√®le (Test MSE): {best_model_name}")

# Graphiques de comparaison
fig, axes = plt.subplots(2, 2, figsize=(15, 10))

# Loss curves
axes[0, 0].set_title('Courbes de Loss')
for model_name, result in results.items():
    if 'history' in result:
        history = result['history']
        axes[0, 0].plot(history.history['loss'], label=f'{model_name} (train)')
        axes[0, 0].plot(history.history['val_loss'], '--', label=f'{model_name} (val)')
axes[0, 0].set_xlabel('√âpoque')
axes[0, 0].set_ylabel('Loss (MSE)')
axes[0, 0].legend()
axes[0, 0].set_yscale('log')

# Test MSE comparison
model_names = list(results.keys())
test_mse_values = [results[name]['metrics']['test_mse'] for name in model_names]
axes[0, 1].bar(model_names, test_mse_values)
axes[0, 1].set_title('Test MSE par mod√®le')
axes[0, 1].set_ylabel('MSE')
axes[0, 1].tick_params(axis='x', rotation=45)

# R¬≤ comparison
test_r2_values = [results[name]['metrics']['test_r2'] for name in model_names]
axes[1, 0].bar(model_names, test_r2_values)
axes[1, 0].set_title('Test R¬≤ par mod√®le')
axes[1, 0].set_ylabel('R¬≤')
axes[1, 0].tick_params(axis='x', rotation=45)

# Training time comparison
training_times = [result['training_time'] for result in results.values()]
axes[1, 1].bar(model_names, training_times)
axes[1, 1].set_title('Temps d\'entra√Ænement')
axes[1, 1].set_ylabel('Temps (secondes)')
axes[1, 1].tick_params(axis='x', rotation=45)

plt.tight_layout()
plt.show()

In [None]:
# Analyse d√©taill√©e des pr√©dictions du meilleur mod√®le
best_result = results[best_model_name]
y_pred_test = best_result['y_pred_test']

# Visualisation des pr√©dictions
n_samples_to_plot = min(100, len(data['y_test']))
indices = np.arange(n_samples_to_plot)

fig, axes = plt.subplots(2, 2, figsize=(15, 10))

# Pr√©dictions vs r√©alit√© (√©chantillon)
axes[0, 0].plot(indices, data['y_test'][:n_samples_to_plot, 0], 'b-', label='R√©el', alpha=0.7)
axes[0, 0].plot(indices, y_pred_test[:n_samples_to_plot, 0], 'r-', label='Pr√©dit', alpha=0.7)
axes[0, 0].set_title(f'{best_model_name} - Pr√©dictions vs R√©alit√© (premier pas de temps)')
axes[0, 0].set_xlabel('√âchantillon')
axes[0, 0].set_ylabel('Valeur normalis√©e')
axes[0, 0].legend()

# Scatter plot - corr√©lation
axes[0, 1].scatter(data['y_test'][:, 0], y_pred_test[:, 0], alpha=0.5)
axes[0, 1].plot([data['y_test'][:, 0].min(), data['y_test'][:, 0].max()], 
                [data['y_test'][:, 0].min(), data['y_test'][:, 0].max()], 'r--')
axes[0, 1].set_title(f'{best_model_name} - Corr√©lation Pr√©dictions/R√©alit√©')
axes[0, 1].set_xlabel('Valeurs r√©elles')
axes[0, 1].set_ylabel('Valeurs pr√©dites')

# Distribution des erreurs
errors = y_pred_test[:, 0] - data['y_test'][:, 0]
axes[1, 0].hist(errors, bins=50, alpha=0.7, density=True)
axes[1, 0].set_title(f'{best_model_name} - Distribution des erreurs')
axes[1, 0].set_xlabel('Erreur (pr√©dit - r√©el)')
axes[1, 0].set_ylabel('Densit√©')

# Erreurs absolues dans le temps
abs_errors = np.abs(errors)
axes[1, 1].plot(indices, abs_errors[:n_samples_to_plot])
axes[1, 1].set_title(f'{best_model_name} - Erreurs absolues dans le temps')
axes[1, 1].set_xlabel('√âchantillon')
axes[1, 1].set_ylabel('Erreur absolue')

plt.tight_layout()
plt.show()

# Statistiques des erreurs
print(f"\nüìä Statistiques des erreurs ({best_model_name}):")
print(f"  Erreur moyenne: {np.mean(errors):.6f}")
print(f"  Erreur absolue moyenne: {np.mean(abs_errors):.6f}")
print(f"  √âcart-type des erreurs: {np.std(errors):.6f}")
print(f"  Erreur max: {np.max(abs_errors):.6f}")

In [None]:
# Test de signification statistique des diff√©rences de performance
from scipy import stats

print("\n" + "="*60)
print("TESTS STATISTIQUES DE COMPARAISON")
print("="*60)

# Comparaison SigTKAN vs baselines
if 'SigTKAN_Manual' in results:
    sigtkan_errors = np.abs(results['SigTKAN_Manual']['y_pred_test'][:, 0] - data['y_test'][:, 0])
    
    for baseline_name in ['LSTM_Baseline', 'GRU_Baseline']:
        if baseline_name in results:
            baseline_errors = np.abs(results[baseline_name]['y_pred_test'][:, 0] - data['y_test'][:, 0])
            
            # Test de Wilcoxon (non-param√©trique)
            statistic, p_value = stats.wilcoxon(sigtkan_errors, baseline_errors, alternative='two-sided')
            
            print(f"\nTest de Wilcoxon: SigTKAN_Manual vs {baseline_name}")
            print(f"  Statistique: {statistic:.4f}")
            print(f"  p-value: {p_value:.6f}")
            
            if p_value < 0.05:
                if np.mean(sigtkan_errors) < np.mean(baseline_errors):
                    print(f"  ‚Üí SigTKAN_Manual est significativement MEILLEUR que {baseline_name}")
                else:
                    print(f"  ‚Üí SigTKAN_Manual est significativement MOINS BON que {baseline_name}")
            else:
                print(f"  ‚Üí Pas de diff√©rence significative entre SigTKAN_Manual et {baseline_name}")

# Analyse des temps de convergence
print(f"\nüìà ANALYSE DE LA CONVERGENCE:")
for model_name, result in results.items():
    if 'history' in result:
        history = result['history']
        val_losses = history.history['val_loss']
        
        # Trouver l'√©poque de meilleure performance
        best_epoch = np.argmin(val_losses)
        min_val_loss = val_losses[best_epoch]
        
        print(f"\n{model_name}:")
        print(f"  Meilleure √©poque: {best_epoch + 1}")
        print(f"  Meilleure val_loss: {min_val_loss:.6f}")
        print(f"  Nombre total d'√©poques: {len(val_losses)}")

## Conclusions et Analyse

### Objectifs de l'exp√©rience
Cette exp√©rience avait pour but de tester si l'int√©gration des signatures dans TKAN (SigTKAN) am√©liore les performances pr√©dictives par rapport aux mod√®les de r√©f√©rence (LSTM, GRU).

### Impl√©mentation manuelle
- ‚úÖ **SigTKAN avec boucle manuelle**: Impl√©mentation r√©ussie sans h√©riter de la classe RNN de Keras
- ‚úÖ **Int√©gration des signatures**: Utilisation de transformations de signatures pour enrichir l'information temporelle
- ‚úÖ **Comparaison √©quitable**: M√™me architecture de sortie pour tous les mod√®les

### M√©triques de performance
Les mod√®les ont √©t√© √©valu√©s sur:
- **MSE (Mean Squared Error)**: Erreur quadratique moyenne
- **MAE (Mean Absolute Error)**: Erreur absolue moyenne  
- **R¬≤ (Coefficient de d√©termination)**: Qualit√© de l'ajustement
- **Temps d'entra√Ænement**: Efficacit√© computationnelle

### Points cl√©s observ√©s
1. **Complexit√© du mod√®le**: SigTKAN est plus complexe avec les transformations de signatures
2. **Convergence**: Analyse des courbes de loss pour comprendre la stabilit√© d'entra√Ænement
3. **G√©n√©ralisation**: Comparaison des performances train vs test

### Prochaines √©tapes possibles
- Test sur d'autres datasets (volatilit√©, autres actifs financiers)
- Optimisation des hyperparam√®tres (sig_level, architecture des sous-couches KAN)
- Analyse de l'impact de diff√©rents niveaux de signature
- Comparaison avec d'autres architectures avanc√©es (Transformers, etc.)

In [None]:
# Sauvegarde des r√©sultats pour analyse ult√©rieure
import pickle
import json

# Sauvegarder les m√©triques et configurations
results_summary = {}
for model_name, result in results.items():
    results_summary[model_name] = {
        'metrics': result['metrics'],
        'training_time': result['training_time'],
        'model_params': result['model'].count_params()
    }

# Sauvegarder en JSON
with open('sigtkan_manual_results.json', 'w') as f:
    json.dump(results_summary, f, indent=2)

# Sauvegarder les mod√®les (seulement le meilleur pour √©conomiser l'espace)
best_result['model'].save('best_sigtkan_manual_model.keras')

print("‚úÖ R√©sultats sauvegard√©s:")
print("  - sigtkan_manual_results.json (m√©triques)")
print("  - best_sigtkan_manual_model.keras (meilleur mod√®le)")
print(f"\nüéØ Exp√©rience termin√©e avec succ√®s!")
print(f"   Meilleur mod√®le: {best_model_name}")
print(f"   Test R¬≤: {results[best_model_name]['metrics']['test_r2']:.4f}")