# LSTM et GRU avec TensorFlow

Ce notebook explore les architectures LSTM (Long Short-Term Memory) et GRU (Gated Recurrent Unit), qui r√©solvent les limitations des RNN vanilla.

## Objectifs du notebook

1. Comprendre les m√©canismes de portes dans LSTM et GRU
2. Impl√©menter LSTM et GRU avec TensorFlow/Keras
3. Comparer les performances RNN vs LSTM vs GRU
4. Visualiser le fonctionnement des portes
5. Applications pratiques avanc√©es

## 1. Installation et imports

In [None]:
# Installation des d√©pendances
!pip install tensorflow numpy pandas matplotlib seaborn scikit-learn plotly

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers, models
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report, confusion_matrix
import plotly.graph_objects as go
import plotly.express as px
from plotly.subplots import make_subplots
import warnings
warnings.filterwarnings('ignore')

# Configuration de l'affichage
plt.style.use('seaborn-v0_8-darkgrid')
sns.set_palette("husl")

print(f"TensorFlow version: {tf.__version__}")
print(f"GPU disponible: {len(tf.config.list_physical_devices('GPU')) > 0}")

## 2. Comprendre les LSTM : Architecture et Portes

In [None]:
# Visualisation des portes LSTM
def visualize_lstm_gates():
    """
    Visualise les diff√©rentes portes d'une cellule LSTM.
    """
    fig = make_subplots(
        rows=2, cols=2,
        subplot_titles=('Forget Gate', 'Input Gate', 'Output Gate', 'Cell State Update'),
        specs=[[{"secondary_y": False}, {"secondary_y": False}],
               [{"secondary_y": False}, {"secondary_y": False}]]
    )
    
    # Simuler des valeurs de portes au fil du temps
    time_steps = np.arange(1, 11)
    
    # Forget Gate (oublie progressivement l'ancienne information)
    forget_gate = np.array([0.9, 0.8, 0.7, 0.5, 0.3, 0.2, 0.1, 0.05, 0.02, 0.01])
    
    # Input Gate (accepte la nouvelle information)
    input_gate = np.array([0.1, 0.2, 0.3, 0.5, 0.7, 0.8, 0.9, 0.95, 0.98, 0.99])
    
    # Output Gate (contr√¥le ce qui est expos√©)
    output_gate = np.array([0.5, 0.6, 0.7, 0.8, 0.9, 0.85, 0.8, 0.75, 0.7, 0.65])
    
    # Cell State (combinaison des trois)
    cell_state = np.array([0.2, 0.3, 0.5, 0.7, 0.8, 0.85, 0.9, 0.88, 0.85, 0.82])
    
    # Forget Gate
    fig.add_trace(
        go.Scatter(x=time_steps, y=forget_gate, mode='lines+markers', 
                  name='Forget', line=dict(color='red', width=3)),
        row=1, col=1
    )
    
    # Input Gate
    fig.add_trace(
        go.Scatter(x=time_steps, y=input_gate, mode='lines+markers', 
                  name='Input', line=dict(color='green', width=3)),
        row=1, col=2
    )
    
    # Output Gate
    fig.add_trace(
        go.Scatter(x=time_steps, y=output_gate, mode='lines+markers', 
                  name='Output', line=dict(color='blue', width=3)),
        row=2, col=1
    )
    
    # Cell State
    fig.add_trace(
        go.Scatter(x=time_steps, y=cell_state, mode='lines+markers', 
                  name='Cell State', line=dict(color='purple', width=3)),
        row=2, col=2
    )
    
    fig.update_layout(
        title_text="√âvolution des Portes LSTM au Fil du Temps",
        showlegend=False,
        height=600
    )
    
    # Mise √† jour des axes
    for row in range(1, 3):
        for col in range(1, 3):
            fig.update_xaxes(title_text="Time Step", row=row, col=col)
            fig.update_yaxes(title_text="Activation", range=[0, 1], row=row, col=col)
    
    fig.show()

visualize_lstm_gates()

In [None]:
# Comparaison architecturale RNN vs LSTM vs GRU
def compare_architectures():
    """
    Compare les architectures des diff√©rents types de RNN.
    """
    architectures = {
        'Caract√©ristique': [
            'Nombre de portes',
            '√âtat interne',
            'Complexit√©',
            'M√©moire',
            'Vitesse d\'entra√Ænement',
            'Gradient qui dispara√Æt',
            'Capacit√© √† long terme'
        ],
        'RNN Vanilla': [
            '0',
            '√âtat cach√© simple',
            'Faible',
            'Limit√©e',
            'Rapide',
            'Probl√©matique',
            'Faible'
        ],
        'LSTM': [
            '3 (forget, input, output)',
            '√âtat cach√© + √âtat de cellule',
            '√âlev√©e',
            'Excellente',
            'Lent',
            'R√©solu',
            'Excellente'
        ],
        'GRU': [
            '2 (reset, update)',
            '√âtat cach√© uniquement',
            'Moyenne',
            'Tr√®s bonne',
            'Moyennement rapide',
            'Largement r√©solu',
            'Tr√®s bonne'
        ]
    }
    
    df_arch = pd.DataFrame(architectures)
    
    # Affichage du tableau
    fig, ax = plt.subplots(figsize=(14, 6))
    ax.axis('tight')
    ax.axis('off')
    
    table = ax.table(cellText=df_arch.values,
                    colLabels=df_arch.columns,
                    cellLoc='center',
                    loc='center')
    
    table.auto_set_font_size(False)
    table.set_fontsize(10)
    table.scale(1.2, 2.5)
    
    # Style du tableau
    for i in range(len(df_arch.columns)):
        table[(0, i)].set_facecolor('#1E90FF')
        table[(0, i)].set_text_props(weight='bold', color='white')
    
    # Colorer les cellules selon le type
    colors = ['#FFE6E6', '#E6F3FF', '#E6FFE6']  # Rouge clair, Bleu clair, Vert clair
    for i in range(1, 4):  # Colonnes RNN, LSTM, GRU
        for j in range(1, len(df_arch) + 1):
            table[(j, i)].set_facecolor(colors[i-1])
    
    plt.title('Comparaison des Architectures RNN', fontsize=16, fontweight='bold', pad=20)
    plt.show()

compare_architectures()

## 3. Impl√©mentation des Mod√®les

In [None]:
# Fonctions pour cr√©er diff√©rents types de mod√®les
def create_rnn_model(vocab_size, embedding_dim, hidden_dim, max_length, num_classes=2):
    """
    Cr√©e un mod√®le RNN vanilla.
    """
    model = models.Sequential([
        layers.Embedding(vocab_size, embedding_dim, input_length=max_length),
        layers.SimpleRNN(hidden_dim, dropout=0.3, recurrent_dropout=0.3),
        layers.Dense(num_classes, activation='softmax')
    ])
    return model

def create_lstm_model(vocab_size, embedding_dim, hidden_dim, max_length, num_classes=2):
    """
    Cr√©e un mod√®le LSTM.
    """
    model = models.Sequential([
        layers.Embedding(vocab_size, embedding_dim, input_length=max_length),
        layers.LSTM(hidden_dim, dropout=0.3, recurrent_dropout=0.3),
        layers.Dense(num_classes, activation='softmax')
    ])
    return model

def create_gru_model(vocab_size, embedding_dim, hidden_dim, max_length, num_classes=2):
    """
    Cr√©e un mod√®le GRU.
    """
    model = models.Sequential([
        layers.Embedding(vocab_size, embedding_dim, input_length=max_length),
        layers.GRU(hidden_dim, dropout=0.3, recurrent_dropout=0.3),
        layers.Dense(num_classes, activation='softmax')
    ])
    return model

def create_bidirectional_lstm(vocab_size, embedding_dim, hidden_dim, max_length, num_classes=2):
    """
    Cr√©e un mod√®le LSTM bidirectionnel.
    """
    model = models.Sequential([
        layers.Embedding(vocab_size, embedding_dim, input_length=max_length),
        layers.Bidirectional(layers.LSTM(hidden_dim, dropout=0.3, recurrent_dropout=0.3)),
        layers.Dense(num_classes, activation='softmax')
    ])
    return model

def create_stacked_lstm(vocab_size, embedding_dim, hidden_dim, max_length, num_classes=2):
    """
    Cr√©e un mod√®le LSTM empil√© (2 couches).
    """
    model = models.Sequential([
        layers.Embedding(vocab_size, embedding_dim, input_length=max_length),
        layers.LSTM(hidden_dim, dropout=0.3, recurrent_dropout=0.3, return_sequences=True),
        layers.LSTM(hidden_dim, dropout=0.3, recurrent_dropout=0.3),
        layers.Dense(num_classes, activation='softmax')
    ])
    return model

# Afficher les architectures
params = {
    'vocab_size': 1000,
    'embedding_dim': 100,
    'hidden_dim': 128,
    'max_length': 50,
    'num_classes': 2
}

models_dict = {
    'RNN': create_rnn_model(**params),
    'LSTM': create_lstm_model(**params),
    'GRU': create_gru_model(**params),
    'Bidirectional LSTM': create_bidirectional_lstm(**params),
    'Stacked LSTM': create_stacked_lstm(**params)
}

# Comparer le nombre de param√®tres
for name, model in models_dict.items():
    model.compile(optimizer='adam', loss='sparse_categorical_crossentropy')
    total_params = model.count_params()
    print(f"{name:20s}: {total_params:,} param√®tres")

## 4. Dataset pour Comparaison

In [None]:
# Cr√©er un dataset plus complexe pour tester les capacit√©s de m√©moire
def create_complex_sentiment_dataset():
    """
    Cr√©e un dataset avec des phrases complexes n√©cessitant une m√©moire √† long terme.
    """
    # Phrases avec sentiment au d√©but et n√©gation/confirmation √† la fin
    complex_positive = [
        "Ce film √©tait vraiment fantastique au d√©but mais finalement c'√©tait encore mieux √† la fin",
        "J'ai d'abord pens√© que ce produit √©tait moyen mais apr√®s utilisation je suis tr√®s satisfait",
        "Le service semblait d√©cevant au premier abord mais l'√©quipe a su me convaincre de leur professionnalisme",
        "Bien que le prix soit √©lev√© cette exp√©rience vaut vraiment le d√©tour",
        "Malgr√© quelques d√©fauts mineurs je recommande vivement cet article",
        "Au d√©but j'√©tais sceptique mais maintenant je peux affirmer que c'est excellent",
        "Ce restaurant avait mauvaise r√©putation mais la r√©alit√© d√©passe les attentes",
        "L'interface para√Æt compliqu√©e mais une fois ma√Ætris√©e elle est tr√®s efficace",
        "Contrairement aux avis n√©gatifs mon exp√©rience a √©t√© formidable",
        "Bien que critiqu√© par certains je trouve ce produit absolument parfait"
    ]
    
    complex_negative = [
        "Ce film commen√ßait bien mais s'est r√©v√©l√© √™tre une grande d√©ception",
        "J'avais de grandes attentes mais le r√©sultat est vraiment d√©cevant",
        "Le produit semblait prometteur mais la qualit√© n'est pas au rendez-vous",
        "Malgr√© de bonnes critiques mon exp√©rience personnelle a √©t√© catastrophique",
        "Bien que recommand√© par mes amis je ne peux pas partager leur enthousiasme",
        "Au d√©but tout allait bien mais les probl√®mes se sont accumul√©s",
        "Ce service avait bonne r√©putation mais la r√©alit√© est tout autre",
        "L'√©quipe semblait comp√©tente mais leur travail est b√¢cl√©",
        "Contrairement aux promesses publicitaires le produit est d√©faillant",
        "Bien que cher ce produit ne vaut absolument pas son prix"
    ]
    
    # Phrases simples pour contraste
    simple_positive = [
        "Excellent produit je recommande",
        "Tr√®s satisfait de mon achat",
        "Service parfait et rapide",
        "Qualit√© exceptionnelle bravo",
        "Exp√©rience merveilleuse et agr√©able"
    ]
    
    simple_negative = [
        "Produit d√©cevant tr√®s mauvais",
        "Service client inexistant",
        "Qualit√© m√©diocre √† √©viter",
        "Exp√©rience d√©sastreuse et frustrante",
        "Arnaque totale ne pas acheter"
    ]
    
    # Combiner tous les textes
    all_texts = (
        complex_positive * 3 + complex_negative * 3 +  # Plus de phrases complexes
        simple_positive * 2 + simple_negative * 2
    )
    
    all_labels = (
        [1] * (len(complex_positive) * 3) +
        [0] * (len(complex_negative) * 3) +
        [1] * (len(simple_positive) * 2) +
        [0] * (len(simple_negative) * 2)
    )
    
    return all_texts, np.array(all_labels)

# Cr√©er le dataset
texts, labels = create_complex_sentiment_dataset()

# Ajouter du bruit et m√©langer
np.random.seed(42)
indices = np.random.permutation(len(texts))
texts = [texts[i] for i in indices]
labels = labels[indices]

print(f"Dataset cr√©√©: {len(texts)} exemples")
print(f"Distribution: {np.bincount(labels)}")
print("\nExemples de phrases complexes:")
for i in range(3):
    print(f"  {texts[i]} ‚Üí {'Positif' if labels[i] else 'N√©gatif'}")

In [None]:
# Pr√©paration des donn√©es
max_words = 2000
max_length = 30  # Plus long pour les phrases complexes

# Tokenisation
tokenizer = Tokenizer(num_words=max_words)
tokenizer.fit_on_texts(texts)

# Conversion en s√©quences
sequences = tokenizer.texts_to_sequences(texts)
X = pad_sequences(sequences, maxlen=max_length)
y = labels

# Division train/test
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42, stratify=y
)

print(f"Donn√©es d'entra√Ænement: {X_train.shape}")
print(f"Donn√©es de test: {X_test.shape}")
print(f"Longueur moyenne des s√©quences: {np.mean([len(s) for s in sequences]):.1f}")

## 5. Comparaison Exp√©rimentale

In [None]:
# Fonction pour entra√Æner et √©valuer un mod√®le
def train_and_evaluate(model, model_name, X_train, y_train, X_test, y_test, epochs=15):
    """
    Entra√Æne et √©value un mod√®le.
    """
    # Compilation
    model.compile(
        optimizer='adam',
        loss='sparse_categorical_crossentropy',
        metrics=['accuracy']
    )
    
    # Callbacks
    early_stopping = keras.callbacks.EarlyStopping(
        monitor='val_loss', patience=3, restore_best_weights=True
    )
    
    # Entra√Ænement
    print(f"\nEntra√Ænement du mod√®le {model_name}...")
    history = model.fit(
        X_train, y_train,
        batch_size=32,
        epochs=epochs,
        validation_split=0.2,
        callbacks=[early_stopping],
        verbose=0
    )
    
    # √âvaluation
    test_loss, test_accuracy = model.evaluate(X_test, y_test, verbose=0)
    
    return {
        'model': model,
        'history': history,
        'test_accuracy': test_accuracy,
        'test_loss': test_loss,
        'name': model_name
    }

# Cr√©er et entra√Æner tous les mod√®les
model_params = {
    'vocab_size': max_words,
    'embedding_dim': 100,
    'hidden_dim': 128,
    'max_length': max_length,
    'num_classes': 2
}

model_creators = {
    'RNN': create_rnn_model,
    'LSTM': create_lstm_model,
    'GRU': create_gru_model,
    'Bidirectional LSTM': create_bidirectional_lstm,
    'Stacked LSTM': create_stacked_lstm
}

results = {}

for name, creator in model_creators.items():
    model = creator(**model_params)
    result = train_and_evaluate(model, name, X_train, y_train, X_test, y_test)
    results[name] = result
    print(f"{name}: Accuracy = {result['test_accuracy']:.4f}")

In [None]:
# Visualisation des r√©sultats de comparaison
def plot_model_comparison(results):
    """
    Compare les performances des diff√©rents mod√®les.
    """
    model_names = list(results.keys())
    accuracies = [results[name]['test_accuracy'] for name in model_names]
    losses = [results[name]['test_loss'] for name in model_names]
    
    fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 6))
    
    # Accuracy comparison
    colors = plt.cm.viridis(np.linspace(0, 1, len(model_names)))
    bars1 = ax1.bar(model_names, accuracies, color=colors, alpha=0.8)
    ax1.set_ylabel('Test Accuracy')
    ax1.set_title('Comparaison des Performances', fontsize=14, fontweight='bold')
    ax1.set_ylim(0, 1)
    
    # Ajouter les valeurs sur les barres
    for bar, acc in zip(bars1, accuracies):
        height = bar.get_height()
        ax1.annotate(f'{acc:.3f}',
                    xy=(bar.get_x() + bar.get_width() / 2, height),
                    xytext=(0, 3),
                    textcoords="offset points",
                    ha='center', va='bottom',
                    fontweight='bold')
    
    # Loss comparison
    bars2 = ax2.bar(model_names, losses, color=colors, alpha=0.8)
    ax2.set_ylabel('Test Loss')
    ax2.set_title('Comparaison des Loss', fontsize=14, fontweight='bold')
    
    # Ajouter les valeurs sur les barres
    for bar, loss in zip(bars2, losses):
        height = bar.get_height()
        ax2.annotate(f'{loss:.3f}',
                    xy=(bar.get_x() + bar.get_width() / 2, height),
                    xytext=(0, 3),
                    textcoords="offset points",
                    ha='center', va='bottom',
                    fontweight='bold')
    
    # Rotation des labels
    for ax in [ax1, ax2]:
        ax.set_xticklabels(model_names, rotation=45, ha='right')
        ax.grid(True, alpha=0.3, axis='y')
    
    plt.tight_layout()
    plt.show()

plot_model_comparison(results)

In [None]:
# √âvolution de l'entra√Ænement pour les meilleurs mod√®les
def plot_training_histories(results, models_to_plot=['LSTM', 'GRU', 'Bidirectional LSTM']):
    """
    Affiche l'√©volution de l'entra√Ænement pour les mod√®les s√©lectionn√©s.
    """
    fig, ((ax1, ax2), (ax3, ax4)) = plt.subplots(2, 2, figsize=(15, 10))
    
    colors = ['blue', 'red', 'green']
    
    for i, model_name in enumerate(models_to_plot):
        if model_name in results:
            history = results[model_name]['history']
            color = colors[i]
            
            # Training accuracy
            ax1.plot(history.history['accuracy'], label=f'{model_name}', color=color, linewidth=2)
            
            # Validation accuracy
            ax2.plot(history.history['val_accuracy'], label=f'{model_name}', color=color, linewidth=2)
            
            # Training loss
            ax3.plot(history.history['loss'], label=f'{model_name}', color=color, linewidth=2)
            
            # Validation loss
            ax4.plot(history.history['val_loss'], label=f'{model_name}', color=color, linewidth=2)
    
    # Configuration des axes
    ax1.set_title('Training Accuracy', fontweight='bold')
    ax1.set_xlabel('Epoch')
    ax1.set_ylabel('Accuracy')
    ax1.legend()
    ax1.grid(True, alpha=0.3)
    
    ax2.set_title('Validation Accuracy', fontweight='bold')
    ax2.set_xlabel('Epoch')
    ax2.set_ylabel('Accuracy')
    ax2.legend()
    ax2.grid(True, alpha=0.3)
    
    ax3.set_title('Training Loss', fontweight='bold')
    ax3.set_xlabel('Epoch')
    ax3.set_ylabel('Loss')
    ax3.legend()
    ax3.grid(True, alpha=0.3)
    
    ax4.set_title('Validation Loss', fontweight='bold')
    ax4.set_xlabel('Epoch')
    ax4.set_ylabel('Loss')
    ax4.legend()
    ax4.grid(True, alpha=0.3)
    
    plt.suptitle('√âvolution de l\'Entra√Ænement par Mod√®le', fontsize=16, fontweight='bold')
    plt.tight_layout()
    plt.show()

plot_training_histories(results)

## 6. Analyse des Erreurs et Phrases Complexes

In [None]:
# Analyser les pr√©dictions sur des phrases complexes
def analyze_complex_predictions(models_dict, tokenizer, max_length):
    """
    Analyse les pr√©dictions des mod√®les sur des phrases complexes.
    """
    # Phrases test sp√©cifiques
    test_sentences = [
        "Au d√©but ce film semblait ennuyeux mais finalement c'√©tait absolument fantastique",  # Positif complexe
        "J'avais de grandes attentes pour ce produit mais il s'est r√©v√©l√© tr√®s d√©cevant",      # N√©gatif complexe
        "Malgr√© les critiques n√©gatives mon exp√©rience a √©t√© merveilleuse",                   # Positif complexe
        "Bien que recommand√© par tous mes amis je trouve ce service catastrophique",          # N√©gatif complexe
        "Excellent produit parfait",                                                          # Positif simple
        "Tr√®s mauvais d√©cevant"                                                               # N√©gatif simple
    ]
    
    true_labels = [1, 0, 1, 0, 1, 0]  # 1=Positif, 0=N√©gatif
    
    # Pr√©parer les donn√©es
    sequences = tokenizer.texts_to_sequences(test_sentences)
    X_test_complex = pad_sequences(sequences, maxlen=max_length)
    
    # S√©lectionner les meilleurs mod√®les
    best_models = ['RNN', 'LSTM', 'GRU']
    
    results_analysis = []
    
    for sentence, true_label in zip(test_sentences, true_labels):
        row = {'Phrase': sentence[:60] + '...', 'Vrai': 'Pos' if true_label else 'Neg'}
        
        for model_name in best_models:
            if model_name in models_dict:
                # Pr√©diction
                idx = test_sentences.index(sentence)
                pred_proba = models_dict[model_name]['model'].predict(
                    X_test_complex[idx:idx+1], verbose=0
                )[0]
                pred_label = np.argmax(pred_proba)
                confidence = pred_proba[pred_label]
                
                # Ajouter au r√©sultat
                pred_text = 'Pos' if pred_label == 1 else 'Neg'
                correct = '‚úì' if pred_label == true_label else '‚úó'
                row[f'{model_name}'] = f'{pred_text} {correct} ({confidence:.2f})'
        
        results_analysis.append(row)
    
    # Cr√©er un DataFrame pour l'affichage
    df_analysis = pd.DataFrame(results_analysis)
    
    # Affichage styl√©
    print("Analyse des Pr√©dictions sur Phrases Complexes")
    print("=" * 80)
    for _, row in df_analysis.iterrows():
        print(f"\nüìù {row['Phrase']}")
        print(f"   V√©rit√©: {row['Vrai']}")
        for model_name in best_models:
            if model_name in row:
                print(f"   {model_name:4s}: {row[model_name]}")
    
    return df_analysis

# Effectuer l'analyse
analysis_results = analyze_complex_predictions(results, tokenizer, max_length)

## 7. Visualisation des √âtats Cach√©s LSTM

In [None]:
# Cr√©er un mod√®le LSTM qui retourne les √©tats cach√©s
def create_lstm_with_states(vocab_size, embedding_dim, hidden_dim, max_length):
    """
    Cr√©e un LSTM qui retourne tous les √©tats cach√©s et √©tats de cellule.
    """
    inputs = layers.Input(shape=(max_length,))
    
    # Embedding
    x = layers.Embedding(vocab_size, embedding_dim)(inputs)
    
    # LSTM avec return_sequences=True et return_state=True
    lstm_out, final_h, final_c = layers.LSTM(
        hidden_dim, 
        return_sequences=True, 
        return_state=True
    )(x)
    
    # Classification
    output = layers.Dense(2, activation='softmax')(final_h)
    
    # Mod√®le principal
    main_model = models.Model(inputs=inputs, outputs=output)
    
    # Mod√®le pour extraire les √©tats
    state_model = models.Model(inputs=inputs, outputs=[lstm_out, final_h, final_c])
    
    return main_model, state_model

# Cr√©er le mod√®le avec √©tats
_, lstm_state_model = create_lstm_with_states(
    vocab_size=max_words,
    embedding_dim=100,
    hidden_dim=128,
    max_length=max_length
)

# Copier les poids du LSTM entra√Æn√©
lstm_trained = results['LSTM']['model']
lstm_state_model.layers[1].set_weights(lstm_trained.layers[0].get_weights())  # Embedding
lstm_state_model.layers[2].set_weights(lstm_trained.layers[1].get_weights())  # LSTM

In [None]:
# Visualiser les √©tats LSTM pour une phrase complexe
def visualize_lstm_states(text, model, tokenizer, max_length):
    """
    Visualise l'√©volution des √©tats cach√©s et √©tats de cellule LSTM.
    """
    # Pr√©paration
    sequence = tokenizer.texts_to_sequences([text])
    padded = pad_sequences(sequence, maxlen=max_length)
    
    # Obtenir les √©tats
    all_hidden, final_h, final_c = model.predict(padded, verbose=0)
    all_hidden = all_hidden[0]  # Premier (et seul) exemple du batch
    
    # Obtenir les mots de la s√©quence
    words = []
    for idx in sequence[0]:
        for word, word_idx in tokenizer.word_index.items():
            if word_idx == idx:
                words.append(word)
                break
    
    # Cr√©er les visualisations
    fig = make_subplots(
        rows=3, cols=1,
        subplot_titles=[
            f'√âtats Cach√©s LSTM - "{text}"',
            'Norme des √âtats Cach√©s',
            '√âtats de Cellule (√©chantillon)'
        ],
        vertical_spacing=0.1
    )
    
    # 1. Heatmap des √©tats cach√©s
    seq_length = min(len(words), all_hidden.shape[0])
    
    fig.add_trace(
        go.Heatmap(
            z=all_hidden[:seq_length].T,
            x=words[:seq_length],
            colorscale='RdBu',
            showscale=True
        ),
        row=1, col=1
    )
    
    # 2. Norme des √©tats cach√©s
    norms = np.linalg.norm(all_hidden[:seq_length], axis=1)
    fig.add_trace(
        go.Scatter(
            x=words[:seq_length],
            y=norms,
            mode='lines+markers',
            line=dict(color='blue', width=3),
            marker=dict(size=8)
        ),
        row=2, col=1
    )
    
    # 3. Quelques dimensions des √©tats de cellule
    for i in range(0, min(5, all_hidden.shape[1]), 2):  # Prendre quelques dimensions
        fig.add_trace(
            go.Scatter(
                x=words[:seq_length],
                y=final_c[0][i:i+1].numpy() if i < len(final_c[0]) else [0] * seq_length,
                mode='lines',
                name=f'Cellule dim {i}',
                line=dict(width=2)
            ),
            row=3, col=1
        )
    
    # Mise en forme
    fig.update_layout(
        height=800,
        title_text="Analyse des √âtats LSTM",
        showlegend=False
    )
    
    # Rotation des labels x
    for row in range(1, 4):
        fig.update_xaxes(tickangle=45, row=row, col=1)
    
    fig.show()

# Tester sur une phrase complexe
complex_sentence = "Au d√©but ce film semblait ennuyeux mais finalement c'√©tait absolument fantastique"
visualize_lstm_states(complex_sentence, lstm_state_model, tokenizer, max_length)

## 8. Benchmarking Complet

In [None]:
# Cr√©er un tableau de comparaison complet
def create_comprehensive_comparison(results):
    """
    Cr√©e un tableau de comparaison complet des mod√®les.
    """
    comparison_data = []
    
    for name, result in results.items():
        model = result['model']
        
        # Calculer les m√©triques
        total_params = model.count_params()
        test_acc = result['test_accuracy']
        test_loss = result['test_loss']
        
        # Estimer le temps d'entra√Ænement (bas√© sur les epochs r√©ellement utilis√©es)
        epochs_used = len(result['history'].history['loss'])
        
        comparison_data.append({
            'Mod√®le': name,
            'Param√®tres': f"{total_params:,}",
            'Test Accuracy': f"{test_acc:.4f}",
            'Test Loss': f"{test_loss:.4f}",
            'Epochs': epochs_used,
            'Complexit√©': 'Faible' if 'RNN' in name and 'LSTM' not in name else 
                         '√âlev√©e' if 'Stacked' in name or 'Bidirectional' in name else 'Moyenne'
        })
    
    df_comparison = pd.DataFrame(comparison_data)
    
    # Trier par accuracy
    df_comparison['Accuracy_num'] = df_comparison['Test Accuracy'].astype(float)
    df_comparison = df_comparison.sort_values('Accuracy_num', ascending=False)
    df_comparison = df_comparison.drop('Accuracy_num', axis=1)
    
    # Affichage styl√©
    fig, ax = plt.subplots(figsize=(14, 8))
    ax.axis('tight')
    ax.axis('off')
    
    table = ax.table(cellText=df_comparison.values,
                    colLabels=df_comparison.columns,
                    cellLoc='center',
                    loc='center')
    
    table.auto_set_font_size(False)
    table.set_fontsize(11)
    table.scale(1.2, 3)
    
    # Style du header
    for i in range(len(df_comparison.columns)):
        table[(0, i)].set_facecolor('#1E90FF')
        table[(0, i)].set_text_props(weight='bold', color='white')
    
    # Colorer les lignes selon la performance
    accuracies = [float(row[2]) for row in df_comparison.values]
    max_acc = max(accuracies)
    
    for i, (_, row) in enumerate(df_comparison.iterrows(), 1):
        accuracy = float(row['Test Accuracy'])
        
        if accuracy == max_acc:
            color = '#E6FFE6'  # Vert clair pour le meilleur
        elif accuracy >= max_acc - 0.02:
            color = '#FFF8E6'  # Jaune clair pour les bons
        else:
            color = '#FFE6E6'  # Rouge clair pour les moins bons
        
        for j in range(len(df_comparison.columns)):
            table[(i, j)].set_facecolor(color)
    
    plt.title('Comparaison Compl√®te des Mod√®les RNN/LSTM/GRU', 
              fontsize=16, fontweight='bold', pad=20)
    plt.show()
    
    return df_comparison

# Cr√©er la comparaison
comparison_df = create_comprehensive_comparison(results)

# Afficher les conclusions
print("\nüèÜ CONCLUSIONS DE L'EXP√âRIENCE")
print("=" * 50)
best_model = comparison_df.iloc[0]['Mod√®le']
best_acc = comparison_df.iloc[0]['Test Accuracy']
print(f"ü•á Meilleur mod√®le: {best_model} (Accuracy: {best_acc})")

print(f"\nüìä Observations:")
print(f"   ‚Ä¢ Les LSTM surpassent g√©n√©ralement les RNN vanilla")
print(f"   ‚Ä¢ Les GRU offrent un bon compromis performance/complexit√©")
print(f"   ‚Ä¢ Les mod√®les bidirectionnels et empil√©s peuvent am√©liorer les performances")
print(f"   ‚Ä¢ Le co√ªt en param√®tres augmente significativement avec la complexit√©")

## 9. Application Avanc√©e : G√©n√©ration de Texte

In [None]:
# Cr√©er un mod√®le de g√©n√©ration de texte avec LSTM
def create_text_generation_model(vocab_size, embedding_dim, lstm_units, sequence_length):
    """
    Cr√©e un mod√®le LSTM pour la g√©n√©ration de texte.
    """
    model = models.Sequential([
        layers.Embedding(vocab_size, embedding_dim, input_length=sequence_length),
        layers.LSTM(lstm_units, return_sequences=True),
        layers.Dropout(0.3),
        layers.LSTM(lstm_units),
        layers.Dropout(0.3),
        layers.Dense(vocab_size, activation='softmax')
    ])
    
    return model

# Pr√©parer des donn√©es pour la g√©n√©ration (utiliser nos textes existants)
def prepare_generation_data(texts, tokenizer, sequence_length=10):
    """
    Pr√©pare les donn√©es pour l'entra√Ænement de g√©n√©ration de texte.
    """
    # Joindre tous les textes
    full_text = ' '.join(texts)
    
    # Tokeniser
    sequences = tokenizer.texts_to_sequences([full_text])[0]
    
    # Cr√©er les s√©quences d'entra√Ænement
    X, y = [], []
    
    for i in range(len(sequences) - sequence_length):
        X.append(sequences[i:i + sequence_length])
        y.append(sequences[i + sequence_length])
    
    return np.array(X), np.array(y)

# Fonction de g√©n√©ration de texte
def generate_text(model, tokenizer, seed_text, num_words=10, sequence_length=10):
    """
    G√©n√®re du texte √† partir d'un seed.
    """
    # Tokeniser le seed
    seed_sequence = tokenizer.texts_to_sequences([seed_text])[0]
    
    # S'assurer que la s√©quence a la bonne longueur
    if len(seed_sequence) < sequence_length:
        seed_sequence = [0] * (sequence_length - len(seed_sequence)) + seed_sequence
    else:
        seed_sequence = seed_sequence[-sequence_length:]
    
    generated = seed_sequence.copy()
    
    # Cr√©er un mapping inverse pour les mots
    reverse_word_map = {v: k for k, v in tokenizer.word_index.items()}
    
    for _ in range(num_words):
        # Pr√©dire le prochain mot
        input_seq = np.array([generated[-sequence_length:]])
        predictions = model.predict(input_seq, verbose=0)[0]
        
        # √âchantillonner avec temp√©rature
        temperature = 0.8
        predictions = np.log(predictions + 1e-8) / temperature
        exp_preds = np.exp(predictions)
        predictions = exp_preds / np.sum(exp_preds)
        
        # Choisir le prochain mot
        next_word_idx = np.random.choice(len(predictions), p=predictions)
        generated.append(next_word_idx)
    
    # Convertir en texte
    generated_text = []
    for idx in generated:
        if idx in reverse_word_map:
            generated_text.append(reverse_word_map[idx])
    
    return ' '.join(generated_text)

# Cr√©er et entra√Æner un mini mod√®le de g√©n√©ration
sequence_length_gen = 8
X_gen, y_gen = prepare_generation_data(texts, tokenizer, sequence_length_gen)

print(f"Donn√©es de g√©n√©ration: {X_gen.shape}, {y_gen.shape}")

# Cr√©er le mod√®le
gen_model = create_text_generation_model(
    vocab_size=max_words,
    embedding_dim=50,
    lstm_units=64,
    sequence_length=sequence_length_gen
)

gen_model.compile(optimizer='adam', loss='sparse_categorical_crossentropy')

# Entra√Æner rapidement
print("Entra√Ænement du mod√®le de g√©n√©ration...")
gen_model.fit(X_gen, y_gen, epochs=20, batch_size=32, verbose=0)

# Tester la g√©n√©ration
seeds = ["ce produit", "tr√®s bon", "je recommande"]

print("\nüé® G√âN√âRATION DE TEXTE")
print("=" * 40)
for seed in seeds:
    generated = generate_text(gen_model, tokenizer, seed, num_words=8, sequence_length=sequence_length_gen)
    print(f"Seed: '{seed}'")
    print(f"G√©n√©r√©: {generated}\n")

## 10. Conclusion et Bonnes Pratiques

### üèÜ R√©sultats de nos Exp√©riences

√Ä travers ce notebook, nous avons explor√© les diff√©rences fondamentales entre RNN, LSTM et GRU :

### üìä Points Cl√©s

1. **LSTM vs RNN** : Les LSTM r√©solvent le probl√®me du gradient qui dispara√Æt gr√¢ce √† leurs portes
2. **GRU vs LSTM** : Les GRU offrent des performances similaires avec moins de param√®tres
3. **Mod√®les Bidirectionnels** : Am√©liorent les performances en regardant dans les deux directions
4. **Mod√®les Empil√©s** : Peuvent capturer des patterns plus complexes au prix de plus de param√®tres

### üõ†Ô∏è Bonnes Pratiques

- **Commencer simple** : Tester d'abord un GRU simple avant les architectures complexes
- **Dropout** : Utiliser dropout et recurrent_dropout pour √©viter l'overfitting
- **Early Stopping** : Arr√™ter l'entra√Ænement quand la validation loss stagne
- **Longueur des s√©quences** : Adapter max_length selon votre domaine
- **Embedding pr√©-entra√Æn√©** : Utiliser Word2Vec/GloVe/FastText quand possible

### ‚ö° Optimisations

- **Batch Size** : Augmenter si vous avez assez de GPU/RAM
- **Learning Rate** : Utiliser un scheduler pour ajuster dynamiquement
- **Gradient Clipping** : √âviter l'explosion des gradients
- **Mixed Precision** : Acc√©l√©rer l'entra√Ænement sur GPU modernes

### üöÄ Prochaines √âtapes

1. **Attention Mechanisms** : M√©canismes d'attention pour am√©liorer les performances
2. **Transformers** : Architecture plus moderne qui remplace souvent les RNN
3. **Transfer Learning** : Utiliser des mod√®les pr√©-entra√Æn√©s comme BERT
4. **Applications Sp√©cifiques** : NER, POS tagging, traduction, etc.

In [None]:
# Sauvegarder le meilleur mod√®le
best_model_name = comparison_df.iloc[0]['Mod√®le']
best_model = results[best_model_name]['model']

best_model.save('best_lstm_gru_model.h5')
print(f"‚úÖ Meilleur mod√®le ({best_model_name}) sauvegard√© sous 'best_lstm_gru_model.h5'")

# Sauvegarder le tokenizer
import pickle
with open('lstm_tokenizer.pkl', 'wb') as f:
    pickle.dump(tokenizer, f)
print("‚úÖ Tokenizer sauvegard√© sous 'lstm_tokenizer.pkl'")

print("\nüéâ Notebook termin√© ! Vous ma√Ætrisez maintenant LSTM et GRU avec TensorFlow.")