# üöÄ Google Colab Setup

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/ogautier1980/sandbox-ml/blob/main/cours/08_deep_learning_rnn/08_exercices.ipynb)

**Si vous ex√©cutez ce notebook sur Google Colab**, ex√©cutez la cellule suivante pour installer les d√©pendances.

In [None]:
# Installation des d√©pendances (Google Colab uniquement)import sysIN_COLAB = 'google.colab' in sys.modulesif IN_COLAB:    print('üì¶ Installation des packages...')        # Packages ML de base    !pip install -q numpy pandas matplotlib seaborn scikit-learn        # D√©tection du chapitre et installation des d√©pendances sp√©cifiques    notebook_name = '08_exercices.ipynb'  # Sera remplac√© automatiquement        # Ch 06-08 : Deep Learning    if any(x in notebook_name for x in ['06_', '07_', '08_']):        !pip install -q torch torchvision torchaudio        # Ch 08 : NLP    if '08_' in notebook_name:        !pip install -q transformers datasets tokenizers        if 'rag' in notebook_name:            !pip install -q sentence-transformers faiss-cpu rank-bm25        # Ch 09 : Reinforcement Learning    if '09_' in notebook_name:        !pip install -q gymnasium[classic-control]        # Ch 04 : Boosting    if '04_' in notebook_name and 'boosting' in notebook_name:        !pip install -q xgboost lightgbm catboost        # Ch 05 : Clustering avanc√©    if '05_' in notebook_name:        !pip install -q umap-learn        # Ch 11 : S√©ries temporelles    if '11_' in notebook_name:        !pip install -q statsmodels prophet        # Ch 12 : Vision avanc√©e    if '12_' in notebook_name:        !pip install -q ultralytics timm segmentation-models-pytorch        # Ch 13 : Recommandation    if '13_' in notebook_name:        !pip install -q scikit-surprise implicit        # Ch 14 : MLOps    if '14_' in notebook_name:        !pip install -q mlflow fastapi pydantic        print('‚úÖ Installation termin√©e !')else:    print('‚ÑπÔ∏è  Environnement local d√©tect√©, les packages sont d√©j√† install√©s.')

# Chapitre 08 - Exercices : RNN, LSTM et Transformers

Ce notebook contient des exercices pratiques sur les r√©seaux r√©currents et les Transformers.

## Exercices
1. Seq2Seq avec Attention pour traduction
2. G√©n√©ration de texte avec LSTM
3. Pr√©diction de s√©ries temporelles avec GRU
4. Classification multi-classe avec BERT

In [None]:
import torch
import torch.nn as nn
import torch.optim as optim
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from collections import Counter
import warnings
warnings.filterwarnings('ignore')

plt.style.use('seaborn-v0_8-darkgrid')
sns.set_palette('husl')
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f"Using device: {device}")

## Exercice 1: Seq2Seq avec Attention

### Objectif
Impl√©menter un mod√®le Sequence-to-Sequence avec m√©canisme d'attention pour traduire des nombres en mots.

### Instructions
1. Cr√©er un dataset de paires (nombre, mot): (1, "one"), (2, "two"), etc.
2. Impl√©menter un Encoder LSTM
3. Impl√©menter un Decoder LSTM avec Attention
4. Entra√Æner le mod√®le
5. Tester sur de nouveaux nombres

In [None]:
# TODO: Dataset nombre -> mot
number_to_word = {
    '1': 'one', '2': 'two', '3': 'three', '4': 'four', '5': 'five',
    '6': 'six', '7': 'seven', '8': 'eight', '9': 'nine', '0': 'zero'
}

# G√©n√©rer des exemples: "123" -> "one two three"
def generate_samples(num_samples=1000, max_len=5):
    samples = []
    for _ in range(num_samples):
        length = np.random.randint(1, max_len + 1)
        number_str = ''.join([str(np.random.randint(0, 10)) for _ in range(length)])
        word_str = ' '.join([number_to_word[d] for d in number_str])
        samples.append((number_str, word_str))
    return samples

samples = generate_samples(1000)
print("Sample pairs:")
for i in range(5):
    print(f"  {samples[i][0]} -> {samples[i][1]}")

In [None]:
# TODO: Vocabulaire
# Cr√©er des vocabulaires pour input (nombres) et output (mots)
# Indices sp√©ciaux: <PAD>=0, <SOS>=1, <EOS>=2

input_vocab = {'<PAD>': 0, '<SOS>': 1, '<EOS>': 2}
for digit in '0123456789':
    input_vocab[digit] = len(input_vocab)

output_vocab = {'<PAD>': 0, '<SOS>': 1, '<EOS>': 2}
for word in number_to_word.values():
    if word not in output_vocab:
        output_vocab[word] = len(output_vocab)

# Vocabulaires invers√©s
inv_output_vocab = {v: k for k, v in output_vocab.items()}

print(f"Input vocab size: {len(input_vocab)}")
print(f"Output vocab size: {len(output_vocab)}")
print(f"Output vocab: {output_vocab}")

In [None]:
# TODO: Encoder LSTM
class Encoder(nn.Module):
    def __init__(self, input_size, embedding_dim, hidden_dim):
        super(Encoder, self).__init__()
        self.embedding = nn.Embedding(input_size, embedding_dim)
        self.lstm = nn.LSTM(embedding_dim, hidden_dim, batch_first=True)
    
    def forward(self, x):
        # x: (batch_size, seq_len)
        embedded = self.embedding(x)  # (batch_size, seq_len, embedding_dim)
        outputs, (hidden, cell) = self.lstm(embedded)
        return outputs, hidden, cell

# TODO: Attention Layer
class Attention(nn.Module):
    def __init__(self, hidden_dim):
        super(Attention, self).__init__()
        self.attn = nn.Linear(hidden_dim * 2, hidden_dim)
        self.v = nn.Linear(hidden_dim, 1, bias=False)
    
    def forward(self, hidden, encoder_outputs):
        # hidden: (1, batch_size, hidden_dim)
        # encoder_outputs: (batch_size, src_len, hidden_dim)
        batch_size = encoder_outputs.shape[0]
        src_len = encoder_outputs.shape[1]
        
        # R√©p√©ter hidden pour chaque timestep
        hidden = hidden.repeat(src_len, 1, 1)  # (src_len, batch_size, hidden_dim)
        hidden = hidden.permute(1, 0, 2)  # (batch_size, src_len, hidden_dim)
        
        # Calcul des scores d'attention
        energy = torch.tanh(self.attn(torch.cat((hidden, encoder_outputs), dim=2)))
        attention = self.v(energy).squeeze(2)  # (batch_size, src_len)
        
        return torch.softmax(attention, dim=1)

# TODO: Decoder LSTM avec Attention
class DecoderWithAttention(nn.Module):
    def __init__(self, output_size, embedding_dim, hidden_dim):
        super(DecoderWithAttention, self).__init__()
        self.embedding = nn.Embedding(output_size, embedding_dim)
        self.attention = Attention(hidden_dim)
        self.lstm = nn.LSTM(embedding_dim + hidden_dim, hidden_dim, batch_first=True)
        self.fc = nn.Linear(hidden_dim, output_size)
    
    def forward(self, x, hidden, cell, encoder_outputs):
        # x: (batch_size, 1)
        embedded = self.embedding(x)  # (batch_size, 1, embedding_dim)
        
        # Attention
        attn_weights = self.attention(hidden, encoder_outputs)  # (batch_size, src_len)
        attn_weights = attn_weights.unsqueeze(1)  # (batch_size, 1, src_len)
        
        # Context vector
        context = torch.bmm(attn_weights, encoder_outputs)  # (batch_size, 1, hidden_dim)
        
        # Combiner embedding et context
        lstm_input = torch.cat((embedded, context), dim=2)  # (batch_size, 1, emb+hidden)
        
        # LSTM
        output, (hidden, cell) = self.lstm(lstm_input, (hidden, cell))
        
        # Pr√©diction
        prediction = self.fc(output.squeeze(1))  # (batch_size, output_size)
        
        return prediction, hidden, cell, attn_weights

print("Seq2Seq architecture defined!")

In [None]:
# TODO: Impl√©menter l'entra√Ænement et tester sur de nouveaux exemples
# ASTUCE: Utiliser teacher forcing pendant l'entra√Ænement
# ASTUCE: Pour l'inf√©rence, g√©n√©rer mot par mot jusqu'√† <EOS>

print("\n=== EXERCISE 1: Implement training and inference ===")
print("Hint: Use teacher forcing ratio of 0.5")
print("Hint: Train for ~50 epochs with batch_size=32")
print("Expected accuracy: >95% on test set")

## Exercice 2: G√©n√©ration de Texte avec LSTM

### Objectif
Cr√©er un g√©n√©rateur de texte caract√®re par caract√®re.

### Instructions
1. Utiliser un texte simple (ex: Shakespeare, Lorem Ipsum)
2. Cr√©er un vocabulaire de caract√®res
3. G√©n√©rer des s√©quences (input: N caract√®res, target: N+1√®me caract√®re)
4. Entra√Æner un LSTM
5. G√©n√©rer du nouveau texte avec sampling

In [None]:
# TODO: Texte d'exemple
text = """
To be or not to be, that is the question.
Whether tis nobler in the mind to suffer
The slings and arrows of outrageous fortune,
Or to take arms against a sea of troubles
And by opposing end them.
""" * 20  # R√©p√©ter pour avoir plus de donn√©es

# TODO: Cr√©er vocabulaire de caract√®res
chars = sorted(list(set(text)))
char_to_idx = {ch: i for i, ch in enumerate(chars)}
idx_to_char = {i: ch for i, ch in enumerate(chars)}
vocab_size = len(chars)

print(f"Text length: {len(text)}")
print(f"Vocab size: {vocab_size}")
print(f"Characters: {''.join(chars)}")

In [None]:
# TODO: Cr√©er s√©quences
seq_length = 40
sequences = []
next_chars = []

for i in range(len(text) - seq_length):
    sequences.append(text[i:i+seq_length])
    next_chars.append(text[i+seq_length])

print(f"Number of sequences: {len(sequences)}")
print(f"Example: '{sequences[0]}' -> '{next_chars[0]}'")

In [None]:
# TODO: Mod√®le LSTM pour g√©n√©ration de caract√®res
class CharLSTM(nn.Module):
    def __init__(self, vocab_size, embedding_dim, hidden_dim, num_layers):
        super(CharLSTM, self).__init__()
        self.embedding = nn.Embedding(vocab_size, embedding_dim)
        self.lstm = nn.LSTM(embedding_dim, hidden_dim, num_layers, batch_first=True)
        self.fc = nn.Linear(hidden_dim, vocab_size)
    
    def forward(self, x, hidden=None):
        embedded = self.embedding(x)
        if hidden is None:
            output, hidden = self.lstm(embedded)
        else:
            output, hidden = self.lstm(embedded, hidden)
        output = self.fc(output)
        return output, hidden

# TODO: Fonction de g√©n√©ration
def generate_text(model, start_str, length, temperature=1.0):
    """G√©n√®re du texte √† partir d'un seed.
    
    temperature: contr√¥le la cr√©ativit√© (0.5=conservateur, 1.5=cr√©atif)
    """
    model.eval()
    chars_generated = [ch for ch in start_str]
    input_seq = torch.LongTensor([[char_to_idx[ch] for ch in start_str]]).to(device)
    hidden = None
    
    with torch.no_grad():
        for _ in range(length):
            output, hidden = model(input_seq, hidden)
            output = output[0, -1, :] / temperature
            probs = torch.softmax(output, dim=0).cpu().numpy()
            char_idx = np.random.choice(len(probs), p=probs)
            next_char = idx_to_char[char_idx]
            chars_generated.append(next_char)
            input_seq = torch.LongTensor([[char_idx]]).to(device)
    
    return ''.join(chars_generated)

print("\n=== EXERCISE 2: Implement training and generate text ===")
print("Hint: Train for ~100 epochs")
print("Hint: Try different temperatures (0.5, 1.0, 1.5)")
print("Expected: Generated text should resemble Shakespeare style")

## Exercice 3: Pr√©diction de S√©ries Temporelles avec GRU

### Objectif
Pr√©dire les valeurs futures d'une s√©rie temporelle (ex: sinuso√Øde + bruit).

### Instructions
1. G√©n√©rer une s√©rie temporelle synth√©tique
2. Cr√©er des s√©quences (fen√™tre glissante)
3. Entra√Æner un GRU
4. Pr√©dire les valeurs futures
5. Visualiser les pr√©dictions

In [None]:
# TODO: G√©n√©rer s√©rie temporelle
def generate_time_series(n_points=1000):
    t = np.linspace(0, 100, n_points)
    # Combinaison de sinuso√Ødes + tendance + bruit
    series = (
        10 * np.sin(0.1 * t) + 
        5 * np.sin(0.3 * t) + 
        0.05 * t +
        np.random.randn(n_points) * 2
    )
    return series

time_series = generate_time_series(1000)

plt.figure(figsize=(14, 4))
plt.plot(time_series[:200], linewidth=2)
plt.xlabel('Time', fontsize=12)
plt.ylabel('Value', fontsize=12)
plt.title('Time Series Data (first 200 points)', fontsize=14, fontweight='bold')
plt.grid(True, alpha=0.3)
plt.show()

In [None]:
# TODO: Cr√©er s√©quences avec fen√™tre glissante
def create_sequences(data, seq_length, pred_length=1):
    """Cr√©er des paires (X, y) pour pr√©diction multi-step."""
    X, y = [], []
    for i in range(len(data) - seq_length - pred_length + 1):
        X.append(data[i:i+seq_length])
        y.append(data[i+seq_length:i+seq_length+pred_length])
    return np.array(X), np.array(y)

seq_length = 50
pred_length = 10  # Pr√©dire 10 pas dans le futur

# TODO: Normaliser les donn√©es
# TODO: Cr√©er train/test split
# TODO: Cr√©er DataLoader PyTorch

print("\n=== EXERCISE 3: Implement GRU for time series prediction ===")
print(f"Hint: Use sequence length = {seq_length}")
print(f"Hint: Predict {pred_length} steps ahead")
print("Expected: MSE < 5.0 on test set")

In [None]:
# TODO: Mod√®le GRU
class TimeSeriesGRU(nn.Module):
    def __init__(self, input_dim, hidden_dim, num_layers, output_dim):
        super(TimeSeriesGRU, self).__init__()
        self.gru = nn.GRU(input_dim, hidden_dim, num_layers, batch_first=True)
        self.fc = nn.Linear(hidden_dim, output_dim)
    
    def forward(self, x):
        # x: (batch_size, seq_len, input_dim)
        output, hidden = self.gru(x)
        # Prendre le dernier output
        prediction = self.fc(output[:, -1, :])
        return prediction

# TODO: Entra√Æner et visualiser les pr√©dictions

## Exercice 4: Classification Multi-classe avec BERT

### Objectif
Utiliser BERT pour classifier des textes en plusieurs cat√©gories.

### Instructions
1. Cr√©er un dataset synth√©tique de 3 cat√©gories (tech, sport, politique)
2. Fine-tuner BERT pour classification multi-classe
3. √âvaluer avec accuracy et confusion matrix
4. Tester sur de nouveaux textes

In [None]:
# TODO: Dataset multi-classe
tech_texts = [
    "new smartphone release powerful processor",
    "artificial intelligence machine learning breakthrough",
    "software update bug fixes performance improvements",
    "cloud computing data center expansion",
    "cybersecurity threat detection prevention"
] * 40

sport_texts = [
    "football match championship final victory",
    "basketball team playoff tournament win",
    "tennis player grand slam title",
    "olympic games gold medal record",
    "soccer world cup qualifier goal"
] * 40

politics_texts = [
    "election campaign presidential debate vote",
    "government policy reform legislation passed",
    "international summit diplomatic relations treaty",
    "parliament session bill proposal discussion",
    "political party coalition agreement negotiation"
] * 40

# TODO: Combiner et cr√©er labels (0=tech, 1=sport, 2=politics)
# TODO: Utiliser transformers library pour BERT
# TODO: Fine-tuner et √©valuer

print("\n=== EXERCISE 4: Implement multi-class classification with BERT ===")
print("Hint: Use 'bert-base-uncased' model")
print("Hint: Set num_labels=3")
print("Expected accuracy: >90%")

## Solutions (√† d√©commenter apr√®s avoir essay√©)

Les solutions sont fournies ci-dessous. Essayez d'abord de r√©soudre les exercices par vous-m√™me!

In [None]:
# SOLUTION EXERCICE 1: Disponible dans les notebooks de d√©monstration
# SOLUTION EXERCICE 2: Voir 08_demo_lstm_sentiment.ipynb pour architecture similaire
# SOLUTION EXERCICE 3: Adapter le mod√®le LSTM avec sortie multi-step
# SOLUTION EXERCICE 4: Voir 08_demo_transformers_huggingface.ipynb

## Bonus: Projets Avanc√©s

### Projet 1: Chatbot Simple
- Cr√©er un dataset de paires question-r√©ponse
- Entra√Æner un seq2seq avec attention
- Ajouter beam search pour g√©n√©ration

### Projet 2: R√©sum√© Automatique
- Utiliser un dataset de textes longs + r√©sum√©s
- Fine-tuner BART ou T5
- √âvaluer avec ROUGE score

### Projet 3: Named Entity Recognition
- Dataset avec entit√©s annot√©es (PER, LOC, ORG)
- Fine-tuner BERT pour NER
- Visualiser les entit√©s d√©tect√©es

### Projet 4: Analyse de Sentiments Multi-aspects
- Analyser diff√©rents aspects d'un produit (qualit√©, prix, service)
- Multi-task learning avec BERT
- Visualiser les sentiments par aspect

## Ressources Compl√©mentaires

### Papers Importants
- Seq2Seq: "Sequence to Sequence Learning with Neural Networks" (Sutskever et al., 2014)
- Attention: "Neural Machine Translation by Jointly Learning to Align and Translate" (Bahdanau et al., 2015)
- Transformer: "Attention is All You Need" (Vaswani et al., 2017)
- BERT: "BERT: Pre-training of Deep Bidirectional Transformers" (Devlin et al., 2018)

### Datasets Publics
- IMDB: Sentiment analysis (50k reviews)
- WMT: Machine translation
- SQuAD: Question answering
- GLUE/SuperGLUE: NLP benchmarks

### Biblioth√®ques
- Hugging Face Transformers: https://huggingface.co/transformers/
- PyTorch: https://pytorch.org/
- spaCy: https://spacy.io/
- NLTK: https://www.nltk.org/