# Créer son propre modèle ChatGPT

Ce code est disponible dans le repo du livre Generative Deep Learning de David Foster disponible [ici](https://github.com/davidADSP/Generative_Deep_Learning_2nd_Edition/blob/main/notebooks/09_transformer/gpt/gpt.ipynb)


Vous pouvez également avoir un code disponible [ici](https://keras.io/examples/generative/text_generation_with_miniature_gpt/)

# Générateur de Descriptions de Vin 🍷



## Objectif 🎯

- Concevoir et entraîner un modèle d'intelligence artificielle pour générer des descriptions de vin qui capturent l'essence, les arômes, les saveurs et le caractère unique de chaque bouteille.

## Données 📊

- Source : - Source : [winemag-data-130k-v2.json sur Kaggle](https://www.kaggle.com/zynicide/wine-reviews)

- Plus de 130 000 avis sur les vins collectés par *Wine Magazine*.
- Chaque entrée contient des informations telles que le pays, la désignation, le prix, la variété et, le plus important pour notre projet, une description détaillée du vin.

## Méthodologie ⚙️

1. **Nettoyage des données** : Élimination des doublons, traitement des valeurs manquantes, et préparation des descriptions pour l'entraînement.
2. **Entraînement du modèle Transformer** : Utilisation d'une architecture Transformer pour apprendre les nuances des descriptions de vin et générer de nouvelles descriptions.



In [41]:
%load_ext autoreload
%autoreload 2
import numpy as np
import json
import re
import string
from IPython.display import display, HTML

import tensorflow as tf
from tensorflow.keras import layers, models, losses, callbacks

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


## 0. Paramètres du projet <a name="parameters"></a>

In [2]:
# Taille du vocabulaire utilisé pour le modèle (nombre de mots uniques pris en compte)
VOCAB_SIZE = 10000

# Longueur maximale des séquences traitées par le modèle
MAX_LEN = 80

# Dimension de l'embedding des mots (représentation vectorielle des mots)
EMBEDDING_DIM = 256

# Dimension des clés pour l'attention multi-têtes
KEY_DIM = 256

# Nombre de "têtes" dans l'attention multi-têtes
N_HEADS = 2

# Dimension du réseau feed-forward dans le Transformer
FEED_FORWARD_DIM = 256

# Fraction des données utilisée pour la validation pendant l'entraînement
VALIDATION_SPLIT = 0.2

# Graine aléatoire pour la reproductibilité
SEED = 42

# Si True, charge un modèle pré-entraîné au lieu de former un nouveau modèle
LOAD_MODEL = True

# Taille des lots utilisés pendant l'entraînement
BATCH_SIZE = 32

# Nombre de cycles complets de passage sur l'ensemble de données pendant l'entraînement
EPOCHS = 5


## 1. Load the data <a name="load"></a>

In [42]:
# Charger l'ensemble de données complet
with open("winemag-data-130k-v2.json") as json_data:
    wine_data = json.load(json_data)


In [44]:
wine_data[10]

{'points': '87',
 'title': 'Kirkland Signature 2011 Mountain Cuvée Cabernet Sauvignon (Napa Valley)',
 'description': 'Soft, supple plum envelopes an oaky structure in this Cabernet, supported by 15% Merlot. Coffee and chocolate complete the picture, finishing strong at the end, resulting in a value-priced wine of attractive flavor and immediate accessibility.',
 'taster_name': 'Virginie Boone',
 'taster_twitter_handle': '@vboone',
 'price': 19,
 'designation': 'Mountain Cuvée',
 'variety': 'Cabernet Sauvignon',
 'region_1': 'Napa Valley',
 'region_2': 'Napa',
 'province': 'California',
 'country': 'US',
 'winery': 'Kirkland Signature'}

In [45]:
# Filtrer l'ensemble de données
filtered_data = [
    "wine review : "
    + x["country"]
    + " : "
    + x["province"]
    + " : "
    + x["variety"]
    + " : "
    + x["description"]
    for x in wine_data
    if x["country"] is not None
    and x["province"] is not None
    and x["variety"] is not None
    and x["description"] is not None
]

In [6]:
filtered_data

["wine review : Italy : Sicily & Sardinia : White Blend : Aromas include tropical fruit, broom, brimstone and dried herb. The palate isn't overly expressive, offering unripened apple, citrus and dried sage alongside brisk acidity.",
 "wine review : Portugal : Douro : Portuguese Red : This is ripe and fruity, a wine that is smooth while still structured. Firm tannins are filled out with juicy red berry fruits and freshened with acidity. It's  already drinkable, although it will certainly be better from 2016.",
 'wine review : US : Oregon : Pinot Gris : Tart and snappy, the flavors of lime flesh and rind dominate. Some green pineapple pokes through, with crisp acidity underscoring the flavors. The wine was all stainless-steel fermented.',
 'wine review : US : Michigan : Riesling : Pineapple rind, lemon pith and orange blossom start off the aromas. The palate is a bit more opulent, with notes of honey-drizzled guava and mango giving way to a slightly astringent, semidry finish.',
 "wine r

In [46]:
# Compter le nombre de description de vin
n_wines = len(filtered_data)
print(f"{n_wines} description chargées")


129907 description chargées


In [47]:
example = filtered_data[25]
print(example)

wine review : US : California : Pinot Noir : Oak and earth intermingle around robust aromas of wet forest floor in this vineyard-designated Pinot that hails from a high-elevation site. Small in production, it offers intense, full-bodied raspberry and blackberry steeped in smoky spice and smooth texture.


##  Tokeniser les données <a name="tokenize"></a>


La **tokenisation** est une étape fondamentale du traitement du langage naturel (NLP). Elle consiste à diviser une chaîne de texte en unités plus petites, appelées "tokens". Ces tokens peuvent être aussi simples que des mots ou aussi complexes que des phrases entières.






In [48]:
def pad_punctuation(s):
    """
    Ajoute des espaces autour de la ponctuation pour les traiter comme des 'mots' séparés.
    
    Args:
        s (str): Chaîne de caractères à traiter.
        
    Returns:
        str: Chaîne de caractères avec la ponctuation séparée par des espaces.
    """
    
    # Remplacer chaque caractère de ponctuation par lui-même entouré d'espaces
    s = re.sub(f"([{string.punctuation}, '\n'])", r" \1 ", s)
    
    # Remplacer les espaces multiples par un seul espace
    s = re.sub(" +", " ", s)
    
    return s

# Appliquer la fonction `pad_punctuation` à chaque élément de `filtered_data`
text_data = [pad_punctuation(x) for x in filtered_data]


In [49]:
# Afficher une description
example_data = text_data[30]
example_data

'wine review : France : Beaujolais : Gamay : Red cherry fruit comes laced with light tannins , giving this bright wine an open , juicy character . '

In [50]:
# Convertir en un Dataset TensorFlow
text_ds = (
    tf.data.Dataset.from_tensor_slices(text_data)  # Créer un dataset TensorFlow à partir de la liste `text_data`
    .batch(BATCH_SIZE)  # Grouper les données en lots de taille `BATCH_SIZE`
    .shuffle(1000)  # Mélanger les données avec un buffer de taille 1000
)


In [51]:
# Créer une couche de vectorisation
vectorize_layer = layers.TextVectorization(
    standardize="lower",                   # Standardiser le texte en le mettant en minuscules
    max_tokens=VOCAB_SIZE,                 # Nombre maximal de tokens uniques dans le vocabulaire
    output_mode="int",                     # Mode de sortie où chaque token est représenté par un entier unique
    output_sequence_length=MAX_LEN + 1,    # Longueur de sortie pour chaque séquence vectorisée
)


In [52]:
# Adapter la couche aux données d'entraînement
vectorize_layer.adapt(text_ds)

# Récupérer le vocabulaire généré par la couche de vectorisation
vocab = vectorize_layer.get_vocabulary()
vocab

['',
 '[UNK]',
 ':',
 ',',
 '.',
 'and',
 'the',
 'wine',
 'a',
 'of',
 'review',
 'with',
 'this',
 'is',
 '-',
 'it',
 'flavors',
 'in',
 "'",
 'to',
 'us',
 's',
 'fruit',
 'on',
 'red',
 'that',
 'aromas',
 'blend',
 'palate',
 'california',
 'acidity',
 'finish',
 'from',
 'but',
 'tannins',
 'drink',
 'cherry',
 'black',
 'ripe',
 'are',
 'italy',
 'has',
 'france',
 'pinot',
 'sauvignon',
 'cabernet',
 'for',
 'by',
 '%',
 'spice',
 'notes',
 'as',
 'an',
 'white',
 'its',
 'oak',
 'fresh',
 'rich',
 'dry',
 'berry',
 'nose',
 'chardonnay',
 'noir',
 'style',
 'full',
 'bordeaux',
 'now',
 'plum',
 'soft',
 'apple',
 'fruits',
 'well',
 'sweet',
 'crisp',
 'blackberry',
 'offers',
 'light',
 'texture',
 'dark',
 'there',
 'while',
 'citrus',
 'bodied',
 'shows',
 'through',
 'spain',
 'vanilla',
 'bright',
 'at',
 'very',
 'pepper',
 'more',
 'syrah',
 'juicy',
 'green',
 'lemon',
 'fruity',
 'raspberry',
 'valley',
 'good',
 'merlot',
 'blanc',
 'firm',
 'washington',
 'some',


In [53]:
# Afficher quelques correspondances entre tokens et mots
for i, word in enumerate(vocab[:10]):
    print(f"{i}: {word}")


0: 
1: [UNK]
2: :
3: ,
4: .
5: and
6: the
7: wine
8: a
9: of


In [54]:
# Afficher le même exemple converti en entiers
example_tokenised = vectorize_layer(example_data)

# Afficher la séquence d'entiers résultante
print(example_tokenised.numpy())


[  7  10   2  42   2 580   2 620   2  24  36  22 287 742  11  76  34   3
 429  12  87   7  52 456   3  93 110   4   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0]


## Créer l'ensemble d'entraînement <a name="create"></a>

In [55]:
def prepare_inputs(text):
    """
    Prépare les entrées pour l'entraînement en créant des paires de phrases : 
    l'une avec le texte original et l'autre avec le texte décalé d'un mot.
    
    Args:
        text (tf.Tensor): Tensor contenant le texte brut.
        
    Returns:
        tuple: Deux tensors, l'un avec le texte original (x) et l'autre avec le texte décalé d'un mot (y).
    """
    
    # Étendre les dimensions du texte pour le traitement
    text = tf.expand_dims(text, -1)
    
    # Convertir le texte en séquences d'entiers
    tokenized_sentences = vectorize_layer(text)
    
    # x contient tous les mots sauf le dernier de chaque phrase
    x = tokenized_sentences[:, :-1]
    
    # y contient tous les mots sauf le premier de chaque phrase, décalant ainsi les séquences d'un mot
    y = tokenized_sentences[:, 1:]
    
    return x, y

# Appliquer la fonction `prepare_inputs` à l'ensemble de données
train_ds = text_ds.map(prepare_inputs)


In [56]:
example_input_output = train_ds.take(1).get_single_element()

In [57]:
# exemple
example_input_output[0][0]

<tf.Tensor: shape=(80,), dtype=int64, numpy=
array([   7,   10,    2,   20,    2,   29,    2,  557,    2, 6399,  307,
          3,  127,   95,    3, 1776,  151,    5,   49,    3,   12,  557,
         13, 1028,  299,    5, 1344,  282,    3,   52,  575, 1969,   19,
        123,  467,   33,  139,    6, 4855,    9,   16, 1461,  453,    8,
       7678, 2639,    4,    0,    0,    0,    0,    0,    0,    0,    0,
          0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
          0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
          0,    0,    0])>

In [19]:
# Exemple de sortie (décalé d'un token)
example_input_output[1][0]


<tf.Tensor: shape=(80,), dtype=int64, numpy=
array([  10,    2,   20,    2,   29,    2,  286,    2,    8,  335, 2137,
          9, 6151,  470,    3,   12,   57,   63,    9,  164,   83,   24,
         36,    3,  213,    3,  276,  846,    5,  685,    3,  151,  859,
         14,  111,   26,   23,    6,   60,    4,   38,   67,    5,  204,
         16,  793,    6,   28,    3, 2171,   47,  719,  333,    5,  266,
          9,  522,  841,    4,    0,    0,    0,    0,    0,    0,    0,
          0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
          0,    0,    0])>

##  Créer la fonction de masquage pour l'attention causale <a name="causal"></a>

In [20]:
def causal_attention_mask(batch_size, n_dest, n_src, dtype):
    """
    Crée un masque pour l'attention causale.
    
    Ce masque permet de s'assurer que lors de la prédiction d'un token, 
    seul le passé (les tokens précédents) est pris en compte.
    
    Args:
        batch_size (int): Taille du lot.
        n_dest (int): Nombre de tokens de destination.
        n_src (int): Nombre de tokens source.
        dtype (tf.DType): Type des éléments du masque.
        
    Returns:
        tf.Tensor: Masque d'attention causale de forme [batch_size, n_dest, n_src].
    """
    
    # Crée des indices pour les tokens de destination et source
    i = tf.range(n_dest)[:, None]
    j = tf.range(n_src)
    
    # Détermine la relation entre les indices de destination et de source
    m = i >= j - n_src + n_dest
    
    # Convertit le masque booléen en type spécifié
    mask = tf.cast(m, dtype)
    
    # Redimensionne le masque pour le rendre compatible avec les dimensions attendues
    mask = tf.reshape(mask, [1, n_dest, n_src])
    
    # Duplique le masque pour tout le lot
    mult = tf.concat(
        [tf.expand_dims(batch_size, -1), tf.constant([1, 1], dtype=tf.int32)], 0
    )
    return tf.tile(mask, mult)

# Affiche le masque d'attention causale transposé pour une meilleure visualisation
np.transpose(causal_attention_mask(1, 10, 10, dtype=tf.int32)[0])


array([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
       [0, 1, 1, 1, 1, 1, 1, 1, 1, 1],
       [0, 0, 1, 1, 1, 1, 1, 1, 1, 1],
       [0, 0, 0, 1, 1, 1, 1, 1, 1, 1],
       [0, 0, 0, 0, 1, 1, 1, 1, 1, 1],
       [0, 0, 0, 0, 0, 1, 1, 1, 1, 1],
       [0, 0, 0, 0, 0, 0, 1, 1, 1, 1],
       [0, 0, 0, 0, 0, 0, 0, 1, 1, 1],
       [0, 0, 0, 0, 0, 0, 0, 0, 1, 1],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 1]], dtype=int32)

## Créer une couche de bloc Transformer<a name="transformer"></a>

In [21]:
class TransformerBlock(layers.Layer):
    """
    Couche de bloc Transformer.
    
    Ce bloc est composé d'une attention multi-têtes suivie d'une 
    normalisation de couche et d'un réseau feed-forward.
    
    Attributes:
        num_heads (int): Nombre de têtes pour l'attention multi-têtes.
        key_dim (int): Dimension de la clé pour l'attention.
        embed_dim (int): Dimension de l'embedding.
        ff_dim (int): Dimension du réseau feed-forward interne.
        dropout_rate (float, optional): Taux de dropout. Par défaut à 0.1.
    """

    def __init__(self, num_heads, key_dim, embed_dim, ff_dim, dropout_rate=0.1):
        super(TransformerBlock, self).__init__()
        self.num_heads = num_heads
        self.key_dim = key_dim
        self.embed_dim = embed_dim
        self.ff_dim = ff_dim
        self.dropout_rate = dropout_rate

        # Initialisation des couches
        self.attn = layers.MultiHeadAttention(
            num_heads, key_dim, output_shape=embed_dim
        )
        self.dropout_1 = layers.Dropout(self.dropout_rate)
        self.ln_1 = layers.LayerNormalization(epsilon=1e-6)
        self.ffn_1 = layers.Dense(self.ff_dim, activation="relu")
        self.ffn_2 = layers.Dense(self.embed_dim)
        self.dropout_2 = layers.Dropout(self.dropout_rate)
        self.ln_2 = layers.LayerNormalization(epsilon=1e-6)

    def call(self, inputs):
        """Passage en avant du bloc Transformer."""
        
        # Calcul du masque d'attention causale
        input_shape = tf.shape(inputs)
        batch_size = input_shape[0]
        seq_len = input_shape[1]
        causal_mask = causal_attention_mask(
            batch_size, seq_len, seq_len, tf.bool
        )

        # Application de l'attention multi-têtes
        attention_output, attention_scores = self.attn(
            inputs,
            inputs,
            attention_mask=causal_mask,
            return_attention_scores=True,
        )
        attention_output = self.dropout_1(attention_output)
        out1 = self.ln_1(inputs + attention_output)

        # Application du réseau feed-forward
        ffn_1 = self.ffn_1(out1)
        ffn_2 = self.ffn_2(ffn_1)
        ffn_output = self.dropout_2(ffn_2)
        return (self.ln_2(out1 + ffn_output), attention_scores)

    def get_config(self):
        """Retourne la configuration du bloc Transformer."""
        config = super().get_config()
        config.update(
            {
                "key_dim": self.key_dim,
                "embed_dim": self.embed_dim,
                "num_heads": self.num_heads,
                "ff_dim": self.ff_dim,
                "dropout_rate": self.dropout_rate,
            }
        )
        return config


## Créer l'Embedding de Tokens et de Position <a name="embedder"></a>

In [22]:
class TokenAndPositionEmbedding(layers.Layer):
    def __init__(self, max_len, vocab_size, embed_dim):
        super(TokenAndPositionEmbedding, self).__init__()
        self.max_len = max_len
        self.vocab_size = vocab_size
        self.embed_dim = embed_dim
        self.token_emb = layers.Embedding(
            input_dim=vocab_size, output_dim=embed_dim
        )
        self.pos_emb = layers.Embedding(input_dim=max_len, output_dim=embed_dim)

    def call(self, x):
        maxlen = tf.shape(x)[-1]
        positions = tf.range(start=0, limit=maxlen, delta=1)
        positions = self.pos_emb(positions)
        x = self.token_emb(x)
        return x + positions

    def get_config(self):
        config = super().get_config()
        config.update(
            {
                "max_len": self.max_len,
                "vocab_size": self.vocab_size,
                "embed_dim": self.embed_dim,
            }
        )
        return config

## Construire le modèle Transformer <a name="transformer_decoder"></a>

In [23]:
inputs = layers.Input(shape=(None,), dtype=tf.int32)
x = TokenAndPositionEmbedding(MAX_LEN, VOCAB_SIZE, EMBEDDING_DIM)(inputs)
x, attention_scores = TransformerBlock(
    N_HEADS, KEY_DIM, EMBEDDING_DIM, FEED_FORWARD_DIM
)(x)
outputs = layers.Dense(VOCAB_SIZE, activation="softmax")(x)
gpt = models.Model(inputs=inputs, outputs=[outputs, attention_scores])
gpt.compile("adam", loss=[losses.SparseCategoricalCrossentropy(), None])

In [24]:
gpt.summary()

Model: "model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 input_1 (InputLayer)        [(None, None)]            0         
                                                                 
 token_and_position_embeddi  (None, None, 256)         2580480   
 ng (TokenAndPositionEmbedd                                      
 ing)                                                            
                                                                 
 transformer_block (Transfo  ((None, None, 256),       658688    
 rmerBlock)                   (None, 2, None, None))             
                                                                 
 dense_2 (Dense)             (None, None, 10000)       2570000   
                                                                 
Total params: 5809168 (22.16 MB)
Trainable params: 5809168 (22.16 MB)
Non-trainable params: 0 (0.00 Byte)
_____________________

In [25]:
if LOAD_MODEL:
    # model.load_weights('./models/model')
    gpt = models.load_model("./models/gpt", compile=True)

## Entraîner le Transformer <a name="train"></a>

In [26]:
# Create a TextGenerator checkpoint
class TextGenerator(callbacks.Callback):
    def __init__(self, index_to_word, top_k=10):
        self.index_to_word = index_to_word
        self.word_to_index = {
            word: index for index, word in enumerate(index_to_word)
        }

    def sample_from(self, probs, temperature):#temperature: parametre pour controler l'alea
        probs = probs ** (1 / temperature)
        probs = probs / np.sum(probs)
        return np.random.choice(len(probs), p=probs), probs

    def generate(self, start_prompt, max_tokens, temperature):
        start_tokens = [
            self.word_to_index.get(x, 1) for x in start_prompt.split()
        ]
        sample_token = None
        info = []
        while len(start_tokens) < max_tokens and sample_token != 0:
            x = np.array([start_tokens])
            y, att = self.model.predict(x, verbose=0)
            sample_token, probs = self.sample_from(y[0][-1], temperature)
            info.append(
                {
                    "prompt": start_prompt,
                    "word_probs": probs,
                    "atts": att[0, :, -1, :],
                }
            )
            start_tokens.append(sample_token)
            start_prompt = start_prompt + " " + self.index_to_word[sample_token]
        print(f"\ngenerated text:\n{start_prompt}\n")
        return info

    def on_epoch_end(self, epoch, logs=None):
        self.generate("wine review", max_tokens=80, temperature=1.0)

In [27]:
# Create a model save checkpoint
model_checkpoint_callback = callbacks.ModelCheckpoint(
    filepath="./checkpoint/checkpoint.ckpt",
    save_weights_only=True,
    save_freq="epoch",
    verbose=0,
)

tensorboard_callback = callbacks.TensorBoard(log_dir="./logs")

# Tokenize starting prompt
text_generator = TextGenerator(vocab)

In [28]:
gpt.fit(
    train_ds,
    epochs=EPOCHS,
    callbacks=[model_checkpoint_callback, tensorboard_callback, text_generator],
)

Epoch 1/5
generated text:
wine review : chile : casablanca valley : syrah : excellent things off - value , with spice notes , and it feels fine , and the rest of the movement in leyda valley . this mixes briny , briny and briny flavors of citrus and berry . its finish is fresh but limited . 

Epoch 2/5
generated text:
wine review : spain : northern spain : verdejo : verdejo can [UNK] particular fruit like mas ' s sugar , enough on the nose , then limited in scope . the palate is soft , round and easy , while the finish is fresh and better than past vintages but underwhelming . 

Epoch 3/5
generated text:
wine review : us : california : cabernet sauvignon : this is a stellar example from one of edna valley ' s great vineyard sites , an interesting white blend , whole - cluster - pressed and fermented , which is 100 % cabernet sauvignon . it ' s fruity and woody in the aroma and finishes with a long , spicy note . 

Epoch 4/5
generated text:
wine review : portugal : ribatejo : portuguese

<keras.src.callbacks.History at 0x2c44dc8e0>

In [29]:
# Save the final model
gpt.save("./models/gpt")

INFO:tensorflow:Assets written to: ./models/gpt/assets


INFO:tensorflow:Assets written to: ./models/gpt/assets


# Générer du texte à l'aide du Transformer

In [30]:
def print_probs(info, vocab, top_k=5):
    for i in info:
        highlighted_text = []
        for word, att_score in zip(
            i["prompt"].split(), np.mean(i["atts"], axis=0)
        ):
            highlighted_text.append(
                '<span style="background-color:rgba(135,206,250,'
                + str(att_score / max(np.mean(i["atts"], axis=0)))
                + ');">'
                + word
                + "</span>"
            )
        highlighted_text = " ".join(highlighted_text)
        display(HTML(highlighted_text))

        word_probs = i["word_probs"]
        p_sorted = np.sort(word_probs)[::-1][:top_k]
        i_sorted = np.argsort(word_probs)[::-1][:top_k]
        for p, i in zip(p_sorted, i_sorted):
            print(f"{vocab[i]}:   \t{np.round(100*p,2)}%")
        print("--------\n")

In [59]:
info = text_generator.generate(
    "wine review : fr", max_tokens=80, temperature=1.0
)


generated text:
wine review : fr : oregon : pinot noir : all [UNK] is billed as a to the wood lover ' s wine , without elegance . the brothers have been ponzi in their bottles of his winemaking , and the acidity called [UNK] , the wine sources that most of this dundee vineyard ' s overall reserve wine to add power . 



In [32]:
info = text_generator.generate(
    "wine review : italy", max_tokens=80, temperature=0.5
)


generated text:
wine review : italy : tuscany : sangiovese : this opens with a fragrance of wild berry , leather , menthol and a whiff of leather . the palate delivers black cherry , espresso and star anise alongside firm tannins and bright acidity . drink through 2019 . 



In [33]:
info = text_generator.generate(
    "wine review : germany", max_tokens=80, temperature=0.5
)
print_probs(info, vocab)


generated text:
wine review : germany : rheingau : riesling : a hint of mineral and crushed stone lend a savory tone to this dry , full - bodied wine . it ' s crisp and bright on the palate , with bright acidity and a light touch of white peach and nectarine flavors . drink now . 



::   	100.0%
-:   	0.0%
,:   	0.0%
grosso:   	0.0%
and:   	0.0%
--------



mosel:   	58.67%
rheingau:   	26.67%
rheinhessen:   	11.72%
pfalz:   	2.34%
nahe:   	0.28%
--------



::   	100.0%
-:   	0.0%
other:   	0.0%
grosso:   	0.0%
neagra:   	0.0%
--------



riesling:   	99.65%
pinot:   	0.23%
spätburgunder:   	0.11%
rosé:   	0.0%
white:   	0.0%
--------



::   	100.0%
-:   	0.0%
blanc:   	0.0%
grosso:   	0.0%
du:   	0.0%
--------



a:   	38.69%
while:   	21.44%
this:   	9.13%
the:   	6.46%
hints:   	3.31%
--------



hint:   	23.98%
bit:   	16.47%
touch:   	13.82%
wisp:   	8.52%
whiff:   	7.1%
--------



of:   	100.0%
-:   	0.0%
at:   	0.0%
on:   	0.0%
sweetness:   	0.0%
--------



honey:   	14.37%
smoke:   	13.3%
crushed:   	9.1%
petrol:   	6.29%
green:   	6.04%
--------



oil:   	36.47%
and:   	17.73%
notes:   	9.23%
clings:   	8.84%
lend:   	6.74%
--------



crushed:   	32.2%
steel:   	20.8%
spice:   	4.7%
petrol:   	4.26%
dried:   	3.41%
--------



stone:   	77.63%
granite:   	10.46%
slate:   	3.94%
minerals:   	2.46%
mineral:   	2.26%
--------



lend:   	43.81%
accent:   	26.79%
add:   	15.03%
mark:   	6.61%
mingle:   	2.16%
--------



a:   	87.15%
an:   	5.5%
complexity:   	4.9%
savoriness:   	0.56%
depth:   	0.36%
--------



savory:   	53.61%
cool:   	14.28%
fresh:   	11.38%
crisp:   	6.43%
slightly:   	1.97%
--------



tone:   	97.88%
,:   	1.52%
character:   	0.19%
feel:   	0.14%
edge:   	0.13%
--------



to:   	99.99%
of:   	0.01%
,:   	0.0%
that:   	0.0%
.:   	0.0%
--------



this:   	97.65%
the:   	1.46%
fresh:   	0.29%
ripe:   	0.21%
bright:   	0.11%
--------



dry:   	86.36%
off:   	7.17%
wine:   	2.25%
otherwise:   	1.31%
light:   	0.51%
--------



,:   	99.35%
riesling:   	0.24%
-:   	0.16%
wine:   	0.14%
and:   	0.06%
--------



medium:   	63.72%
full:   	12.68%
crisp:   	6.54%
lusciously:   	2.37%
delicately:   	1.68%
--------



-:   	99.97%
bodied:   	0.03%
of:   	0.01%
wine:   	0.0%
,:   	0.0%
--------



bodied:   	100.0%
flavored:   	0.0%
force:   	0.0%
in:   	0.0%
and:   	0.0%
--------



riesling:   	89.29%
wine:   	8.9%
dry:   	0.6%
kabinett:   	0.43%
auslese:   	0.32%
--------



.:   	97.28%
that:   	2.51%
from:   	0.16%
,:   	0.04%
made:   	0.01%
--------



it:   	92.42%
the:   	5.03%
its:   	0.85%
crisp:   	0.49%
ripe:   	0.33%
--------



':   	99.97%
has:   	0.02%
balances:   	0.0%
juxtaposes:   	0.0%
is:   	0.0%
--------



s:   	100.0%
ll:   	0.0%
bursts:   	0.0%
[UNK]:   	0.0%
has:   	0.0%
--------



juicy:   	39.96%
crisp:   	12.03%
rich:   	11.35%
a:   	8.74%
concentrated:   	3.86%
--------



and:   	78.89%
,:   	20.65%
with:   	0.16%
in:   	0.13%
on:   	0.12%
--------



fresh:   	25.53%
refreshing:   	18.39%
clean:   	10.37%
cutting:   	9.52%
juicy:   	6.12%
--------



,:   	59.22%
on:   	20.35%
in:   	16.95%
with:   	3.46%
at:   	0.01%
--------



the:   	99.99%
its:   	0.0%
acidity:   	0.0%
entry:   	0.0%
a:   	0.0%
--------



palate:   	99.52%
finish:   	0.31%
nose:   	0.14%
midpalate:   	0.02%
tongue:   	0.01%
--------



,:   	99.33%
with:   	0.54%
and:   	0.07%
but:   	0.02%
.:   	0.02%
--------



with:   	63.61%
it:   	25.59%
but:   	4.81%
yet:   	2.76%
finishing:   	1.62%
--------



a:   	86.16%
lemon:   	4.59%
crisp:   	1.21%
hints:   	1.0%
fresh:   	0.82%
--------



lemon:   	34.01%
acidity:   	28.78%
lime:   	9.02%
,:   	7.91%
citrus:   	7.51%
--------



and:   	94.44%
,:   	4.01%
that:   	1.05%
.:   	0.38%
highlighting:   	0.07%
--------



a:   	99.38%
an:   	0.28%
crisp:   	0.04%
citrus:   	0.04%
lemon:   	0.02%
--------



lingering:   	36.36%
long:   	33.48%
brisk:   	5.52%
touch:   	3.17%
hint:   	3.13%
--------



,:   	57.15%
touch:   	25.93%
dusting:   	4.3%
-:   	3.98%
body:   	3.78%
--------



of:   	99.9%
.:   	0.04%
to:   	0.04%
,:   	0.01%
that:   	0.0%
--------



lemon:   	19.89%
citrus:   	17.37%
honey:   	13.81%
sweetness:   	12.94%
minerality:   	11.1%
--------



peach:   	74.44%
grapefruit:   	21.87%
cherry:   	1.38%
pepper:   	0.74%
fruit:   	0.59%
--------



and:   	97.72%
.:   	1.56%
,:   	0.65%
flavor:   	0.04%
flavors:   	0.02%
--------



melon:   	22.34%
nectarine:   	14.41%
citrus:   	11.32%
apple:   	11.21%
apricot:   	10.42%
--------



flavors:   	98.84%
.:   	0.92%
flavor:   	0.07%
notes:   	0.06%
fruit:   	0.05%
--------



.:   	97.38%
,:   	1.4%
that:   	1.21%
on:   	0.01%
finish:   	0.0%
--------



drink:   	43.42%
it:   	23.32%
:   	20.59%
finishes:   	9.64%
the:   	2.48%
--------



now:   	99.95%
it:   	0.03%
up:   	0.01%
now–2017:   	0.0%
soon:   	0.0%
--------



.:   	76.18%
through:   	23.44%
and:   	0.14%
for:   	0.09%
to:   	0.05%
--------



:   	100.0%
drink:   	0.0%
it:   	0.0%
the:   	0.0%
enjoy:   	0.0%
--------



In [34]:
info = text_generator.generate(
    "wine review : france", max_tokens=80, temperature=0.5
)
print_probs(info, vocab)


generated text:
wine review : france : bordeaux : bordeaux - style red blend : this is a ripe , fruity wine that has attractive black - currant flavors and a soft texture . the tannins are cushioned by the rich texture of the wine , giving a wine that is already delicious . drink from 2017 . 



::   	100.0%
grosso:   	0.0%
,:   	0.0%
other:   	0.0%
and:   	0.0%
--------



alsace:   	55.36%
bordeaux:   	24.58%
burgundy:   	9.27%
loire:   	3.17%
southwest:   	2.77%
--------



::   	100.0%
-:   	0.0%
.:   	0.0%
,:   	0.0%
other:   	0.0%
--------



bordeaux:   	98.32%
rosé:   	1.57%
merlot:   	0.06%
sauvignon:   	0.03%
cabernet:   	0.01%
--------



-:   	100.0%
::   	0.0%
blend:   	0.0%
white:   	0.0%
red:   	0.0%
--------



style:   	100.0%
blend:   	0.0%
bordeaux:   	0.0%
[UNK]:   	0.0%
wine:   	0.0%
--------



red:   	98.28%
white:   	1.72%
of:   	0.0%
black:   	0.0%
[UNK]:   	0.0%
--------



blend:   	100.0%
::   	0.0%
-:   	0.0%
wine:   	0.0%
sauvignon:   	0.0%
--------



::   	100.0%
(:   	0.0%
grosso:   	0.0%
-:   	0.0%
,:   	0.0%
--------



this:   	54.06%
a:   	26.2%
the:   	14.51%
with:   	1.23%
87:   	0.61%
--------



is:   	87.71%
wine:   	11.36%
has:   	0.3%
blend:   	0.29%
ripe:   	0.19%
--------



a:   	98.63%
an:   	1.08%
the:   	0.19%
ripe:   	0.04%
rich:   	0.02%
--------



ripe:   	47.19%
wine:   	16.17%
rich:   	13.83%
structured:   	4.51%
dense:   	2.27%
--------



,:   	72.46%
and:   	19.84%
wine:   	7.69%
fruity:   	0.0%
while:   	0.0%
--------



fruity:   	49.94%
rich:   	11.85%
juicy:   	8.66%
full:   	7.22%
smooth:   	3.32%
--------



wine:   	97.56%
and:   	2.13%
,:   	0.3%
style:   	0.0%
blend:   	0.0%
--------



with:   	31.2%
that:   	26.46%
,:   	24.88%
.:   	16.61%
from:   	0.81%
--------



has:   	62.98%
is:   	29.59%
shows:   	3.48%
':   	2.75%
will:   	0.41%
--------



a:   	64.85%
attractive:   	14.67%
some:   	3.88%
plenty:   	2.54%
the:   	1.67%
--------



red:   	48.21%
acidity:   	22.36%
black:   	12.98%
tannins:   	12.77%
berry:   	0.86%
--------



-:   	69.08%
currant:   	24.77%
fruits:   	4.18%
currants:   	0.86%
plum:   	0.86%
--------



currant:   	98.99%
plum:   	0.96%
cherry:   	0.04%
fruit:   	0.01%
berry:   	0.0%
--------



flavors:   	71.3%
fruits:   	16.31%
flavor:   	3.82%
fruitiness:   	2.7%
fruit:   	2.68%
--------



and:   	70.78%
.:   	22.52%
that:   	4.21%
,:   	1.9%
as:   	0.28%
--------



a:   	84.57%
attractive:   	2.75%
acidity:   	2.48%
juicy:   	2.09%
balanced:   	1.21%
--------



juicy:   	15.25%
touch:   	11.4%
smooth:   	9.13%
crisp:   	8.42%
light:   	7.68%
--------



texture:   	83.59%
,:   	13.17%
structure:   	0.73%
character:   	0.52%
edge:   	0.49%
--------



.:   	99.71%
that:   	0.24%
of:   	0.03%
,:   	0.02%
and:   	0.0%
--------



it:   	70.84%
the:   	25.45%
drink:   	1.9%
with:   	0.76%
a:   	0.37%
--------



wine:   	57.04%
tannins:   	18.39%
acidity:   	16.97%
wood:   	3.62%
aftertaste:   	1.35%
--------



are:   	98.49%
and:   	0.96%
give:   	0.35%
have:   	0.05%
,:   	0.03%
--------



already:   	50.19%
well:   	17.27%
balanced:   	10.69%
cushioned:   	6.93%
still:   	2.96%
--------



by:   	99.99%
with:   	0.0%
,:   	0.0%
in:   	0.0%
into:   	0.0%
--------



the:   	99.84%
a:   	0.08%
ripe:   	0.03%
attractive:   	0.01%
rich:   	0.01%
--------



ripe:   	26.15%
rich:   	23.11%
generous:   	16.87%
attractive:   	8.64%
acidity:   	4.96%
--------



texture:   	35.38%
fruit:   	18.17%
fruitiness:   	13.73%
black:   	9.3%
berry:   	8.16%
--------



,:   	34.84%
and:   	30.43%
.:   	23.62%
of:   	9.53%
that:   	1.2%
--------



the:   	96.6%
this:   	2.86%
cabernet:   	0.44%
a:   	0.05%
ripe:   	0.01%
--------



wine:   	89.36%
vintage:   	6.51%
cabernet:   	1.95%
fruit:   	1.29%
ripe:   	0.18%
--------



.:   	86.46%
,:   	11.02%
that:   	1.34%
and:   	0.42%
is:   	0.39%
--------



with:   	53.19%
giving:   	28.22%
which:   	6.69%
but:   	3.4%
while:   	2.62%
--------



a:   	96.61%
the:   	1.42%
an:   	1.4%
it:   	0.3%
spice:   	0.05%
--------



wine:   	65.33%
lift:   	10.92%
fine:   	2.98%
ripe:   	2.17%
smooth:   	2.06%
--------



that:   	99.44%
with:   	0.31%
,:   	0.13%
to:   	0.05%
.:   	0.04%
--------



will:   	48.54%
is:   	47.06%
needs:   	1.4%
has:   	1.31%
':   	0.71%
--------



ready:   	97.55%
already:   	1.97%
developing:   	0.18%
now:   	0.05%
likely:   	0.04%
--------



ready:   	31.69%
delicious:   	30.17%
attractive:   	16.57%
drinkable:   	10.96%
accessible:   	4.69%
--------



.:   	98.74%
,:   	0.66%
to:   	0.3%
and:   	0.26%
now:   	0.01%
--------



drink:   	99.31%
:   	0.43%
it:   	0.16%
the:   	0.09%
a:   	0.0%
--------



from:   	97.99%
now:   	1.79%
starting:   	0.19%
this:   	0.03%
soon:   	0.0%
--------



2017:   	68.59%
2018:   	17.3%
2016:   	7.52%
2015:   	2.07%
2019:   	1.78%
--------



.:   	100.0%
and:   	0.0%
to:   	0.0%
,:   	0.0%
for:   	0.0%
--------



:   	100.0%
the:   	0.0%
it:   	0.0%
andré:   	0.0%
george:   	0.0%
--------



In [40]:
info = text_generator.generate(
    "wine review : england", max_tokens=90, temperature=0.5
)
print_probs(info, vocab)


generated text:
wine review : england : england : chardonnay : a touch of apple scent the nose . the palate is all [UNK] . the same , however , the result is a slender , taut and slender , dry and whistle - clean , refreshing and the palate is whistle - clean , pure lemon and lime zest . there are glimpses of white grapefruit and quince , while the finish is persistent . 



::   	100.0%
-:   	0.0%
and:   	0.0%
grosso:   	0.0%
,:   	0.0%
--------



england:   	100.0%
france:   	0.0%
us:   	0.0%
central:   	0.0%
alsace:   	0.0%
--------



::   	100.0%
and:   	0.0%
-:   	0.0%
,:   	0.0%
generosity:   	0.0%
--------



sparkling:   	53.02%
chardonnay:   	27.58%
pinot:   	17.35%
riesling:   	0.73%
sauvignon:   	0.43%
--------



::   	99.98%
-:   	0.02%
,:   	0.0%
and:   	0.0%
blend:   	0.0%
--------



a:   	47.08%
the:   	27.38%
while:   	4.11%
there:   	2.96%
an:   	2.21%
--------



touch:   	93.52%
hint:   	2.63%
little:   	0.7%
trio:   	0.68%
haze:   	0.61%
--------



of:   	99.92%
on:   	0.05%
and:   	0.01%
,:   	0.01%
[UNK]:   	0.0%
--------



golden:   	26.43%
green:   	12.88%
ripe:   	9.48%
the:   	8.92%
a:   	5.38%
--------



and:   	16.85%
hovers:   	11.39%
peel:   	10.5%
plays:   	9.69%
play:   	7.18%
--------



the:   	87.89%
and:   	2.72%
this:   	2.45%
kicks:   	1.77%
plays:   	1.52%
--------



nose:   	99.25%
scent:   	0.29%
first:   	0.07%
mix:   	0.06%
aromatic:   	0.06%
--------



is:   	25.61%
.:   	23.65%
and:   	21.85%
,:   	8.38%
promises:   	5.18%
--------



the:   	98.65%
on:   	0.68%
there:   	0.36%
these:   	0.13%
this:   	0.03%
--------



palate:   	98.79%
mellowness:   	0.28%
frothy:   	0.25%
ripe:   	0.2%
creamy:   	0.12%
--------



is:   	80.58%
follows:   	4.56%
continues:   	4.04%
counters:   	1.86%
reveals:   	1.57%
--------



joined:   	48.65%
more:   	6.66%
tart:   	5.99%
all:   	4.0%
where:   	3.26%
--------



there:   	23.69%
about:   	19.59%
of:   	17.32%
the:   	12.89%
[UNK]:   	9.47%
--------



,:   	45.24%
and:   	9.48%
in:   	8.66%
the:   	6.54%
::   	5.93%
--------



the:   	93.14%
there:   	2.22%
this:   	0.94%
these:   	0.63%
tart:   	0.41%
--------



palate:   	44.68%
same:   	13.35%
fruit:   	8.56%
slender:   	6.5%
body:   	4.67%
--------



,:   	42.45%
notes:   	11.24%
notions:   	7.85%
aromatic:   	5.95%
ripe:   	5.62%
--------



however:   	53.97%
the:   	12.69%
but:   	12.07%
and:   	4.54%
slender:   	2.18%
--------



,:   	99.99%
the:   	0.0%
and:   	0.0%
is:   	0.0%
that:   	0.0%
--------



is:   	59.68%
the:   	27.44%
there:   	2.83%
this:   	2.06%
where:   	1.89%
--------



palate:   	28.99%
flavours:   	13.36%
body:   	11.54%
freshness:   	6.4%
fruit:   	6.36%
--------



is:   	98.76%
of:   	1.23%
reveals:   	0.0%
has:   	0.0%
,:   	0.0%
--------



a:   	89.27%
an:   	2.89%
more:   	1.86%
fresh:   	1.22%
that:   	0.92%
--------



little:   	16.29%
slender:   	15.14%
very:   	10.72%
fresh:   	7.62%
[UNK]:   	3.58%
--------



,:   	94.05%
and:   	3.16%
but:   	1.47%
wine:   	1.25%
[UNK]:   	0.03%
--------



taut:   	55.84%
fresh:   	14.63%
dry:   	6.47%
translucent:   	5.01%
concentrated:   	2.31%
--------



and:   	59.51%
,:   	38.39%
wine:   	1.21%
but:   	0.84%
expression:   	0.03%
--------



streamlined:   	16.0%
linear:   	14.74%
slender:   	11.82%
fresh:   	10.18%
dry:   	6.42%
--------



,:   	73.51%
wine:   	16.55%
but:   	4.75%
body:   	2.51%
palate:   	1.18%
--------



dry:   	64.74%
taut:   	8.45%
translucent:   	7.23%
but:   	3.61%
fresh:   	2.57%
--------



and:   	60.09%
,:   	18.05%
wine:   	17.36%
but:   	2.09%
core:   	1.2%
--------



slender:   	14.22%
precisely:   	10.45%
fresh:   	8.52%
taut:   	7.91%
crisp:   	6.9%
--------



-:   	100.0%
that:   	0.0%
,:   	0.0%
clean:   	0.0%
.:   	0.0%
--------



clean:   	100.0%
refreshing:   	0.0%
crisp:   	0.0%
superclean:   	0.0%
[UNK]:   	0.0%
--------



,:   	85.67%
.:   	9.63%
wine:   	2.97%
palate:   	0.49%
but:   	0.28%
--------



stony:   	17.22%
fresh:   	15.31%
refreshing:   	14.41%
but:   	10.99%
pure:   	7.13%
--------



and:   	77.55%
,:   	12.02%
finish:   	4.39%
wine:   	2.21%
.:   	1.67%
--------



streamlined:   	14.79%
whistle:   	12.16%
linear:   	9.91%
clean:   	9.58%
stony:   	6.85%
--------



finish:   	66.42%
palate:   	10.85%
slender:   	4.01%
slightest:   	3.27%
lasting:   	3.15%
--------



is:   	84.33%
has:   	4.28%
offers:   	3.09%
displays:   	2.6%
shows:   	1.57%
--------



whistle:   	42.01%
streamlined:   	41.74%
pervaded:   	6.03%
slender:   	3.14%
taut:   	1.6%
--------



-:   	99.9%
clean:   	0.09%
,:   	0.01%
of:   	0.0%
that:   	0.0%
--------



clean:   	100.0%
cleansing:   	0.0%
refreshing:   	0.0%
superclean:   	0.0%
tingling:   	0.0%
--------



,:   	48.62%
and:   	43.91%
.:   	6.91%
but:   	0.45%
with:   	0.1%
--------



refreshing:   	17.62%
whistle:   	16.63%
pure:   	11.65%
stony:   	8.44%
fresh:   	8.04%
--------



and:   	36.23%
lemon:   	33.41%
fruit:   	16.25%
flavors:   	4.3%
,:   	3.96%
--------



and:   	45.95%
-:   	34.67%
freshness:   	9.94%
zest:   	2.47%
zestiness:   	2.31%
--------



lime:   	82.47%
apple:   	6.12%
grapefruit:   	3.43%
green:   	2.49%
tart:   	1.22%
--------



zest:   	55.77%
.:   	19.63%
flavors:   	7.06%
,:   	6.06%
that:   	4.33%
--------



.:   	69.36%
,:   	14.44%
flavors:   	12.48%
that:   	2.32%
notes:   	0.7%
--------



the:   	35.97%
this:   	25.52%
there:   	14.65%
it:   	6.77%
:   	5.48%
--------



are:   	51.39%
is:   	37.87%
':   	10.61%
also:   	0.1%
will:   	0.01%
--------



glimpses:   	65.3%
some:   	4.91%
lovely:   	4.21%
hints:   	4.0%
a:   	3.93%
--------



of:   	99.92%
by:   	0.04%
that:   	0.02%
,:   	0.01%
like:   	0.01%
--------



fresh:   	19.94%
red:   	10.05%
tart:   	9.46%
white:   	6.87%
more:   	5.96%
--------



pepper:   	35.84%
peach:   	28.5%
grapefruit:   	13.15%
stone:   	5.18%
cherry:   	5.04%
--------



and:   	77.41%
,:   	10.57%
pith:   	8.13%
zest:   	2.28%
.:   	0.36%
--------



green:   	28.14%
lemon:   	24.55%
a:   	8.72%
stone:   	5.26%
tart:   	4.83%
--------



,:   	86.71%
.:   	7.14%
flavors:   	2.45%
notes:   	1.32%
and:   	0.67%
--------



while:   	30.49%
but:   	29.56%
with:   	27.47%
and:   	6.25%
all:   	1.97%
--------



the:   	80.98%
a:   	8.69%
this:   	6.91%
it:   	1.33%
there:   	0.83%
--------



finish:   	95.27%
midpalate:   	2.27%
body:   	0.72%
palate:   	0.23%
texture:   	0.22%
--------



is:   	99.69%
provides:   	0.08%
makes:   	0.04%
remains:   	0.03%
does:   	0.02%
--------



whistle:   	73.23%
long:   	5.84%
clean:   	5.72%
dry:   	5.2%
[UNK]:   	1.17%
--------



and:   	54.21%
.:   	18.12%
,:   	17.14%
but:   	10.36%
with:   	0.09%
--------



:   	88.31%
drink:   	11.25%
this:   	0.29%
it:   	0.05%
the:   	0.03%
--------



In [60]:
import tensorflow as tf
import math

# Initialisation
sentence = "Salut, comment vas-tu ?"
embedding_layer = tf.keras.layers.Embedding(5000, 256)  # Exemple de dimension
tokenized_sentence = [1, 34, 56, 78, 3]
embedded_sentence = embedding_layer(tf.convert_to_tensor([tokenized_sentence]))

# Encodeur: Calcul des états de l'encodeur
encoder_rnn = tf.keras.layers.SimpleRNN(256, return_sequences=True)
encoder_states = encoder_rnn(embedded_sentence)

# Query, Key, Value pour l'encodeur
Q = tf.keras.layers.Dense(256, name="query")(encoder_states)
K = tf.keras.layers.Dense(256, name="key")(encoder_states)
V = tf.keras.layers.Dense(256, name="value")(encoder_states)

# Attention dans l'encodeur
QK = tf.matmul(Q, K, transpose_b=True)
QK_normalized = QK / tf.math.sqrt(tf.cast(256, tf.float32))
softmax = tf.nn.softmax(QK_normalized)
attention_encoder = tf.matmul(softmax, V)

# Décodeur: Supposons qu'il y ait une réponse que le décodeur essaie de générer
decoder_sentence = "Je vais bien, merci."  # Exemple de réponse
tokenized_decoder_sentence = [2, 35, 57, 79, 4]
embedded_decoder_sentence = embedding_layer(tf.convert_to_tensor([tokenized_decoder_sentence]))

# Calcul des états du décodeur
decoder_rnn = tf.keras.layers.SimpleRNN(256, return_sequences=True)
decoder_states = decoder_rnn(embedded_decoder_sentence)

# Query pour le décodeur et Key, Value de l'encodeur pour l'attention
Q_decoder = tf.keras.layers.Dense(256, name="query_decoder")(decoder_states)

# Attention dans le décodeur
QK_decoder = tf.matmul(Q_decoder, K, transpose_b=True)
QK_normalized_decoder = QK_decoder / tf.math.sqrt(tf.cast(256, tf.float32))
softmax_decoder = tf.nn.softmax(QK_normalized_decoder)
attention_decoder = tf.matmul(softmax_decoder, V)

print(attention_decoder)


tf.Tensor(
[[[ 0.05007175 -0.00486111  0.02492413 ...  0.00042409 -0.0139338
   -0.00778311]
  [ 0.05011487 -0.00488082  0.02498374 ...  0.00042968 -0.0139846
   -0.0077955 ]
  [ 0.05002842 -0.00485839  0.02475856 ...  0.00043114 -0.01390447
   -0.00771669]
  [ 0.05012161 -0.00482507  0.02490799 ...  0.00038137 -0.01394735
   -0.00781874]
  [ 0.05009252 -0.00488605  0.02477339 ...  0.0004489  -0.01395305
   -0.00768763]]], shape=(1, 5, 256), dtype=float32)
