# Exercise project 5 - Transformer networks

After building the basic transformer model, I wanted to try implementing a simple GUI interface for real-time translation. The goal was to enhance the previous model by adding a user interface that allows interactive translation between English and Spanish. The architecture and training process were the same as in the previous notebook, but with additional features.

The text was tokenized and preprocessed similarly as before, but all text pairs were converted to lowercase. The model architecture remained the same with the encoder-decoder structure. A simple GUI was implemented using ipywidgets for user input and translation. Translations were generated using a Greedy Sampler to decode sequences.


https://www.kaggle.com/code/abrahamanderson/artificial-neural-networks-for-regression/notebook


In [None]:
!pip install -q --upgrade tensorflow
!pip install -q --upgrade rouge-score
!pip install -q --upgrade keras-nlp
!pip install -q --upgrade keras  # Upgrade to Keras 3.

[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m615.3/615.3 MB[0m [31m2.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m5.5/5.5 MB[0m [31m99.9 MB/s[0m eta [36m0:00:00[0m
[?25h[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
tf-keras 2.17.0 requires tensorflow<2.18,>=2.17, but you have tensorflow 2.18.0 which is incompatible.[0m[31m
[0m  Preparing metadata (setup.py) ... [?25l[?25hdone
  Building wheel for rouge-score (setup.py) ... [?25l[?25hdone
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m691.2/691.2 kB[0m [31m40.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m5.2/5.2 MB[0m [31m106.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.2/1.2 MB[0m [31m52.7 MB/s[0m eta [36

In [None]:
import keras_nlp
import pathlib
import random
import os
import keras
from keras import ops
import keras.utils
import shutil
import pathlib
import tensorflow as tf
import tensorflow.data as tf_data
from tensorflow_text.tools.wordpiece_vocab import (
    bert_vocab_from_dataset as bert_vocab,
)

In [None]:
folder_path = "/content/drive/MyDrive/deeplearning2024_VincenzinaSoos/ex_5"
os.chdir(folder_path)
data_path = os.path.join(folder_path, "data")

In [None]:
BATCH_SIZE = 64
EPOCHS = 10  # This should be at least 10 for convergence
MAX_SEQUENCE_LENGTH = 40
ENG_VOCAB_SIZE = 15000
SPA_VOCAB_SIZE = 15000

EMBED_DIM = 256
INTERMEDIATE_DIM = 2048
NUM_HEADS = 8

In [None]:
zip_file = keras.utils.get_file(
    fname="spa-eng.zip",
    origin="http://storage.googleapis.com/download.tensorflow.org/data/spa-eng.zip",
)

# Move the downloaded file to the /data folder
shutil.move(zip_file, os.path.join(data_path, "spa-eng.zip"))

# Extract the zip file in the /data folder
import zipfile

with zipfile.ZipFile(os.path.join(data_path, "spa-eng.zip"), "r") as zip_ref:
    zip_ref.extractall(data_path)

# Set the path to the extracted `spa.txt` file
text_file = pathlib.Path(data_path) / "spa-eng" / "spa.txt"

Downloading data from http://storage.googleapis.com/download.tensorflow.org/data/spa-eng.zip
[1m2638744/2638744[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 0us/step


In [None]:
with open(text_file) as f:
    lines = f.read().split("\n")[:-1]
text_pairs = []
for line in lines:
    eng, spa = line.split("\t")
    eng = eng.lower()
    spa = spa.lower()
    text_pairs.append((eng, spa))

print("Loaded text pairs:", text_pairs[:5])  # Preview

Loaded text pairs: [('go.', 've.'), ('go.', 'vete.'), ('go.', 'vaya.'), ('go.', 'váyase.'), ('hi.', 'hola.')]


In [None]:
for _ in range(5):
    print(random.choice(text_pairs))

('i live pretty close to here.', 'yo vivo muy cerca de aquí.')
('tom helped mary start over again.', 'tom ayudó a mary a volver a comenzar.')
("he's afraid of his own shadow.", 'él tiene miedo hasta de su propia sombra.')
('tom took a sip of wine.', 'tom bebió un sorbo de vino.')
('can plants feel pain?', '¿pueden las plantas sentir dolor?')


In [None]:
random.shuffle(text_pairs)
num_val_samples = int(0.15 * len(text_pairs))
num_train_samples = len(text_pairs) - 2 * num_val_samples
train_pairs = text_pairs[:num_train_samples]
val_pairs = text_pairs[num_train_samples : num_train_samples + num_val_samples]
test_pairs = text_pairs[num_train_samples + num_val_samples :]

print(f"{len(text_pairs)} total pairs")
print(f"{len(train_pairs)} training pairs")
print(f"{len(val_pairs)} validation pairs")
print(f"{len(test_pairs)} test pairs")

118964 total pairs
83276 training pairs
17844 validation pairs
17844 test pairs


In [None]:
def train_word_piece(text_samples, vocab_size, reserved_tokens):
    word_piece_ds = tf_data.Dataset.from_tensor_slices(text_samples)
    vocab = keras_nlp.tokenizers.compute_word_piece_vocabulary(
        word_piece_ds.batch(1000).prefetch(2),
        vocabulary_size=vocab_size,
        reserved_tokens=reserved_tokens,
    )
    return vocab

In [None]:
reserved_tokens = ["[PAD]", "[UNK]", "[START]", "[END]"]

eng_samples = [text_pair[0] for text_pair in train_pairs]
eng_vocab = train_word_piece(eng_samples, ENG_VOCAB_SIZE, reserved_tokens)

spa_samples = [text_pair[1] for text_pair in train_pairs]
spa_vocab = train_word_piece(spa_samples, SPA_VOCAB_SIZE, reserved_tokens)

In [None]:
print("English Tokens: ", eng_vocab[100:110])
print("Spanish Tokens: ", spa_vocab[100:110])

English Tokens:  ['him', 'there', 'they', 'go', 'her', 'has', 'will', 're', 'time', 'll']
Spanish Tokens:  ['mi', 'qué', 'le', 'ella', 'te', 'para', 'mary', 'las', 'más', 'al']


In [None]:
eng_tokenizer = keras_nlp.tokenizers.WordPieceTokenizer(
    vocabulary=eng_vocab, lowercase=False
)
spa_tokenizer = keras_nlp.tokenizers.WordPieceTokenizer(
    vocabulary=spa_vocab, lowercase=False
)

In [None]:
eng_input_ex = text_pairs[0][0]
eng_tokens_ex = eng_tokenizer.tokenize(eng_input_ex)
print("English sentence: ", eng_input_ex)
print("Tokens: ", eng_tokens_ex)
print(
    "Recovered text after detokenizing: ",
    eng_tokenizer.detokenize(eng_tokens_ex),
)

print()

spa_input_ex = text_pairs[0][1]
spa_tokens_ex = spa_tokenizer.tokenize(spa_input_ex)
print("Spanish sentence: ", spa_input_ex)
print("Tokens: ", spa_tokens_ex)
print(
    "Recovered text after detokenizing: ",
    spa_tokenizer.detokenize(spa_tokens_ex),
)

English sentence:  languages aren't his forte.
Tokens:  tf.Tensor([1027  446    8   45   88   80 1548   11], shape=(8,), dtype=int32)
Recovered text after detokenizing:  languages aren ' t his forte .

Spanish sentence:  los idiomas no son su fuerte.
Tokens:  tf.Tensor([  97 1831   82  137   96  480   14], shape=(7,), dtype=int32)
Recovered text after detokenizing:  los idiomas no son su fuerte .


In [None]:
def preprocess_batch(eng, spa):
    batch_size = ops.shape(spa)[0]

    eng = eng_tokenizer(eng)
    spa = spa_tokenizer(spa)

    # Pad `eng` to `MAX_SEQUENCE_LENGTH`.
    eng_start_end_packer = keras_nlp.layers.StartEndPacker(
        sequence_length=MAX_SEQUENCE_LENGTH,
        pad_value=eng_tokenizer.token_to_id("[PAD]"),
    )
    eng = eng_start_end_packer(eng)

    # Add special tokens (`"[START]"` and `"[END]"`) to `spa` and pad it as well.
    spa_start_end_packer = keras_nlp.layers.StartEndPacker(
        sequence_length=MAX_SEQUENCE_LENGTH + 1,
        start_value=spa_tokenizer.token_to_id("[START]"),
        end_value=spa_tokenizer.token_to_id("[END]"),
        pad_value=spa_tokenizer.token_to_id("[PAD]"),
    )
    spa = spa_start_end_packer(spa)

    return (
        {
            "encoder_inputs": eng,
            "decoder_inputs": spa[:, :-1],
        },
        spa[:, 1:],
    )


def make_dataset(pairs):
    eng_texts, spa_texts = zip(*pairs)
    eng_texts = list(eng_texts)
    spa_texts = list(spa_texts)
    dataset = tf_data.Dataset.from_tensor_slices((eng_texts, spa_texts))
    dataset = dataset.batch(BATCH_SIZE)
    dataset = dataset.map(preprocess_batch, num_parallel_calls=tf_data.AUTOTUNE)
    return dataset.shuffle(2048).prefetch(16).cache()


train_ds = make_dataset(train_pairs)
val_ds = make_dataset(val_pairs)

In [None]:
for inputs, targets in train_ds.take(1):
    print(f'inputs["encoder_inputs"].shape: {inputs["encoder_inputs"].shape}')
    print(f'inputs["decoder_inputs"].shape: {inputs["decoder_inputs"].shape}')
    print(f"targets.shape: {targets.shape}")

inputs["encoder_inputs"].shape: (64, 40)
inputs["decoder_inputs"].shape: (64, 40)
targets.shape: (64, 40)


In [None]:
# Encoder
encoder_inputs = keras.Input(shape=(None,), name="encoder_inputs")

x = keras_nlp.layers.TokenAndPositionEmbedding(
    vocabulary_size=ENG_VOCAB_SIZE,
    sequence_length=MAX_SEQUENCE_LENGTH,
    embedding_dim=EMBED_DIM,
)(encoder_inputs)

encoder_outputs = keras_nlp.layers.TransformerEncoder(
    intermediate_dim=INTERMEDIATE_DIM, num_heads=NUM_HEADS
)(inputs=x)
encoder = keras.Model(encoder_inputs, encoder_outputs)


# Decoder
decoder_inputs = keras.Input(shape=(None,), name="decoder_inputs")
encoded_seq_inputs = keras.Input(shape=(None, EMBED_DIM), name="decoder_state_inputs")

x = keras_nlp.layers.TokenAndPositionEmbedding(
    vocabulary_size=SPA_VOCAB_SIZE,
    sequence_length=MAX_SEQUENCE_LENGTH,
    embedding_dim=EMBED_DIM,
)(decoder_inputs)

x = keras_nlp.layers.TransformerDecoder(
    intermediate_dim=INTERMEDIATE_DIM, num_heads=NUM_HEADS
)(decoder_sequence=x, encoder_sequence=encoded_seq_inputs)
x = keras.layers.Dropout(0.5)(x)
decoder_outputs = keras.layers.Dense(SPA_VOCAB_SIZE, activation="softmax")(x)
decoder = keras.Model(
    [
        decoder_inputs,
        encoded_seq_inputs,
    ],
    decoder_outputs,
)
decoder_outputs = decoder([decoder_inputs, encoder_outputs])

transformer = keras.Model(
    [encoder_inputs, decoder_inputs],
    decoder_outputs,
    name="transformer",
)

In [None]:
transformer.summary()
transformer.compile(
    "rmsprop", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)
transformer.fit(train_ds, epochs=EPOCHS, validation_data=val_ds)

Epoch 1/10
[1m1302/1302[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m126s[0m 71ms/step - accuracy: 0.8179 - loss: 1.4815 - val_accuracy: 0.8655 - val_loss: 0.8086
Epoch 2/10
[1m1302/1302[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m54s[0m 42ms/step - accuracy: 0.8714 - loss: 0.7752 - val_accuracy: 0.8939 - val_loss: 0.6028
Epoch 3/10
[1m1302/1302[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m54s[0m 41ms/step - accuracy: 0.8930 - loss: 0.6130 - val_accuracy: 0.9043 - val_loss: 0.5280
Epoch 4/10
[1m1302/1302[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m53s[0m 41ms/step - accuracy: 0.9036 - loss: 0.5361 - val_accuracy: 0.9093 - val_loss: 0.4949
Epoch 5/10
[1m1302/1302[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m54s[0m 41ms/step - accuracy: 0.9100 - loss: 0.4907 - val_accuracy: 0.9121 - val_loss: 0.4826
Epoch 6/10
[1m1302/1302[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m54s[0m 41ms/step - accuracy: 0.9150 - loss: 0.4588 - val_accuracy: 0.9150 - val_loss: 0.4659
Epo

<keras.src.callbacks.history.History at 0x7bf6dfb17670>

In [None]:
def decode_sequences(input_sentences):
    batch_size = 1

    # tokenize the encoder input
    encoder_input_tokens = ops.convert_to_tensor(eng_tokenizer(input_sentences))
    if len(encoder_input_tokens[0]) < MAX_SEQUENCE_LENGTH:
        pads = ops.full((1, MAX_SEQUENCE_LENGTH - len(encoder_input_tokens[0])), 0)
        encoder_input_tokens = ops.concatenate(
            [encoder_input_tokens, pads], 1
        )

    # outputs the next probability
    def next(prompt, cache, index):
        logits = transformer([encoder_input_tokens, prompt])[:, index - 1, :]
        hidden_states = None
        return logits, hidden_states, cache

    length = 40
    start = ops.full((batch_size, 1), spa_tokenizer.token_to_id("[START]"))
    pad = ops.full((batch_size, length - 1), spa_tokenizer.token_to_id("[PAD]"))
    prompt = ops.concatenate((start, pad), axis=-1)

    generated_tokens = keras_nlp.samplers.GreedySampler()(
        next,
        prompt,
        stop_token_ids=[spa_tokenizer.token_to_id("[END]")],
        index=1,
    )
    generated_sentences = spa_tokenizer.detokenize(generated_tokens)
    return generated_sentences

In [None]:
test_eng_texts = [pair[0] for pair in test_pairs]
for i in range(2):
    input_sentence = random.choice(test_eng_texts)
    translated = decode_sequences([input_sentence])[0]  # Access the first string from the list
    translated = (
        translated.replace("[PAD]", "")
        .replace("[START]", "")
        .replace("[END]", "")
        .strip()
    )
    print(f"** Example {i} **")
    print("Input sentence:", input_sentence)
    print("Translated sentence:", translated)
    print()

** Example 0 **
Input sentence: i went shopping.
Translated sentence: fui de compras .

** Example 1 **
Input sentence: tom is having his bar mitzvah next month.
Translated sentence: tom está teniendo el barbadiz su mes que viene .



In [None]:
rouge_1 = keras_nlp.metrics.RougeN(order=1)
rouge_2 = keras_nlp.metrics.RougeN(order=2)

for test_pair in test_pairs[:30]:
    input_sentence = test_pair[0]
    reference_sentence = test_pair[1]

    # Decode the input sentence
    translated_sentence = decode_sequences([input_sentence])[0]  # Access first element of the list
    translated_sentence = (
        translated_sentence.replace("[PAD]", "")
        .replace("[START]", "")
        .replace("[END]", "")
        .strip()
    )

    # Update ROUGE metrics
    rouge_1(reference_sentence, translated_sentence)
    rouge_2(reference_sentence, translated_sentence)

In [None]:
rouge_1_result = rouge_1.result()
rouge_2_result = rouge_2.result()

print("ROUGE-1 Scores:")
print(f"  Precision: {rouge_1_result['precision'].numpy():.4f}")
print(f"  Recall:    {rouge_1_result['recall'].numpy():.4f}")
print(f"  F1 Score:  {rouge_1_result['f1_score'].numpy():.4f}")

print("\nROUGE-2 Scores:")
print(f"  Precision: {rouge_2_result['precision'].numpy():.4f}")
print(f"  Recall:    {rouge_2_result['recall'].numpy():.4f}")
print(f"  F1 Score:  {rouge_2_result['f1_score'].numpy():.4f}")

ROUGE-1 Scores:
  Precision: 0.5331
  Recall:    0.5101
  F1 Score:  0.5163

ROUGE-2 Scores:
  Precision: 0.3025
  Recall:    0.2948
  F1 Score:  0.2945


In [None]:
import ipywidgets as widgets
from IPython.display import display, clear_output

# dropdown
language_dropdown = widgets.Dropdown(
    options=["English to Spanish"],
    value="English to Spanish",
    description="Language:",
)

# text input field
input_text = widgets.Textarea(
    value="",
    placeholder="Enter text here...",
    description="Input:",
    layout=widgets.Layout(width="20%", height="100px"),
)

# output area
output_text = widgets.Textarea(
    value="",
    placeholder="Translation will appear here...",
    description="Output:",
    layout=widgets.Layout(width="20%", height="100px"),
    disabled=True,
)

# translate button
translate_button = widgets.Button(
    description="Translate",
    button_style="warning",  # Color
    tooltip="Click to translate text",
    icon="language",  # Icon
)

# Define the translation function
def translate_text(button):
    clear_output(wait=True)
    display(language_dropdown, input_text, translate_button, output_text)
    if not input_text.value.strip():
        output_text.value = "Please enter text to translate."
        return

    # translation direction
    if language_dropdown.value == "English to Spanish":
        translated = decode_sequences([input_text.value.lower()])[0]

    # Clean up
    translated = (
        translated.replace("[PAD]", "")
        .replace("[START]", "")
        .replace("[END]", "")
        .strip()
    )

    output_text.value = translated


# function to the button on_click event
translate_button.on_click(translate_text)
display(language_dropdown, input_text, translate_button, output_text)

Dropdown(description='Language:', options=('English to Spanish',), value='English to Spanish')

Textarea(value='', description='Input:', layout=Layout(height='100px', width='20%'), placeholder='Enter text h…



Textarea(value='', description='Output:', disabled=True, layout=Layout(height='100px', width='20%'), placehold…

## Personal Reflection / Analysis

The training process showed even better performance than the previous notebook. Despite the improvement in training and validation accuracy, the ROUGE scores remained the same. This suggests that even with higher accuracy, the model may not produce perfect translations.


**Keras Translation English to Spanish**

Here I want to give credit to ChatGPT for helping me translating the outputs, as my spanish is pretty basic and I might not see grammatical errors or understand complex phrases. See Google doc in folder `ex_5`, where I showcase the translations.


