<a href="https://colab.research.google.com/github/hurricane195/Intro-to-Deep-Learning/blob/Homework_4/HW4_P2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Problem 2**

Repeat problem 1; this time, extend the network with attention. Train the model on the entire dataset and evaluate it. Report training loss, validation loss, and validation accuracy. Also, try some qualitative validation, asking the network to generate French translations for some English sentences. Also, the results were compared against problem 1.

In [None]:
#Using a modided example of Dr. Tabkhi's "sequence2sequence" available at https://github.com/HamedTabkhi/Intro-to-DL/blob/main/sequence2sequence.py
#Using a modided example of Dr. Tabkhi's "E2F-loader" available at https://github.com/HamedTabkhi/Intro-to-DL/blob/main/E2F-loader.py
#Using a modided example of Dr. Tabkhi's "attention" available at https://github.com/HamedTabkhi/Intro-to-DL/blob/main/attention.py
#Random help from Chat GPT on formatting, sytntax, etc.
#Random help from Chat Colab AI on formatting, sytntax, etc.

In [None]:
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, Dataset

In [None]:
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

In [None]:
class Vocabulary:
    def __init__(self):
        self.word2index = {"<PAD>": 0, "<SOS>": 1, "<EOS>": 2}
        self.index2word = {0: "<PAD>", 1: "<SOS>", 2: "<EOS>"}
        self.word_count = {}
        self.n_words = 3  # Start counting from 3 to account for special tokens

    def add_sentence(self, sentence):
        for word in sentence.split(' '):
            self.add_word(word)

    def add_word(self, word):
        if word not in self.word2index:
            self.word2index[word] = self.n_words
            self.index2word[self.n_words] = word
            self.word_count[word] = 1
            self.n_words += 1
        else:
            self.word_count[word] += 1

def tokenize_and_pad(sentences, vocab):
    max_length = max(len(sentence.split(' ')) for sentence in sentences) + 2  # For SOS and EOS
    tokenized_sentences = []
    for sentence in sentences:
        tokens = [vocab.word2index["<SOS>"]] + [vocab.word2index.get(word, vocab.word2index["<PAD>"]) for word in sentence.split(' ')] + [vocab.word2index["<EOS>"]]
        padded_tokens = tokens + [vocab.word2index["<PAD>"]] * (max_length - len(tokens))
        tokenized_sentences.append(padded_tokens)
    return torch.tensor(tokenized_sentences, dtype=torch.long)

In [None]:
class EngFrDataset(Dataset):
    def __init__(self, pairs):
        self.eng_vocab = Vocabulary()
        self.fr_vocab = Vocabulary()
        self.pairs = []

        for eng, fr in pairs:
            self.eng_vocab.add_sentence(eng)
            self.fr_vocab.add_sentence(fr)
            self.pairs.append((eng, fr))

        self.eng_tokens = tokenize_and_pad([eng for eng, _ in pairs], self.eng_vocab)
        self.fr_tokens = tokenize_and_pad([fr for _, fr in pairs], self.fr_vocab)

    def __len__(self):
        return len(self.pairs)

    def __getitem__(self, idx):
        return self.eng_tokens[idx], self.fr_tokens[idx]

In [None]:
english_to_french = [
    ("I am cold", "J'ai froid"),
    ("You are tired", "Tu es fatigué"),
    ("He is hungry", "Il a faim"),
    ("She is happy", "Elle est heureuse"),
    ("We are friends", "Nous sommes amis"),
    ("I am cold", "J'ai froid"),
    ("You are tired", "Tu es fatigué"),
    ("He is hungry", "Il a faim"),
    ("She is happy", "Elle est heureuse"),
    ("We are friends", "Nous sommes amis"),
    ("They are students", "Ils sont étudiants"),
    ("The cat is sleeping", "Le chat dort"),
    ("The sun is shining", "Le soleil brille"),
    ("We love music", "Nous aimons la musique"),
    ("She speaks French fluently", "Elle parle français couramment"),
    ("He enjoys reading books", "Il aime lire des livres"),
    ("They play soccer every weekend", "Ils jouent au football chaque week-end"),
    ("The movie starts at 7 PM", "Le film commence à 19 heures"),
    ("She wears a red dress", "Elle porte une robe rouge"),
    ("We cook dinner together", "Nous cuisinons le dîner ensemble"),
    ("He drives a blue car", "Il conduit une voiture bleue"),
    ("They visit museums often", "Ils visitent souvent des musées"),
    ("The restaurant serves delicious food", "Le restaurant sert une délicieuse cuisine"),
    ("She studies mathematics at university", "Elle étudie les mathématiques à l'université"),
    ("We watch movies on Fridays", "Nous regardons des films le vendredi"),
    ("He listens to music while jogging", "Il écoute de la musique en faisant du jogging"),
    ("They travel around the world", "Ils voyagent autour du monde"),
    ("The book is on the table", "Le livre est sur la table"),
    ("She dances gracefully", "Elle danse avec grâce"),
    ("We celebrate birthdays with cake", "Nous célébrons les anniversaires avec un gâteau"),
    ("He works hard every day", "Il travaille dur tous les jours"),
    ("They speak different languages", "Ils parlent différentes langues"),
    ("The flowers bloom in spring", "Les fleurs fleurissent au printemps"),
    ("She writes poetry in her free time", "Elle écrit de la poésie pendant son temps libre"),
    ("We learn something new every day", "Nous apprenons quelque chose de nouveau chaque jour"),
    ("The dog barks loudly", "Le chien aboie bruyamment"),
    ("He sings beautifully", "Il chante magnifiquement"),
    ("They swim in the pool", "Ils nagent dans la piscine"),
    ("The birds chirp in the morning", "Les oiseaux gazouillent le matin"),
    ("She teaches English at school", "Elle enseigne l'anglais à l'école"),
    ("We eat breakfast together", "Nous prenons le petit déjeuner ensemble"),
    ("He paints landscapes", "Il peint des paysages"),
    ("They laugh at the joke", "Ils rient de la blague"),
    ("The clock ticks loudly", "L'horloge tic-tac bruyamment"),
    ("She runs in the park", "Elle court dans le parc"),
    ("We travel by train", "Nous voyageons en train"),
    ("He writes a letter", "Il écrit une lettre"),
    ("They read books at the library", "Ils lisent des livres à la bibliothèque"),
    ("The baby cries", "Le bébé pleure"),
    ("She studies hard for exams", "Elle étudie dur pour les examens"),
    ("We plant flowers in the garden", "Nous plantons des fleurs dans le jardin"),
    ("He fixes the car", "Il répare la voiture"),
    ("They drink coffee in the morning", "Ils boivent du café le matin"),
    ("The sun sets in the evening", "Le soleil se couche le soir"),
    ("She dances at the party", "Elle danse à la fête"),
    ("We play music at the concert", "Nous jouons de la musique au concert"),
    ("He cooks dinner for his family", "Il cuisine le dîner pour sa famille"),
    ("They study French grammar", "Ils étudient la grammaire française"),
    ("The rain falls gently", "La pluie tombe doucement"),
    ("She sings a song", "Elle chante une chanson"),
    ("We watch a movie together", "Nous regardons un film ensemble"),
    ("He sleeps deeply", "Il dort profondément"),
    ("They travel to Paris", "Ils voyagent à Paris"),
    ("The children play in the park", "Les enfants jouent dans le parc"),
    ("She walks along the beach", "Elle se promène le long de la plage"),
    ("We talk on the phone", "Nous parlons au téléphone"),
    ("He waits for the bus", "Il attend le bus"),
    ("They visit the Eiffel Tower", "Ils visitent la tour Eiffel"),
    ("The stars twinkle at night", "Les étoiles scintillent la nuit"),
    ("She dreams of flying", "Elle rêve de voler"),
    ("We work in the office", "Nous travaillons au bureau"),
    ("He studies history", "Il étudie l'histoire"),
    ("They listen to the radio", "Ils écoutent la radio"),
    ("The wind blows gently", "Le vent souffle doucement"),
    ("She swims in the ocean", "Elle nage dans l'océan"),
    ("We dance at the wedding", "Nous dansons au mariage"),
    ("He climbs the mountain", "Il gravit la montagne"),
    ("They hike in the forest", "Ils font de la randonnée dans la forêt"),
    ("The cat meows loudly", "Le chat miaule bruyamment"),
    ("She paints a picture", "Elle peint un tableau"),
    ("We build a sandcastle", "Nous construisons un château de sable"),
    ("He sings in the choir", "Il chante dans le chœur")

]


In [None]:
dataset = EngFrDataset(english_to_french)
dataloader = DataLoader(dataset, batch_size=1, shuffle=True)

In [None]:
class Encoder(nn.Module):
    def __init__(self, input_size, hidden_size):
        super(Encoder, self).__init__()
        self.hidden_size = hidden_size
        self.embedding = nn.Embedding(input_size, hidden_size)
        self.lstm = nn.LSTM(hidden_size, hidden_size)

    def forward(self, input, hidden):
        embedded = self.embedding(input).view(1, 1, -1)
        output, hidden = self.lstm(embedded, hidden)
        return output, hidden

    def initHidden(self):
        return (torch.zeros(1, 1, self.hidden_size, device=device), torch.zeros(1, 1, self.hidden_size, device=device))

In [None]:
class AttnDecoder(nn.Module):
    def __init__(self, hidden_size, output_size, max_length=10, dropout_p=0.1):
        super(AttnDecoder, self).__init__()
        self.hidden_size = hidden_size
        self.output_size = output_size
        self.max_length = max_length
        self.dropout_p = dropout_p

        self.embedding = nn.Embedding(self.output_size, self.hidden_size)
        self.attn = nn.Linear(self.hidden_size * 2, self.max_length)
        self.attn_combine = nn.Linear(self.hidden_size * 2, self.hidden_size)
        self.dropout = nn.Dropout(self.dropout_p)
        self.lstm = nn.LSTM(self.hidden_size, self.hidden_size)
        self.out = nn.Linear(self.hidden_size, self.output_size)

    def forward(self, input, hidden, encoder_outputs):
        embedded = self.embedding(input).view(1, 1, -1)
        embedded = self.dropout(embedded)

        attn_weights = torch.softmax(self.attn(torch.cat((embedded[0], hidden[0][0]), 1)), dim=1)
        attn_applied = torch.bmm(attn_weights.unsqueeze(0), encoder_outputs.unsqueeze(0))

        output = torch.cat((embedded[0], attn_applied[0]), 1)
        output = self.attn_combine(output).unsqueeze(0)

        output = torch.relu(output)
        output, hidden = self.lstm(output, hidden)
        output = torch.log_softmax(self.out(output[0]), dim=1)
        return output, hidden, attn_weights

In [None]:
encoder = Encoder(dataset.eng_vocab.n_words, 256).to(device)
decoder = AttnDecoder(256, dataset.fr_vocab.n_words, max_length=dataset.eng_tokens.shape[1], dropout_p=0.1).to(device)

encoder_optimizer = optim.SGD(encoder.parameters(), lr=0.01)
decoder_optimizer = optim.SGD(decoder.parameters(), lr=0.01)
criterion = nn.NLLLoss()

In [None]:
def train(input_tensor, target_tensor, encoder, decoder, encoder_optimizer, decoder_optimizer, criterion, max_length=dataset.eng_tokens.shape[1]):
    encoder_hidden = encoder.initHidden()

    encoder_optimizer.zero_grad()
    decoder_optimizer.zero_grad()

    input_length = input_tensor.size(0)
    target_length = target_tensor.size(0)

    loss = 0
    correct_sequences = 0
    total_sequences = 0

    encoder_outputs = torch.zeros(max_length, encoder.hidden_size, device=device)

    for ei in range(input_length):
        encoder_output, encoder_hidden = encoder(input_tensor[ei], encoder_hidden)
        encoder_outputs[ei] = encoder_output[0, 0]

    decoder_input = torch.tensor([[dataset.fr_vocab.word2index["<SOS>"]]], device=device)
    decoder_hidden = encoder_hidden

    for di in range(target_length):
        decoder_output, decoder_hidden, _ = decoder(decoder_input, decoder_hidden, encoder_outputs)
        topv, topi = decoder_output.topk(1)
        decoder_input = topi.squeeze().detach()

        loss += criterion(decoder_output, target_tensor[di].unsqueeze(0))
        if decoder_input.item() == dataset.fr_vocab.word2index["<EOS>"]:
            break

    loss.backward()

    encoder_optimizer.step()
    decoder_optimizer.step()

    total_sequences += 1
    if decoder_input.item() == target_tensor[di].item():
        correct_sequences += 1

    return loss.item() / target_length, correct_sequences, total_sequences

In [None]:
def evaluate(input_tensor, target_tensor, encoder, decoder, criterion, max_length=dataset.eng_tokens.shape[1]):
    with torch.no_grad():
        input_tensor = input_tensor.squeeze(0)
        target_tensor = target_tensor.squeeze(0)

        if input_tensor.size(0) == 0 or target_tensor.size(0) == 0:
            return 0, 0, 0

        encoder_hidden = encoder.initHidden()
        input_length = input_tensor.size(0)
        target_length = target_tensor.size(0)

        encoder_outputs = torch.zeros(max_length, encoder.hidden_size, device=device)

        for ei in range(input_length):
            encoder_output, encoder_hidden = encoder(input_tensor[ei], encoder_hidden)
            encoder_outputs[ei] = encoder_output[0, 0]

        eval_loss = 0
        correct_tokens = 0
        total_tokens = 0

        decoder_input = torch.tensor([[dataset.fr_vocab.word2index["<SOS>"]]], device=device)
        decoder_hidden = encoder_hidden

        for di in range(target_length):
            decoder_output, decoder_hidden, _ = decoder(decoder_input, decoder_hidden, encoder_outputs)
            eval_loss += criterion(decoder_output, target_tensor[di].unsqueeze(0))

            topv, topi = decoder_output.topk(1)
            decoder_input = topi.squeeze().detach()

            if decoder_input.item() == dataset.fr_vocab.word2index["<EOS>"]:
                break

            total_tokens += 1
            if decoder_input.item() == target_tensor[di].item():
                correct_tokens += 1

        eval_loss /= target_length
        accuracy = correct_tokens / total_tokens if total_tokens > 0 else 0

        return eval_loss.item(), accuracy, correct_tokens, total_tokens

In [None]:
def translate(input_tensor, encoder, decoder, max_length=dataset.eng_tokens.shape[1]):
    with torch.no_grad():
        input_length = input_tensor.size(0)
        encoder_hidden = encoder.initHidden()

        encoder_outputs = torch.zeros(max_length, encoder.hidden_size, device=device)

        for ei in range(input_length):
            encoder_output, encoder_hidden = encoder(input_tensor[ei], encoder_hidden)
            encoder_outputs[ei] = encoder_output[0, 0]

        decoder_input = torch.tensor([[dataset.fr_vocab.word2index["<SOS>"]]], device=device)
        decoder_hidden = encoder_hidden

        decoded_words = []

        for di in range(max_length):
            decoder_output, decoder_hidden, _ = decoder(decoder_input, decoder_hidden, encoder_outputs)
            topv, topi = decoder_output.topk(1)
            if topi.item() == dataset.fr_vocab.word2index["<EOS>"]:
                break
            else:
                decoded_words.append(topi.item())

            decoder_input = topi.squeeze().detach()

        return decoded_words

In [None]:
# Training and evaluation loop
n_epochs = 110
for epoch in range(n_epochs):
    total_loss = 0
    total_correct_sequences = 0
    total_sequences = 0
    val_total_loss = 0
    val_total_correct_tokens = 0
    val_total_tokens = 0

    for i, (input_tensor, target_tensor) in enumerate(dataloader):
        input_tensor, target_tensor = input_tensor.to(device), target_tensor.to(device)
        input_tensor = input_tensor.squeeze(0)
        target_tensor = target_tensor.squeeze(0)

        if input_tensor.size(0) == 0 or target_tensor.size(0) == 0:  # Skip empty tensors
            continue

        loss, correct_sequences, total_sequences = train(input_tensor, target_tensor, encoder, decoder, encoder_optimizer, decoder_optimizer, criterion, max_length=dataset.eng_tokens.shape[1])
        total_loss += loss
        total_correct_sequences += correct_sequences
        total_sequences += total_sequences

    with torch.no_grad():
        for i, (input_tensor, target_tensor) in enumerate(dataloader):
            input_tensor, target_tensor = input_tensor.to(device), target_tensor.to(device)
            input_tensor = input_tensor.squeeze(0)
            target_tensor = target_tensor.squeeze(0)

            if input_tensor.size(0) == 0 or target_tensor.size(0) == 0:  # Skip empty tensors
                continue

            eval_loss, accuracy, correct_tokens, total_tokens = evaluate(input_tensor, target_tensor, encoder, decoder, criterion, max_length=dataset.eng_tokens.shape[1])
            val_total_loss += eval_loss
            val_total_correct_tokens += correct_tokens
            val_total_tokens += total_tokens

    # Calculate training and validation losses and accuracies
    train_loss = total_loss / len(dataloader)
    val_loss = val_total_loss / len(dataloader)
    train_accuracy = total_correct_sequences / len(dataset)  # Update training accuracy calculation
    val_accuracy = val_total_correct_tokens / val_total_tokens if val_total_tokens > 0 else 0

    if epoch % 10 == 0:
        # Print epoch-wise training and validation results
        print(f'Epoch {epoch}, Training Loss: {train_loss}, Training Accuracy: {train_accuracy}, Validation Loss: {val_loss}, Validation Accuracy: {val_accuracy}')

# Calculate overall evaluation results
overall_val_loss = val_total_loss / len(dataloader)
overall_val_accuracy = val_total_correct_tokens / val_total_tokens if val_total_tokens > 0 else 0
overall_train_loss = total_loss / len(dataloader)
overall_train_accuracy = total_correct_sequences / len(dataset)

# Printing prediction examples after training
print("\nPrediction Examples:")
n_examples = 10
example_count = 0

for i in range(len(dataset)):
    eng_tensor, fr_tensor = dataset[i]
    eng_tensor = eng_tensor.to(device)
    fr_tensor = fr_tensor.to(device)
    predicted_indices = translate(eng_tensor, encoder, decoder)

    predicted_string = ' '.join([dataset.fr_vocab.index2word[index] for index in predicted_indices if index not in (dataset.fr_vocab.word2index['<SOS>'], dataset.fr_vocab.word2index['<EOS>'], dataset.fr_vocab.word2index['<PAD>'])])
    target_string = ' '.join([dataset.fr_vocab.index2word[index.item()] for index in fr_tensor if index.item() not in (dataset.fr_vocab.word2index['<SOS>'], dataset.fr_vocab.word2index['<EOS>'], dataset.fr_vocab.word2index['<PAD>'])])
    eng_string = ' '.join([dataset.eng_vocab.index2word[index.item()] for index in eng_tensor if index.item() not in (dataset.eng_vocab.word2index['<SOS>'], dataset.eng_vocab.word2index['<EOS>'], dataset.eng_vocab.word2index['<PAD>'])])

    if example_count < n_examples:
        print(f'Example {example_count + 1}, Input: {eng_string}, Target: {target_string}, Predicted: {predicted_string}')
        example_count += 1

    if example_count >= n_examples:
        break

print("\nOverall Evaluation Results:")
print(f'Overall Training Loss: {overall_train_loss}, Overall Training Accuracy: {overall_train_accuracy}, Overall Validation Loss: {overall_val_loss}, Overall Validation Accuracy: {overall_val_accuracy}')

Epoch 0, Training Loss: 2.68213269187183, Training Accuracy: 0.5487804878048781, Validation Loss: 1.2721712836405126, Validation Accuracy: 0.350210970464135
Epoch 10, Training Loss: 1.4092207684485185, Training Accuracy: 0.17073170731707318, Validation Loss: 1.3703952228150718, Validation Accuracy: 0.3201219512195122
Epoch 20, Training Loss: 1.437361919166244, Training Accuracy: 0.18292682926829268, Validation Loss: 1.3298539366663955, Validation Accuracy: 0.4486486486486487
Epoch 30, Training Loss: 1.0310630158680243, Training Accuracy: 0.32926829268292684, Validation Loss: 0.9150663254464545, Validation Accuracy: 0.5922077922077922
Epoch 40, Training Loss: 0.6362986794596501, Training Accuracy: 0.6463414634146342, Validation Loss: 0.5291274706492337, Validation Accuracy: 0.8649885583524027
Epoch 50, Training Loss: 0.17170843630698726, Training Accuracy: 0.975609756097561, Validation Loss: 0.1406495823502177, Validation Accuracy: 0.9935897435897436
Epoch 60, Training Loss: 0.054439338

In [None]:
def generate_english_translations(encoder, decoder, input_sentences, dataset, device):
    # Tokenize and pad the French sentences
    tokenized_french_sentences = tokenize_and_pad(input_sentences, dataset.fr_vocab)

    # Convert tokenized sentences into tensors
    input_tensors = tokenized_french_sentences.to(device)

    # Generate English translations
    with torch.no_grad():
        for input_tensor in input_tensors:
            # Initialize encoder hidden states
            encoder_hidden = encoder.initHidden()

            input_length = input_tensor.size(0)

            # Pass input through the encoder
            for ei in range(input_length):
                encoder_output, encoder_hidden = encoder(input_tensor[ei].unsqueeze(0), encoder_hidden)

            # Initialize decoder input with SOS token
            decoder_input = torch.tensor([[dataset.eng_vocab.word2index['<SOS>']]], device=device)
            decoder_hidden = encoder_hidden
            encoder_outputs = torch.zeros(dataset.eng_tokens.shape[1], encoder.hidden_size, device=device)

            for ei in range(input_length):
                encoder_outputs[ei] = encoder_output[0, 0]

            # Initialize list to store predicted indices
            predicted_indices = []

            # Generate translation
            for _ in range(dataset.eng_tokens.shape[1]):  # Use max length from English tokens
                decoder_output, decoder_hidden, _ = decoder(decoder_input, decoder_hidden, encoder_outputs)
                topv, topi = decoder_output.topk(1)
                predicted_indices.append(topi.item())
                decoder_input = topi.squeeze().detach()

                #print("Decoder Input Shape:", decoder_input.shape)  # Print decoder input shape

                if decoder_input.item() == dataset.eng_vocab.word2index['<EOS>']:
                    break

            # Convert predicted indices to English words
            predicted_words = [dataset.eng_vocab.index2word[index] for index in predicted_indices if index not in (dataset.eng_vocab.word2index['<SOS>'], dataset.eng_vocab.word2index['<EOS>'], dataset.eng_vocab.word2index['<PAD>'])]

            # Print the translations
            print("French Sentence:", ' '.join([dataset.fr_vocab.index2word[index.item()] for index in input_tensor if index.item() not in (dataset.fr_vocab.word2index['<SOS>'], dataset.fr_vocab.word2index['<EOS>'], dataset.fr_vocab.word2index['<PAD>'])]))
            print("English Translation:", ' '.join(predicted_words))

In [None]:
# Define the French sentences for translation
french_sentences = [
    "J'ai froid",
    "Tu es fatigué",
    "Il a faim",
    "Elle est heureuse",
    "Nous sommes amis",
    "Ils sont étudiants",
    "Le chat dort",
    "Le soleil brille",
    "Le bébé pleure"

]

In [None]:
# Generate English translations
generate_english_translations(encoder, decoder, french_sentences, dataset, device)

French Sentence: J'ai froid
English Translation: hungry loudly world French sings beautifully swim pool
French Sentence: Tu es fatigué
English Translation: We chirp morning teaches world English 7
French Sentence: Il a faim
English Translation: tired the world French fluently book table dances
French Sentence: Elle est heureuse
English Translation: hungry loudly world French sings beautifully swim pool birds
French Sentence: Nous sommes amis
English Translation: We falls every new world English 7 school
French Sentence: Ils sont étudiants
English Translation: sleeping cake She works French fluently cook
French Sentence: Le chat dort
English Translation: sleeping wears a red dress dress cook
French Sentence: Le soleil brille
English Translation: something new dog world French fluently starts talk
French Sentence: Le bébé pleure
English Translation: tired the world French fluently book table dances
