# Part 3. Enhancement
The RNN model used in Part 2 is a basic model to perform the task of sentiment classification. In
this section, you will design strategies to improve upon the previous model you have built. You are
required to implement the following adjustments:

1. Instead of keeping the word embeddings fixed, now update the word embeddings (the same
way as model parameters) during the training process.
2. As discussed in Question 1(c), apply your solution in mitigating the influence of OOV words
and train your model again.
3. Keeping the above two adjustments, replace your simple RNN model in Part 2 with a biLSTM model and a biGRU model, incorporating recurrent computations in both directions and
stacking multiple layers if possible.
4. Keeping the above two adjustments, replace your simple RNN model in Part 2 with a Convolutional Neural Network (CNN) to produce sentence representations and perform sentiment
classification.
5. Further improve your model. You are free to use any strategy other than the above mentioned solutions. Changing hyper-parameters or stacking more layers is not counted towards
a meaningful improvement.


## Question 1

Instead of keeping the word embeddings fixed, now update the word embeddings (the same
way as model parameters) during the training process.

### Approach

We will use the same model as in part 2 notebook, but now we will also back propagate
the loss into the word embeddings itself. This will mean that as the model learns,
the word embeddings would also update, causing the encoding of the words to change.

In [1]:
import numpy as np
import torch
import torch.nn as nn
import torch.optim as optim
from common_utils import EmbeddingMatrix

  from .autonotebook import tqdm as notebook_tqdm


In [2]:
class RNNModel(nn.Module):
    def __init__(self, vocab_size, embedding_dim, hidden_dim, output_dim, word_embedding:np.ndarray):
        super(RNNModel, self).__init__()

        # Word2Vec embedding layer
        # freeze=False to enable updates to embeddings
        self.word2vec_embeddings = nn.Embedding.from_pretrained(torch.tensor(word_embedding), freeze=False)

        # RNN layer
        self.rnn = nn.RNN(embedding_dim, hidden_dim)

        # Fully connected layer
        self.fc = nn.Linear(hidden_dim, output_dim)

    def forward(self, x):
        # Look up word embeddings
        x = self.word2vec_embeddings(x)

        # Pass through RNN
        x, _ = self.rnn(x)

        # Take the last hidden state
        x = x[:, -1, :]

        # Pass through fully connected layer
        x = self.fc(x)

        return x


hidden_dim = 128
output_dim = 1
# Load EmbeddingMatrix
w2v_model = EmbeddingMatrix.load()

# Create RNN model
model = RNNModel(w2v_model.vocab_size,
                 w2v_model.dimension,
                 hidden_dim,
                 output_dim,
                 w2v_model.embedding_matrix)

# Create optimizer
optimizer = optim.Adam(model.parameters())

## Question 2

### Approach

We can implement FastText and the <UNK> token handling.

In [4]:
# Implementation of FastText for word embedding
from gensim.models import FastText

# Load Word2Vec model
w2v_model = EmbeddingMatrix.load()

# Create a list of words from the Word2Vec model
words = list(w2v_model.vocab)

# Create a FastText model with the same dimensions as the Word2Vec model
fasttext_model = FastText(vector_size=w2v_model.dimension, window=5, min_count=1, workers=4)

# Build the FastText vocabulary
fasttext_model.build_vocab(words)

# Initialize embeddings with Word2Vec embeddings
for word in words:
    fasttext_model.wv[word] = w2v_model[word]


# Question 3. Enhancement
(a) Report the accuracy score on the test set when the word embeddings are updated (Part 3.1).
   
(b) Report the accuracy score on the test set when applying your method to deal with OOV words
in Part 3.2.
   
(c) Report the accuracy scores of biLSTM and biGRU on the test set (Part 3.3).
   
(d) Report the accuracy scores of CNN on the test set (Part 3.4).
   
(e) Describe your final improvement strategy in Part 3.5. Report the accuracy on the test set
using your improved model.
   
(f) Compare the results across different solutions above and describe your observations with possible discussions.
