# Introduction to Python and Natural Language Technologies

__Laboratory 07, Deep learning and NLP__

__March 25, 2021__

__Ádám Kovács__


During this laboratory we are going to use the same classification dataset that we used the last time: SemEval 2019 - Task 6. 
The dataset is about Identifying and Categorizing Offensive Language in Social Media.
__Preparation:__
- You will need the Semeval dataset (we will have code to download it)
- You will need to install pytorch:
    - pip install torch 
- You will also need to have pandas, torchtext, numpy and scikit learn installed, you can find the instructions for them in the lecture notebook.

We are going to use an open source library for building optimized deep learning models that can be run on GPUs, the library is called [Pytorch](https://pytorch.org/docs/stable/index.html). It is one of the most widely used libraries for building neural networks/deep learning models.

__NOTE: If your notebook/PC is not good enough, it is advised to use Google Colab for this laboratory for free access to GPUs. If you have completed the exercises, you can download the notebook and upload it to the repository__

In [None]:
!pip install torch



In [234]:
# Import the needed libraries
import pandas as pd
import numpy as np

## 0. Download the dataset and load it into a pandas DataFrame

__Note: you can reuse your code from the previous lab!__

In [236]:
# First we download the data using the code from last week
import os
if not os.path.isdir('./data'):
    os.mkdir('./data')

import urllib.request # modified for python version 3.7
u = urllib.request.URLopener()
u.retrieve("http://sandbox.hlt.bme.hu/~adaamko/offenseval.tsv",
           "data/offenseval.tsv")

('data/offenseval.tsv', <http.client.HTTPMessage at 0x7f787fd6b110>)

## 0.1 Read in the dataset into a Pandas DataFrame
Use `pd.read_csv` with the correct parameters to read in the dataset. If done correctly, `DataFrame` should have 3 columns, 
`id`, `tweet`, `subtask_a`.

In [237]:
import pandas as pd
import numpy as np

In [238]:
def read_dataset():
    # YOUR CODE HERE:

    #SAME AS LAB6:
    dataset = pd.read_csv("./data/offenseval.tsv",sep="\t", names = ["id", "tweet", "subtask_a"])
    final_dataset= dataset.iloc[1:]
    return final_dataset
    #raise NotImplementedError()
d= read_dataset()
d

Unnamed: 0,id,tweet,subtask_a
1,86426,@USER She should ask a few native Americans wh...,OFF
2,90194,@USER @USER Go home you’re drunk!!! @USER #MAG...,OFF
3,16820,Amazon is investigating Chinese employees who ...,NOT
4,62688,"@USER Someone should'veTaken"" this piece of sh...",OFF
5,43605,@USER @USER Obama wanted liberals &amp; illega...,NOT
...,...,...,...
13236,95338,@USER Sometimes I get strong vibes from people...,OFF
13237,67210,Benidorm ✅ Creamfields ✅ Maga ✅ Not too sh...,NOT
13238,82921,@USER And why report this garbage. We don't g...,OFF
13239,27429,@USER Pussy,OFF


In [239]:
train_data_unprocessed = read_dataset()

assert type(train_data_unprocessed) == pd.core.frame.DataFrame
assert len(train_data_unprocessed.columns) == 3
assert (train_data_unprocessed.columns == ['id', 'tweet', 'subtask_a']).all()

## 0.2 Convert `subtask_a` into a binary label
The task is to classify the given tweets into two category: _offensive(OFF)_ , _not offensive (NOT)_. For machine learning algorithms you will need integer labels instead of strings. Add a new column to the dataframe called `label`, and transform the `subtask_a` column into a binary integer label.

In [240]:
def transform(train_data):
    # YOUR CODE HERE
    # SAME AS LAB 6:
    train_data["label"]=train_data.subtask_a.apply(lambda x: 1 if x== "NOT" else 0 )
    return train_data
    #raise NotImplementedError()
tr_d=transform(d)
tr_d

Unnamed: 0,id,tweet,subtask_a,label
1,86426,@USER She should ask a few native Americans wh...,OFF,0
2,90194,@USER @USER Go home you’re drunk!!! @USER #MAG...,OFF,0
3,16820,Amazon is investigating Chinese employees who ...,NOT,1
4,62688,"@USER Someone should'veTaken"" this piece of sh...",OFF,0
5,43605,@USER @USER Obama wanted liberals &amp; illega...,NOT,1
...,...,...,...,...
13236,95338,@USER Sometimes I get strong vibes from people...,OFF,0
13237,67210,Benidorm ✅ Creamfields ✅ Maga ✅ Not too sh...,NOT,1
13238,82921,@USER And why report this garbage. We don't g...,OFF,0
13239,27429,@USER Pussy,OFF,0


In [241]:
from pandas.api.types import is_numeric_dtype

train_data = transform(train_data_unprocessed)

assert "label" in train_data
assert is_numeric_dtype(train_data.label)
assert (train_data.label.isin([0,1])).all()

## 1. Train a simple neural network on this dataset

__HINT: you can reuse the code from the Lecture! Most of the code will be very similar that we used there!__

In [242]:
#Import pytorch and set a fixed random seed for reproducibility
import torch

SEED = 1234

torch.manual_seed(SEED)
torch.backends.cudnn.deterministic = True
type(SEED)

int

### 1.1 Split the dataset into a train and a validation dataset
Use the random seed for splitting. You should split the dataset into 70% training data and 30% validation data

In [243]:
from sklearn.model_selection import train_test_split as split

def split_data(train_data, random_seed):
    # YOUR CODE HERE
  tr_data, val_data = split(train_data, test_size=0.3, random_state=random_seed)
  return tr_data, val_data
    #raise NotImplementedError()
tr_data, val_data = split_data(train_data, SEED)

In [244]:
tr_data, val_data = split_data(train_data, SEED)
assert len(tr_data) == 9268

### 1.2 Use CountVectorizer to prepare the features for the sentences
You should fit CountVectorizer using _10000_ features

In [245]:
from sklearn.feature_extraction.text import CountVectorizer

def prepare_vectorizer(tr_data):
    # YOUR CODE HERE
  vectorizer = CountVectorizer(max_features=10000)
  word_to_ix = vectorizer.fit(tr_data.tweet)
  return word_to_ix
    #raise NotImplementedError()


In [246]:
word_to_ix = prepare_vectorizer(tr_data)
VOCAB_SIZE = len(word_to_ix.vocabulary_)
assert VOCAB_SIZE == 10000

### 1.3 Prepare the DataLoader for batch processing

The __prepare_dataloader(..)__ function will take the training and the validation dataset and convert them to one-hot encoded vectors with the help of the initialized CountVectorizer.

You should prepare two FloatTensor for the converted tweets of the training and the validation data.

Then zip together the vectors with the labels as a list of tuples!

__Hint: look at the lecture (but be careful, we had different types of labels there!)__

In [247]:
def prepare_dataloader(tr_data, val_data, word_to_ix):
    # YOUR CODE HERE
  tr_data_vecs = torch.FloatTensor(word_to_ix.transform(tr_data.tweet).toarray())
  tr_labels = tr_data.label.tolist()

  val_data_vecs = torch.FloatTensor(word_to_ix.transform(val_data.tweet).toarray())
  val_labels = val_data.label.tolist()

  # Answer from teacher Kovac Adam:
  # In the lecture: labels range from [1,2,3,4] => must fix the range into [0,1,2,3] for later usage
  # => (sample, label-1) is used
  # in this case, the labels range from [0,1] => already correct with the index requirement
  # => no need for label-1. IN this case if label -1 is kept, there would be INDEX ERROR at 1.5 Task below:
  

  tr_data_loader = [(sample, label) for sample, label in zip(tr_data_vecs, tr_labels)] 
  val_data_loader = [(sample, label) for sample, label in zip(val_data_vecs, val_labels)]
  return tr_data_loader, val_data_loader
    #raise NotImplementedError()

In [248]:
tr_data_loader, val_data_loader = prepare_dataloader(tr_data, val_data, word_to_ix)
assert type(tr_data_loader[0][0]) == torch.Tensor
assert len(tr_data_loader) == 9268
assert type(tr_data_loader[0][1]) == int

- __We have the correct lists now, it is time to initialize the DataLoader objects!__
- __Create two DataLoader objects with the lists we have created__
- __Shuffle the training data but not the validation data!__
- __Set a BATCH_SIZE, experiment with different sized batches to see if it improves the performance__

In [249]:
from torch.utils.data import DataLoader

def create_dataloader_iterators(tr_data_loader, val_data_loader, BATCH_SIZE):
    # YOUR CODE HERE
  train_iterator = DataLoader(tr_data_loader,
                            batch_size=BATCH_SIZE,
                            shuffle=True,
                            )
  
  valid_iterator = DataLoader(val_data_loader,
                          batch_size=BATCH_SIZE,
                          shuffle=False,
                          )
  return train_iterator, valid_iterator
    #raise NotImplementedError()

In [250]:
# Try to experiment with different sized batches and see if changing this will improve the performance or not!
BATCH_SIZE = 64

# The affect of changing BATCH_SIZE, i will do this procedure at task 2.1 

In [251]:
train_iterator, valid_iterator = create_dataloader_iterators(tr_data_loader, val_data_loader, BATCH_SIZE)
assert type(train_iterator) == torch.utils.data.dataloader.DataLoader

In [252]:
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

### 1.4 Build the model
At first, the model only should contain a single Linear layer that takes one-hot-encoded vectors and trainsforms it into the dimension if the __NUM_LABELS__(how many classes we are trying to predict). Then, run through the output on a softmax activation to produce probabilites of the classes!

In [253]:
from torch import nn

class BoWClassifier(nn.Module):  # inheriting from nn.Module!
    # YOUR CODE HERE
    def __init__(self, num_labels, vocab_size):
        # calls the init function of nn.Module.  Dont get confused by syntax,
        # just always do it in an nn.Module
        super(BoWClassifier, self).__init__()

        
        self.linear = nn.Linear(vocab_size, num_labels) # vocab_size: the input layer (the number of input nodes = the number of features of the input vector)
                                                        # number_labels: the output layer (the number of output nodes= the number of labels or classes (in our case, there are 4 labels: sport, world,..))

    def forward(self, bow_vec):
        return F.log_softmax(self.linear(bow_vec), dim=1)
    #raise NotImplementedError()

In [254]:
# SET THE CORRECT INPUT AND OUTPUT DIMENSIONS!
INPUT_DIM = 10000
OUTPUT_DIM = 2
# YOUR CODE HERE
#raise NotImplementedError()

In [255]:
model = BoWClassifier(OUTPUT_DIM, INPUT_DIM)

In [256]:
# Set the optimizer and the loss function!
import torch.optim as optim

optimizer = optim.Adam(model.parameters(), lr=1e-3)
criterion = nn.NLLLoss()

In [257]:
model = model.to(device)
criterion = criterion.to(device)

In [258]:
assert model.linear.in_features == 10000
assert model.linear.out_features == 2

### Implement the following functions:
- __calculate_performance__: This should calculate the batch-wise accuracy of your model!
- __train__ - Train your model on the training data! This function should set the model to training mode, then use the given iterator to iterate through the training samples and make predictions using the provided model. You should then propagate back the error with the loss function and the optimizer. Finally return the average epoch loss and accuracy!
- __evaluate__ - Evaluate your model on the validation dataset. This function is essentially the same as the trainnig function, but you should set your model to eval mode and don't propagate back the errors to your weights!

In [259]:
def calculate_performance(preds, y):
    # YOUR CODE HERE
    rounded_preds = preds.argmax(1)
    # Calculate the correct predictions batch-wise
    correct = (rounded_preds == y).float()
    
    # Calculate the accuracy of your model
    acc = correct.sum() / len(correct)
    return acc
    #raise NotImplementedError()

In [260]:
import torch.nn.functional as F
def train(model, iterator, optimizer, criterion):
    epoch_loss = 0
    epoch_acc = 0
    # YOUR CODE HERE
    model.train()
    
    # We calculate the error on batches so the iterator will return matrices with shape [BATCH_SIZE, VOCAB_SIZE]
    for texts, labels in iterator: # an iterator = one batch has gone through the model, and the model weighs has got one update
                                   # number of iteration = total samples/ batch size.
        # We copy the text and label to the correct device
        texts = texts.to(device)
        labels = labels.to(device)
        
        # We reset the gradients from the last step, so the loss will be calculated correctly (and not added together)
        optimizer.zero_grad() # the weights are updated on this batch.
                              # the loss of the this batch will not affect how the weights in the next batch is updated.  
                
        # This runs the forward function on your model (you don't need to call it directly)    
        predictions = model(texts)

        # Calculate the loss and the accuracy on the predictions (the predictions are log probabilities, remember!)
        loss = criterion(predictions, labels)
        acc = calculate_performance(predictions, labels)
        
        # Propagate the error back on the model (this means changing the initial weights in your model)
        loss.backward()
        optimizer.step()
        
        # We add batch-wise loss to the epoch-wise loss
        epoch_loss += loss.item()
        epoch_acc += acc.item()
    #raise NotImplementedError()
    return epoch_loss / len(iterator), epoch_acc / len(iterator)

In [261]:
def evaluate(model, iterator, criterion):
    
    epoch_loss = 0
    epoch_acc = 0
    # YOUR CODE HERE
     # On the validation dataset we don't want training so we need to set the model on evaluation mode
    model.eval() # do not learn on the data, only give the predictions
    
    # Also tell Pytorch to not propagate any error backwards in the model
    # This is needed when you only want to make predictions and use your model in inference mode!
    with torch.no_grad(): # means The we do not update the weights and biases (or the model parameters) anymore with backpropagation
                          # = only gives out the predictions, no training or learning procedures anymore.  
    
        # The remaining part is the same with the difference of not using the optimizer to backpropagation
        for texts, labels in iterator: # number of iteration is independent of batch size.
          
            # We copy the text and label to the correct device
            texts = texts.to(device)
            labels = labels.to(device)
            
            predictions = model(texts)
            loss = criterion(predictions, labels)
            
            acc = calculate_performance(predictions, labels)

            epoch_loss += loss.item()
            epoch_acc += acc.item()
    
    return epoch_loss / len(iterator), epoch_acc / len(iterator)

In [262]:
import time

def epoch_time(start_time, end_time):
    elapsed_time = end_time - start_time
    elapsed_mins = int(elapsed_time / 60)
    elapsed_secs = int(elapsed_time - (elapsed_mins * 60))
    return elapsed_mins, elapsed_secs

### 1.5 Training loop!
Below is the training loop of our model! Try to set an EPOCH number that will correctly train your model :) (it is not underfitted but neither overfitted!

In [263]:
# Set an EPOCH number!
# one EPOCH= one time the model has been trained with the data
N_EPOCHS = 15 # This is the appropriate epoch numbers because the last two epochs show the training loss increased
# while the validation loss does not change (= stagnate)

In [264]:
best_valid_loss = float('inf')

for epoch in range(N_EPOCHS):

    start_time = time.time()
    
    train_loss, train_score = train(model, train_iterator, optimizer, criterion)
    valid_loss, valid_score = evaluate(model, valid_iterator, criterion)
    
    end_time = time.time()

    epoch_mins, epoch_secs = epoch_time(start_time, end_time)
    
    print(f'Epoch: {epoch+1:02} | Epoch Time: {epoch_mins}m {epoch_secs}s')
    print(f'\tTrain Loss: {train_loss:.3f} | Train Fscore: {train_score*100:.2f}%')
    print(f'\t Val. Loss: {valid_loss:.3f} |  Val. Fscore: {valid_score*100:.2f}%')

Epoch: 01 | Epoch Time: 0m 0s
	Train Loss: 0.638 | Train Fscore: 66.81%
	 Val. Loss: 0.616 |  Val. Fscore: 68.20%
Epoch: 02 | Epoch Time: 0m 0s
	Train Loss: 0.580 | Train Fscore: 70.68%
	 Val. Loss: 0.592 |  Val. Fscore: 70.34%
Epoch: 03 | Epoch Time: 0m 0s
	Train Loss: 0.539 | Train Fscore: 74.61%
	 Val. Loss: 0.577 |  Val. Fscore: 71.75%
Epoch: 04 | Epoch Time: 0m 0s
	Train Loss: 0.507 | Train Fscore: 77.67%
	 Val. Loss: 0.565 |  Val. Fscore: 72.02%
Epoch: 05 | Epoch Time: 0m 0s
	Train Loss: 0.480 | Train Fscore: 79.52%
	 Val. Loss: 0.557 |  Val. Fscore: 72.77%
Epoch: 06 | Epoch Time: 0m 0s
	Train Loss: 0.457 | Train Fscore: 81.06%
	 Val. Loss: 0.551 |  Val. Fscore: 73.26%
Epoch: 07 | Epoch Time: 0m 0s
	Train Loss: 0.437 | Train Fscore: 82.25%
	 Val. Loss: 0.546 |  Val. Fscore: 73.64%
Epoch: 08 | Epoch Time: 0m 0s
	Train Loss: 0.420 | Train Fscore: 83.40%
	 Val. Loss: 0.542 |  Val. Fscore: 74.45%
Epoch: 09 | Epoch Time: 0m 0s
	Train Loss: 0.404 | Train Fscore: 84.76%
	 Val. Loss: 0.5

### 1.6 Change calculate_performance to calculate FScore instead of accuracy

Our dataset is very imbalanced. We have twice as many NOT offensive tweets as offensive ones. Accuracy is not a good measure for this.

See https://scikit-learn.org/stable/modules/generated/sklearn.metrics.f1_score.html for fscore calculation.

You should expect a heavy drop in performance when you calculate fscore instead of accuracy!

__NOTE: DON'T FORGET TO RERUN THE MODEL INITIALIZATION WHEN YOU ARE TRYING TO RUN THE MODEL MULTIPLE TIMES. IF YOU DON'T REINITIALIZE THE MODEL IT WILL CONTINUE THE TRAINING WHERE IT HAS STOPPED LAST TIME AND DOESN'T RUN FROM SRATCH!__

These lines:


`model = BoWClassifier(OUTPUT_DIM, INPUT_DIM)
optimizer = optim.Adam(model.parameters(), lr=1e-3)
criterion = nn.NLLLoss()
model = model.to(device)
criterion = criterion.to(device)`

This will reinitialize the model!

In [265]:
from sklearn.metrics import f1_score

def calculate_performance(preds, y):
    # YOUR CODE HERE
    # ROUNDING the prediction values of 'preds'
    # if this rounding step is not implemented, there is an error for runing 'f1_score' command
    rounded_preds = preds.argmax(1)

    # because 'y' and 'rounded_preds' are torch.Tensor types
    # this type is not appropriate as f1_score input
    # MUST CONVERT THEM TO list type:
    y=y.tolist()
    rounded_preds=rounded_preds.tolist()

    t=f1_score(y, rounded_preds, average='weighted')
    return t
  

In [266]:
# RE-INITIATE THE MODEL FOR EACH TRY:
model = BoWClassifier(OUTPUT_DIM, INPUT_DIM) 
optimizer = optim.Adam(model.parameters(), lr=1e-3) 
criterion = nn.NLLLoss() 
model = model.to(device) 
criterion = criterion.to(device)
#########
for epoch in range(N_EPOCHS):

    start_time = time.time()
    
    train_loss, train_score = train(model, train_iterator, optimizer, criterion)
    valid_loss, valid_score = evaluate(model, valid_iterator, criterion)
    
    end_time = time.time()

    epoch_mins, epoch_secs = epoch_time(start_time, end_time)
    
    print(f'Epoch: {epoch+1:02} | Epoch Time: {epoch_mins}m {epoch_secs}s')
    print(f'\tTrain Loss: {train_loss:.3f} | Train Fscore: {train_score*100:.2f}%')
    print(f'\t Val. Loss: {valid_loss:.3f} |  Val. Fscore: {valid_score*100:.2f}%')

Epoch: 01 | Epoch Time: 0m 1s
	Train Loss: 0.640 | Train Fscore: 54.11%
	 Val. Loss: 0.617 |  Val. Fscore: 56.71%
Epoch: 02 | Epoch Time: 0m 0s
	Train Loss: 0.580 | Train Fscore: 61.55%
	 Val. Loss: 0.593 |  Val. Fscore: 61.52%
Epoch: 03 | Epoch Time: 0m 0s
	Train Loss: 0.539 | Train Fscore: 68.84%
	 Val. Loss: 0.576 |  Val. Fscore: 65.53%
Epoch: 04 | Epoch Time: 0m 0s
	Train Loss: 0.506 | Train Fscore: 73.86%
	 Val. Loss: 0.566 |  Val. Fscore: 67.73%
Epoch: 05 | Epoch Time: 0m 0s
	Train Loss: 0.479 | Train Fscore: 76.99%
	 Val. Loss: 0.557 |  Val. Fscore: 67.95%
Epoch: 06 | Epoch Time: 0m 0s
	Train Loss: 0.456 | Train Fscore: 78.91%
	 Val. Loss: 0.550 |  Val. Fscore: 69.67%
Epoch: 07 | Epoch Time: 0m 0s
	Train Loss: 0.437 | Train Fscore: 80.73%
	 Val. Loss: 0.545 |  Val. Fscore: 70.23%
Epoch: 08 | Epoch Time: 0m 0s
	Train Loss: 0.419 | Train Fscore: 81.82%
	 Val. Loss: 0.542 |  Val. Fscore: 71.05%
Epoch: 09 | Epoch Time: 0m 0s
	Train Loss: 0.404 | Train Fscore: 83.06%
	 Val. Loss: 0.5

## 2. Add more linear layers to your model and experiment with other hyperparameters

### 2.1 More layers

Currently we only have a single linear layers in our model. Try to add one or more additional linear layers to the model.
You should introduce a HIDDEN_SIZE parameter that will be the size of the intermediate representation between the linear layers. Also add a RELU activation function between the linear layers.

See more:
- https://pytorch.org/docs/stable/generated/torch.nn.ReLU.html
- https://pytorch.org/tutorials/beginner/examples_nn/two_layer_net_nn.html

In [267]:
from torch import nn

class BoWDeepClassifier(nn.Module):  # inheriting from nn.Module!
    def __init__(self, num_labels, vocab_size, hidden_size):
        # YOUR CODE HERE
        super(BoWDeepClassifier, self).__init__()

        # add one more hidden layer:
        self.sequence= nn.Sequential(
                    nn.Linear(vocab_size, hidden_size),
                    nn.ReLU(),
                    nn.Linear(hidden_size,num_labels),
                    nn.LogSoftmax(dim=1)
        )

        #raise NotImplementedError()

    def forward(self, bow_vec):
        # YOUR CODE HERE
        output= self.sequence(bow_vec)
        return output
        #raise NotImplementedError()

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

### Write down your experiences with changing the parameters to the cell below

In [None]:
# YOUR CODE HERE
'''
HIDDEN_SIZE = 200
learning_rate = 0.001
BATCH_SIZE = 15
N_EPOCHS = 15

=>Train Loss: 0.012 | Train Fscore: 99.72%
	 Val. Loss: 1.482 |  Val. Fscore: 72.54%
#########
HIDDEN_SIZE = 200
learning_rate = 0.001
BATCH_SIZE = 64
N_EPOCHS = 15

=> The final result : Train Loss: 0.011 | Train Fscore: 99.74%
	 Val. Loss: 1.497 |  Val. Fscore: 72.03%

With one hidden_layer added, the train loss decreased => train score increased
while validation loss increased => validation score decreased
###########
HIDDEN_SIZE = 200
learning_rate = 0.001
BATCH_SIZE = 100
N_EPOCHS = 15

=> Train Loss: 0.012 | Train Fscore: 99.73%
	 Val. Loss: 1.469 |  Val. Fscore: 72.65%

##############
HIDDEN_SIZE = 200
learning_rate = 0.001
BATCH_SIZE = 100
N_EPOCHS = 6

=>Train Loss: 0.059 | Train Fscore: 98.52%
	 Val. Loss: 0.994 |  Val. Fscore: 72.87%

#############
HIDDEN_SIZE = 230
learning_rate = 0.001
BATCH_SIZE = 100
N_EPOCHS = 6

=>Train Loss: 0.049 | Train Fscore: 98.81%
	 Val. Loss: 1.036 |  Val. Fscore: 72.48%
###########
HIDDEN_SIZE = 180
learning_rate = 0.001
BATCH_SIZE = 100
N_EPOCHS = 6

=>Train Loss: 0.057 | Train Fscore: 98.66%
	 Val. Loss: 0.994 |  Val. Fscore: 72.36%

=>> Decrease epochs, decrase epoch to appropriate value, can raise the validation score => model becomes less overfit
=>> Decrease/Increase hidden_size => decrase/increase train score but for validation score in both cases always decreases
=>> Decrease/Increase BATCH_SIZE => decrease/ increase train score and validation score
'''
#raise NotImplementedError()

In [268]:
HIDDEN_SIZE = 200
learning_rate = 0.001
BATCH_SIZE = 64
N_EPOCHS = 15

In [269]:
model = BoWDeepClassifier(OUTPUT_DIM, INPUT_DIM, HIDDEN_SIZE)

optimizer = optim.Adam(model.parameters(), lr=learning_rate)
criterion = nn.NLLLoss()

model = model.to(device)
criterion = criterion.to(device)

In [270]:


# TRAINING LOOP HERE!
best_valid_loss = float('inf')

for epoch in range(N_EPOCHS):

    start_time = time.time()
    
    train_loss, train_score = train(model, train_iterator, optimizer, criterion)
    valid_loss, valid_score = evaluate(model, valid_iterator, criterion)
    
    end_time = time.time()

    epoch_mins, epoch_secs = epoch_time(start_time, end_time)
    
    print(f'Epoch: {epoch+1:02} | Epoch Time: {epoch_mins}m {epoch_secs}s')
    print(f'\tTrain Loss: {train_loss:.3f} | Train Fscore: {train_score*100:.2f}%')
    print(f'\t Val. Loss: {valid_loss:.3f} |  Val. Fscore: {valid_score*100:.2f}%')

Epoch: 01 | Epoch Time: 0m 6s
	Train Loss: 0.600 | Train Fscore: 62.30%
	 Val. Loss: 0.545 |  Val. Fscore: 71.33%
Epoch: 02 | Epoch Time: 0m 5s
	Train Loss: 0.397 | Train Fscore: 81.75%
	 Val. Loss: 0.563 |  Val. Fscore: 74.13%
Epoch: 03 | Epoch Time: 0m 5s
	Train Loss: 0.243 | Train Fscore: 90.38%
	 Val. Loss: 0.678 |  Val. Fscore: 73.12%
Epoch: 04 | Epoch Time: 0m 5s
	Train Loss: 0.150 | Train Fscore: 94.78%
	 Val. Loss: 0.769 |  Val. Fscore: 72.75%
Epoch: 05 | Epoch Time: 0m 5s
	Train Loss: 0.090 | Train Fscore: 97.37%
	 Val. Loss: 0.879 |  Val. Fscore: 72.67%
Epoch: 06 | Epoch Time: 0m 5s
	Train Loss: 0.058 | Train Fscore: 98.43%
	 Val. Loss: 0.996 |  Val. Fscore: 72.17%
Epoch: 07 | Epoch Time: 0m 6s
	Train Loss: 0.040 | Train Fscore: 99.02%
	 Val. Loss: 1.073 |  Val. Fscore: 72.22%
Epoch: 08 | Epoch Time: 0m 6s
	Train Loss: 0.031 | Train Fscore: 99.20%
	 Val. Loss: 1.162 |  Val. Fscore: 72.17%
Epoch: 09 | Epoch Time: 0m 6s
	Train Loss: 0.024 | Train Fscore: 99.43%
	 Val. Loss: 1.2

# ================ PASSING LEVEL ====================

## 3. Implement automatic early-stopping in the training loop
Early stopping is a very easy method to avoid the overfitting of your model.

You should:
- Save the training and the validation loss of the last two epochs (if you are atleast in the third epoch)
- If the loss increased in the last two epoch on the training data but descreased or stagnated in the validation data, you should stop the training automatically!

In [272]:
HIDDEN_SIZE = 200
learning_rate = 0.001
BATCH_SIZE = 64
N_EPOCHS = 45

model = BoWDeepClassifier(OUTPUT_DIM, INPUT_DIM, HIDDEN_SIZE)

optimizer = optim.Adam(model.parameters(), lr=learning_rate)
criterion = nn.NLLLoss()

model = model.to(device)
criterion = criterion.to(device)


# YOUR CODE HERE
tr_loss_0=0
val_loss_0=0
tr_loss_1=0
val_loss_1=0
for epoch in range(N_EPOCHS):

    start_time = time.time()
    
    train_loss, train_score = train(model, train_iterator, optimizer, criterion)
    valid_loss, valid_score = evaluate(model, valid_iterator, criterion)
    if epoch == 0:
      tr_loss_0 = train_loss
      val_loss_0 = valid_loss
    if epoch == 1 :
      tr_loss_1 = train_loss
      val_loss_1 = valid_loss
    if epoch > 1:
      if (tr_loss_1 - tr_loss_0) > 0 and (val_loss_1 - val_loss_0) <= 0:  
        break
      else:
        tr_loss_0 = tr_loss_1
        val_loss_0 = val_loss_1
        tr_loss_1 = train_loss
        val_loss_1 = valid_loss

    end_time = time.time()

    epoch_mins, epoch_secs = epoch_time(start_time, end_time)
    
    print(f'Epoch: {epoch+1:02} | Epoch Time: {epoch_mins}m {epoch_secs}s')
    print(f'\tTrain Loss: {train_loss:.3f} | Train Fscore: {train_score*100:.2f}%')
    print(f'\t Val. Loss: {valid_loss:.3f} |  Val. Fscore: {valid_score*100:.2f}%')
#raise NotImplementedError()

Epoch: 01 | Epoch Time: 0m 6s
	Train Loss: 0.597 | Train Fscore: 61.43%
	 Val. Loss: 0.545 |  Val. Fscore: 70.43%
Epoch: 02 | Epoch Time: 0m 5s
	Train Loss: 0.402 | Train Fscore: 81.63%
	 Val. Loss: 0.557 |  Val. Fscore: 73.81%
Epoch: 03 | Epoch Time: 0m 5s
	Train Loss: 0.246 | Train Fscore: 90.13%
	 Val. Loss: 0.669 |  Val. Fscore: 73.75%
Epoch: 04 | Epoch Time: 0m 5s
	Train Loss: 0.147 | Train Fscore: 94.84%
	 Val. Loss: 0.768 |  Val. Fscore: 73.31%
Epoch: 05 | Epoch Time: 0m 5s
	Train Loss: 0.088 | Train Fscore: 97.44%
	 Val. Loss: 0.876 |  Val. Fscore: 72.92%
Epoch: 06 | Epoch Time: 0m 5s
	Train Loss: 0.055 | Train Fscore: 98.52%
	 Val. Loss: 1.018 |  Val. Fscore: 72.71%
Epoch: 07 | Epoch Time: 0m 6s
	Train Loss: 0.038 | Train Fscore: 99.06%
	 Val. Loss: 1.079 |  Val. Fscore: 73.02%
Epoch: 08 | Epoch Time: 0m 7s
	Train Loss: 0.028 | Train Fscore: 99.35%
	 Val. Loss: 1.175 |  Val. Fscore: 72.42%
Epoch: 09 | Epoch Time: 0m 7s
	Train Loss: 0.023 | Train Fscore: 99.40%
	 Val. Loss: 1.2

## 4. Handling class imbalance
Our data is imbalanced, the first class has twice the population of the second class.

One way of handling imbalanced data is to weight the loss function, so it penalizes errors on the smaller class.

Look at the documentation of the loss function: https://pytorch.org/docs/stable/generated/torch.nn.NLLLoss.html

Set the weights based on the inverse population of the classes (so the less sample a class has, more the errors will be penalized!)

In [273]:
tr_data.groupby("label").size()

label
0    3089
1    6179
dtype: int64

In [274]:
# YOUR CODE HERE
model = BoWDeepClassifier(OUTPUT_DIM, INPUT_DIM, HIDDEN_SIZE)

optimizer = optim.Adam(model.parameters(), lr=learning_rate)

# Modified to deal with unbalance training data
# because label 0 has 3089 samples < label 1 has 6179 samples
# => the weight of label 0 should be bigger than the weight of label 1
# 1/3089 is the weight of label 0
# 1/6179 is the weight of label 1
criterion = nn.NLLLoss(weight=torch.Tensor([ (1/3089) , (1/6179)]) )

model = model.to(device)
criterion = criterion.to(device)
#raise NotImplementedError()

In [275]:
tr_loss_0=0
val_loss_0=0
tr_loss_1=0
val_loss_1=0
for epoch in range(N_EPOCHS):

    start_time = time.time()
    
    train_loss, train_score = train(model, train_iterator, optimizer, criterion)
    valid_loss, valid_score = evaluate(model, valid_iterator, criterion)
    if epoch == 0:
      tr_loss_0 = train_loss
      val_loss_0 = valid_loss
    if epoch == 1 :
      tr_loss_1 = train_loss
      val_loss_1 = valid_loss
    if epoch > 1:
      if (tr_loss_1 - tr_loss_0) > 0 and (val_loss_1 - val_loss_0) <= 0:  
        break
      else:
        tr_loss_0 = tr_loss_1
        val_loss_0 = val_loss_1
        tr_loss_1 = train_loss
        val_loss_1 = valid_loss

    end_time = time.time()

    epoch_mins, epoch_secs = epoch_time(start_time, end_time)
    
    print(f'Epoch: {epoch+1:02} | Epoch Time: {epoch_mins}m {epoch_secs}s')
    print(f'\tTrain Loss: {train_loss:.3f} | Train Fscore: {train_score*100:.2f}%')
    print(f'\t Val. Loss: {valid_loss:.3f} |  Val. Fscore: {valid_score*100:.2f}%')

Epoch: 01 | Epoch Time: 0m 6s
	Train Loss: 0.641 | Train Fscore: 65.40%
	 Val. Loss: 0.592 |  Val. Fscore: 70.38%
Epoch: 02 | Epoch Time: 0m 5s
	Train Loss: 0.418 | Train Fscore: 83.37%
	 Val. Loss: 0.627 |  Val. Fscore: 71.01%
Epoch: 03 | Epoch Time: 0m 5s
	Train Loss: 0.248 | Train Fscore: 91.01%
	 Val. Loss: 0.754 |  Val. Fscore: 73.11%
Epoch: 04 | Epoch Time: 0m 5s
	Train Loss: 0.144 | Train Fscore: 95.39%
	 Val. Loss: 0.907 |  Val. Fscore: 72.94%
Epoch: 05 | Epoch Time: 0m 5s
	Train Loss: 0.090 | Train Fscore: 97.43%
	 Val. Loss: 1.056 |  Val. Fscore: 72.29%
Epoch: 06 | Epoch Time: 0m 6s
	Train Loss: 0.058 | Train Fscore: 98.61%
	 Val. Loss: 1.187 |  Val. Fscore: 72.01%
Epoch: 07 | Epoch Time: 0m 7s
	Train Loss: 0.050 | Train Fscore: 98.86%
	 Val. Loss: 1.200 |  Val. Fscore: 72.03%
Epoch: 08 | Epoch Time: 0m 7s
	Train Loss: 0.033 | Train Fscore: 99.28%
	 Val. Loss: 1.359 |  Val. Fscore: 72.41%
Epoch: 09 | Epoch Time: 0m 7s
	Train Loss: 0.025 | Train Fscore: 99.45%
	 Val. Loss: 1.4

# ================ EXTRA LEVEL ====================