<a href="https://colab.research.google.com/github/viniciusjk/Vinicius-Jokubauskas-Portfolio/blob/main/Sentiment_Analysis_using_Bag_of_Embeddings.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Sentiment Analysis using Bag of Embeddings

This notebook is one project from the graduation class taken in Unicamp in the 1st semester of 2021 – Introdução ao Aprendizado Profundo (Introduction to Deep Learning).

*This notebook is meant to be run as a Google Colab notebook, where there is more computational resources availiable.*

In [25]:
!pip install pytorch-lightning==1.6.2 neptune-client==0.9.1

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


In [26]:
import numpy as np
import pandas as pd
import torch
import pytorch_lightning as pl
import neptune.new as neptune
from torch.utils.data import Dataset, DataLoader
pl.seed_everything(123)

api_token = 'TOKEN FROM NEPTUNE'
project = 'PROJECT FROM NEPTUNE'

INFO:pytorch_lightning.utilities.seed:Global seed set to 123


## Preparing Data

Downloading dataset:

In [27]:
!wget -nc http://files.fast.ai/data/examples/imdb_sample.tgz
!tar -xzf imdb_sample.tgz

File ‘imdb_sample.tgz’ already there; not retrieving.



Loading dataset using Pandas:

In [28]:
df = pd.read_csv('imdb_sample/texts.csv')
df.shape
df.head()

Unnamed: 0,label,text,is_valid
0,negative,Un-bleeping-believable! Meg Ryan doesn't even ...,False
1,positive,This is a extremely well-made film. The acting...,False
2,negative,Every once in a long while a movie will come a...,False
3,positive,Name just says it all. I watched this movie wi...,False
4,negative,This movie succeeds at being one of the most u...,False


Splitting dataset into train and test parts:

- Validation dataset created from the train dataset

In [29]:
train = df[df['is_valid'] == False]
test = df[df['is_valid'] == True]
valid = train.sample(frac=0.10)
train = train[~train.index.isin(valid.index)]
print('train.shape:', train.shape)
print('valid.shape:', valid.shape)
print('test.shape:', test.shape)

train.shape: (720, 3)
valid.shape: (80, 3)
test.shape: (200, 3)


Creating the X and the Y (ground-truth) variables:

In [30]:
X_train = train['text']
Y_train = train['label']
X_valid = valid['text']
Y_valid = valid['label']
X_test = test['text']
Y_test = test['label']

print('X_train.head():', X_train.head())
print('Y_train.head():', Y_train.head())


X_train.head(): 0    Un-bleeping-believable! Meg Ryan doesn't even ...
1    This is a extremely well-made film. The acting...
2    Every once in a long while a movie will come a...
3    Name just says it all. I watched this movie wi...
5    From the start, you know how this movie will e...
Name: text, dtype: object
Y_train.head(): 0    negative
1    positive
2    negative
3    positive
5    negative
Name: label, dtype: object


Converting "positive" and "negative" to boolean variables:

In [31]:
mapping = {'positive': True, 'negative': False}
Y_train_bool = Y_train.map(mapping)
Y_valid_bool = Y_valid.map(mapping)
Y_test_bool = Y_test.map(mapping)
print(Y_train_bool.head())

0    False
1     True
2    False
3     True
5    False
Name: label, dtype: bool


## The function that loads the Dataloaders

- Will be used in all models

In [32]:
def get_dataloaders(dataset_train, dataset_valid, dataset_test, batch_size):

  dataloader_train = DataLoader(dataset=dataset_train, batch_size=batch_size, shuffle=True)
  dataloader_valid = DataLoader(dataset=dataset_valid, batch_size=batch_size, shuffle=False)
  dataloader_test = DataLoader(dataset=dataset_test, batch_size=batch_size, shuffle=False)
  return dataloader_train, dataloader_valid, dataloader_test

## Lightning Model

- Will be used in all models training

In [33]:
class LitModel(pl.LightningModule):

  def __init__( self, model, hparams, run=None,):
      super().__init__()
      self.model = model
      self.save_hyperparameters(hparams)
      self.run = run
      self.criterion = torch.nn.CrossEntropyLoss(reduce=None)

      if self.run:
          self.run['hparams'] = hparams

      print(self.hparams)
      
  def forward(self, x):

      logits = self.model(x)
      preds = logits.argmax(dim=1)
      return logits, preds

  """
  ===========================================
                  TRAINING
  ===========================================
  """


  def training_step(self, batch, batch_idx):

      (x, y) = batch
      if self.hparams.get('dropout'):
        if x.is_sparse:
            logits = self.model(torch.nn.Dropout(self.hparams.dropout)(x.to_dense().float()).to_sparse())
        else:
            logits = self.model(torch.nn.Dropout(self.hparams.dropout)(x.float()))
      else:
        logits = self.model(x)
      batch_loss = self.criterion(logits, y)
      loss = batch_loss.mean()


      if self.run:
          self.run['train/batch_loss'].log(batch_loss)

      return {'batch_loss': batch_loss, 'loss': loss}

  def training_epoch_end(self, outputs):
      

      train_epoch_loss = self.avg_calculation(outputs, 'batch_loss')
      self.log('train_epoch_loss', train_epoch_loss)
      if self.run:
          self.run['train/loss_epoch'].log(train_epoch_loss)


  """
  ===========================================
                  VALIDATION
  ===========================================
  """


  def validation_step(self, batch, batch_idx):

      x, y = batch
      logits, preds = self.forward(x)
      
      batch_loss = self.criterion(logits, y)
      # print(y, preds)
      accuracy = (preds == y).float()
      self.log('val_loss_batch', batch_loss)

      return {'loss': batch_loss, 'accuracy': accuracy}

  def validation_epoch_end(self, outputs):

      valid_epoch_loss = self.avg_calculation(outputs, 'loss')
      valid_epoch_accuracy = self.avg_calculation(outputs, 'accuracy')
      # print(valid_epoch_accuracy)
      self.log('valid_epoch_loss', valid_epoch_loss,  on_epoch=True, prog_bar=True)
      self.log('valid_epoch_accuracy', valid_epoch_accuracy,  on_epoch=True, prog_bar=True)
      if self.run:
          self.run['validation/loss'].log(valid_epoch_loss)
          self.run['validation/accuracy'].log(valid_epoch_accuracy)


  """
  ===========================================
                      TEST
  ===========================================
  """

  def test_step(self, batch, batch_idx):

      x, y = batch
      logits, preds = self.forward(x)
      accuracy = (preds == y).float()
      batch_loss = self.criterion(logits, y)
      return {'loss': batch_loss, 'accuracy': accuracy}

  def test_epoch_end(self, outputs):

      test_loss = self.avg_calculation(outputs, 'loss')
      test_accuracy = self.avg_calculation(outputs, 'accuracy')
      self.log('test_loss', test_loss)
      self.log('test_accuracy', test_accuracy)

  def configure_optimizers(self):

      optimizer =  self.hparams.optimizer(self.model.parameters(),
                              lr=self.hparams.lr)
      scheduler = torch.optim.lr_scheduler.MultiplicativeLR(optimizer, lr_lambda=lambda epoch: .99)

      return [optimizer], [scheduler]

  def avg_calculation(self, outputs, metric):

      return torch.stack([output[metric] for output in outputs]).float().mean()

## Loading the word embeddings from gensim

We used gensim to download and load the word embeddings. The list is availiable in: https://github.com/RaRe-Technologies/gensim-data#models

In [34]:
import gensim.downloader as api
from sklearn.feature_extraction.text import CountVectorizer
from collections import Counter

word2vec_model = api.load("glove-wiki-gigaword-300")
print(word2vec_model.vectors.shape)
print(word2vec_model.index2word)

IOPub data rate exceeded.
The notebook server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--NotebookApp.iopub_data_rate_limit`.

Current values:
NotebookApp.iopub_data_rate_limit=1000000.0 (bytes/sec)
NotebookApp.rate_limit_window=3.0 (secs)



In [35]:
vocab = dict([(w, i) for i, w in enumerate(word2vec_model.index2entity)])


In [36]:
embedding = torch.nn.Embedding.from_pretrained(torch.tensor(word2vec_model.vectors))
# embedding_bag.weight.shape

In [37]:
class EmbedingsDataSet(Dataset):

  def __init__(self, x, y, embedder):
    super(EmbedingsDataSet, self).__init__()

    self.embedder = embedder
    tensor = torch.tensor((self.embedder.transform(x).toarray()))
    self.data = tensor.to_sparse()
    self.target = torch.tensor((y == 'positive').to_numpy()*1)

  def __len__(self):

    return len(self.data)

  def __getitem__(self, index):

    x_token = self.data[index]
    y =  self.target[index]

    return x_token, y.item()

In [38]:
class WordEmbeddingsModelPreTrained(torch.nn.Module):

  def __init__(self, n_in, n_out, embedding, self_attention):
    super(WordEmbeddingsModelPreTrained, self).__init__()

    self.n_in = n_in
    self.n_out = n_out
    self.embedding = embedding
    self.self_attention = self_attention

    self.embedder = embedding
    self.embedder.weight.requires_grad = False
    self.encoder = self.self_attention(encoder=self.embedder)
    self.relu = torch.nn.ReLU()
    self.dense = torch.nn.Linear(self.embedder.weight.shape[1], self.n_out, bias=False)



  def forward(self, x):

    x = self.encoder(x)
    x = self.relu(x)
    return self.dense(x)
  

In [39]:
BoW = CountVectorizer(vocabulary=vocab)

dataset_train = EmbedingsDataSet(X_train, Y_train, BoW)
dataset_valid = EmbedingsDataSet(X_valid, Y_valid, BoW)
dataset_test = EmbedingsDataSet(X_test, Y_test, BoW)

## Self Attention Loop (Slower Implementation)

In [40]:
class SelfAttentionSlow(torch.nn.Module):

    def __init__(self, encoder, reduction='mean'):
        super(SelfAttentionSlow, self).__init__()
        self.reduction=reduction
        self.encoder = encoder

    def __call__(self, x):

        if x.dim()<=1:
            resp = self.get_self_attention(x)
            return resp
        else:
            vectors = []
            for i in range(x.shape[0]):
                vectors.append(self.get_self_attention(x[i]))
            
            resp= torch.cat(vectors)

            return resp
    
    def get_self_attention(self, x):
        vectors = []
        x = self.get_index(x)
        x = self.encoder(x)
        for i in range(x.shape[0]):
            vector = []
            for j in range(x.shape[0]):
                vector.append((x[i].dot(x[j])).item())
            vectors.append(vector)

        vectors = torch.tensor(vectors)
        vectors = torch.nn.Softmax(dim=1)(vectors)

        vectors_embedded = []
    
        for i in range(vectors.shape[0]):
            if x.is_cuda:
                vectors_embedded.append(vectors[i].cuda().unsqueeze(1).T.mm(x))
            else:
                vectors_embedded.append(vectors[i].unsqueeze(1).T.mm(x))
        vectors_embedded = torch.cat(vectors_embedded)

        if self.reduction=='mean':
            return vectors_embedded.mean(dim=0, keepdim=True)
        else:
            return vectors_embedded

    def get_index(self, x):

        if x.is_sparse:
            x = x.to_dense()

        words= x.nonzero().squeeze(1)
        return words

In [41]:
hparams = {'batch_size': 40, 'epochs': 10, 'lr':0.01, 'optimizer': torch.optim.Adam, 'dropout': 0.05}
run=None
new_run = False
if new_run: 
  run = neptune.init(project=project, api_token=api_token)
else: print(run)

dataloader_train, dataloader_valid, dataloader_test = get_dataloaders(dataset_train, dataset_valid, dataset_test, batch_size=hparams['batch_size'])

pl_model = LitModel(model=WordEmbeddingsModelPreTrained(dataset_train.data.shape[1], 2, embedding, self_attention=SelfAttentionSlow), hparams=hparams, run=run)


callback_loss = pl.callbacks.ModelCheckpoint(dirpath='./checkpoints/', monitor='valid_epoch_loss', mode='min')
callback_acc = pl.callbacks.ModelCheckpoint(dirpath='./checkpoints/', monitor='valid_epoch_accuracy', mode='max')

trainer = pl.Trainer(max_epochs=hparams['epochs'], callbacks=[callback_loss, callback_acc], gpus=0)

INFO:pytorch_lightning.utilities.rank_zero:GPU available: True, used: False
INFO:pytorch_lightning.utilities.rank_zero:TPU available: False, using: 0 TPU cores
INFO:pytorch_lightning.utilities.rank_zero:IPU available: False, using: 0 IPUs
INFO:pytorch_lightning.utilities.rank_zero:HPU available: False, using: 0 HPUs


None
"batch_size": 40
"dropout":    0.05
"epochs":     10
"lr":         0.01
"optimizer":  <class 'torch.optim.adam.Adam'>


  rank_zero_warn(


### Training Loop

In [42]:
trainer.fit(pl_model, dataloader_train, dataloader_valid)

  rank_zero_warn(f"Checkpoint directory {dirpath} exists and is not empty.")
INFO:pytorch_lightning.callbacks.model_summary:
  | Name      | Type                          | Params
------------------------------------------------------------
0 | model     | WordEmbeddingsModelPreTrained | 120 M 
1 | criterion | CrossEntropyLoss              | 0     
------------------------------------------------------------
600       Trainable params
120 M     Non-trainable params
120 M     Total params
480.002   Total estimated model params size (MB)


Sanity Checking: 0it [00:00, ?it/s]

  rank_zero_warn(


Training: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

### Testing

#### Checkpoint Loss

In [43]:
pl_model_loss = LitModel.load_from_checkpoint(callback_loss.best_model_path, model=pl_model.model, hparams=hparams, run=None)


"batch_size": 40
"dropout":    0.05
"epochs":     10
"lr":         0.01
"optimizer":  <class 'torch.optim.adam.Adam'>


##### Train Dataset

In [44]:
trainer.test(pl_model_loss, dataloader_train)

  rank_zero_warn(


Testing: 0it [00:00, ?it/s]

────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
       Test metric             DataLoader 0
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
      test_accuracy         0.8013888597488403
        test_loss           0.4945432245731354
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────


[{'test_loss': 0.4945432245731354, 'test_accuracy': 0.8013888597488403}]

In [45]:
trainer.test(pl_model_loss, dataloader_valid)

Testing: 0it [00:00, ?it/s]

────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
       Test metric             DataLoader 0
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
      test_accuracy                0.75
        test_loss           0.5099033117294312
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────


[{'test_loss': 0.5099033117294312, 'test_accuracy': 0.75}]

##### Dataset Test

In [46]:
trainer.test(pl_model_loss, dataloader_test)


Testing: 0it [00:00, ?it/s]

────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
       Test metric             DataLoader 0
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
      test_accuracy         0.7950000166893005
        test_loss           0.5130002498626709
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────


[{'test_loss': 0.5130002498626709, 'test_accuracy': 0.7950000166893005}]

#### Checkpoint Accuracy

In [47]:
pl_model_acc = LitModel.load_from_checkpoint(callback_acc.best_model_path, model=pl_model.model, hparams=hparams, run=None)


"batch_size": 40
"dropout":    0.05
"epochs":     10
"lr":         0.01
"optimizer":  <class 'torch.optim.adam.Adam'>


##### Train Dataset

In [48]:
trainer.test(pl_model_acc, dataloader_train)

Testing: 0it [00:00, ?it/s]

────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
       Test metric             DataLoader 0
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
      test_accuracy         0.7791666388511658
        test_loss           0.5488478541374207
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────


[{'test_loss': 0.5488478541374207, 'test_accuracy': 0.7791666388511658}]

##### Validation Dataset


In [49]:
trainer.test(pl_model_acc, dataloader_valid)

Testing: 0it [00:00, ?it/s]

────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
       Test metric             DataLoader 0
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
      test_accuracy         0.7875000238418579
        test_loss           0.5548407435417175
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────


[{'test_loss': 0.5548407435417175, 'test_accuracy': 0.7875000238418579}]

##### Test Dataset

In [50]:
trainer.test(pl_model_acc, dataloader_test)

Testing: 0it [00:00, ?it/s]

────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
       Test metric             DataLoader 0
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
      test_accuracy         0.7900000214576721
        test_loss           0.5555382966995239
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────


[{'test_loss': 0.5555382966995239, 'test_accuracy': 0.7900000214576721}]

## Self Attention Matrix


In [51]:
class SelfAttention(torch.nn.Module):

    def __init__(self, encoder, reduction='mean'):
        super(SelfAttention, self).__init__()
        self.reduction=reduction
        self.encoder = encoder
        self.softmax = torch.nn.Softmax(dim=1)

    def __call__(self, x):
        # print(x.dim())
        if x.dim()<=1:
            resp = self.get_self_attention(x)
            return resp
        else:
            vectors = []
            for i in range(x.shape[0]):
                vectors.append(self.get_self_attention(x[i]))
            
            # return vectors
            resp= torch.stack(vectors)
            # print(resp, resp.shape)
            return resp

    
    def get_self_attention(self, x):
        words = self.get_index(x)
        embedded_matrix = self.encoder(torch.tensor(words))
        embedded_matrix_weights = embedded_matrix.mm(embedded_matrix.T)
        soft_embedded = self.softmax(embedded_matrix_weights)
        embedded_matrix = soft_embedded.mm(embedded_matrix)
        if self.reduction=='mean':
            return embedded_matrix.mean(dim=0)
        else:
            return embedded_matrix

    def get_index(self, x):

        if x.is_sparse:
            x = x.to_dense()

        words= x.nonzero().squeeze(1)
        return words

In [53]:
hparams = {'batch_size': 40, 'epochs': 100, 'lr':0.01, 'optimizer': torch.optim.Adam, 'dropout': 0.05}
run=None
new_run = False
if new_run: 
  run = neptune.init(project=project, api_token=api_token)
else: print(run)

dataloader_train, dataloader_valid, dataloader_test = get_dataloaders(dataset_train, dataset_valid, dataset_test, batch_size=hparams['batch_size'])

pl_model = LitModel(model=WordEmbeddingsModelPreTrained(dataset_train.data.shape[1], 2, embedding, self_attention=SelfAttention), hparams=hparams, run=run)


callback_loss = pl.callbacks.ModelCheckpoint(dirpath='./checkpoints/', monitor='valid_epoch_loss', mode='min')
callback_acc = pl.callbacks.ModelCheckpoint(dirpath='./checkpoints/', monitor='valid_epoch_accuracy', mode='max')

trainer = pl.Trainer(max_epochs=hparams['epochs'], callbacks=[callback_loss, callback_acc], gpus=1)

INFO:pytorch_lightning.utilities.rank_zero:GPU available: True, used: True
INFO:pytorch_lightning.utilities.rank_zero:TPU available: False, using: 0 TPU cores
INFO:pytorch_lightning.utilities.rank_zero:IPU available: False, using: 0 IPUs
INFO:pytorch_lightning.utilities.rank_zero:HPU available: False, using: 0 HPUs


None
"batch_size": 40
"dropout":    0.05
"epochs":     100
"lr":         0.01
"optimizer":  <class 'torch.optim.adam.Adam'>


### Training Loop

In [54]:
trainer.fit(pl_model, dataloader_train, dataloader_valid)

  rank_zero_warn(f"Checkpoint directory {dirpath} exists and is not empty.")
INFO:pytorch_lightning.accelerators.gpu:LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
INFO:pytorch_lightning.callbacks.model_summary:
  | Name      | Type                          | Params
------------------------------------------------------------
0 | model     | WordEmbeddingsModelPreTrained | 120 M 
1 | criterion | CrossEntropyLoss              | 0     
------------------------------------------------------------
600       Trainable params
120 M     Non-trainable params
120 M     Total params
480.002   Total estimated model params size (MB)


Sanity Checking: 0it [00:00, ?it/s]

  embedded_matrix = self.encoder(torch.tensor(words))
  rank_zero_warn(


Training: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]


### Testing

#### Checkpoint Loss

In [55]:
pl_model_loss = LitModel.load_from_checkpoint(callback_loss.best_model_path, model=pl_model.model, hparams=hparams, run=None)


"batch_size": 40
"dropout":    0.05
"epochs":     100
"lr":         0.01
"optimizer":  <class 'torch.optim.adam.Adam'>


##### Train Dataset

In [56]:
trainer.test(pl_model_loss, dataloader_train)

INFO:pytorch_lightning.accelerators.gpu:LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
  rank_zero_warn(


Testing: 0it [00:00, ?it/s]

  embedded_matrix = self.encoder(torch.tensor(words))


────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
       Test metric             DataLoader 0
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
      test_accuracy         0.8805555701255798
        test_loss           0.3472112715244293
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────


[{'test_loss': 0.3472112715244293, 'test_accuracy': 0.8805555701255798}]

##### Validation Dataset

In [57]:
trainer.test(pl_model_loss, dataloader_valid)

INFO:pytorch_lightning.accelerators.gpu:LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Testing: 0it [00:00, ?it/s]

  embedded_matrix = self.encoder(torch.tensor(words))


────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
       Test metric             DataLoader 0
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
      test_accuracy         0.8500000238418579
        test_loss            0.375499963760376
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────


[{'test_loss': 0.375499963760376, 'test_accuracy': 0.8500000238418579}]

##### Test Dataset

In [58]:
trainer.test(pl_model_loss, dataloader_test)

INFO:pytorch_lightning.accelerators.gpu:LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Testing: 0it [00:00, ?it/s]

  embedded_matrix = self.encoder(torch.tensor(words))


────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
       Test metric             DataLoader 0
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
      test_accuracy         0.8449999690055847
        test_loss           0.4412713944911957
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────


[{'test_loss': 0.4412713944911957, 'test_accuracy': 0.8449999690055847}]


#### Accuracy Checkpoint

In [59]:
pl_model_acc = LitModel.load_from_checkpoint(callback_acc.best_model_path, model=pl_model.model, hparams=hparams, run=None)


"batch_size": 40
"dropout":    0.05
"epochs":     100
"lr":         0.01
"optimizer":  <class 'torch.optim.adam.Adam'>


##### Train Dataset

In [60]:
trainer.test(pl_model_acc, dataloader_train)

INFO:pytorch_lightning.accelerators.gpu:LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Testing: 0it [00:00, ?it/s]

  embedded_matrix = self.encoder(torch.tensor(words))


────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
       Test metric             DataLoader 0
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
      test_accuracy         0.8666666746139526
        test_loss           0.36418619751930237
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────


[{'test_loss': 0.36418619751930237, 'test_accuracy': 0.8666666746139526}]

##### Validation Dataset

In [61]:
trainer.test(pl_model_acc, dataloader_valid)

INFO:pytorch_lightning.accelerators.gpu:LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Testing: 0it [00:00, ?it/s]

  embedded_matrix = self.encoder(torch.tensor(words))


────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
       Test metric             DataLoader 0
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
      test_accuracy          0.862500011920929
        test_loss            0.381300151348114
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────


[{'test_loss': 0.381300151348114, 'test_accuracy': 0.862500011920929}]

##### Test Dataset

In [62]:
trainer.test(pl_model_acc, dataloader_test)

INFO:pytorch_lightning.accelerators.gpu:LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Testing: 0it [00:00, ?it/s]

  embedded_matrix = self.encoder(torch.tensor(words))


────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
       Test metric             DataLoader 0
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
      test_accuracy         0.8449999690055847
        test_loss           0.44375181198120117
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────


[{'test_loss': 0.44375181198120117, 'test_accuracy': 0.8449999690055847}]

## Speed test Self Attention

In [63]:
self_attention_loop = SelfAttentionSlow(embedding)
self_attention_matrix = SelfAttention(embedding)

In [64]:
%%timeit
self_attention_loop(dataset_train.data)

2min 13s ± 1.24 s per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [65]:
%%timeit
self_attention_matrix(dataset_train.data)

  embedded_matrix = self.encoder(torch.tensor(words))


5.39 s ± 31.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
