
# **Introduction**: Fine Tuning Roberta Transformer for Sentiment Classification of IMDB movie reviews


---


**Data:**
*   We are using the IMDB Dataset we found with reference to this research paper
*   This dataset is a collection of moview reviews obtained from IMDB website
the reviews are labled with a positive or negative sentiment.
There are approx. 50000 rows of data. Where each row has the following
data-point: 
    - review : Review of a movie
    - sentiment : positive or negative




**Language Model Used:**


*   We used RoBERTa as a base transformer model. Research Paper
RoBERTa was an incremental improvement in the BERT architecture with multiple tweaks in different domains.




Installing required packages for the fine-tuning of our dataset

In [None]:
# Installing NLP-Transformers library
!pip install -q transformers

# Installing wandb library for experiment tracking and hyper parameter optimization
!pip install -q wandb

# Code for TPU packages install
!curl -q https://raw.githubusercontent.com/pytorch/xla/master/contrib/scripts/env-setup.py -o pytorch-xla-env-setup.py
!python pytorch-xla-env-setup.py --apt-packages libomp5 libopenblas-dev

[K     |████████████████████████████████| 4.0 MB 4.5 MB/s 
[K     |████████████████████████████████| 77 kB 3.4 MB/s 
[K     |████████████████████████████████| 880 kB 29.9 MB/s 
[K     |████████████████████████████████| 596 kB 25.1 MB/s 
[K     |████████████████████████████████| 6.6 MB 31.4 MB/s 
[?25h  Building wheel for sacremoses (setup.py) ... [?25l[?25hdone
[K     |████████████████████████████████| 1.8 MB 4.8 MB/s 
[K     |████████████████████████████████| 144 kB 11.6 MB/s 
[K     |████████████████████████████████| 181 kB 14.0 MB/s 
[K     |████████████████████████████████| 63 kB 727 kB/s 
[?25h  Building wheel for pathtools (setup.py) ... [?25l[?25hdone
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  6034  100  6034    0     0  44696      0 --:--:-- --:--:-- --:--:-- 44696
Updating... This may take around 2 minutes.
Updating TPU runtime to pytorch-dev20

In [None]:
# Mount google drive to import the dataset and upload our fine-tuned model

from google.colab import drive
drive.mount('/content/drive')
# path = '/content/drive/My Drive/MMD-Project-IMDB-Sentiment-Analysis/'
path = '/content/drive/MyDrive/MMD-Project/'

Mounted at /content/drive


In [None]:
# Importing stock libraries
import numpy as np
import pandas as pd
import torch
import torch.nn.functional as F
from torch.utils.data import Dataset, DataLoader, RandomSampler, SequentialSampler
from sklearn import metrics
import textwrap

# Importing lackages from our NLP-Hugging Package
from transformers import RobertaConfig, RobertaModel, RobertaTokenizerFast

from datetime import datetime
# Importing wand for logging and hyper-parameter tuning
import wandb
import pickle

In [None]:
# Setting up the accelerators

# # GPU
# from torch import cuda
# device = 'cuda' if cuda.is_available() else 'cpu'

# TPU
import torch_xla
import torch_xla.core.xla_model as xm
device = xm.xla_device()

In [None]:
# login to wandb - a very nice app to view performance metrics of the training and validation model
!wandb login

[34m[1mwandb[0m: You can find your API key in your browser here: https://wandb.ai/authorize
[34m[1mwandb[0m: Paste an API key from your profile and hit enter, or press ctrl+c to quit: 
[34m[1mwandb[0m: Appending key for api.wandb.ai to your netrc file: /root/.netrc


# Preprocess class
* This defines how the text is pre-processed before working on the tokenization, dataset and dataloader aspects of the workflow. In this class the dataframe is loaded and then the sentiment column is used to create a new column in the dataframe called encoded_polarity such that if:
    - sentiment = positive then encoded_polarity = 0
    - sentiment = negative then encoded_polarity = 1

* Followed by this, the sentiment column is removed from the dataframe.
* The dataframe and encoded_polarity dictionary are returned. 
* This method is called in the run() function.

In [61]:
# The processing method will return both the dictionary, and the updated dataframe for further usage.

class Preprocess:
    def __init__(self, df):
        """
        Constructor for the class
        :param df: Input Dataframe to be pre-processed
        """
        self.df = df
        self.encoded_dict = dict()

    def encoding(self, x):
        if x =='negative':
          self.encoded_dict[x] = 0
        else:
          self.encoded_dict[x] = 1
        return self.encoded_dict[x]

    def processing(self):
        self.df['encoded_polarity'] = self.df['sentiment'].apply(lambda x: self.encoding(x))
        self.df.drop(['sentiment'], axis=1, inplace=True)
        return self.encoded_dict, self.df

#IMDBDataset - a custom dataset

- This class is defined to accept the Dataframe as input and generate tokenized 
output that is used by the Roberta model for training. 
- We are using the Roberta tokenizer to tokenize the data in the review column of the dataframe. 
- The tokenizer uses the encode_plus method to perform tokenization and generate the necessary outputs, namely: ids, attention_mask
- encoded_polarity transformed into the targets tensor. 
- The *IMDBDataset* class is used to create 2 datasets, for training and for validation.
- Training Dataset is used to fine tune the model: 70% of the original data
- Validation Dataset is used to evaluate the performance of the model.


In [40]:
# Creating a IMDBDataset class that is used to read the updated dataframe and tokenize the text. 
# The class is used in the return_dataloader function

class IMDBDataset(Dataset):
    def __init__(self, dataframe, tokenizer, max_len):
        self.len = len(dataframe)
        self.data = dataframe
        self.tokenizer = tokenizer
        self.max_len = max_len
        
    def __getitem__(self, index):
        text = str(self.data.review[index])
        text = " ".join(text.split())
        inputs = self.tokenizer.encode_plus(
            text,
            None,
            add_special_tokens=True,
            max_length=self.max_len,
            pad_to_max_length=True,
            return_token_type_ids=True
        )
        ids = inputs['input_ids']
        mask = inputs['attention_mask']

        return {
            'ids': torch.tensor(ids, dtype=torch.long),
            'mask': torch.tensor(mask, dtype=torch.long),
            'targets': torch.tensor(self.data.encoded_polarity[index], dtype=torch.float)
        } 
    
    def __len__(self):
        return self.len

#### return_dataloader: Called inside the run()
- return_dataloader function is used to for creating training and validation dataloader that load data to the neural network in a defined manner. This is needed because all the data from the dataset cannot be loaded to the memory at once, hence the amount of data loaded to the memory and then passed to the neural network needs to be controlled.
- Internally the return_dataloader function calls the pytorch Dataloader class and the IMDBDataset custom dataset class to create the dataloaders for training and validation. 
- This control is achieved using the parameters such as batch_size and max_len.
- Training and Validation dataloaders are used in the training and validation part of the flow respectively

In [None]:
# Creating a function that returns the dataloader based on the dataframe and the specified train and validation batch size. 

def return_dataloader(df, tokenizer, train_batch_size, validation_batch_size, MAX_LEN, train_size=0.7):
    train_dataset=df.sample(frac=train_size,random_state=200)
    
    val_dataset=df.drop(train_dataset.index).reset_index(drop=True)
    train_dataset = train_dataset.reset_index(drop=True)

    print("FULL Dataset: {}".format(df.shape))
    print("TRAIN Dataset: {}".format(train_dataset.shape))
    print("VALIDATE Dataset: {}".format(val_dataset.shape))

    training_set = IMDBDataset(train_dataset, tokenizer, MAX_LEN)
    validation_set = IMDBDataset(val_dataset, tokenizer, MAX_LEN)

    train_params = {'batch_size': train_batch_size,
                'shuffle': True,
                'num_workers': 1
                }

    val_params = {'batch_size': validation_batch_size,
                    'shuffle': True,
                    'num_workers': 1
                    }

    training_loader = DataLoader(training_set, **train_params)
    validation_loader = DataLoader(validation_set, **val_params)
    

    return training_loader, validation_loader

#RobertaModelClass

In [None]:
# Creating the customized model, by adding a drop out and a dense layer on top of roberta to get the final output for the model. 
class RobertaModelClass(torch.nn.Module):
    def __init__(self):
        super(RobertaModelClass, self).__init__()
        self.model_layer = RobertaModel.from_pretrained("roberta-base")
        self.pre_classifier = torch.nn.Linear(768, 768)
        self.dropout = torch.nn.Dropout(0.3)
        self.classifier = torch.nn.Linear(768, 2)

    def forward(self, input_ids, attention_mask):
        output_1 = self.model_layer(input_ids=input_ids, attention_mask=attention_mask)
        hidden_state = output_1[0] 
        pooler = hidden_state[:, 0] 
        pooler = self.pre_classifier(pooler)
        pooler = torch.nn.ReLU()(pooler) 
        pooler = self.dropout(pooler)
        output = self.classifier(pooler)
        return output


In [None]:
# Function to return model based on the definition of RobertaModel Class
def return_model(device):
    model = RobertaModelClass()
    model = model.to(device) 
    return model

In [None]:
# Function to calcuate the accuracy of the model
def calcuate_accuracy(big_idx, targets):
    n_correct = (big_idx==targets).sum().item()
    return n_correct

# Model Training

In [None]:
# Function to fine tune the model based on the epochs, model, tokenizer and other arguments

def train(epoch, model, device, training_loader, optimizer, loss_function):
    n_correct = 0
    nb_tr_examples, nb_tr_steps = 0, 0
    tr_loss = 0
    

    print("Before Model Training: ", datetime.now())
    model.train() #puts it into training mode
    print("After Model Training: ", datetime.now())

    for _,data in enumerate(training_loader, 0):
        ids = data['ids'].to(device, dtype = torch.long)
        mask = data['mask'].to(device, dtype = torch.long)
        targets = data['targets'].to(device, dtype = torch.long)

        
        outputs = model(ids, mask).squeeze() 

        optimizer.zero_grad() 
        loss = loss_function(outputs, targets)
        tr_loss += loss.item() 
        big_val, big_idx = torch.max(outputs.data, dim=1) 
        n_correct += calcuate_accuracy(big_idx, targets)

        nb_tr_steps += 1
        nb_tr_examples+=targets.size(0) 
        
        if _%100==0: 
            loss_step = tr_loss/nb_tr_steps 
            accu_step = (n_correct*100)/nb_tr_examples 
            wandb.log({"Training Loss per 100 steps": loss_step})
            wandb.log({"Training Accuracy per 100 steps": accu_step})

        optimizer.zero_grad()
        loss.backward()
        
        # For TPU
        xm.optimizer_step(optimizer)
        xm.mark_step()

    # print("\n\nTraining data accuracy metrics: ",metrics.classification_report(targets, outputs))

    print(f'The Total Accuracy for Epoch {epoch}: {(n_correct*100)/nb_tr_examples}')
    epoch_loss = tr_loss/nb_tr_steps
    epoch_accu = (n_correct*100)/nb_tr_examples
    wandb.log({"\nTraining Loss Epoch": epoch_loss})
    wandb.log({"\nTraining Accuracy Epoch": epoch_accu})

# Model Validation

- During the validation stage, we pass the unseen data, trained model, and device details to the function to perform the validation run. This step generates new encoded_sentiment value for dataset that it has not seen during the training session.
- This is then compared to the actual encoded_sentiment, to give us the Validation Accuracy and Loss.
- This function is called in the run()
- This unseen data is the 30% of IMBD Dataset which was seperated during the Dataset creation stage. During the validation stage the weights of the model are not updated. We use the generate method for generating new text for the summary.
- The generated validation accuracy and loss are logged to wandb for every 100th step and per epoch.

In [None]:
# Function to run the validation dataloader to validate the performance of the fine tuned model. 

def valid(model, device, validation_loader, loss_function):
    n_correct = 0; total = 0
    nb_tr_examples, nb_tr_steps = 0, 0
    tr_loss = 0
    model.eval()
    # create lists to hold outputs and targets 
    output_list = []
    target_list = []
  
    with torch.no_grad(): 
        for _,data in enumerate(validation_loader, 0):
            ids = data['ids'].to(device, dtype = torch.long)
            mask = data['mask'].to(device, dtype = torch.long)
            targets = data['targets'].to(device, dtype = torch.long)
            
            outputs = model(ids, mask).squeeze()
            # convert targets & outputs to numpy arrays 
            #targets.cpu().detach().numpy()
            ops = outputs.cpu().detach().numpy()

            target_list.extend(targets.tolist())
            
            loss = loss_function(outputs, targets)
            tr_loss += loss.item()
            big_val, big_idx = torch.max(outputs.data, dim=1)
            n_correct += calcuate_accuracy(big_idx, targets)

            
            big_op = big_idx.cpu().detach().numpy()
            #print(big_op)
            #print(target_list)
            #break
            output_list.extend(big_op.tolist())

            # print(target_list)
            # print(output_list)
            # break


            nb_tr_steps += 1
            nb_tr_examples+=targets.size(0)
            
            if _%100==0:
                loss_step = tr_loss/nb_tr_steps
                accu_step = (n_correct*100)/nb_tr_examples 
                wandb.log({"Validation Loss per 100 steps": loss_step})
                wandb.log({"Validation Accuracy per 100 steps": accu_step})
    

    # print("Test data accuracy metrics: ",metrics.classification_report(targets, outputs))

    epoch_loss = tr_loss/nb_tr_steps
    epoch_accu = (n_correct*100)/nb_tr_examples
    wandb.log({"Validation Loss Epoch": epoch_loss})
    wandb.log({"Validation Accuracy Epoch": epoch_accu})
    print(f'The Validation Accuracy: {(n_correct*100)/nb_tr_examples}')
    return target_list, output_list 


# Driver Function

Flow of the run() function:
- Importing and preprocessing the domain data
- Creation of Dataset and Dataloader: Train and Validation parameters are defined and passed to the pytorch Dataloader contstruct to create train and validation data loaders.
- Neural Network and Optimizer: We define the model and optimizer that will be used for training and to update the weights of the network.
- Training Model and Logging to WandB: Where the actual `train()` function is called and the training takes place.
- Validation: Using the fine-tuned model to get the sentiment and find the accuracy of the validation. 


In [70]:
def run(path, train_again = 1, pickled_file = None):
    
    # WandB – Initialize a new run
    wandb.init(project="MMD-Project-IMDB-Sentiment-Analysis")
    
    # Defining some key variables that will be used later on in the training
    config = wandb.config 
    config.MAX_LEN = 512
    config.TRAIN_BATCH_SIZE = 12
    config.VALID_BATCH_SIZE = 4
    config.EPOCHS = 3
    config.LEARNING_RATE = 1e-05
    
   
    tokenizer = RobertaTokenizerFast.from_pretrained('roberta-base')

    # Reading the dataset and pre-processing it for usage
    df = pd.read_csv(f'{path}output.csv', encoding='latin-1')
    # df = pd.read_csv('/content/drive/My Drive/MMD-Project-IMDB-Sentiment-Analysis/output.csv', encoding='latin-1')
    # df = df.head(10)

    pre = Preprocess(df)
    encoding_dict, df = pre.processing()

    # Creating the training and validation dataloader using the functions defined above
    training_loader, validation_loader = return_dataloader(df, tokenizer, config.TRAIN_BATCH_SIZE, config.VALID_BATCH_SIZE, config.MAX_LEN)

    # Defining the model based on the function and ModelClass defined above
    model = return_model(device)

    # Creating the loss function and optimizer
    loss_function = torch.nn.CrossEntropyLoss()
    optimizer = torch.optim.Adam(params =  model.parameters(), lr=config.LEARNING_RATE)

    if(train_again==1):
      print('Training....')
      # Fine tuning the model using the train function:
      for epoch in range(config.EPOCHS):
          train(epoch, model, device, training_loader, optimizer, loss_function)
      
      print("\nAfter Model Training: ", datetime.now())

      pickle.dump(model, open(pickled_file, 'wb'))

    else:
      print('Loading from pickled...')
      model = pickle.load(open(pickled_file, 'rb'))
      # Running the validation function to validate the performance of the trained model
    
    print('Validating...')
    target_list, output_list = valid(model, device, validation_loader, loss_function)
    return target_list, output_list
    
    #pickle.dump(model, open('/content/drive/My Drive/MMD-Project-IMDB-Sentiment-Analysis/IMDB-Sentiment-Analysis_evalmodel1.pkl', 'wb'))




trigger the run() function from here,
for the first time, train_again variable = 1, to just run the validation function, train_again = 0

t_list and o_list are the training and output tensors

In [72]:

# pickled_file = '/content/drive/My Drive/MMD-Project-IMDB-Sentiment-Analysis/IMDB-Sentiment-Analysis_model1.pkl'
pickled_file = path+'IMDB-Sentiment-Analysis_model1.123.pkl'
train_again = 1

t_list, o_list = run(path, train_again, pickled_file)

VBox(children=(Label(value='0.001 MB of 0.001 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

FULL Dataset: (49582, 3)
TRAIN Dataset: (34707, 3)
VALIDATE Dataset: (14875, 3)


Some weights of the model checkpoint at roberta-base were not used when initializing RobertaModel: ['lm_head.bias', 'lm_head.layer_norm.weight', 'lm_head.layer_norm.bias', 'lm_head.decoder.weight', 'lm_head.dense.weight', 'lm_head.dense.bias']
- This IS expected if you are initializing RobertaModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


Training....
Before Model Training:  2022-05-08 19:30:21.677237
After Model Training:  2022-05-08 19:30:21.678220


Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.


The Total Accuracy for Epoch 0: 92.83717981963292
Before Model Training:  2022-05-08 20:10:22.942670
After Model Training:  2022-05-08 20:10:22.943904


Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.


The Total Accuracy for Epoch 1: 96.32062696286052
Before Model Training:  2022-05-08 20:50:08.442762
After Model Training:  2022-05-08 20:50:08.444282


Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.


The Total Accuracy for Epoch 2: 97.79871495663699

After Model Training:  2022-05-08 21:29:51.318956
Validating...


Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.


The Validation Accuracy: 95.27394957983194


In [73]:
# Outputting the result from validation just to check if the model has run correctly. Accuracy and other metrics cannot be judged from just looking at this

print(o_list)
print(t_list)

[0, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 0, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 1, 0, 0, 0, 1, 0, 1, 1, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 1, 1, 1, 1, 1, 0, 1, 1, 0, 1, 0, 1, 0, 1, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 0, 1, 1, 0, 1, 0, 0, 0, 1, 0, 0, 1, 1, 1, 0, 1, 0, 1, 1, 0, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1, 0, 0, 1, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0, 0, 1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 1, 0, 0, 0, 1, 1, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 

In [74]:
from sklearn.metrics import classification_report

print(classification_report(t_list, o_list))

              precision    recall  f1-score   support

           0       0.97      0.93      0.95      7329
           1       0.94      0.97      0.95      7546

    accuracy                           0.95     14875
   macro avg       0.95      0.95      0.95     14875
weighted avg       0.95      0.95      0.95     14875



- As seen from the summary statistics - f1 score, recall, precision - 

- The model is performing quite well.
It has given high accuracy of 95% in predicting both classes - positive as well as negative. 

# Validation for explicit inputs

- In the next few cells, we show how our model works on explicit inputs that we have specified externally (outside our validation set). 

- We have defined a custom dataloader for this task to contain only the validation requirements (as our previous dataloader contains code for both the training and validation set). 

- We run through our model in the same way as above, just for an input that we have explicitly specified. 


In [75]:
# Custom dataloader for just the validation set as we have already trained our model

def custom_loader(df, tokenizer, validation_batch_size, MAX_LEN):
    df = df.drop('sentiment', axis=1)
    val_dataset=df.reset_index(drop=True)
    print("VAL Dataset: {}".format(val_dataset.shape))
    validation_set = IMDBDataset(val_dataset, tokenizer, MAX_LEN)
    val_params = {'batch_size': validation_batch_size,
                    'shuffle': True,
                    'num_workers': 1
                    }
    validation_loader = DataLoader(validation_set, **val_params)
    

    return validation_loader

In [76]:
# Loading the pre-trained model

# pickled_model = pickle.load(open('/content/drive/My Drive/MMD-Project-IMDB-Sentiment-Analysis/IMDB-Sentiment-Analysis_model1.pkl', 'rb'))
pickled_model = pickle.load(open(path+'IMDB-Sentiment-Analysis_model1.123.pkl', 'rb'))
tokenizer = RobertaTokenizerFast.from_pretrained('roberta-base')

In [77]:
# Ex. Positive and negative reviews for the 2022 Batman movie taken from the IMDB website. 

import pandas as pd
test_review = {"review": ["I'll admit I raised an eyebrow when I saw that Pattinson was cast, but I eat my words, he was awesome, and hopefully will play the part a few more times. This film blew me away, exciting, fast paced, surprisingly gritty, and genuinely had an awesome story.",
                          "Not sure why this is rated so highly. This was the worst Batman movie ever! this was just long and boring."],
               "sentiment": ["positive","negative"],
               "encoded_polarity":[1,0]}
test_review_column_names = ['review','sentiment','encoded_polarity']
df_test = pd.DataFrame(test_review)
df_test.columns = test_review_column_names
df_test

Unnamed: 0,review,sentiment,encoded_polarity
0,I'll admit I raised an eyebrow when I saw that...,positive,1
1,Not sure why this is rated so highly. This was...,negative,0


In [78]:
valid_loader = custom_loader(df_test, tokenizer, 2,512)

VAL Dataset: (2, 2)


In [79]:
loss_function = torch.nn.CrossEntropyLoss()

In [89]:
wandb.init(project="MMD-Project-IMDB-Sentiment-Analysis")
target_list, output_list = valid(pickled_model, device, valid_loader, loss_function)
print(target_list, output_list)

VBox(children=(Label(value='0.001 MB of 0.001 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
Validation Accuracy Epoch,▁
Validation Accuracy per 100 steps,▁
Validation Loss Epoch,▁
Validation Loss per 100 steps,▁

0,1
Validation Accuracy Epoch,100.0
Validation Accuracy per 100 steps,100.0
Validation Loss Epoch,0.00196
Validation Loss per 100 steps,0.00196


Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.


The Validation Accuracy: 100.0
[1, 0] [1, 0]


In [87]:
output_list
target_list

[1, 0]

In [88]:
result_df = df_test
result_df['Predicted output'] = output_list
result_df.rename(columns = {'sentiment':'Actual sentiment'}, inplace = True)
result_df = result_df.drop('encoded_polarity', axis=1)
result_df.loc[result_df['Predicted output'] == 0, 'Predicted sentiment'] = 'negative' 
result_df.loc[result_df['Predicted output'] == 1, 'Predicted sentiment'] = 'positive' 
result_df = result_df.drop('Predicted output', axis=1)
result_df

Unnamed: 0,review,Actual sentiment,Predicted sentiment
0,I'll admit I raised an eyebrow when I saw that...,positive,positive
1,Not sure why this is rated so highly. This was...,negative,negative


In [55]:
pd.options.display.max_colwidth = 50

- When we test our model against this testset, we first of all see that due to its small size it runs quickly. 

- Secondly, the validation set runs with a 100% accuracy of predicting both the classes accurately. 

- The above cell shows the same, when comparing the predicted sentiment and the actual sentiment of the review. 

- The aim of this last section is to show how our model works for an input that we can explicitly see. 