<a href="https://colab.research.google.com/github/AnaPao1998/Thesis/blob/main/roBERTa_Fine_Tunning.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Fine Tuning Roberta for Sentiment Analysis





<a id='section01'></a>
### Importing Python Libraries and preparing the environment

At this step we will be importing the libraries and modules needed to run our script. Libraries are:
* Pandas
* Pytorch
* Pytorch Utils for Dataset and Dataloader
* Transformers
* tqdm
* sklearn
* Robert Model and Tokenizer

Followed by that we will preapre the device for CUDA exececution. This configuration is needed if you want to leverage on onboard GPU. 

In [None]:
# Installing transformers
!pip install transformers==3.0.2



In [None]:
# Importing the libraries needed
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
import torch
import seaborn as sns
import transformers
import json
from tqdm import tqdm
from torch.utils.data import Dataset, DataLoader
from transformers import RobertaModel, RobertaTokenizer
import logging
logging.basicConfig(level=logging.ERROR)

In [None]:
# Setting up the device for GPU usage
from torch import cuda
device = 'cuda' if cuda.is_available() else 'cpu'

In [None]:
# Uploading files from local 
# DEPRESSION .CSV
from google.colab import files
uploaded = files.upload()

Saving depression.csv to depression (1).csv


In [None]:
# Dataset is now stored as a Pandas Dataframe
import io
depression_df = pd.read_csv(io.BytesIO(uploaded['depression.csv']))

In [None]:
depression_df.head()

Unnamed: 0,renderedContent,pre_processed,label_sentiment,probability_sentiment,label_emotion,probability_emotion,label_irony,probability_irony,Unnamed: 8,depresion
0,"Too many shootings, too many lives lost from e...","many shootings, many lives lost evil people po...",negative,0.722731,sadness,0.641728,non_irony,0.648788,,True
1,i hate having to walk back into my depression ...,hate walk back depression den i’ve completely ...,negative,0.651571,sadness,0.966754,non_irony,0.82253,,True
2,Seriously. 2019 decided to kick my ass one mor...,seriously. decided kick ass one time. i'm list...,negative,0.700923,anger,0.931618,non_irony,0.892144,,True
3,First 2020 depression... yay... 👌🏻 I hate my l...,first depression... yay... 👌🏻 hate life.,negative,0.743294,sadness,0.834031,irony,0.984495,,False
4,I hate that being upfront with your emotional ...,"hate upfront emotional state adult seen ""atten...",negative,0.919534,sadness,0.794845,non_irony,0.844464,,True


In [None]:
# Splitting dataset
# Split imbalanced dataset into train and test sets with stratification
from sklearn.model_selection import train_test_split

# full_X uses only 2 columns of my df
full_X = depression_df[['pre_processed', 'depresion']]
# depression_df = depression_df.head(100)
X=depression_df[['pre_processed']]
y=depression_df[['depresion']]

# Split into train test sets
# Stratification is useful for imbalaced data, in this case "y" are the labels, and that labels are being separated equally in the train/test sets
# Since the test_size equals to 0.20, the data is splitted in 800 for training and 200 for testing
train_data, validation_data, y_train, y_test = train_test_split(full_X, y, test_size=0.20, random_state=1, stratify=y)

# Reseting index due to errors
train_data = train_data.reset_index(drop=True)
validation_data = validation_data.reset_index(drop=True)

print((train_data.shape))
print((validation_data.shape))

(800, 2)
(200, 2)


<a id='section03'></a>
### Preparing the Dataset and Dataloader

I will start with defining few key variables that will be used later during the training/fine tuning stage.
Followed by creation of Dataset class - This defines how the text is pre-processed before sending it to the neural network. I will also define the Dataloader that will feed  the data in batches to the neural network for suitable training and processing. 
Dataset and Dataloader are constructs of the PyTorch library for defining and controlling the data pre-processing and its passage to neural network. For further reading into Dataset and Dataloader read the [docs at PyTorch](https://pytorch.org/docs/stable/data.html)

#### *SentimentData* Dataset Class
- This class is defined to accept the Dataframe as input and generate tokenized output that is used by the Roberta model for training. 
- The tokenizer uses the `encode_plus` method to perform tokenization and generate the necessary outputs, namely: `ids`, `attention_mask`
- To read further into the tokenizer, [refer to this document](https://huggingface.co/transformers/model_doc/roberta.html#robertatokenizer)
- `target` is the encoded category on the news headline. 
- The *SentimentData* class is used to create 2 datasets, for training and for validation.
- *Training Dataset* is used to fine tune the model: **80% of the original data**
- *Validation Dataset* is used to evaluate the performance of the model. The model has not seen this data during training. 

#### Dataloader
- Dataloader is used to for creating training and validation dataloader that load data to the neural network in a defined manner. This is needed because all the data from the dataset cannot be loaded to the memory at once, hence the amount of dataloaded to the memory and then passed to the neural network needs to be controlled.
- This control is achieved using the parameters such as `batch_size` and `max_len`.
- Training and Validation dataloaders are used in the training and validation part of the flow respectively

In [None]:
# Defining some key variables that will be used later on in the training
MAX_LEN = 256
TRAIN_BATCH_SIZE = 8
VALID_BATCH_SIZE = 4
# EPOCHS = 1
LEARNING_RATE = 1e-05
tokenizer = RobertaTokenizer.from_pretrained('roberta-base', truncation=True, do_lower_case=True)

In [None]:
# This class is defined to accept the Dataframe as input and generate tokenized output that is used by the Roberta model for training
class SentimentData(Dataset):
    def __init__(self, dataframe, tokenizer, max_len):
        self.tokenizer = tokenizer
        self.data = dataframe
        self.text = dataframe.pre_processed
        self.targets = self.data.depresion
        self.max_len = max_len

    def __len__(self):
        return len(self.text)

    def __getitem__(self, index):
        text = str(self.text[index])
        text = " ".join(text.split())

# The tokenizer uses the encode_plus method to perform tokenization and generate the necessary outputs, namely: ids, attention_mask
        inputs = self.tokenizer.encode_plus(
            text,
            None,
            add_special_tokens=True,
            max_length=self.max_len,
            pad_to_max_length=True,
            return_token_type_ids=True
        )
        ids = inputs['input_ids']
        mask = inputs['attention_mask']
        token_type_ids = inputs["token_type_ids"]


        return {
            'ids': torch.tensor(ids, dtype=torch.long),
            'mask': torch.tensor(mask, dtype=torch.long),
            'token_type_ids': torch.tensor(token_type_ids, dtype=torch.long),
            'targets': torch.tensor(self.targets[index], dtype=torch.float)
        }

In [None]:
# train_size = 0.8
# train_data=new_df.sample(frac=train_size,random_state=200)
# validation_data=new_df.drop(train_data.index).reset_index(drop=True)
# train_data = train_data.reset_index(drop=True)

# The SentimentData class is used to create 2 datasets, for training and for validation.
print(f"FULL Dataset: {full_X.shape}")
# Training Dataset is used to fine tune the model: 80% of the original data
print(f"TRAIN Dataset: {train_data.shape}")
# Validation Dataset is used to evaluate the performance of the model. The model has not seen this data during training.
print(f"VALIDATION Dataset: {validation_data.shape}")

training_set = SentimentData(train_data, tokenizer, MAX_LEN)
#change to valid
valid_set = SentimentData(validation_data, tokenizer, MAX_LEN)

FULL Dataset: (1000, 2)
TRAIN Dataset: (800, 2)
VALIDATION Dataset: (200, 2)


In [None]:
train_params = {'batch_size': TRAIN_BATCH_SIZE,
                'shuffle': True,
                'num_workers': 0
                }

#valid params
test_params = {'batch_size': VALID_BATCH_SIZE,
                'shuffle': True,
                'num_workers': 0
                }

# Dataloader is used to for creating training and validation dataloader that load data to the neural network in a defined manner. 
# This is needed because all the data from the dataset cannot be loaded to the memory at once, hence the amount of dataloaded to the memory and
# then passed to the neural network needs to be controlled.

training_loader = DataLoader(training_set, **train_params)
validation_loader = DataLoader(valid_set, **test_params)

<a id='section04'></a>
### Creating the Neural Network for Fine Tuning

#### Neural Network
 - We will be creating a neural network with the `RobertaClass`. 
 - This network will have the Roberta Language model followed by a `dropout` and finally a `Linear` layer to obtain the final outputs. 
 - The data will be fed to the Roberta Language model as defined in the dataset. 
 - Final layer outputs is what will be compared to the `Sentiment category` to determine the accuracy of models prediction. 
 - We will initiate an instance of the network called `model`. This instance will be used for training and then to save the final trained model for future inference. 
 
#### Loss Function and Optimizer
 - `Loss Function` and `Optimizer` and defined in the next cell.
 - The `Loss Function` is used the calculate the difference in the output created by the model and the actual output. 
 - `Optimizer` is used to update the weights of the neural network to improve its performance.

In [None]:
# We will be creating a neural network with the RobertaClass.
class RobertaClass(torch.nn.Module):
# This network will have the Roberta Language model followed by a dropout and finally a Linear layer to obtain the final outputs.
    def __init__(self):
        super(RobertaClass, self).__init__()
        self.l1 = RobertaModel.from_pretrained("roberta-base")
        self.pre_classifier = torch.nn.Linear(768, 768)
        self.dropout = torch.nn.Dropout(0.3)
        # changing output to 2 insted of 5 (depression, non-depression)
        # 768 neurons and converting 2 
        self.classifier = torch.nn.Linear(768, 2)

    def forward(self, input_ids, attention_mask, token_type_ids):
        output_1 = self.l1(input_ids=input_ids, attention_mask=attention_mask, token_type_ids=token_type_ids)
        hidden_state = output_1[0]
        pooler = hidden_state[:, 0]
        pooler = self.pre_classifier(pooler)
        # adding a RELU layer
        pooler = torch.nn.ReLU()(pooler)
        # adding a dropout
        pooler = self.dropout(pooler)
        # adding a classifier
        output = self.classifier(pooler)
        return output

In [None]:
# We will initiate an instance of the network called mental-health-model. This instance will be used for training and then to save the final trained model
mental_health_model = RobertaClass()
mental_health_model.to(device)

RobertaClass(
  (l1): RobertaModel(
    (embeddings): RobertaEmbeddings(
      (word_embeddings): Embedding(50265, 768, padding_idx=1)
      (position_embeddings): Embedding(514, 768, padding_idx=1)
      (token_type_embeddings): Embedding(1, 768)
      (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
      (dropout): Dropout(p=0.1, inplace=False)
    )
    (encoder): BertEncoder(
      (layer): ModuleList(
        (0): BertLayer(
          (attention): BertAttention(
            (self): BertSelfAttention(
              (query): Linear(in_features=768, out_features=768, bias=True)
              (key): Linear(in_features=768, out_features=768, bias=True)
              (value): Linear(in_features=768, out_features=768, bias=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
            (output): BertSelfOutput(
              (dense): Linear(in_features=768, out_features=768, bias=True)
              (LayerNorm): LayerNorm((768,), eps=1e-05, eleme

<a id='section05'></a>
### Fine Tuning the Model

After all the effort of loading and preparing the data and datasets, creating the model and defining its loss and optimizer. This is probably the easier steps in the process. 

Here we define a training function that trains the model on the training dataset created above, specified number of times (EPOCH), An epoch defines how many times the complete data will be passed through the network. 

Following events happen in this function to fine tune the neural network:
- The dataloader passes data to the model based on the batch size. 
- Subsequent output from the model and the actual category are compared to calculate the loss. 
- Loss value is used to optimize the weights of the neurons in the network.
- After every 5000 steps the loss value is printed in the console.

As you can see just in 1 epoch by the final step the model was working with a loss of 0.8141926634122427.

In [None]:
# Creating the loss function and optimizer
loss_function = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(params =  mental_health_model.parameters(), lr=LEARNING_RATE)

In [None]:
def calcuate_accuracy(preds, targets):
    n_correct = (preds==targets).sum().item()
    return n_correct

In [None]:
# Defining the training function on the 80% of the dataset for tuning the distilbert model
# Here we define a training function that trains the model on the training dataset created above, specified number of times (EPOCH)
# An epoch defines how many times the complete data will be passed through the network.
def train(epoch):
    tr_loss = 0
    n_correct = 0
    nb_tr_steps = 0
    nb_tr_examples = 0
    mental_health_model.train()
    
    # The dataloader passes data to the model based on the batch size
    for i,data in tqdm(enumerate(training_loader, 0)):
        #print(i)
        # Here we pass the data to the GPU
        ids = data['ids'].to(device, dtype = torch.long)
        mask = data['mask'].to(device, dtype = torch.long)
        token_type_ids = data['token_type_ids'].to(device, dtype = torch.long)
        targets = data['targets'].to(device, dtype = torch.long)

        outputs = mental_health_model(ids, mask, token_type_ids)
        # Subsequent output from the model and the actual category are compared to calculate the loss.
        # Here we compare predictions with targets and we calculate the loss 
        # Loss value is used to optimize the weights of the neurons in the network.
        loss = loss_function(outputs, targets)
        tr_loss += loss.item()

        # big_Val = probability of the prediction and big_idx = label of the prediction (0 o 1) 
        big_val, big_idx = torch.max(outputs.data, dim=1)

        # we calculate how many predictions are correct comparing big_idx(label) with the target
        n_correct += calcuate_accuracy(big_idx, targets)

        nb_tr_steps += 1
        nb_tr_examples+=targets.size(0)
        
        # After every 5000 steps the loss value is printed in the console.
        # vamos a mostrar los valores en la consola para ver como esta funcionando el algoritmo 
        if i%5000==0:
            loss_step = tr_loss/nb_tr_steps
            accu_step = (n_correct*100)/nb_tr_examples 
            print(f"Training Loss per 5000 steps: {loss_step}")
            print(f"Training Accuracy per 5000 steps: {accu_step}")

        # Pytorch is doing a zero_grad, backward propagation and optimizer step for the new iteration 
        # gradientes a 0, ya no se quiere tener la informacion para el siguiente epoch (VOLVER A EMPEZAR)
        # poner a 0 los gradientes y luego el backward 
        optimizer.zero_grad()
        loss.backward()
        # where to move on the function space
        optimizer.step()

    print(f'The Total Accuracy for Epoch {epoch}: {(n_correct*100)/nb_tr_examples}')
    epoch_loss = tr_loss/nb_tr_steps
    epoch_accu = (n_correct*100)/nb_tr_examples
    print(f"Training Loss Epoch: {epoch_loss}")
    print(f"Training Accuracy Epoch: {epoch_accu}")

    return 

In [None]:
EPOCHS = 1
# 800/8 
# training size / batch size of training 
for epoch in range(EPOCHS):
    train(epoch)

0it [00:00, ?it/s]

Training Loss per 5000 steps: 0.7254340648651123
Training Accuracy per 5000 steps: 12.5


100it [36:09, 21.70s/it]

The Total Accuracy for Epoch 0: 82.375
Training Loss Epoch: 0.44582631267607215
Training Accuracy Epoch: 82.375





<a id='section06'></a>
### Validating the Model

During the validation stage we pass the unseen data(Testing Dataset) to the model. This step determines how good the model performs on the unseen data. 

This unseen data is the 20% of `train.tsv` which was seperated during the Dataset creation stage. 
During the validation stage the weights of the model are not updated. Only the final output is compared to the actual value. This comparison is then used to calcuate the accuracy of the model. 

As you can see the model is predicting the correct category of a given sample to a 69.47% accuracy which can further be improved by training more.

In [None]:
def my_valid(mental_health_model, validation_loader):
    mental_health_model.eval()
    n_correct = 0; n_wrong = 0; total = 0; tr_loss=0; nb_tr_steps=0; nb_tr_examples=0

    with torch.no_grad():
        pred_total=[]
        targets_total = []
        for i, data in tqdm(enumerate(validation_loader, 0)):
            # Here we pass the data to the GPU
            ids = data['ids'].to(device, dtype = torch.long)
            mask = data['mask'].to(device, dtype = torch.long)
            token_type_ids = data['token_type_ids'].to(device, dtype=torch.long)
            targets = data['targets'].to(device, dtype = torch.long)

            # Here we make the predictions
            outputs = mental_health_model(ids, mask, token_type_ids).squeeze()
            loss = loss_function(outputs, targets)
            tr_loss += loss.item()
            big_val, big_idx = torch.max(outputs.data, dim=1)
            for v in big_idx:
              pred_total.append(v)
            for v in targets:
              targets_total.append(v)
    
    return pred_total , targets_total
    #return torch.reshape(torch.stack(pred_total),(-1,)), torch.reshape(torch.stack(targets_total),(-1,))


In [None]:
from sklearn.metrics import accuracy_score, f1_score, precision_score, recall_score, classification_report, confusion_matrix
predictions, targets = my_valid(mental_health_model, validation_loader)
# tensor to numpy array
#predictions_numpy = predictions.cpu().detach().numpy()
#targets_numpy = targets.cpu().detach().numpy()
#print(predictions_numpy)
#print(targets_numpy)



50it [02:53,  3.46s/it]


In [None]:
tmp_pred= [v.numpy().item() for v in predictions]
tmp_targets= [v.numpy().item() for v in targets]


In [None]:
classification_report(tmp_targets,tmp_pred)

  _warn_prf(average, modifier, msg_start, len(result))


'              precision    recall  f1-score   support\n\n           0       0.00      0.00      0.00        32\n           1       0.84      1.00      0.91       168\n\n    accuracy                           0.84       200\n   macro avg       0.42      0.50      0.46       200\nweighted avg       0.71      0.84      0.77       200\n'

In [None]:
f1_score(tmp_pred,tmp_targets)

0.9130434782608696

In [None]:
tmp_targets

[array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(0),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(0),
 array(0),
 array(1),
 array(1),
 array(0),
 array(1),
 array(0),
 array(1),
 array(1),
 array(1),
 array(0),
 array(1),
 array(1),
 array(1),
 array(1),
 array(0),
 array(0),
 array(0),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(0),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(0),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(0),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),

In [None]:
tmp_pred

[array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),
 array(1),

In [None]:
torch.reshape(torch.stack(predictions),(-1,))

tensor([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
        1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
        1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
        1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
        1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
        1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
        1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
        1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
        1, 1, 1, 1, 1, 1, 1, 1])

In [None]:
torch.stack(targets)

tensor([[1, 1, 1, 0],
        [1, 1, 0, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 0, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 0, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 0],
        [1, 1, 0, 0],
        [0, 1, 0, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 0, 1],
        [1, 0, 1, 1],
        [1, 0, 1, 1],
        [0, 0, 1, 0],
        [1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 0, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 0],
        [0, 1, 0, 1],
        [1, 1, 0, 1],
        [0, 1, 1, 0],
        [1, 1, 1, 0],
        [1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 0, 1],
        [1, 1, 1, 1],
        [0, 1, 1, 0],
        [1, 1, 1, 1],
        [1, 1, 0, 1],
        [1, 1, 1, 1],
        [1, 1, 0, 0],
        [1, 1, 0, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 0],
        [1, 1, 1, 1],
        [1

In [None]:
targets

[tensor([1, 1, 1, 0]),
 tensor([1, 1, 0, 1]),
 tensor([1, 1, 1, 1]),
 tensor([1, 1, 1, 1]),
 tensor([1, 1, 1, 1]),
 tensor([1, 0, 1, 1]),
 tensor([1, 1, 1, 1]),
 tensor([1, 1, 0, 1]),
 tensor([1, 1, 1, 1]),
 tensor([1, 1, 1, 1]),
 tensor([1, 1, 1, 1]),
 tensor([1, 1, 1, 0]),
 tensor([1, 1, 0, 0]),
 tensor([0, 1, 0, 1]),
 tensor([1, 1, 1, 1]),
 tensor([1, 1, 1, 1]),
 tensor([1, 1, 1, 1]),
 tensor([1, 1, 0, 1]),
 tensor([1, 0, 1, 1]),
 tensor([1, 0, 1, 1]),
 tensor([0, 0, 1, 0]),
 tensor([1, 1, 1, 1]),
 tensor([1, 1, 1, 1]),
 tensor([1, 1, 0, 1]),
 tensor([1, 1, 1, 1]),
 tensor([1, 1, 1, 0]),
 tensor([0, 1, 0, 1]),
 tensor([1, 1, 0, 1]),
 tensor([0, 1, 1, 0]),
 tensor([1, 1, 1, 0]),
 tensor([1, 1, 1, 1]),
 tensor([1, 1, 1, 1]),
 tensor([1, 1, 1, 1]),
 tensor([1, 1, 0, 1]),
 tensor([1, 1, 1, 1]),
 tensor([0, 1, 1, 0]),
 tensor([1, 1, 1, 1]),
 tensor([1, 1, 0, 1]),
 tensor([1, 1, 1, 1]),
 tensor([1, 1, 0, 0]),
 tensor([1, 1, 0, 1]),
 tensor([1, 1, 1, 1]),
 tensor([1, 1, 1, 1]),
 tensor([1,

In [None]:
predictions

[tensor([1, 1, 1, 1]),
 tensor([1, 1, 1, 1]),
 tensor([1, 1, 1, 1]),
 tensor([1, 1, 1, 1]),
 tensor([1, 1, 1, 1]),
 tensor([1, 1, 1, 1]),
 tensor([1, 1, 1, 1]),
 tensor([1, 1, 1, 1]),
 tensor([1, 1, 1, 1]),
 tensor([1, 1, 1, 1]),
 tensor([1, 1, 1, 1]),
 tensor([1, 1, 1, 1]),
 tensor([1, 1, 1, 1]),
 tensor([1, 1, 1, 1]),
 tensor([1, 1, 1, 1]),
 tensor([1, 1, 1, 1]),
 tensor([1, 1, 1, 1]),
 tensor([1, 1, 1, 1]),
 tensor([1, 1, 1, 1]),
 tensor([1, 1, 1, 1]),
 tensor([1, 1, 1, 1]),
 tensor([1, 1, 1, 1]),
 tensor([1, 1, 1, 1]),
 tensor([1, 1, 1, 1]),
 tensor([1, 1, 1, 1]),
 tensor([1, 1, 1, 1]),
 tensor([1, 1, 1, 1]),
 tensor([1, 1, 1, 1]),
 tensor([1, 1, 1, 1]),
 tensor([1, 1, 1, 1]),
 tensor([1, 1, 1, 1]),
 tensor([1, 1, 1, 1]),
 tensor([1, 1, 1, 1]),
 tensor([1, 1, 1, 1]),
 tensor([1, 1, 1, 1]),
 tensor([1, 1, 1, 1]),
 tensor([1, 1, 1, 1]),
 tensor([1, 1, 1, 1]),
 tensor([1, 1, 1, 1]),
 tensor([1, 1, 1, 1]),
 tensor([1, 1, 1, 1]),
 tensor([1, 1, 1, 1]),
 tensor([1, 1, 1, 1]),
 tensor([1,

In [None]:
# from sklearn.metrics import accuracy_score, f1_score, precision_score, recall_score, classification_report, confusion_matrix
# predictions2, targets2 = my_valid2(model, testing_loader)
# predictions_numpy2 = predictions.cpu().detach().numpy()
# targets_numpy2 = targets.cpu().detach().numpy()
# #print(predictions_numpy)
# #print(targets_numpy)



In [None]:
recall_score(predictions_numpy,targets_numpy)


NameError: ignored

In [None]:
#recall_score(predictions_numpy2,targets_numpy2)

In [None]:
precision_score(predictions_numpy,targets_numpy)


In [None]:
f1_score(predictions_numpy,targets_numpy)


In [None]:
accuracy_score(predictions_numpy,targets_numpy)

<a id='section07'></a>
### Saving the Trained Model Artifacts for inference

This is the final step in the process of fine tuning the model. 

The model and its vocabulary are saved locally. These files are then used in the future to make inference on new inputs of news headlines.

In [None]:
output_model_file = 'pytorch_roberta_sentiment.bin'
output_vocab_file = './'

model_to_save = model
torch.save(model_to_save, output_model_file)
tokenizer.save_vocabulary(output_vocab_file)

print('All files saved')
print('This tutorial is completed')