<a href="https://colab.research.google.com/github/krenarep/sentimentAnalysis_1/blob/master/sentimentAnalysisBERT.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Sentiment Analysis with Deep Learning using BERT

In [1]:
from google.colab import drive
drive.mount('/content/drive/')

Mounted at /content/drive/


### Prerequisites

- Intermediate-level knowledge of Python 3 (NumPy and Pandas preferably, but not required)
- Exposure to PyTorch usage
- Basic understanding of Deep Learning and Language Models (BERT specifically)

### Project Outline

**Task 1**: Introduction (this section)

**Task 2**: Exploratory Data Analysis and Preprocessing

**Task 3**: Training/Validation Split

**Task 4**: Loading Tokenizer and Encoding our Data

**Task 5**: Setting up BERT Pretrained Model

**Task 6**: Creating Data Loaders

**Task 7**: Setting Up Optimizer and Scheduler

**Task 8**: Defining our Performance Metrics

**Task 9**: Creating our Training Loop

**Task 10**: Loading and Evaluating our Model

## Task 1: Introduction

### What is BERT

BERT is a large-scale transformer-based Language Model that can be finetuned for a variety of tasks.

For more information, the original paper can be found [here](https://arxiv.org/abs/1810.04805). 

[HuggingFace documentation](https://huggingface.co/transformers/model_doc/bert.html)

[Bert documentation](https://characters.fandom.com/wiki/Bert_(Sesame_Street) ;)

<img src="Images/BERT_diagrams.pdf" width="1000">

## Task 2: Exploratory Data Analysis and Preprocessing

We will use the SMILE Twitter dataset.

_Wang, Bo; Tsakalidis, Adam; Liakata, Maria; Zubiaga, Arkaitz; Procter, Rob; Jensen, Eric (2016): SMILE Twitter Emotion dataset. figshare. Dataset. https://doi.org/10.6084/m9.figshare.3187909.v2_

In [1]:
import torch
import pandas as pd
from tqdm.notebook import tqdm

In [3]:
df = pd.read_csv(
    '/content/training.csv',
    names = ['id','text', 'category', 'aspects']             
    )
df.set_index('id', inplace=True)

In [40]:
df.head()

Unnamed: 0_level_0,text,category,aspects
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
id,text,category,aspects
1,Due to the situation created as a result of th...,Negative,I
2,During the lectures there were also various pr...,Positive,X
3,In general I can say that during this difficul...,Positive,M
4,This situation has continued even during the e...,Negative,G


In [4]:
df.text.iloc[1]

'Due to the situation created as a result of the pandemic, we, like most institutions in Kosovo, have been forced to attend online lectures'

In [5]:
df.category.value_counts()

Positive    354
Negative     79
Neutral      66
category      1
Name: category, dtype: int64

In [6]:
df = df[~df.category.str.contains('\|')]

In [7]:
df = df[df.category != 'category']

In [8]:
df.category.value_counts()

Positive    354
Negative     79
Neutral      66
Name: category, dtype: int64

In [9]:
possible_labels = df.category.unique ()

In [10]:
label_dict = {}
for index, possible_labels in enumerate(possible_labels):
  label_dict[possible_labels] = index


In [11]:
label_dict

{'Negative': 0, 'Positive': 1, 'Neutral': 2}

In [12]:
df['label'] = df.category.replace(label_dict)
df.head()

Unnamed: 0_level_0,text,category,aspects,label
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
1,Due to the situation created as a result of th...,Negative,I,0
2,During the lectures there were also various pr...,Positive,X,1
3,In general I can say that during this difficul...,Positive,M,1
4,This situation has continued even during the e...,Negative,G,0
5,The same process was followed in the Human-Com...,Neutral,L,2


## Task 3: Training/Validation Split

In [13]:
from sklearn.model_selection import train_test_split

In [14]:
X_train, X_val, y_train, y_val = train_test_split(
    df.index.values,
    df.label.values,
    test_size=0.15,
    random_state=17,
    stratify=df.label.values
    )

In [15]:
df['data_type'] = ['not_set']* df.shape[0]

In [16]:
df.head()

Unnamed: 0_level_0,text,category,aspects,label,data_type
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
1,Due to the situation created as a result of th...,Negative,I,0,not_set
2,During the lectures there were also various pr...,Positive,X,1,not_set
3,In general I can say that during this difficul...,Positive,M,1,not_set
4,This situation has continued even during the e...,Negative,G,0,not_set
5,The same process was followed in the Human-Com...,Neutral,L,2,not_set


In [17]:
df.loc[X_train, 'data_type'] = 'train'
df.loc[X_val, 'data_type'] = 'val'

In [18]:
df.groupby(['category','label','data_type']).count()

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,text,aspects
category,label,data_type,Unnamed: 3_level_1,Unnamed: 4_level_1
Negative,0,train,67,67
Negative,0,val,12,12
Neutral,2,train,56,56
Neutral,2,val,10,10
Positive,1,train,301,301
Positive,1,val,53,53


## Task 4: Loading Tokenizer and Encoding our Data

In [19]:
pip install transformers

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting transformers
  Downloading transformers-4.23.1-py3-none-any.whl (5.3 MB)
[K     |████████████████████████████████| 5.3 MB 30.5 MB/s 
Collecting tokenizers!=0.11.3,<0.14,>=0.11.1
  Downloading tokenizers-0.13.1-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.6 MB)
[K     |████████████████████████████████| 7.6 MB 43.9 MB/s 
Collecting huggingface-hub<1.0,>=0.10.0
  Downloading huggingface_hub-0.10.1-py3-none-any.whl (163 kB)
[K     |████████████████████████████████| 163 kB 74.1 MB/s 
Installing collected packages: tokenizers, huggingface-hub, transformers
Successfully installed huggingface-hub-0.10.1 tokenizers-0.13.1 transformers-4.23.1


In [20]:
from transformers import BertTokenizer
from torch.utils.data import TensorDataset

In [21]:
tokenizer = BertTokenizer.from_pretrained(
    'bert-base-uncased',
    do_lower_case = True
    )

Downloading:   0%|          | 0.00/232k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/28.0 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/570 [00:00<?, ?B/s]

In [22]:
df.data_type=='train'

id
1       True
2       True
3       True
4      False
5       True
       ...  
495    False
496     True
497     True
498    False
499     True
Name: data_type, Length: 499, dtype: bool

In [23]:
df[df.data_type=='train']

Unnamed: 0_level_0,text,category,aspects,label,data_type
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
1,Due to the situation created as a result of th...,Negative,I,0,train
2,During the lectures there were also various pr...,Positive,X,1,train
3,In general I can say that during this difficul...,Positive,M,1,train
5,The same process was followed in the Human-Com...,Neutral,L,2,train
6,But I can freely say that we have not had any ...,Positive,G,1,train
...,...,...,...,...,...
493,Our direction involves 90% of the work being d...,Positive,L,1,train
494,Evaluation-The evaluation part has been one of...,Positive,V,1,train
496,Learning - I personally think most students ha...,Positive,L,1,train
497,Replacement of the evaluation system in some c...,Neutral,M,2,train


In [24]:
df[df.data_type=='train'].text.values

array(['Due to the situation created as a result of the pandemic, we, like most institutions in Kosovo, have been forced to attend online lectures',
       'During the lectures there were also various projects from which we benefited a lot and these projects although from a distance I think have been quite useful',
       'In general I can say that during this difficult period we have managed to successfully complete the online learning process',
       'The same process was followed in the Human-Computer Interaction (HCI) course',
       'But I can freely say that we have not had any setbacks in any aspect, because we have had support at all times from both the professor of the subject and the institution in general',
       'Self-confidence when I started lecturing on Computer-Human Interaction was a positive belief',
       'At first I was worried about how we would handle the quizzes, how we would engage during the lectures, since we were teaching online for safer health reasons',


#We have to encode the texts by using tokenizer.batch_encode_plus

In [25]:
encoded_data_train = tokenizer.batch_encode_plus(
    df[df.data_type=='train'].text.values,
    add_special_tokens=True,
    return_attention_mask=True,
    #pad_to_max_length=True,
    padding=True,
    truncation=True,
    max_length=256,
    return_tensors='pt'
)


In [26]:
encoded_data_val = tokenizer.batch_encode_plus(
    df[df.data_type=='val'].text.values,
    add_special_tokens=True,
    return_attention_mask=True,
   # pad_to_max_length=True,
    padding=True,
    truncation=True,
    max_length=256,
    return_tensors='pt'
)


In [27]:
#For the train

In [28]:
input_ids_train = encoded_data_train['input_ids']
attention_masks_train = encoded_data_train['attention_mask']
labels_train = torch.tensor(df[df.data_type=='train'].label.values) 


In [29]:
#for the validation
input_ids_val = encoded_data_val['input_ids']
attention_masks_val = encoded_data_val['attention_mask']
labels_val = torch.tensor(df[df.data_type=='val'].label.values)

In [30]:
dataset_train = TensorDataset(
    input_ids_train,
    attention_masks_train,
    labels_train
)


In [31]:
dataset_val = TensorDataset(input_ids_val,
                            attention_masks_val,
                            labels_val
)


In [32]:
len(dataset_train)


424

In [33]:
len(dataset_val)

75

## Task 5: Setting up BERT Pretrained Model

In [34]:
from transformers import BertForSequenceClassification

In [35]:
model = BertForSequenceClassification.from_pretrained('bert-base-uncased', 
     num_labels=len(label_dict),
     output_attentions=False,
     output_hidden_states=False)


Downloading:   0%|          | 0.00/440M [00:00<?, ?B/s]

Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.LayerNorm.bias', 'cls.predictions.bias', 'cls.predictions.decoder.weight', 'cls.predictions.transform.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

## Task 6: Creating Data Loaders

In [37]:
from torch.utils.data import DataLoader, RandomSampler, SequentialSampler

In [38]:
#In Google Colab -- GPU Instance (k80)
#batch_size =32
#epoch =10

In [39]:
#batch_size = 4 #32
batch_size = 32
dataloader_train = DataLoader(
    dataset_train,
    sampler=RandomSampler(dataset_train),
    batch_size=batch_size
)

In [40]:
dataloader_val = DataLoader(
    dataset_val,
    sampler=SequentialSampler(dataset_val),
    batch_size=batch_size 
)

## Task 7: Setting Up Optimizer and Scheduler

In [41]:
from transformers import AdamW, get_linear_schedule_with_warmup

In [42]:
optimizer = AdamW(
    model.parameters(),
    lr=1e-5, #2e-5 > 5e-5
    eps=1e-8
)




In [53]:
#epochs = 10
epochs = 10

scheduler = get_linear_schedule_with_warmup(
        optimizer,
        num_warmup_steps=0,
        num_training_steps=len(dataloader_train)*epochs
)


## Task 8: Defining our Performance Metrics

Accuracy metric approach originally used in accuracy function in [this tutorial](https://mccormickml.com/2019/07/22/BERT-fine-tuning/#41-bertforsequenceclassification).

In [54]:
import numpy as np

In [55]:
from sklearn.metrics import f1_score

In [56]:
#preds=[0.9 0.05 0.05 0 0 0]
#preds = [1 0 0 0 0]

In [57]:
def f1_score_func(preds, labels):
    preds_flat = np.argmax(preds, axis =1 ).flatten()
    labels_flat = labels.flatten()
    return f1_score(labels_flat, preds_flat, average='weighted')


In [58]:
def accuracy_per_class(preds, labels):
    label_dict_inverse={v: k for k, v in label_dict.items()}
    preds_flat = np.argmax(preds, axis =1 ).flatten()
    labels_flat = labels.flatten()
    
    for label in np.unique(labels_flat):
        y_pred = preds_flat[labels_flat== label]
        y_true = labels_flat[labels_flat== label]
        print(f'Class:{label_dict_inverse[label]}')
        print(f'Accuracy:{len(y_pred[y_pred==label])}/{len(y_true)}\n')


## Task 9: Creating our Training Loop

Approach adapted from an older version of HuggingFace's `run_glue.py` script. Accessible [here](https://github.com/huggingface/transformers/blob/5bfcd0485ece086ebcbed2d008813037968a9e58/examples/run_glue.py#L128).

In [59]:
import random

seed_val = 17
random.seed(seed_val)
np.random.seed(seed_val)
torch.manual_seed(seed_val)
torch.cuda.manual_seed_all(seed_val)

In [60]:
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model.to(device)
print(device)


cuda


In [62]:
def evaluate(dataloader_val):
    model.eval()
    loss_val_total = 0
    predictions, true_vals = [], []
    for batch in tqdm(dataloader_val):
        batch = tuple(b.to(device) for b in batch)
        inputs = {'input_ids':      batch[0],
                  'attention_mask': batch[1],
                  'labels':         batch[2],
                 }

        with torch.no_grad():        
            outputs = model(**inputs)
            
        loss = outputs[0]
        logits = outputs[1]
        loss_val_total += loss.item()

        logits = logits.detach().cpu().numpy()
        label_ids = inputs['labels'].cpu().numpy()
        predictions.append(logits)
        true_vals.append(label_ids)
    
    loss_val_avg = loss_val_total/len(dataloader_val) 
    
    predictions = np.concatenate(predictions, axis=0)
    true_vals = np.concatenate(true_vals, axis=0)
            
    return loss_val_avg, predictions, true_vals



In [63]:
for epoch in tqdm(range(1, epochs+1)):
    
    model.train()
    
    loss_train_total = 0
    
    progress_bar = tqdm(dataloader_train, 
                        desc='Epoch {:1d}'.format(epoch),
                        leave=False,
                        disable=False)
    for batch in progress_bar:
        model.zero_grad()
        batch = tuple(b.to(device) for b in batch)
        inputs ={
            'input_ids'    :batch[0],
            'attention_mask':batch[1],
            'labels'        :batch[2]
        }
        outputs = model(**inputs)
        loss = outputs[0]
        loss_train_total += loss.item()
        loss.backward()
    
        torch.nn.utils.clip_grad_norm_(model.parameters(), 1.0)
        optimizer.step()
        scheduler.step()
        progress_bar.set_postfix(
            {'training_loss': '{:.3f}'.format(loss.item()/len(batch))})
        
    #torch.save(model.state_dict(),f'Models/BERT_ft_epoch{epoch}.model')
    tqdm.write('\nEpoch {epoch}')
    
    loss_train_avg= loss_train_total/len(dataloader_train)
    tqdm.write(f'Training loss:{loss_train_avg}')
    
    val_loss, predictions, true_vals = evaluate(dataloader_val)
    val_f1= f1_score_func(predictions,true_vals)
    tqdm.write(f'Validation{val_loss}')
    tqdm.write(f'F1 Score (weigthed): {val_f1}')


  0%|          | 0/10 [00:00<?, ?it/s]

Epoch 1:   0%|          | 0/14 [00:00<?, ?it/s]


Epoch {epoch}
Training loss:0.7563998614038739


  0%|          | 0/3 [00:00<?, ?it/s]

Validation0.7556255658467611
F1 Score (weigthed): 0.5852083333333333


Epoch 2:   0%|          | 0/14 [00:00<?, ?it/s]


Epoch {epoch}
Training loss:0.7030011543205806


  0%|          | 0/3 [00:00<?, ?it/s]

Validation0.7177067995071411
F1 Score (weigthed): 0.6144316575812638


Epoch 3:   0%|          | 0/14 [00:00<?, ?it/s]


Epoch {epoch}
Training loss:0.6427640595606395


  0%|          | 0/3 [00:00<?, ?it/s]

Validation0.706241250038147
F1 Score (weigthed): 0.6539773633030448


Epoch 4:   0%|          | 0/14 [00:00<?, ?it/s]


Epoch {epoch}
Training loss:0.5932181371109826


  0%|          | 0/3 [00:00<?, ?it/s]

Validation0.7226996819178263
F1 Score (weigthed): 0.6918028985507246


Epoch 5:   0%|          | 0/14 [00:00<?, ?it/s]


Epoch {epoch}
Training loss:0.5168301931449345


  0%|          | 0/3 [00:00<?, ?it/s]

Validation0.761785884698232
F1 Score (weigthed): 0.7155037555037554


Epoch 6:   0%|          | 0/14 [00:00<?, ?it/s]


Epoch {epoch}
Training loss:0.4445469783885138


  0%|          | 0/3 [00:00<?, ?it/s]

Validation0.7840244571367899
F1 Score (weigthed): 0.7176393976393975


Epoch 7:   0%|          | 0/14 [00:00<?, ?it/s]


Epoch {epoch}
Training loss:0.42326629161834717


  0%|          | 0/3 [00:00<?, ?it/s]

Validation0.8020831147829691
F1 Score (weigthed): 0.7176393976393975


Epoch 8:   0%|          | 0/14 [00:00<?, ?it/s]


Epoch {epoch}
Training loss:0.38665240577289034


  0%|          | 0/3 [00:00<?, ?it/s]

Validation0.8247719208399454
F1 Score (weigthed): 0.7176393976393975


Epoch 9:   0%|          | 0/14 [00:00<?, ?it/s]


Epoch {epoch}
Training loss:0.38753579344068256


  0%|          | 0/3 [00:00<?, ?it/s]

Validation0.8312948942184448
F1 Score (weigthed): 0.7176393976393975


Epoch 10:   0%|          | 0/14 [00:00<?, ?it/s]


Epoch {epoch}
Training loss:0.3730886269892965


  0%|          | 0/3 [00:00<?, ?it/s]

Validation0.8269752462704977
F1 Score (weigthed): 0.7176393976393975


In [65]:
#torch.save(model.state_dict(),f'Models/BERT_ft_epoch{epoch}.model')  
torch.save(model.state_dict(),f'/content/sample_data/Models/BERT_ft_epoch{epoch}.model')  

In [66]:
## https://ruslanmv.com/blog/Deep-Learning-using-BERT-and-Pytorch

## Task 10: Loading and Evaluating our Model



In [67]:
model = BertForSequenceClassification.from_pretrained("bert-base-uncased",
                                                      num_labels=len(label_dict),
                                                      output_attentions=False,
                                                      output_hidden_states=False)

Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.LayerNorm.bias', 'cls.predictions.bias', 'cls.predictions.decoder.weight', 'cls.predictions.transform.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

In [68]:
model = BertForSequenceClassification.from_pretrained(
    "bert-base-uncased",
    num_labels=len(label_dict),
    output_attentions=False,
    output_hidden_states=False)

Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.LayerNorm.bias', 'cls.predictions.bias', 'cls.predictions.decoder.weight', 'cls.predictions.transform.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

In [69]:
len(label_dict)

3

In [70]:
# Print model's state_dict
print("Model's state_dict:")
for param_tensor in model.state_dict():
    print(param_tensor, "\t", model.state_dict()[param_tensor].size())

Model's state_dict:
bert.embeddings.position_ids 	 torch.Size([1, 512])
bert.embeddings.word_embeddings.weight 	 torch.Size([30522, 768])
bert.embeddings.position_embeddings.weight 	 torch.Size([512, 768])
bert.embeddings.token_type_embeddings.weight 	 torch.Size([2, 768])
bert.embeddings.LayerNorm.weight 	 torch.Size([768])
bert.embeddings.LayerNorm.bias 	 torch.Size([768])
bert.encoder.layer.0.attention.self.query.weight 	 torch.Size([768, 768])
bert.encoder.layer.0.attention.self.query.bias 	 torch.Size([768])
bert.encoder.layer.0.attention.self.key.weight 	 torch.Size([768, 768])
bert.encoder.layer.0.attention.self.key.bias 	 torch.Size([768])
bert.encoder.layer.0.attention.self.value.weight 	 torch.Size([768, 768])
bert.encoder.layer.0.attention.self.value.bias 	 torch.Size([768])
bert.encoder.layer.0.attention.output.dense.weight 	 torch.Size([768, 768])
bert.encoder.layer.0.attention.output.dense.bias 	 torch.Size([768])
bert.encoder.layer.0.attention.output.LayerNorm.weight 	 t

In [71]:
# Print optimizer's state_dict
print("Optimizer's state_dict:")
for var_name in optimizer.state_dict():
    print(var_name, "\t", optimizer.state_dict()[var_name])

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
         4.4950e-05,  7.1885e-05,  4.1223e-06,  6.5378e-05, -3.6245e-05,
        -1.2590e-04,  2.6793e-05,  8.0330e-05,  6.8284e-05,  7.8053e-05,
        -2.4236e-05, -7.7125e-05, -3.5099e-05, -9.6682e-06, -1.6879e-04,
         1.4471e-04, -6.2484e-05, -3.4448e-05,  3.0049e-05, -7.9735e-05,
         1.1221e-05, -3.3684e-05, -1.1695e-04,  5.7106e-05, -1.2762e-05,
         8.0112e-05,  1.2194e-04, -2.5004e-05,  1.8140e-05, -2.4629e-05,
         2.6667e-05,  7.0275e-05, -1.9350e-05,  1.2061e-06, -5.4647e-05,
         5.9177e-05, -2.0693e-05,  4.3959e-05, -7.5885e-05,  9.5387e-05,
        -6.3131e-05, -3.0592e-05, -2.6578e-05,  5.9479e-05,  2.4488e-06,
         1.4533e-04, -6.6113e-07, -5.3513e-05, -7.2186e-05, -1.0967e-04,
        -6.5690e-06,  4.3892e-05, -9.9734e-05,  1.2607e-04,  3.2838e-06,
         7.2089e-05,  4.9225e-06, -1.6034e-06, -1.4460e-06,  3.7137e-05,
        -2.4685e-05,  7.7988e-05,  7.0141e-07,  6.6527e-05,

In [75]:
device = torch.device('cuda')
pass


In [76]:
model.to(device)
pass
# Make sure to call input = input.to(device) on any input tensors that you feed to the model

In [77]:
PATH='./content/sample_data/Models/BERT_ft_epoch10.model'

In [78]:
model.load_state_dict(
    torch.load('/content/sample_data/Models/BERT_ft_epoch10.model',
    map_location=torch.device('cpu')))

<All keys matched successfully>

In [79]:
_, predictions, true_vals = evaluate(dataloader_val)


  0%|          | 0/3 [00:00<?, ?it/s]

In [80]:
accuracy_per_class(predictions, true_vals)

Class:Negative
Accuracy:6/12

Class:Positive
Accuracy:50/53

Class:Neutral
Accuracy:1/10

