<a href="https://colab.research.google.com/github/stang715/sentiment_analysis/blob/main/Project.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Sentiment Analysis with Deep Learning using BERT

### Prerequisites

- Intermediate-level knowledge of Python 3 (NumPy and Pandas preferably, but not required)
- Exposure to PyTorch usage
- Basic understanding of Deep Learning and Language Models (BERT specifically)

### Project Outline

**Task 1**: Introduction (this section)

**Task 2**: Exploratory Data Analysis and Preprocessing

**Task 3**: Training/Validation Split

**Task 4**: Loading Tokenizer and Encoding our Data

**Task 5**: Setting up BERT Pretrained Model

**Task 6**: Creating Data Loaders

**Task 7**: Setting Up Optimizer and Scheduler

**Task 8**: Defining our Performance Metrics

**Task 9**: Creating our Training Loop

**Task 10**: Loading and Evaluating our Model

## Task 1: Introduction

### What is BERT

BERT is a large-scale transformer-based Language Model that can be finetuned for a variety of tasks.

For more information, the original paper can be found [here](https://arxiv.org/abs/1810.04805).

[HuggingFace documentation](https://huggingface.co/transformers/model_doc/bert.html)

[Bert documentation](https://characters.fandom.com/wiki/Bert_(Sesame_Street) ;)

<img src="Images/BERT_diagrams.pdf" width="1000">

## Task 2: Exploratory Data Analysis and Preprocessing

We will use the SMILE Twitter dataset.

_Wang, Bo; Tsakalidis, Adam; Liakata, Maria; Zubiaga, Arkaitz; Procter, Rob; Jensen, Eric (2016): SMILE Twitter Emotion dataset. figshare. Dataset. https://doi.org/10.6084/m9.figshare.3187909.v2_

In [None]:
import torch
import pandas as pd
from tqdm.notebook import tqdm

In [None]:
# a standard way of creating a pandas data frame
df = pd.read_csv(
    '/content/smile-annotations-final.csv',
    names = ['id', 'text', 'category'])

# set the id to be the index
df.set_index('id', inplace = True)

# after running the code above, we have a dataset ready!

In [None]:
# check the first 5 items in the dataset
df.head()

Unnamed: 0_level_0,text,category
id,Unnamed: 1_level_1,Unnamed: 2_level_1
611857364396965889,@aandraous @britishmuseum @AndrewsAntonio Merc...,nocode
614484565059596288,Dorian Gray with Rainbow Scarf #LoveWins (from...,happy
614746522043973632,@SelectShowcase @Tate_StIves ... Replace with ...,happy
614877582664835073,@Sofabsports thank you for following me back. ...,happy
611932373039644672,@britishmuseum @TudorHistory What a beautiful ...,happy


In [None]:
df.text
# see what the text looks like

id
611857364396965889    @aandraous @britishmuseum @AndrewsAntonio Merc...
614484565059596288    Dorian Gray with Rainbow Scarf #LoveWins (from...
614746522043973632    @SelectShowcase @Tate_StIves ... Replace with ...
614877582664835073    @Sofabsports thank you for following me back. ...
611932373039644672    @britishmuseum @TudorHistory What a beautiful ...
                                            ...                        
613678555935973376    MT @AliHaggett: Looking forward to our public ...
613294681225621504                      @britishmuseum Upper arm guard?
615246897670922240             @MrStuchbery @britishmuseum Mesmerising.
613016084371914753    @NationalGallery The 2nd GENOCIDE against #Bia...
611566876762640384    @britishmuseum Experience #battlewaterloo from...
Name: text, Length: 3085, dtype: object

In [None]:
# look at the text content from the 2nd index
df.text.iloc[1]

'Dorian Gray with Rainbow Scarf #LoveWins (from @britishmuseum http://t.co/Q4XSwL0esu) http://t.co/h0evbTBWRq'

In [None]:
# check if this dataset has been classified to different emotions
df.category.value_counts()

# df + colomn's name + value_coutns

# count how many each unique instances occurs in this dataset

nocode               1572
happy                1137
not-relevant          214
angry                  57
surprise               35
sad                    32
happy|surprise         11
happy|sad               9
disgust|angry           7
disgust                 6
sad|disgust             2
sad|angry               2
sad|disgust|angry       1
Name: category, dtype: int64

In [None]:
# first, we want to delete all emotions that are with this simbols: |
# df = df[df.category.str.contains('\|')]
# the code above is pulling items that do have | this simbols, we need to add ~ to exclude this

df = df[~df.category.str.contains('\|')]

In [None]:
# second, we want to remove all emotions which is shown as "nocode"
df = df[df.category != 'nocode']

In [None]:
# third, do a value counts
df.category.value_counts()

happy           1137
not-relevant     214
angry             57
surprise          35
sad               32
disgust            6
Name: category, dtype: int64

In [None]:
# 4th, we need to build a dictionary.
# key is the emotion, value is the unique numbers which ranges from 0-5
possible_labels = df.category.unique()

# a list containing all of the unique labels.

In [None]:
# create an empty dictionary
label_dictionary = {}

for index,possible_label in enumerate(possible_labels):
  label_dictionary[possible_label] = index


In [None]:
label_dictionary

{'happy': 0,
 'not-relevant': 1,
 'angry': 2,
 'disgust': 3,
 'sad': 4,
 'surprise': 5}

In [None]:
df.head()

Unnamed: 0_level_0,text,category
id,Unnamed: 1_level_1,Unnamed: 2_level_1
614484565059596288,Dorian Gray with Rainbow Scarf #LoveWins (from...,happy
614746522043973632,@SelectShowcase @Tate_StIves ... Replace with ...,happy
614877582664835073,@Sofabsports thank you for following me back. ...,happy
611932373039644672,@britishmuseum @TudorHistory What a beautiful ...,happy
611570404268883969,@NationalGallery @ThePoldarkian I have always ...,happy


In [None]:
# create a new column inside the original data frame that
# has 0-5 values, we will call that column_labels
df['label'] = df.category.replace(label_dictionary)
df.head()



Unnamed: 0_level_0,text,category,label
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
614484565059596288,Dorian Gray with Rainbow Scarf #LoveWins (from...,happy,0
614746522043973632,@SelectShowcase @Tate_StIves ... Replace with ...,happy,0
614877582664835073,@Sofabsports thank you for following me back. ...,happy,0
611932373039644672,@britishmuseum @TudorHistory What a beautiful ...,happy,0
611570404268883969,@NationalGallery @ThePoldarkian I have always ...,happy,0


## Task 3: Training/Validation Split

In [None]:
from sklearn.model_selection import train_test_split

# use sklearn - train_test_split bc it has an option for the stratified parameter

In [None]:
X_train, X_val, y_train,y_val = train_test_split(
    df.index.values, # in the traintestsplit (), we will give a index value
    df.label.values,
    test_size =0.15, #15% test size
    random_state=17,
    stratify = df.label.values # tell it what to stratify on
)

# returns four things:
# xtrain,xval as validation set
#, ytrain, y val


In [None]:
# create a new column in our data frame that includes data type
# to get it prepared for labeling train and val sets
df['data_type'] = ['not_set'] * df.shape[0]

In [None]:
df.head()

Unnamed: 0_level_0,text,category,label,data_type
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
614484565059596288,Dorian Gray with Rainbow Scarf #LoveWins (from...,happy,0,not_set
614746522043973632,@SelectShowcase @Tate_StIves ... Replace with ...,happy,0,not_set
614877582664835073,@Sofabsports thank you for following me back. ...,happy,0,not_set
611932373039644672,@britishmuseum @TudorHistory What a beautiful ...,happy,0,not_set
611570404268883969,@NationalGallery @ThePoldarkian I have always ...,happy,0,not_set


In [None]:
df.loc[X_train,'data_type'] = 'train'
df.loc[X_val,'data_type'] = 'val'

In [None]:
# group the data, and see if it is what i expected
df.groupby(['category','label', 'data_type']).count()

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,text
category,label,data_type,Unnamed: 3_level_1
angry,2,train,48
angry,2,val,9
disgust,3,train,5
disgust,3,val,1
happy,0,train,966
happy,0,val,171
not-relevant,1,train,182
not-relevant,1,val,32
sad,4,train,27
sad,4,val,5


## Task 4: Loading Tokenizer and Encoding our Data

In [None]:
# a token is a numerical data that is converted from text
from transformers import BertTokenizer
from torch.utils.data import TensorDataset

In [None]:
# import tokenizer
tokenizer = BertTokenizer.from_pretrained(
    'bert-base-uncased', # uncased means all lower cased letters
    do_lower_case = True # 2nd argument we give is to do lower case
    )

In [None]:
# now we convert sentences frin text form to encoded form.
# so, we need to use the function batch encode plus
# batch: bc it can take multiple str and convert them into tokens as we need them
# so we need to write code  seperately for the train data
encoded_data_train = tokenizer.batch_encode_plus(    # start putting all the parameters that we need
   df[df.data_type == 'train'].text.values,  # 1st para is the actual sentences themselves
   add_special_token = True,# 2nd para : add special token, this is the bert way of knowing when the sentence ends and when a new one begins
   return_attention_mask = True,  # the attention mark is just telling us bc we used fixed input
   # the attentional mask tells where the actual values are and where are just 0 which means blank
   pad_to_max_length = True, # pad all sentences to a certain length
   max_length= 256, # max length of a sentence is 256 words
   return_tensors= 'pt' # built in option incase we want to use tensorflow, here we use pytorch as pt
   )

# copy the code above and change train to val for validation encoding process

encoded_data_val = tokenizer.batch_encode_plus(    # start putting all the parameters that we need
   df[df.data_type == 'val'].text.values,  # 1st para is the actual sentences themselves
   add_special_token = True,# 2nd para : add special token, this is the bert way of knowing when the sentence ends and when a new one begins
   return_attention_mask = True,  # the attention mark is just telling us bc we used fixed input
   # the attentional mask tells where the actual values are and where are just 0 which means blank
   pad_to_max_length = True, # pad all sentences to a certain length
   max_length= 256, # max length of a sentence is 256 words
   return_tensors= 'pt' # built in option incase we want to use tensorflow, here we use pytorch as pt
   )


# then we need split, this is what bert needs when anyone trains it in taking any inputs

# 1st, bert needs input ids
input_ids_train = encoded_data_train['input_ids']  # access encode dic, then plug in input ids
# input ids are just mean each words are represented as numbers
attention_masks_train = encoded_data_train['attention_mask'] # pull out atentional mask, mask is known as pytorch tensor
labels_train = torch.tensor(df[df.data_type == 'train'].label.values)
# make the tensor out of origianl data,

# copy code above change train to val

# 1st, bert needs input ids
input_ids_val = encoded_data_val['input_ids']  # access encode dic, then plug in input ids
                            # input ids are just mean each words are represented as numbers
attention_masks_val = encoded_data_val['attention_mask'] # pull out atentional mask, mask is known as pytorch tensor
labels_val = torch.tensor(df[df.data_type == 'val'].label.values) # make the tensor out of origianl data,

Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
Keyword arguments {'add_special_token': True} not recognized.
Keyword arguments {'add_special_token': True} not recognized.
Keyword arguments {'add_special_token': True} not recognized.
Keyword arguments {'add_special_token': True} not recognized.
Keyword arguments {'add_special_token': True} not recognized.
Keyword arguments {'add_special_token': True} not recognized.
Keyword arguments {'add_special_token': True} not recognized.
Keyword arguments {'add_special_token': True} not recognized.
Keyword arguments {'add_special_token': True} not recognized.
Keyword arguments {'add_special_token': True} not recognized.
Keyword

In [None]:
# now the encoded dataset are ready

# then we need to create two different datasets : Training and val

dataset_train = TensorDataset(input_ids_train, attention_masks_train,labels_train )
# simply use a tensor dataset which is a standard way of using dataset in pytorch library

dataset_val = TensorDataset(input_ids_val, attention_masks_val,labels_val )

# it's iterating through all three of these items at one time for each dataset

In [None]:
# fact check what we have done so far and make sure everything is what we wanted.
len(dataset_train) # check the length of our training dataset

1258

In [None]:
len(dataset_val) # check the length of our validation dataset

223

Review what we did in task 4:
1. encoded dataset using bert tokenizer, takes sentences and convert them to numbers so that a pytorch model use these data
2.

## Task 5: Setting up BERT Pretrained Model

import this Bert For Sequence Classification from transformers library the huggingface module

basically, we treat each sentence(tweet) as its own unique sequence.

one sequence will be classified into one of six classes (6 different sentiments)

In [None]:
from transformers import BertForSequenceClassification


In [None]:
# new model is coming from bert
# use the bert base uncased which is more computationally efficient (it's a smaller version)
# the larger bert is significantly heavier to use which requires more compute to enven just to infer
# the larger one takes more time to train
# another reason is we are using for sequence classification, this is our fine tuning step
# this is the point where we redefin the architecture to include the parts we need
# we will be using num_labels which is saying how many output labels must this final layout of bert have
# will be able to classify
model = BertForSequenceClassification.from_pretrained(
    'bert-base-uncased',
    num_labels = len(label_dictionary),
    output_attentions = False, # don't want any unnessasary inputs from the model, so set the output attention to false
    output_hidden_states = False, # we don't care about the output of the hidden state which is the state just before the prediction
    # this might be useful in the encoding situation, but we don't need it here, so set it to false

    )

Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Explaination of this fine tune process above:

Bert essentially takes in text and is able to encode it in a meaningful way based on this huge corpus of data that it was initially exposed to.

Here, we just add a layer on top of that of size six. bc we got six different classes we want to be able to predict. and it's just gonna be a classfier that will just predict one of the six values.

We are probably across them, we need to pick the right one.

Now the model is ready to be trained.

Next task, we will talk about creating our data loaders so that we are ready.  

## Task 6: Creating Data Loaders

The data loader wil offer a nice way to iterate through your data sets in batches.

Here, we need to import ramdom sampler and sequential sampler, this is for how to sample the data per batch.

We'll use the ramdon sample for training, which is just makes sense to randomize how this model is trained, and what it is exposed to.

Use the sequencial one for validation dataset. we don't really care about ramdomly sorting our dataset, bc at that point our gradient are fixed and nothing will actually change.


In [None]:
from torch.utils.data import DataLoader, RandomSampler, SequentialSampler

In [None]:
# 1st, set our batch size, like 32 (64, 126 etc)
# bc we are using virtual machine and there is limited resources,
# so we can set the batch size to 4

batch_size= 4

In [None]:
# we need 2 different data loaders
# random sampler helps to improve and prevents the model from
# leanrning sequence based on differences when it's training
data_loader_train = DataLoader(
  dataset_train,
  sampler = RandomSampler(dataset_train),
  batch_size = 32
)


data_loader_val = DataLoader(
  dataset_val,
  sampler = RandomSampler(dataset_val),
  batch_size = 32
)

# now our dataset is in the dataloader, and it's ready to go

## Task 7: Setting Up Optimizer and Scheduler

- Opitimizer is just to decide our learning rate and our changes through each epoch

- Now run adamw below

- Adam is a way to optimizing our weight. Such stochastic optimization approach and finan

In [None]:
from transformers import AdamW, get_linear_schedule_with_warmup
# get linear schedule with warmup is a function built in to the transformers library

In [None]:
# setting up an optimizer
# this is just to control how we use, how are learning rate changes through time?
# 1st para is the model.parameters()
# 2nd is learning rate shortened as lr (bert paper recommended 2e-5 > 5e -5)
# 3rd needs an absolute value
optimizer = AdamW(
   model.parameters(),
   lr = 1e-5, # 2e-5 > 5e -5
   eps = 1e-8
)





In [None]:
# set the epoch

epochs = 10 # 10 works well in this project,

# set scheduler
# one of the paras needs be the optimizer,
# then add the warm up steps which is 0 - default
# training step: how many iteration it should go on for,
# it will be the length of your training data loader and multiply by how many epochs you have
scheduler = get_linear_schedule_with_warmup(
    optimizer,
    num_warmup_steps = 0,
    num_training_steps= len(data_loader_train) * epochs


)

## Task 8: Defining our Performance Metrics

- slowly approaching train our model part
- what performance metrics will define to see how well our model actually performs
- need to have few imports: numpy, f1 score,

Accuracy metric approach originally used in accuracy function in [this tutorial](https://mccormickml.com/2019/07/22/BERT-fine-tuning/#41-bertforsequenceclassification).

In [None]:
import numpy as np

In [None]:
from sklearn.metrics import f1_score

In [None]:
# this function will receive value of preds being predications, labels being the true lables
# prediction is the form of 6 values, so it's like a probability distribution
# eg. preds = [0.9 0.5 0.2 0.4 0.06 0.1 ]
# we need to find a way to convert the set of 6 values to be a binary vector
# we want it to be preds= [1 0 0 0 0 0]
def f1_score_func(preds, labels):

    preds_flat = np.argmax(preds, axis=1).flatten()  # convert preds to flat vector
    labels_flat = labels.flatten()
    # here above we have  predictions and labels in the same formant

    return f1_score(labels_flat, preds_flat, average='weighted' )
    # return f1 as it's defined in sklearn
    # first para is labels flat
    # 2nd is preds flat
    # 3rd give what kind of average we want to use
    # choose weighted average bc it's essentially weights each class based on how many samples exist
    # eg. for the disgust class, it only has 6 samples,it would be done weighted in this representation.

In [None]:
"""
- find a way to print out the accuracy per class
- meaning take all the true labels of class 5 and see how many of them are really class 5
- we can create a dictionary that inverse to the one we had before (happy to o), now 0 to happy.
"""


def accuracy_per_class(preds, labels):
    label_dic_inverse = {v:k for k, v in label_dictionary.items()}
    # instead of key to value, we have value to key

    preds_flat = np.argmax(preds, axis=1).flatten()  # convert preds to flat vector
    labels_flat = labels.flatten()

    # iterate thru all unique labels
    for label in np.unique(labels_flat):

      # now use some clever numpy indexing
      y_preds = preds_flat[labels_flat == label] # eg. take the prediction for the happy for the numeric labels
      y_true = labels_flat[labels_flat == label] # eg. take the true label for the happy tag as numeric label.
      print(f'Class:{label_dic_inverse[label]}') # print out the result nums representing all emotions
      print(f'Accuracy:{len(y_preds[y_preds == label])}/{len(y_true)}\n') # print accuracy


## Task 9: Creating our Training Loop

- we will create a training loop for bert fine tune model

- most import part of this project

https://production-common-skillspace.gp.coursera.org/ch_vid-0151ec14-e71c-4ea0-a322-0b036cf49756-feae2484-3f6c-4057-a807-cfed635e8011-yUUfJwSY-20200501151521.mp4

Approach adapted from an older version of HuggingFace's


`run_glue.py` script. Accessible [here](https://github.com/huggingface/transformers/blob/5bfcd0485ece086ebcbed2d008813037968a9e58/examples/run_glue.py#L128).

the huggingface one is outdated here, but we can test how it works,

In [None]:
import random

seed_val = 17
random.seed(seed_val)
np.random.seed(seed_val)
torch.manual_seed(seed_val)
torch.cuda.manual_seed_all(seed_val)

In [None]:
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model.to(device)

print(device)

cuda


In [None]:
"""
This evaluate function

"""


def evaluate(dataloader_val):

    model.eval()

    loss_val_total = 0
    predictions, true_vals = [], []

    for batch in dataloader_val: # 1st time runing like:  for batch in dataloader_val
                        # before run task 10, change it to: for batch in tqdm(dataloader_val)
        batch = tuple(b.to(device) for b in batch)

        inputs = {'input_ids':      batch[0],
                  'attention_mask': batch[1],
                  'labels':         batch[2],
                 }

        with torch.no_grad():
            outputs = model(**inputs)

        loss = outputs[0]
        logits = outputs[1]
        loss_val_total += loss.item()

        logits = logits.detach().cpu().numpy()
        label_ids = inputs['labels'].cpu().numpy()
        predictions.append(logits)
        true_vals.append(label_ids)

    loss_val_avg = loss_val_total/len(dataloader_val)

    predictions = np.concatenate(predictions, axis=0)
    true_vals = np.concatenate(true_vals, axis=0)

    return loss_val_avg, predictions, true_vals


In [None]:
!mkdir Models

In [None]:
"""
1. send model in training mode "model.train()"
2. set our training loss to 0 to start, each epoch will get an average training loss
3. use progress_bar method for artic UDM (see how many batches have been trained, and how many to go)
4. for each epoch, we are gonna use batches


"""
for epoch in tqdm(range(1, epochs+1)):
    model.train()
    loss_train_total = 0
    progress_bar = tqdm(data_loader_train,
               desc= 'Epoch {:1d}'.format(epoch),
               leave=False,
               disable = False) # disable flag

    for batch in progress_bar:
      model.zero_grad() # set the gradient to 0, this is the standard pytorch procedure

      # dataloader has 3 different variables
      batch = tuple(b.to(device) for b in batch) # ensure each individual item of the tuple is on the correct device

      # based on what the bert model can accept, here are three papras as bert inputs
      input = {
          'input_ids': batch[0], # 1st item in tuple
          'attention_mask': batch[1], # atention is calculated by using
          'labels': batch [2]
      }

      # now get outputs
      outputs = model(**input) # run model, give inputs, ** unpack the dictionary and straghit to the inputs

      loss = outputs[0] # output[0] would be the last, it returns as tuple
      loss_train_total += loss.item() # add up the loss
      loss.backward() # do back propergate

      torch.nn.utils.clip_grad_norm_(model.parameters(), 1.0) # take the gradient and give it a norm value, avoid extrem value

      optimizer.step() # step optimizer
      scheduler.step() # step scheduler

      progress_bar.set_postfix({'training_loss': '{:.3f}'.format(loss.item()/len(batch))}) # update the oprogress bar


    # outside the batch loop, inside the epoch loop, we need to save the model in every epoch
    torch.save(model.state_dict(), f'Models/BERT_ft_epoch{epochs}.model')

    tqdm.write('\nEpoch {epoch}') # report to couple things, which epoch we are on

    loss_train_avg = loss_train_total / len(data_loader_val) #
    tqdm.write(f'Training Loss: {loss_train_avg}')

    val_loss, predictions, true_vals = evaluate(data_loader_val)
    val_f1 = f1_score_func(predictions,true_vals)
    tqdm.write(f'Validation Loss: {val_loss}')
    tqdm.write(f'F1 Score (weighted): {val_f1}')



  0%|          | 0/10 [00:00<?, ?it/s]

Epoch 1:   0%|          | 0/40 [00:00<?, ?it/s]


Epoch {epoch}
Training Loss: 1.3363429041845458
Validation Loss: 0.5395298962082181
F1 Score (weighted): 0.8321753551372641


Epoch 2:   0%|          | 0/40 [00:00<?, ?it/s]


Epoch {epoch}
Training Loss: 1.3083236691142832
Validation Loss: 0.5410180624042239
F1 Score (weighted): 0.8321753551372641


Epoch 3:   0%|          | 0/40 [00:00<?, ?it/s]


Epoch {epoch}
Training Loss: 1.365514884037631
Validation Loss: 0.5396051704883575
F1 Score (weighted): 0.8321753551372641


Epoch 4:   0%|          | 0/40 [00:00<?, ?it/s]


Epoch {epoch}
Training Loss: 1.2939832673541136
Validation Loss: 0.5411584973335266
F1 Score (weighted): 0.8321753551372641


Epoch 5:   0%|          | 0/40 [00:00<?, ?it/s]


Epoch {epoch}
Training Loss: 1.3283305083002364
Validation Loss: 0.5392290864671979
F1 Score (weighted): 0.8321753551372641


Epoch 6:   0%|          | 0/40 [00:00<?, ?it/s]


Epoch {epoch}
Training Loss: 1.324245452348675
Validation Loss: 0.5403023787907192
F1 Score (weighted): 0.8321753551372641


Epoch 7:   0%|          | 0/40 [00:00<?, ?it/s]


Epoch {epoch}
Training Loss: 1.314737833504166
Validation Loss: 0.539145861353193
F1 Score (weighted): 0.8321753551372641


Epoch 8:   0%|          | 0/40 [00:00<?, ?it/s]


Epoch {epoch}
Training Loss: 1.3099551179579325
Validation Loss: 0.5396607560770852
F1 Score (weighted): 0.8321753551372641


Epoch 9:   0%|          | 0/40 [00:00<?, ?it/s]


Epoch {epoch}
Training Loss: 1.4064543055636543
Validation Loss: 0.5395066780703408
F1 Score (weighted): 0.8321753551372641


Epoch 10:   0%|          | 0/40 [00:00<?, ?it/s]


Epoch {epoch}
Training Loss: 1.288321528583765
Validation Loss: 0.5396328078848975
F1 Score (weighted): 0.8321753551372641


## Task 10: Loading and Evaluating our Model

In [None]:
model = BertForSequenceClassification.from_pretrained("bert-base-uncased",
                            num_labels=len(label_dictionary),
                            output_attentions=False,
                            output_hidden_states=False)

# reload the model

Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


In [None]:
model.to(device)
pass

# put the model to the device

In [None]:
model.load_state_dict(
    torch.load('/content/Models/BERT_ft_epoch10.model')
)

<All keys matched successfully>

In [None]:
# use evaluation function one mroe time
# don't care about the loss anymore, only care about prediction and true value
_, predictions, true_vals = evaluate(data_loader_val)



  0%|          | 0/7 [00:00<?, ?it/s]

In [None]:
# look at accuracy class we built earlier
accuracy_per_class(predictions, true_vals)

Class:happy
Accuracy:164/171

Class:not-relevant
Accuracy:19/32

Class:angry
Accuracy:8/9

Class:disgust
Accuracy:0/1

Class:sad
Accuracy:0/5

Class:surprise
Accuracy:0/5

