<a href="https://colab.research.google.com/github/harshil0217/BERT_headline_classifier_v2/blob/main/model_training.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
from google.colab import drive
drive.mount('/content/drive')
!pwd

Mounted at /content/drive
/content


In [2]:
!git clone https://github.com/harshil0217/BERT_headline_classifier_v2.git
import os
os.chdir('/content/BERT_headline_classifier_v2')

Cloning into 'BERT_headline_classifier_v2'...
remote: Enumerating objects: 66, done.[K
remote: Counting objects: 100% (66/66), done.[K
remote: Compressing objects: 100% (48/48), done.[K
remote: Total 66 (delta 30), reused 42 (delta 16), pack-reused 0[K
Receiving objects: 100% (66/66), 967.18 KiB | 21.03 MiB/s, done.
Resolving deltas: 100% (30/30), done.


In [3]:
#import needed libraries

import pandas as pd
import numpy as np
import torch

from torch.utils.data.dataset import Dataset
from transformers import TrainingArguments, Trainer, AutoTokenizer, AutoModelForSequenceClassification
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score
from itertools import product

In [4]:
train = pd.read_csv('./data/train.csv')
test = pd.read_csv('./data/test.csv')

In [5]:
train.head()

Unnamed: 0,text,sentiment
0,A survey conducted by Taloustutkimus for Sampo...,negative
1,The total value of the project is estimated to...,neutral
2,"In the first half of 2008 , the Bank 's operat...",negative
3,"In July-September 2009 , Konecranes ' sales de...",negative
4,Earnings per share ( EPS ) amounted to a loss ...,negative


In [6]:
train_labels = train['sentiment']
test_labels = test['sentiment']

In [7]:
#encode labels with get dummies
train_labels = pd.get_dummies(train_labels)
test_labels = pd.get_dummies(test_labels)

In [8]:
train_labels

Unnamed: 0,negative,neutral,positive
0,True,False,False
1,False,True,False
2,True,False,False
3,True,False,False
4,True,False,False
...,...,...,...
1444,True,False,False
1445,False,False,True
1446,False,True,False
1447,False,False,True


In [9]:
#convert train and test labels to float
train_labels = train_labels.astype(float)
test_labels = test_labels.astype(float)
train_labels

Unnamed: 0,negative,neutral,positive
0,1.0,0.0,0.0
1,0.0,1.0,0.0
2,1.0,0.0,0.0
3,1.0,0.0,0.0
4,1.0,0.0,0.0
...,...,...,...
1444,1.0,0.0,0.0
1445,0.0,0.0,1.0
1446,0.0,1.0,0.0
1447,0.0,0.0,1.0


In [10]:
train_labels = train_labels.values.tolist()
test_labels = test_labels.values.tolist()

In [11]:
train_texts = train['text'].to_list()
test_texts = test['text'].to_list()

In [12]:
tokenizer = AutoTokenizer.from_pretrained('bert-base-cased')

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json:   0%|          | 0.00/49.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/570 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/213k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/436k [00:00<?, ?B/s]

In [13]:
train_encodings = tokenizer(train_texts, truncation=True, padding=True)
test_encodings = tokenizer(test_texts, truncation=True, padding=True)

In [14]:
#create dataset for headline classifier data

class HeadlineDataset(Dataset):
    def __init__(self, encodings, labels):
        self.encodings = encodings
        self.labels = labels

    def __len__(self):
        return len(self.labels)

    def __getitem__(self, idx):
        item = {key: torch.tensor(val[idx]) for key, val in self.encodings.items()}
        item['labels'] = torch.tensor(self.labels[idx])
        return item

In [15]:
train_dataset = HeadlineDataset(train_encodings, train_labels)
test_dataset = HeadlineDataset(test_encodings, test_labels)

In [16]:
# add compute metrics
def compute_metrics(pred):
# Convert logits to probabilities
    logits = pred.predictions
    probs = torch.sigmoid(torch.tensor(logits))

    # Convert probabilities to binary predictions
    preds = np.where(probs >= 0.5, 1, 0)

    # True labels
    labels = pred.label_ids

    # Calculate accuracy for multi-label classification
    accuracy = accuracy_score(labels, preds)

    # Calculate precision, recall, and F1-score
    precision = precision_score(labels, preds, average='weighted')
    recall = recall_score(labels, preds, average='weighted')
    f1 = f1_score(labels, preds, average='weighted')

    return {
        'accuracy': accuracy,
        'precision': precision,
        'recall': recall,
        'f1': f1
    }

In [17]:
#load model

model = AutoModelForSequenceClassification.from_pretrained('bert-base-cased',
                                                           problem_type = 'multi_label_classification',
                                                           num_labels=3)

training_args = TrainingArguments(
    output_dir='.',
    evaluation_strategy="epoch",
    per_device_train_batch_size=16,
    per_device_eval_batch_size = 16,
    num_train_epochs=8
)

trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=test_dataset,
    compute_metrics=compute_metrics
)

model.safetensors:   0%|          | 0.00/436M [00:00<?, ?B/s]

Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-cased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


In [18]:
trainer.train()

Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,No log,0.284317,0.820937,0.851417,0.820937,0.832151
2,No log,0.350541,0.826446,0.838612,0.829201,0.831569
3,No log,0.362426,0.84573,0.845904,0.853994,0.849566
4,No log,0.393808,0.84022,0.847481,0.842975,0.844212
5,No log,0.395302,0.85124,0.856468,0.85124,0.8535
6,0.150900,0.452779,0.85124,0.852798,0.853994,0.853292
7,0.150900,0.478673,0.84573,0.849222,0.84573,0.846986
8,0.150900,0.477423,0.842975,0.847882,0.848485,0.84804


TrainOutput(global_step=728, training_loss=0.10814886630236448, metrics={'train_runtime': 59.643, 'train_samples_per_second': 194.357, 'train_steps_per_second': 12.206, 'total_flos': 559962908985312.0, 'train_loss': 0.10814886630236448, 'epoch': 8.0})

In [19]:
#check accuracy with testing data

results = trainer.evaluate()
results

{'eval_loss': 0.4774226248264313,
 'eval_accuracy': 0.8429752066115702,
 'eval_precision': 0.8478819667122157,
 'eval_recall': 0.8484848484848485,
 'eval_f1': 0.8480396903229883,
 'eval_runtime': 0.4642,
 'eval_samples_per_second': 781.993,
 'eval_steps_per_second': 49.548,
 'epoch': 8.0}

In [20]:
best_accuracy = 0
best_hyperparams = {}

learning_rates = [1e-5, 2e-5, 3e-5, 5e-5]
batch_sizes = [8, 16, 32]

for learning_rate, batch_size in product(learning_rates, batch_sizes):
  training_args = TrainingArguments(
    output_dir='.',
    learning_rate=learning_rate,
    evaluation_strategy="epoch",
    per_device_train_batch_size=batch_size,
    per_device_eval_batch_size = batch_size,
    num_train_epochs=8
  )

  model = AutoModelForSequenceClassification.from_pretrained('bert-base-cased',
                                                           problem_type = 'multi_label_classification',
                                                           num_labels=3)

  trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=test_dataset,
    compute_metrics=compute_metrics
  )

  trainer.train()

  eval_results = trainer.evaluate()

  if eval_results['eval_accuracy'] > best_accuracy:
        best_accuracy = eval_results['eval_accuracy']
        best_hyperparams = {
            "learning_rate": learning_rate,
            "batch_size": batch_size,
        }




Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-cased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,No log,0.320648,0.804408,0.847657,0.807163,0.823279
2,No log,0.264146,0.820937,0.848086,0.823691,0.832024
3,0.330000,0.270259,0.848485,0.855611,0.856749,0.856011
4,0.330000,0.307048,0.870523,0.875869,0.873278,0.873199
5,0.330000,0.330691,0.85124,0.855728,0.853994,0.854753
6,0.084300,0.351051,0.862259,0.867094,0.865014,0.865413
7,0.084300,0.358978,0.862259,0.865116,0.865014,0.864427
8,0.084300,0.361588,0.859504,0.865334,0.862259,0.86307


Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-cased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,No log,0.43045,0.556474,0.878263,0.556474,0.547373
2,No log,0.304518,0.804408,0.856526,0.826446,0.838957
3,No log,0.249448,0.853994,0.879533,0.856749,0.867248
4,No log,0.257452,0.859504,0.871876,0.867769,0.867951
5,No log,0.261646,0.848485,0.862302,0.856749,0.858466
6,0.276000,0.260421,0.876033,0.887755,0.878788,0.882863
7,0.276000,0.277274,0.865014,0.874445,0.865014,0.869482
8,0.276000,0.2801,0.862259,0.872178,0.865014,0.868318


Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-cased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,No log,0.535242,0.366391,0.468656,0.366391,0.410285
2,No log,0.384884,0.730028,0.842496,0.749311,0.787125
3,No log,0.31272,0.831956,0.855528,0.84573,0.849755
4,No log,0.272463,0.831956,0.864857,0.84022,0.851875
5,No log,0.264893,0.842975,0.866009,0.853994,0.859573
6,No log,0.255247,0.85124,0.863535,0.859504,0.861413
7,No log,0.255228,0.84573,0.862092,0.856749,0.859391
8,No log,0.255076,0.856749,0.869969,0.865014,0.867313


  _warn_prf(average, modifier, msg_start, len(result))


Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-cased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,No log,0.262756,0.826446,0.859259,0.829201,0.839732
2,No log,0.28601,0.84022,0.859956,0.85124,0.854293
3,0.261500,0.289519,0.876033,0.880206,0.876033,0.877663
4,0.261500,0.342698,0.859504,0.863087,0.859504,0.860385
5,0.261500,0.398315,0.856749,0.864879,0.859504,0.861723
6,0.041800,0.439894,0.853994,0.855618,0.859504,0.857497
7,0.041800,0.445283,0.856749,0.858724,0.862259,0.860371
8,0.041800,0.445363,0.853994,0.858275,0.859504,0.858669


Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-cased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,No log,0.305765,0.831956,0.867191,0.837466,0.849647
2,No log,0.245564,0.853994,0.872459,0.853994,0.862384
3,No log,0.276848,0.842975,0.855793,0.848485,0.852093
4,No log,0.277765,0.859504,0.86446,0.859504,0.861404
5,No log,0.310906,0.862259,0.867273,0.862259,0.864329
6,0.190700,0.333021,0.856749,0.860089,0.865014,0.862485
7,0.190700,0.350823,0.853994,0.864808,0.856749,0.860023
8,0.190700,0.354554,0.862259,0.865197,0.865014,0.864981


Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-cased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,No log,0.399842,0.688705,0.83067,0.705234,0.734471
2,No log,0.297017,0.798898,0.844793,0.812672,0.825527
3,No log,0.248002,0.856749,0.878883,0.862259,0.86927
4,No log,0.260346,0.859504,0.868491,0.859504,0.862236
5,No log,0.27741,0.84573,0.857964,0.848485,0.851356
6,No log,0.281369,0.853994,0.864952,0.859504,0.862159
7,No log,0.287969,0.856749,0.867677,0.862259,0.864861
8,No log,0.290506,0.865014,0.873008,0.867769,0.870118


Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-cased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,No log,0.282809,0.84022,0.85712,0.84022,0.83983
2,No log,0.335301,0.853994,0.856855,0.856749,0.85668
3,0.245000,0.373045,0.862259,0.864088,0.862259,0.86198
4,0.245000,0.410856,0.853994,0.85512,0.853994,0.854208
5,0.245000,0.460304,0.84573,0.847945,0.85124,0.849241
6,0.042700,0.478967,0.853994,0.85275,0.853994,0.853163
7,0.042700,0.473512,0.85124,0.855598,0.85124,0.853245
8,0.042700,0.481657,0.856749,0.855839,0.856749,0.856202


Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-cased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,No log,0.285618,0.84573,0.859607,0.85124,0.853716
2,No log,0.256769,0.856749,0.87172,0.862259,0.86612
3,No log,0.278763,0.853994,0.869515,0.856749,0.86232
4,No log,0.273938,0.856749,0.871525,0.862259,0.866647
5,No log,0.347325,0.856749,0.868029,0.856749,0.860319
6,0.166600,0.333893,0.865014,0.873082,0.867769,0.869912
7,0.166600,0.347713,0.862259,0.872125,0.865014,0.868178
8,0.166600,0.355931,0.859504,0.869225,0.859504,0.863826


Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-cased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,No log,0.344663,0.787879,0.843484,0.793388,0.815095
2,No log,0.275533,0.820937,0.862762,0.823691,0.838293
3,No log,0.260506,0.84573,0.854636,0.85124,0.85189
4,No log,0.300045,0.848485,0.862464,0.848485,0.853021
5,No log,0.302611,0.856749,0.860951,0.867769,0.863629
6,No log,0.341163,0.831956,0.839687,0.837466,0.838228
7,No log,0.337758,0.859504,0.86551,0.865014,0.864766
8,No log,0.342036,0.865014,0.871286,0.870523,0.870001


Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-cased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,No log,0.30365,0.831956,0.86646,0.837466,0.833721
2,No log,0.315333,0.84573,0.854926,0.865014,0.859808
3,0.249800,0.397209,0.85124,0.849874,0.85124,0.850062
4,0.249800,0.454585,0.85124,0.85505,0.85124,0.852651
5,0.249800,0.440562,0.856749,0.863083,0.856749,0.859492
6,0.043400,0.478828,0.856749,0.857251,0.859504,0.858029
7,0.043400,0.506937,0.856749,0.860927,0.859504,0.860124
8,0.043400,0.506386,0.853994,0.859896,0.853994,0.856845


Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-cased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,No log,0.285064,0.829201,0.848461,0.837466,0.83891
2,No log,0.298157,0.848485,0.858663,0.859504,0.858432
3,No log,0.363455,0.853994,0.853688,0.856749,0.854377
4,No log,0.384946,0.848485,0.852195,0.859504,0.855376
5,No log,0.447861,0.831956,0.842161,0.842975,0.841831
6,0.146400,0.474724,0.842975,0.846157,0.84573,0.845933
7,0.146400,0.490471,0.85124,0.853881,0.85124,0.852517
8,0.146400,0.496257,0.85124,0.853881,0.85124,0.852517


Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-cased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,No log,0.310141,0.815427,0.852496,0.815427,0.831431
2,No log,0.258695,0.823691,0.859294,0.823691,0.84101
3,No log,0.268196,0.85124,0.853446,0.853994,0.853591
4,No log,0.340532,0.84022,0.846778,0.842975,0.844707
5,No log,0.357815,0.85124,0.856396,0.85124,0.850179
6,No log,0.395684,0.842975,0.848348,0.848485,0.848407
7,No log,0.403608,0.84573,0.848705,0.85124,0.849856
8,No log,0.406576,0.842975,0.845825,0.84573,0.845486


In [21]:
print(f"Best Hyperparameters: {best_hyperparams}")
print(f"Best Accuracy: {best_accuracy}")

Best Hyperparameters: {'learning_rate': 2e-05, 'batch_size': 32}
Best Accuracy: 0.8650137741046832
