# Project Part 3

[![Kaggle](https://kaggle.com/static/images/open-in-kaggle.svg)](https://kaggle.com/kernels/welcome?src=https://github.com/cbarond/cs39aa_project/blob/main/project_part3.ipynb)

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/cbarond/cs39aa_project/blob/main/project_part3.ipynb)


In [1]:
import numpy as np
import pandas as pd
import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification,  TrainingArguments, Trainer
from datasets import load_dataset, load_metric
import wandb

wandb.init(project="cs39aa-final")
wandb.config = {
  "learning_rate": 0.001,
  "epochs": 45,
  "batch_size": 128
}

dataset = load_dataset("emotion")
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f'Device: {device}')

[34m[1mwandb[0m: Logging into wandb.ai. (Learn how to deploy a W&B server locally: https://wandb.me/wandb-server)
[34m[1mwandb[0m: You can find your API key in your browser here: https://wandb.ai/authorize
[34m[1mwandb[0m: Paste an API key from your profile and hit enter, or press ctrl+c to quit:

  ········································


[34m[1mwandb[0m: Appending key for api.wandb.ai to your netrc file: /root/.netrc


Downloading builder script:   0%|          | 0.00/1.66k [00:00<?, ?B/s]

Downloading metadata:   0%|          | 0.00/1.61k [00:00<?, ?B/s]

Downloading and preparing dataset emotion/default (download: 1.97 MiB, generated: 2.07 MiB, post-processed: Unknown size, total: 4.05 MiB) to /root/.cache/huggingface/datasets/emotion/default/0.0.0/348f63ca8e27b3713b6c04d723efe6d824a56fb3d1449794716c0f0296072705...


Downloading data:   0%|          | 0.00/1.66M [00:00<?, ?B/s]

Downloading data:   0%|          | 0.00/204k [00:00<?, ?B/s]

Downloading data:   0%|          | 0.00/207k [00:00<?, ?B/s]

Generating train split:   0%|          | 0/16000 [00:00<?, ? examples/s]

Generating validation split:   0%|          | 0/2000 [00:00<?, ? examples/s]

Generating test split:   0%|          | 0/2000 [00:00<?, ? examples/s]

Dataset emotion downloaded and prepared to /root/.cache/huggingface/datasets/emotion/default/0.0.0/348f63ca8e27b3713b6c04d723efe6d824a56fb3d1449794716c0f0296072705. Subsequent calls will reuse this data.


  0%|          | 0/3 [00:00<?, ?it/s]

Device: cuda


I used bert-base-uncased and froze all layers except the final.

In [2]:
MODEL_NAME = "bert-base-cased"
MAX_LENGTH=50

tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
model = AutoModelForSequenceClassification.from_pretrained(MODEL_NAME, num_labels=6, max_length=MAX_LENGTH, output_attentions=False, output_hidden_states=False).to(device)

for param in model.bert.parameters():
    param.requires_grad = False

print(f'Model set: {MODEL_NAME}')

Downloading:   0%|          | 0.00/29.0 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/570 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/208k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/426k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/416M [00:00<?, ?B/s]

Some weights of the model checkpoint at bert-base-cased were not used when initializing BertForSequenceClassification: ['cls.seq_relationship.weight', 'cls.seq_relationship.bias', 'cls.predictions.decoder.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.bias', 'cls.predictions.transform.dense.bias', 'cls.predictions.transform.LayerNorm.weight']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at b

Model set: bert-base-cased


In [3]:
def tokenize_function(examples):
    return tokenizer(examples['text'], padding="max_length", truncation=True, max_length=MAX_LENGTH)

ds_train = dataset['train'].map(tokenize_function, batched=True)
ds_test = dataset['test'].map(tokenize_function, batched=True)
ds_valid = dataset['validation'].map(tokenize_function, batched=True)

  0%|          | 0/16 [00:00<?, ?ba/s]

  0%|          | 0/2 [00:00<?, ?ba/s]

  0%|          | 0/2 [00:00<?, ?ba/s]

In [4]:
ds_train = ds_train.shuffle(seed=42)
ds_valid = ds_valid.shuffle(seed=42)
ds_test = ds_test.shuffle(seed=42)

I loaded the training arguments with the config set earlier.

In [5]:
import os
os.environ["WANDB_DISABLED"] = "false"

metric = load_metric("accuracy")

def compute_metrics(eval_pred):
    logits, labels = eval_pred
    predictions = np.argmax(logits, axis=-1)
    return metric.compute(predictions=predictions, references=labels)

training_args = TrainingArguments(#num_train_epochs=100,
                                  num_train_epochs=wandb.config['epochs'],
                                  do_train=True,
                                  report_to='wandb',
                                  output_dir='/kaggle/working',
                                  learning_rate=wandb.config['learning_rate'],
                                  #learning_rate=.001,
                                  per_device_train_batch_size=wandb.config['batch_size'],
                                  #per_device_train_batch_size=128,
                                  per_device_eval_batch_size=16,
                                  save_strategy='epoch',
                                  evaluation_strategy='epoch'
                                  #eval_steps=500,
                                  )

trainer = Trainer(model = model, 
                  args = training_args,
                  train_dataset = ds_train, 
                  eval_dataset = ds_valid,
                  compute_metrics = compute_metrics,
)

torch.set_grad_enabled(True)
print('Training...')
trainer.train()
print('Evaluating...')
trainer.evaluate()

Downloading builder script:   0%|          | 0.00/1.41k [00:00<?, ?B/s]

The following columns in the training set don't have a corresponding argument in `BertForSequenceClassification.forward` and have been ignored: text. If text are not expected by `BertForSequenceClassification.forward`,  you can safely ignore this message.
***** Running training *****
  Num examples = 16000
  Num Epochs = 45
  Instantaneous batch size per device = 128
  Total train batch size (w. parallel, distributed & accumulation) = 128
  Gradient Accumulation steps = 1
  Total optimization steps = 5625
Automatic Weights & Biases logging enabled, to disable set os.environ["WANDB_DISABLED"] = "true"


Training...


Epoch,Training Loss,Validation Loss,Accuracy
1,No log,1.559422,0.341
2,No log,1.486302,0.4425
3,No log,1.46942,0.4345
4,1.514400,1.456451,0.4485
5,1.514400,1.422071,0.478
6,1.514400,1.443534,0.4435
7,1.514400,1.410064,0.482
8,1.448100,1.402235,0.489
9,1.448100,1.428491,0.449
10,1.448100,1.388676,0.4805


The following columns in the evaluation set don't have a corresponding argument in `BertForSequenceClassification.forward` and have been ignored: text. If text are not expected by `BertForSequenceClassification.forward`,  you can safely ignore this message.
***** Running Evaluation *****
  Num examples = 2000
  Batch size = 16
Saving model checkpoint to /kaggle/working/checkpoint-125
Configuration saved in /kaggle/working/checkpoint-125/config.json
Model weights saved in /kaggle/working/checkpoint-125/pytorch_model.bin
The following columns in the evaluation set don't have a corresponding argument in `BertForSequenceClassification.forward` and have been ignored: text. If text are not expected by `BertForSequenceClassification.forward`,  you can safely ignore this message.
***** Running Evaluation *****
  Num examples = 2000
  Batch size = 16
Saving model checkpoint to /kaggle/working/checkpoint-250
Configuration saved in /kaggle/working/checkpoint-250/config.json
Model weights saved in

Evaluating...


{'eval_loss': 1.317842721939087,
 'eval_accuracy': 0.5155,
 'eval_runtime': 4.0148,
 'eval_samples_per_second': 498.153,
 'eval_steps_per_second': 31.135,
 'epoch': 45.0}

# Results

Due to hardware limitations and other issues, I was unable to run this for longer. With this in mind, I got the following results:

Accuracy: 0.5155        
Loss: 1.318

More data can be found [here](https://wandb.ai/cbaron/cs39aa-final/runs/2mdl02t9?workspace=user-cbaron)

<img src="images/W&B_Chart_Accuracy.png" alt="Accuracy" width="800"/>
<img src="images/W&B_Chart_Loss.png" alt="Loss" width="800"/>
