<a href="https://colab.research.google.com/github/AtharvBhat/PIAYN/blob/main/PIAYN_TextLRABaseline_3k.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Set-up environment

As usual, we first install HuggingFace Transformers, and Datasets.

In [1]:
!pip install -q git+https://github.com/huggingface/transformers.git

  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
    Preparing wheel metadata ... [?25l[?25hdone


In [2]:
!pip install -q datasets

In [3]:
!pip install ml_collections
!pip install absl-py



In [4]:
!git clone https://github.com/google-research/long-range-arena.git
!git clone https://github.com/mlpen/Nystromformer.git
!git clone https://github.com/AtharvBhat/PIAYN.git

fatal: destination path 'long-range-arena' already exists and is not an empty directory.
fatal: destination path 'Nystromformer' already exists and is not an empty directory.
fatal: destination path 'PIAYN' already exists and is not an empty directory.


## Set up output folders

In [5]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [6]:
#Set the task
task = 'text'

#Set Seq Length
max_length = 3000

#Set Seed
seed = 42

In [7]:
import os

baseDir = "/content/drive/MyDrive"
piaynDir = os.path.join(baseDir, "PIAYN")

if not os.path.exists(piaynDir):
  os.makedirs(piaynDir)

piaynTaskDir = os.path.join(piaynDir, task)

if not os.path.exists(piaynTaskDir):
  os.makedirs(piaynTaskDir)

piaynTaskModelDir = os.path.join(piaynTaskDir, "models")

if not os.path.exists(piaynTaskModelDir):
  os.makedirs(piaynTaskModelDir)

piaynTaskDataDir = os.path.join(piaynTaskDir, "data")

if not os.path.exists(piaynTaskDataDir):
  os.makedirs(piaynTaskDataDir)    

piaynTaskDataDirTokens = os.path.join(piaynTaskDataDir, str(max_length))

if not os.path.exists(piaynTaskDataDirTokens):
  os.makedirs(piaynTaskDataDirTokens) 

piaynTaskModelDirTokens = os.path.join(piaynTaskModelDir, str(max_length))

if not os.path.exists(piaynTaskModelDirTokens):
  os.makedirs(piaynTaskModelDirTokens)  

## Prepare data

Here we take a small portion of the IMDB dataset, a binary text classification dataset ("is a movie review positive or negative?").

In [8]:
from datasets import load_dataset
#from Nystromformer.LRA.datasets import text
import pickle, numpy as np
import torch
import random

#Add all seeds to make it deterministic
if seed is not None:
  torch.manual_seed(seed)
  torch.cuda.manual_seed_all(seed)
  torch.backends.cudnn.deterministic = True
  torch.backends.cudnn.benchmark = False
  np.random.seed(seed)
  random.seed(seed)
  os.environ['PYTHONHASHSEED'] = str(seed)

#copy over the custom text.py file to the Nystromformer location
!cp PIAYN/text.py Nystromformer/LRA/datasets

#Create Train test and dev MAP files for the imdb dataset
!python Nystromformer/LRA/datasets/text.py --max_length=$max_length

train_file = task+'.train.pickle'
dev_file = task+'.dev.pickle'
test_file = task+'.test.pickle'

!cp $train_file $piaynTaskDataDirTokens
!cp $dev_file $piaynTaskDataDirTokens
!cp $test_file $piaynTaskDataDirTokens

max_length:  3000
I0415 02:19:09.706180 140362176395136 dataset_builder.py:811] No config specified, defaulting to first: imdb_reviews/plain_text
I0415 02:19:09.707057 140362176395136 dataset_info.py:361] Load dataset info from /root/tensorflow_datasets/imdb_reviews/plain_text/1.0.0
I0415 02:19:09.709865 140362176395136 dataset_builder.py:299] Reusing dataset imdb_reviews (/root/tensorflow_datasets/imdb_reviews/plain_text/1.0.0)
I0415 02:19:09.710001 140362176395136 dataset_builder.py:511] Constructing tf.data.Dataset for split None, from /root/tensorflow_datasets/imdb_reviews/plain_text/1.0.0
2022-04-15 02:19:10.898373: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:39] Overriding allow_growth setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0.
I0415 02:19:11.132364 140362176395136 input_pipeline.py:49] Data sample: {'label': 1, 'text': b'As others have mentioned, all the women that go nude in this film are mostly absolute

In [9]:
from Nystromformer.LRA.code import lra_config
from Nystromformer.LRA.code.dataset import LRADataset
#from Nystromformer.LRA.code.run_tasks import training_config
from torch.utils.data import DataLoader, RandomSampler

#get training config
training_config = lra_config.config[task]["training"]

#Update training config
#training_config["learning_rate"] = 0.05
training_config["weight_decay"] = 0.1
#training_config["eval_frequency"] = 1000

#Check Train Config
print('Training Config: ', training_config)

#get pre-defined model config
model_config = lra_config.config[task]['model']

#Check model Config
print('Model Config: ', model_config)

#Get the dataset
train_dataset = LRADataset(piaynTaskDataDirTokens + '/' + train_file, False)
val_dataset = LRADataset(piaynTaskDataDirTokens + '/' + dev_file, False)
test_dataset = LRADataset(piaynTaskDataDirTokens + '/' + test_file, False)

#Create DataLoader iterators
ds_iter = {
    "train":enumerate(DataLoader(train_dataset, 
                                 #Sample batches randomly for number of specified steps
                                 sampler = RandomSampler(train_dataset, 
                                                         replacement=True, 
                                                         num_samples= training_config["num_train_steps"]*lra_config.config[task]['dataset']['train']), 
                                 batch_size = training_config["batch_size"], 
                                 drop_last = True)),
    "dev":enumerate(DataLoader(val_dataset, batch_size = 32, drop_last = True)),
    "test":enumerate(DataLoader(test_dataset, batch_size = 32, drop_last = True)),
}


Training Config:  {'batch_size': 32, 'learning_rate': 0.0001, 'warmup': 8000, 'lr_decay': 'linear', 'weight_decay': 0.1, 'eval_frequency': 500, 'num_train_steps': 20000, 'num_eval_steps': 781}
Model Config:  {'learn_pos_emb': True, 'tied_weights': False, 'embedding_dim': 64, 'transformer_dim': 64, 'transformer_hidden_dim': 128, 'head_dim': 32, 'num_head': 2, 'num_layers': 2, 'vocab_size': 512, 'max_seq_len': 4000, 'dropout_prob': 0.1, 'attention_dropout': 0.1, 'pooling_mode': 'MEAN', 'num_classes': 2}
Loaded /content/drive/MyDrive/PIAYN/text/data/3000/text.train.pickle... size=25000
Loaded /content/drive/MyDrive/PIAYN/text/data/3000/text.dev.pickle... size=25000
Loaded /content/drive/MyDrive/PIAYN/text/data/3000/text.test.pickle... size=25000


In [10]:
#Check sizes of batches
batch = next((ds_iter['train']))
for k,v in batch[1].items():
  print(k,v.shape)

input_ids_0 torch.Size([32, 3024])
mask_0 torch.Size([32, 3024])
label torch.Size([32])


## Define model

Next, we define our model, and put it on the GPU.

In [11]:
from transformers import PerceiverForSequenceClassification
import torch

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

In [12]:
from transformers import PerceiverConfig
#get default perceiver config
configuration = PerceiverConfig()

#Update the Perceiver configurations with Preset model configs
#configuration.update(model_config)

#Print Updated Perceiver Configuration
print(configuration)

PerceiverConfig {
  "attention_probs_dropout_prob": 0.1,
  "audio_samples_per_frame": 1920,
  "cross_attention_shape_for_attention": "kv",
  "cross_attention_widening_factor": 1,
  "d_latents": 1280,
  "d_model": 768,
  "hidden_act": "gelu",
  "image_size": 56,
  "initializer_range": 0.02,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 2048,
  "model_type": "perceiver",
  "num_blocks": 1,
  "num_cross_attention_heads": 8,
  "num_frames": 16,
  "num_latents": 256,
  "num_self_attends_per_block": 26,
  "num_self_attention_heads": 8,
  "output_shape": [
    1,
    16,
    224,
    224
  ],
  "qk_channels": null,
  "samples_per_patch": 16,
  "self_attention_widening_factor": 1,
  "train_size": [
    368,
    496
  ],
  "transformers_version": "4.19.0.dev0",
  "use_query_residual": true,
  "v_channels": null,
  "vocab_size": 262
}



In [13]:
def initialize_model(config):
  #Initialize Model
  model = PerceiverForSequenceClassification(config)
  
  #Get Model Parameter Counts
  pytorch_total_params = sum(p.numel() for p in model.parameters())
  pytorch_total_params_Trainable = sum(p.numel() for p in model.parameters() if p.requires_grad)
  print('Total Parameters: ', pytorch_total_params, '\nTrainable Parameters: ', pytorch_total_params_Trainable)  

  return model#.to(device)                                          

In [14]:
#Change the perceiver configurations to get total parameters within 10% of BERT
configuration.num_labels = 2
configuration.num_self_attends_per_block = 6
configuration.d_latents = 512
configuration.d_model = 512
configuration.num_layers = 6
if max_length>2000:
  configuration.max_position_embeddings = max_length + 24*np.power(2, int(max_length/1000)-1)



model = initialize_model(configuration)

Total Parameters:  14477826 
Trainable Parameters:  14477826


## Train the model

Here we train the model using native PyTorch.

In [15]:
device

device(type='cuda')

In [16]:
from transformers import AdamW
#from torch.optim import Adam
from tqdm.notebook import tqdm
from sklearn.metrics import accuracy_score
from datasets import load_metric
import pandas as pd

best_score = 0 
prev_score = 0
maxPatience = 5
currentPatience = 0

#steps = int(training_config["num_train_steps"]/20000)
steps = training_config["num_train_steps"]

optimizer = AdamW(model.parameters(), 
                  lr = training_config["learning_rate"],
                  betas = (0.9, 0.999), 
                  eps = 1e-6, 
                  weight_decay = training_config["weight_decay"])

lr_scheduler = torch.optim.lr_scheduler.OneCycleLR(
    optimizer = optimizer,
    max_lr = training_config["learning_rate"],
    pct_start = training_config["warmup"] / training_config["num_train_steps"],
    anneal_strategy = training_config["lr_decay"],
    total_steps = training_config["num_train_steps"]
)

#amp_scaler = torch.cuda.amp.GradScaler() if model_config["mixed_precision"] else None

#initialize training summary
trainingSummary = pd.DataFrame(columns=['step', 'mean_train_loss', 'mean_train_acc', 'val_acc'])


model.to(device)

#initialize training accuracy metric and loss list
train_accuracy = load_metric("accuracy")
loss_list = list()

for step in tqdm(range(steps)):  # Perform gradient updates for multiple steps
    
    model.train()
    
    #print("Step:", step)
    #for batch in tqdm(train_dataloader):
    batch = next(ds_iter['train'])[1]

    # get the inputs; 
    inputs = batch["input_ids_0"].to(device)
    attention_mask = batch["mask_0"].to(device)
    labels = batch["label"].to(device)

    # zero the parameter gradients
    optimizer.zero_grad()

    # forward + backward + optimize
    outputs = model(inputs=inputs, attention_mask=attention_mask, labels=labels)
    loss = outputs.loss
    loss.backward()
    optimizer.step()
    lr_scheduler.step()

    # evaluate
    predictions = outputs.logits.argmax(-1).cpu().detach().numpy()
    accuracy = accuracy_score(y_true=batch["label"].numpy(), y_pred=predictions)
    references = batch["label"].numpy()
    train_accuracy.add_batch(predictions=predictions, references=references)
    
    #Add to loss list
    loss_list.append(loss.item())

    #print(f"Loss: {loss.item()}, Accuracy: {accuracy}")

    #delete intermediate variables to free up GPU space
    del loss, outputs, inputs, attention_mask, labels, predictions, accuracy


    #Every 1000 steps validate and save model
    if (step+1)%training_config['eval_frequency']  == 0:
    #if (step+1)%2  == 0:
      
      model.eval()

      print('Validating at Step: ', step)

      val_accuracy = load_metric("accuracy")

      #reset dev iterator
      ds_iter['dev'] = enumerate(DataLoader(val_dataset, batch_size = 32, drop_last = True))

      with torch.no_grad():
        for i, batch in tqdm(ds_iter['dev']):
              
          # get the inputs; 
          inputs = batch["input_ids_0"].to(device)
          attention_mask = batch["mask_0"].to(device)
          labels = batch["label"].to(device)

          # forward pass
          outputs = model(inputs=inputs, attention_mask=attention_mask)
          logits = outputs.logits 
          predictions = logits.argmax(-1).cpu().detach().numpy()
          references = batch["label"].numpy()
          val_accuracy.add_batch(predictions=predictions, references=references)

          #delete intermediate variables to free up GPU space
          del logits, outputs, inputs, attention_mask, labels, predictions, references
      
      #Compute val accuracy
      final_val_score = val_accuracy.compute()['accuracy']

      #Compute training accuracy till now
      train_score = train_accuracy.compute()['accuracy']

      #Compute training loss till now
      train_loss = sum(loss_list)/len(loss_list)

      #Add to trainingSummary
      trainingSummary.loc[len(trainingSummary.index)] = [step+1, train_loss, train_score, final_val_score]

      #save training summary
      trainingSummary.to_csv(piaynTaskModelDirTokens + '/trainingSummaryToken'+str(max_length)+'.csv')

      #print progress
      print('Step: ', step+1, "\n\tAverage Train Loss: ", train_loss, "\n\tAverage Train Accuracy: ", train_score, "\n\tValidation Accuracy: ", final_val_score)

      #Save if performance better than best model
      if final_val_score >= best_score:
        best_score = final_val_score
        torch.save(model.to('cpu').state_dict(), piaynTaskModelDirTokens + '/trainedPerceiverClassifierToken'+str(max_length)+'.pkl')
        model.to(device)
      else:
        pass  

      #Stop training if patience limit reached
      if final_val_score <= prev_score:
        currentPatience += 1
        if currentPatience >= maxPatience:
          print('Patience Limit reached! Stopping early!')
          torch.save(model.to('cpu').state_dict(), piaynTaskModelDirTokens + '/trainedPerceiverClassifierStep_' + str(step + 1) + 'Token' + str(max_length) + '.pkl')
          break  
      else:
        currentPatience = 0
      
      #Update prev_score
      prev_score = final_val_score



  0%|          | 0/20000 [00:00<?, ?it/s]

Validating at Step:  499


0it [00:00, ?it/s]

Step:  500 
	Average Train Loss:  0.6783526061773301 
	Average Train Accuracy:  0.559625 
	Validation Accuracy:  0.569182138284251
Validating at Step:  999


0it [00:00, ?it/s]

Step:  1000 
	Average Train Loss:  0.6715034549236297 
	Average Train Accuracy:  0.5975625 
	Validation Accuracy:  0.5928697183098591
Validating at Step:  1499


0it [00:00, ?it/s]

Step:  1500 
	Average Train Loss:  0.663912217994531 
	Average Train Accuracy:  0.626125 
	Validation Accuracy:  0.5949903969270166
Validating at Step:  1999


0it [00:00, ?it/s]

Step:  2000 
	Average Train Loss:  0.658950630158186 
	Average Train Accuracy:  0.62875 
	Validation Accuracy:  0.6464868758002561
Validating at Step:  2499


0it [00:00, ?it/s]

Step:  2500 
	Average Train Loss:  0.6532003645181655 
	Average Train Accuracy:  0.6469375 
	Validation Accuracy:  0.6494078104993598
Validating at Step:  2999


0it [00:00, ?it/s]

Step:  3000 
	Average Train Loss:  0.6480370312531789 
	Average Train Accuracy:  0.652625 
	Validation Accuracy:  0.6477272727272727
Validating at Step:  3499


0it [00:00, ?it/s]

Step:  3500 
	Average Train Loss:  0.6435472640991211 
	Average Train Accuracy:  0.6639375 
	Validation Accuracy:  0.6212788092189501
Validating at Step:  3999


0it [00:00, ?it/s]

Step:  4000 
	Average Train Loss:  0.637449493162334 
	Average Train Accuracy:  0.68625 
	Validation Accuracy:  0.6426856594110115
Validating at Step:  4499


0it [00:00, ?it/s]

Step:  4500 
	Average Train Loss:  0.6324264850550227 
	Average Train Accuracy:  0.69125 
	Validation Accuracy:  0.6401648527528809
Validating at Step:  4999


0it [00:00, ?it/s]

Step:  5000 
	Average Train Loss:  0.6272587994873524 
	Average Train Accuracy:  0.7000625 
	Validation Accuracy:  0.6413252240717029
Validating at Step:  5499


0it [00:00, ?it/s]

Step:  5500 
	Average Train Loss:  0.6219743181358685 
	Average Train Accuracy:  0.71175 
	Validation Accuracy:  0.6384443021766966
Validating at Step:  5999


0it [00:00, ?it/s]

Step:  6000 
	Average Train Loss:  0.6175336322536071 
	Average Train Accuracy:  0.71525 
	Validation Accuracy:  0.6408450704225352
Validating at Step:  6499


0it [00:00, ?it/s]

Step:  6500 
	Average Train Loss:  0.612746187214668 
	Average Train Accuracy:  0.7220625 
	Validation Accuracy:  0.6336427656850192
Validating at Step:  6999


0it [00:00, ?it/s]

Step:  7000 
	Average Train Loss:  0.6068795856322561 
	Average Train Accuracy:  0.744875 
	Validation Accuracy:  0.6302016645326505
Validating at Step:  7499


0it [00:00, ?it/s]

Step:  7500 
	Average Train Loss:  0.6004117758353551 
	Average Train Accuracy:  0.759875 
	Validation Accuracy:  0.6332826504481434
Validating at Step:  7999


0it [00:00, ?it/s]

Step:  8000 
	Average Train Loss:  0.5941866580508649 
	Average Train Accuracy:  0.76525 
	Validation Accuracy:  0.6266005121638925
Validating at Step:  8499


0it [00:00, ?it/s]

Step:  8500 
	Average Train Loss:  0.5880669028899249 
	Average Train Accuracy:  0.7768125 
	Validation Accuracy:  0.622679257362356
Validating at Step:  8999


0it [00:00, ?it/s]

Step:  9000 
	Average Train Loss:  0.5816051018453307 
	Average Train Accuracy:  0.7873125 
	Validation Accuracy:  0.6175576184379001
Validating at Step:  9499


0it [00:00, ?it/s]

Step:  9500 
	Average Train Loss:  0.574121430354683 
	Average Train Accuracy:  0.80775 
	Validation Accuracy:  0.6215188860435339
Validating at Step:  9999


0it [00:00, ?it/s]

Step:  10000 
	Average Train Loss:  0.5662740436598659 
	Average Train Accuracy:  0.8233125 
	Validation Accuracy:  0.6146366837387964
Validating at Step:  10499


0it [00:00, ?it/s]

Step:  10500 
	Average Train Loss:  0.5579936775160688 
	Average Train Accuracy:  0.83775 
	Validation Accuracy:  0.6082346350832266
Validating at Step:  10999


0it [00:00, ?it/s]

Step:  11000 
	Average Train Loss:  0.5496187527809631 
	Average Train Accuracy:  0.84625 
	Validation Accuracy:  0.6139564660691421
Validating at Step:  11499


0it [00:00, ?it/s]

Step:  11500 
	Average Train Loss:  0.5404554431820693 
	Average Train Accuracy:  0.8666875 
	Validation Accuracy:  0.6102752880921894
Validating at Step:  11999


0it [00:00, ?it/s]

Step:  12000 
	Average Train Loss:  0.5312076716975619 
	Average Train Accuracy:  0.8790625 
	Validation Accuracy:  0.6129961587708067
Validating at Step:  12499


0it [00:00, ?it/s]

Step:  12500 
	Average Train Loss:  0.5218346219027042 
	Average Train Accuracy:  0.8876875 
	Validation Accuracy:  0.6131562099871959
Validating at Step:  12999


0it [00:00, ?it/s]

Step:  13000 
	Average Train Loss:  0.5121319928100476 
	Average Train Accuracy:  0.9003125 
	Validation Accuracy:  0.6098351472471191
Validating at Step:  13499


0it [00:00, ?it/s]

Step:  13500 
	Average Train Loss:  0.5023590875993724 
	Average Train Accuracy:  0.911875 
	Validation Accuracy:  0.6026728553137004
Validating at Step:  13999


0it [00:00, ?it/s]

Step:  14000 
	Average Train Loss:  0.49288830300367303 
	Average Train Accuracy:  0.9170625 
	Validation Accuracy:  0.6100352112676056
Validating at Step:  14499


0it [00:00, ?it/s]

Step:  14500 
	Average Train Loss:  0.48307512002412617 
	Average Train Accuracy:  0.9280625 
	Validation Accuracy:  0.6063140204865557
Validating at Step:  14999


0it [00:00, ?it/s]

Step:  15000 
	Average Train Loss:  0.47322898666883506 
	Average Train Accuracy:  0.9365625 
	Validation Accuracy:  0.6117157490396927
Validating at Step:  15499


0it [00:00, ?it/s]

Step:  15500 
	Average Train Loss:  0.46360833837776894 
	Average Train Accuracy:  0.9423125 
	Validation Accuracy:  0.6052736875800256
Validating at Step:  15999


0it [00:00, ?it/s]

Step:  16000 
	Average Train Loss:  0.4538252988391323 
	Average Train Accuracy:  0.9514375 
	Validation Accuracy:  0.6085547375160051
Validating at Step:  16499


0it [00:00, ?it/s]

Step:  16500 
	Average Train Loss:  0.4444375541460785 
	Average Train Accuracy:  0.955125 
	Validation Accuracy:  0.603393085787452
Validating at Step:  16999


0it [00:00, ?it/s]

Step:  17000 
	Average Train Loss:  0.4350739623882315 
	Average Train Accuracy:  0.961375 
	Validation Accuracy:  0.5987516005121639
Validating at Step:  17499


0it [00:00, ?it/s]

Step:  17500 
	Average Train Loss:  0.4258082277230386 
	Average Train Accuracy:  0.9673125 
	Validation Accuracy:  0.6067941741357235
Validating at Step:  17999


0it [00:00, ?it/s]

Step:  18000 
	Average Train Loss:  0.4165585307898517 
	Average Train Accuracy:  0.9730625 
	Validation Accuracy:  0.6095150448143406
Validating at Step:  18499


0it [00:00, ?it/s]

Step:  18500 
	Average Train Loss:  0.4076571742060116 
	Average Train Accuracy:  0.976125 
	Validation Accuracy:  0.6064340588988476
Validating at Step:  18999


0it [00:00, ?it/s]

Step:  19000 
	Average Train Loss:  0.39886368110994075 
	Average Train Accuracy:  0.979375 
	Validation Accuracy:  0.5977512804097311
Validating at Step:  19499


0it [00:00, ?it/s]

Step:  19500 
	Average Train Loss:  0.39040530609413504 
	Average Train Accuracy:  0.9813125 
	Validation Accuracy:  0.6078745198463509
Validating at Step:  19999


0it [00:00, ?it/s]

Step:  20000 
	Average Train Loss:  0.38242311142771507 
	Average Train Accuracy:  0.98125 
	Validation Accuracy:  0.6073943661971831


In [17]:
!nvidia-smi

Fri Apr 15 04:54:22 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03    Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla P100-PCIE...  Off  | 00000000:00:04.0 Off |                    0 |
| N/A   71C    P0    51W / 250W |   8697MiB / 16280MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

In [18]:
del model
torch.cuda.empty_cache()
!nvidia-smi

Fri Apr 15 04:54:23 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03    Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla P100-PCIE...  Off  | 00000000:00:04.0 Off |                    0 |
| N/A   70C    P0    53W / 250W |   1371MiB / 16280MiB |     96%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

## Evaluate the model

Finally, we evaluate the model on the test set. We use the Datasets library to compute the accuracy.

In [19]:
from tqdm.notebook import tqdm
from datasets import load_metric

accuracy = load_metric("accuracy")

#load best performing model checkpoint
model = PerceiverForSequenceClassification(configuration)
model.load_state_dict(torch.load(piaynTaskModelDirTokens + '/trainedPerceiverClassifierToken'+str(max_length)+'.pkl'))
model.to(device)

model.eval()

with torch.no_grad():
  for i, batch in tqdm(ds_iter['test']):
        
        # get the inputs; 
        inputs = batch["input_ids_0"].to(device)
        attention_mask = batch["mask_0"].to(device)
        labels = batch["label"].to(device)

        # forward pass
        outputs = model(inputs=inputs, attention_mask=attention_mask)
        logits = outputs.logits 
        predictions = logits.argmax(-1).cpu().detach().numpy()
        references = batch["label"].numpy()
        accuracy.add_batch(predictions=predictions, references=references)

        #delete intermediate variables to free up GPU space
        del logits, outputs, inputs, attention_mask, labels, predictions, references

final_score = accuracy.compute()
print("Accuracy on test set:", final_score['accuracy'])

0it [00:00, ?it/s]

Accuracy on test set: 0.6493677976952625
