<a href="https://colab.research.google.com/github/NA-VNEET/Multimodal-BERT-in-Medical-Image-and-Text-Classification/blob/Navneet/run_bert_text_only.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Run Text-Only Experiments

This notebook shows the end-to-end pipeline to fine-tune pre-trained BERT model for text classification on our dataset.

Parts of this pipeline are adapted from [McCormick's and Ryan's Tutorial on BERT Fine-Tuning](http://mccormickml.com/2019/07/22/BERT-fine-tuning/) and the
Huggingface `run_mmimdb.py` script to execute the MMBT model. This code can
be accessed [here.](https://github.com/huggingface/transformers/blob/8ea412a86faa8e9edeeb6b5c46b08def06aa03ea/examples/research_projects/mm-imdb/run_mmimdb.py#L305)

## Skip unless on Google Colab


In [1]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [2]:
%pwd

'/content'

### Working Directory
The notebook needs to be executed from the parent directory of the project, i.e. the `LAP` folder, which contains the notebooks, the data/, MMBT/, runs/, etc. directories.

Change the cell below to reflace the correct path to the `LAP` folder in your drive.

In [3]:
%cd /content/drive/MyDrive/LAP
%pwd

/content/drive/MyDrive/LAP


'/content/drive/MyDrive/LAP'

### Checking Directory
If you're in the correct directory, the command in the cell below should show the notebooks, MMBT/, data/, runs/, integrated_gradients/ directories. If you're not getting this outputk, you are not in the correct directory to run the subsequent cells in this notebook.

In [4]:
%ls

[0m[01;34m10epochs_text_only_findings[0m/        [01;34mpreprocess[0m/
baseline_experiments_results.ipynb  [01;34m__pycache__[0m/
bertviz_attention.ipynb             README.md
[01;34mdata[0m/                               requirements_bare.txt
false_preds.ipynb                   requirements_no_builds.txt
[01;34mfigures[0m/                            requirements.txt
image_submodel.ipynb                run_bert_text_only.ipynb
[01;34mintegrated_gradients[0m/               run_mmbt.ipynb
LAP_environment_no_versions.yaml    run_mmbt_masked_text_eval.ipynb
LAP_environment.yaml                [01;34mruns[0m/
LICENSE                             textBert_utils.py
[01;34mMMBT[0m/                               [01;34mwandb[0m/


## Check GPU is Available

In [5]:
import torch

# If there's a GPU available...
if torch.cuda.is_available():

    # Tell PyTorch to use the GPU.
    device = torch.device("cuda")

    print('There are %d GPU(s) available.' % torch.cuda.device_count())

    print('We will use the GPU:', torch.cuda.get_device_name(0))

# If not...
else:
    print('No GPU available, using the CPU instead.')
    device = torch.device("cpu")

There are 1 GPU(s) available.
We will use the GPU: Tesla T4


## Install Huggingface Trnasformers and WandB modules

These should have been installed during your environment set-up; you only need to run these cells in Google Colab.

In [6]:
pip install transformers==4.28.0  # Or another compatible version


Collecting transformers==4.28.0
  Downloading transformers-4.28.0-py3-none-any.whl.metadata (109 kB)
[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/110.0 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m110.0/110.0 kB[0m [31m4.5 MB/s[0m eta [36m0:00:00[0m
Collecting tokenizers!=0.11.3,<0.14,>=0.11.1 (from transformers==4.28.0)
  Downloading tokenizers-0.13.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (6.7 kB)
Downloading transformers-4.28.0-py3-none-any.whl (7.0 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m7.0/7.0 MB[0m [31m78.0 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading tokenizers-0.13.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.8 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m7.8/7.8 MB[0m [31m71.5 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: tokenizers, transformers
  Attempting uninstall

In [7]:
%pip install wandb



## Import Required Modules

In [8]:
from textBert_utils import (
    get_train_val_test_data,
    tokenize_and_encode_data,
    make_tensor_dataset,
    make_dataloader,
    set_seed,
    get_label_frequencies,
    get_multiclass_criterion
)

In [9]:
from MMBT.mmbt_utils import get_multiclass_labels, get_labels

In [10]:
import textBert_utils

In [11]:
import argparse
import pandas as pd
import os
import wandb
import glob
import numpy as np

In [12]:
import logging
import json

In [13]:
from transformers import (
    WEIGHTS_NAME,
    AutoConfig,
    AutoModelForSequenceClassification,
    AutoTokenizer,
)

# Set-up Experiment Hyperparameters and Arguments

Specify the training, validation, and test files to run the experiment on. The default here is running the model on both 'findings' and 'impression' texts.  

To re-make the training, validation, and test data, please refer to the information in the **data/** directory.  

Change the default values in the parser.add_argument function for the hyperparameters that you want to specify in the following cell or use the default option.  

For multiple experiment runs, please make sure to change the `output_dir` argument so that new results don't overwrit existing ones.

In [14]:
#train_file = "image_labels_impression_frontal_train.csv"
#val_file = "image_labels_impression_frontal_val.csv"
#test_file = "image_labels_impression_frontal_test.csv"

#train_file = "image_multi_labels_major_findings_frontal_train.csv"
#val_file = "image_multi_labels_major_findings_frontal_val.csv"
#test_file = "image_multi_labels_major_findings_frontal_test.csv"


#train_file = "image_labels_major_findings_frontal_train.csv"
#val_file = "image_labels_major_findings_frontal_val.csv"
#test_file = "image_labels_major_findings_frontal_test.csv"


train_file = "image_labels_findings_frontal_train.csv"
val_file = "image_labels_findings_frontal_val.csv"
test_file = "image_labels_findings_frontal_test.csv"

In [15]:
parser = argparse.ArgumentParser(f'Project Hyperparameters and Other Configurations Argument Parser')

In [16]:
parser = argparse.ArgumentParser()

# Required parameters
parser.add_argument(
    "--data_dir",
    default="data/csv",
    type=str,
    help="The input data dir. Should contain the .jsonl files.",
)
parser.add_argument(
    "--model_name",
    default="bert-base-uncased",
    type=str,
    help="model identifier from huggingface.co/models",
)
parser.add_argument(
    "--output_dir",
    default="10epochs_text_only_findings",
    type=str,
    help="The output directory where the model predictions and checkpoints will be written.",
)


parser.add_argument(
    "--config_name", default="bert-base-uncased", type=str, help="Pretrained config name if not the same as model_name"
)
parser.add_argument(
    "--tokenizer_name",
    default="bert-base-uncased",
    type=str,
    help="Pretrained tokenizer name or path if not the same as model_name",
)

parser.add_argument("--train_batch_size", default=32, type=int, help="Batch size for training.")
parser.add_argument(
    "--eval_batch_size", default=32, type=int, help="Batch size for evaluation."
)
parser.add_argument(
    "--max_seq_length",
    default=300,
    type=int,
    help="The maximum total input sequence length after tokenization. Sequences longer "
    "than this will be truncated, sequences shorter will be padded.",
)
parser.add_argument(
    "--num_image_embeds", default=3, type=int, help="Number of Image Embeddings from the Image Encoder"
)
parser.add_argument("--do_train", default=True, type=bool, help="Whether to run training.")
parser.add_argument("--do_eval", default=True, type=bool, help="Whether to run eval on the dev set.")
parser.add_argument(
    "--evaluate_during_training", default=True, type=bool, help="Rul evaluation during training at each logging step."
)


parser.add_argument(
    "--gradient_accumulation_steps",
    type=int,
    default=1,
    help="Number of updates steps to accumulate before performing a backward/update pass.",
)
parser.add_argument("--learning_rate", default=5e-5, type=float, help="The initial learning rate for Adam.")
parser.add_argument("--weight_decay", default=0.1, type=float, help="Weight deay if we apply some.")
parser.add_argument("--adam_epsilon", default=1e-8, type=float, help="Epsilon for Adam optimizer.")
parser.add_argument("--max_grad_norm", default=1.0, type=float, help="Max gradient norm.")
parser.add_argument(
    "--num_train_epochs", default=10.0, type=float, help="Total number of training epochs to perform."
)
parser.add_argument("--patience", default=5, type=int, help="Patience for Early Stopping.")
parser.add_argument(
    "--max_steps",
    default=-1,
    type=int,
    help="If > 0: set total number of training steps to perform. Override num_train_epochs.",
)
parser.add_argument("--warmup_steps", default=0, type=int, help="Linear warmup over warmup_steps.")

parser.add_argument("--logging_steps", type=int, default=25, help="Log every X updates steps.")
parser.add_argument("--save_steps", type=int, default=25, help="Save checkpoint every X updates steps.")
parser.add_argument(
    "--eval_all_checkpoints",
    default=True, type=bool,
    help="Evaluate all checkpoints starting with the same prefix as model_name ending and ending with step number",
)

parser.add_argument("--num_workers", type=int, default=8, help="number of worker threads for dataloading")

parser.add_argument("--seed", type=int, default=42, help="random seed for initialization")


args = parser.parse_args("")

# Setup CUDA, GPU & distributed training
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
args.n_gpu = torch.cuda.device_count() if torch.cuda.is_available() else 0
args.device = device

# Setup Train/Val/Test filenames
args.train_file = train_file
args.val_file = val_file
args.test_file = test_file

# accomodatae multiclass labeling
args.multiclass = False

### Check that the Args dict contains correct configurations

In [17]:
args.__dict__

{'data_dir': 'data/csv',
 'model_name': 'bert-base-uncased',
 'output_dir': '10epochs_text_only_findings',
 'config_name': 'bert-base-uncased',
 'tokenizer_name': 'bert-base-uncased',
 'train_batch_size': 32,
 'eval_batch_size': 32,
 'max_seq_length': 300,
 'num_image_embeds': 3,
 'do_train': True,
 'do_eval': True,
 'evaluate_during_training': True,
 'gradient_accumulation_steps': 1,
 'learning_rate': 5e-05,
 'weight_decay': 0.1,
 'adam_epsilon': 1e-08,
 'max_grad_norm': 1.0,
 'num_train_epochs': 10.0,
 'patience': 5,
 'max_steps': -1,
 'warmup_steps': 0,
 'logging_steps': 25,
 'save_steps': 25,
 'eval_all_checkpoints': True,
 'num_workers': 8,
 'seed': 42,
 'n_gpu': 1,
 'device': device(type='cuda'),
 'train_file': 'image_labels_findings_frontal_train.csv',
 'val_file': 'image_labels_findings_frontal_val.csv',
 'test_file': 'image_labels_findings_frontal_test.csv',
 'multiclass': False}

## Set-up WandB

We are setting up our code to run more experiments later and would be tracking them in the WandB API. You need to sign up for an account first to continue.

In [18]:
wandb.login()

[34m[1mwandb[0m: Using wandb-core as the SDK backend.  Please refer to https://wandb.me/wandb-core for more information.


<IPython.core.display.Javascript object>

[34m[1mwandb[0m: Logging into wandb.ai. (Learn how to deploy a W&B server locally: https://wandb.me/wandb-server)
[34m[1mwandb[0m: You can find your API key in your browser here: https://wandb.ai/authorize
wandb: Paste an API key from your profile and hit enter:

 ··········


[34m[1mwandb[0m: No netrc file found, creating one.
[34m[1mwandb[0m: Appending key for api.wandb.ai to your netrc file: /root/.netrc
[34m[1mwandb[0m: Currently logged in as: [33mkalilinux161023[0m ([33mkalilinux161023-federal-student-aid[0m) to [32mhttps://api.wandb.ai[0m. Use [1m`wandb login --relogin`[0m to force relogin


True

In [19]:
wandb.init(name="Train_Findings_Texts_10", tags=['Findings', 'frontal'], project="Text_Only", notes="10 epochs 256 size and 32 batch", config=args.__dict__)
run_name = wandb.run.name
wandb_config = wandb.config

## Create Dataset

In [20]:
train, val, test = get_train_val_test_data(wandb_config)

Number of training sentences: 1,707

Number of val sentences: 570

Number of test sentences: 570



In [21]:
train.head()

Unnamed: 0.1,Unnamed: 0,img,label,text
0,2573,CXR2728_IM-1187-1001.png,1,Lungs remain hyperexpanded. No change in the ...
1,1061,CXR2156_IM-0775-1001.png,0,The heart is normal in size. The pulmonary va...
2,862,CXR1732_IM-0482-1001.png,0,Cardiac and mediastinal silhouette are unrema...
3,1581,CXR3265_IM-1551-1001.png,0,The heart size and mediastinal contours appea...
4,1847,CXR3760_IM-1883-1001.png,0,XXXX sternotomy XXXX remain in XXXX. The card...


In [22]:
val.head()

Unnamed: 0.1,Unnamed: 0,img,label,text
0,437,CXR857_IM-2378-1001.png,0,The heart size and pulmonary vascularity appe...
1,1142,CXR2349_IM-0914-1001.png,0,Heart size and vascularity normal. These cont...
2,1024,CXR2075_IM-0708-1001.png,0,The cardiomediastinal silhouette is normal in...
3,1454,CXR3020_IM-1395-1001.png,0,Heart size and mediastinal contours are norma...
4,1759,CXR3583_IM-1762-1001.png,0,"The heart, pulmonary XXXX and mediastinum are..."


In [23]:
test.head()

Unnamed: 0.1,Unnamed: 0,img,label,text
0,1813,CXR3704_IM-1851-1001.png,0,"Lungs are clear without focal consolidation, ..."
1,1132,CXR2328_IM-0898-1001.png,0,Heart size is normal. The lungs are clear. Th...
2,2325,CXR1668_IM-0441-1001.png,1,"No pneumothorax, pleural effusion, or focal a..."
3,605,CXR1222_IM-0150-1001.png,0,The heart and lungs have XXXX XXXX in the int...
4,1073,CXR2198_IM-0808-1001.png,0,"Cardiac silhouette, pulmonary vascular patter..."


# sentences and labels

In [24]:
train_sentences = train.text.values
train_labels = train.label.values

val_sentences = val.text.values
val_labels = val.label.values

test_sentences = test.text.values
test_labels = test.label.values

In [25]:
train_sentences[:10]

array([' Lungs remain hyperexpanded. No change in the right middle lobe opacification. No XXXX infiltrates or masses. Pulmonary arteries are prominent centrally.',
       ' The heart is normal in size. The pulmonary vascularity is within normal limits in appearance. No focal air space opacities. No pleural effusions or pneumothorax. No acute bony abnormalities.',
       ' Cardiac and mediastinal silhouette are unremarkable. Lungs are clear. No focal consolidation, pneumothorax, or pleural effusion identified. XXXX and soft tissue are unremarkable.',
       ' The heart size and mediastinal contours appear within normal limits. No focal airspace consolidation, pleural effusion or pneumothorax. No acute bony abnormalities.',
       ' XXXX sternotomy XXXX remain in XXXX. The cardiomediastinal silhouette is within normal limits for appearance. The thoracic aorta is tortuous. No focal areas of pulmonary consolidation. No pneumothorax. No pleural effusion. Moderate degenerative changes of the

In [26]:
train_labels[:10]

array([1, 0, 0, 0, 0, 1, 0, 0, 0, 0])

# Tokenize and Encode with BERT encoder plus

The `tokenizer.encode_plus` function combines multiple steps for us:

1. Split the sentence into tokens.
2. Add the special `[CLS]` and `[SEP]` tokens.
3. Map the tokens to their IDs.
4. Pad or truncate all sentences to the same length.
5. Create the attention masks which explicitly differentiate real tokens from `[PAD]` tokens.

These steps are performed inside the `make_tensor_dataset` function.

# Torch dataset and dataloader

In [27]:
train_dataset = make_tensor_dataset(train_sentences, train_labels, wandb_config)
val_dataset = make_tensor_dataset(val_sentences, val_labels, wandb_config)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/570 [00:00<?, ?B/s]

Original:   Lungs remain hyperexpanded. No change in the right middle lobe opacification. No XXXX infiltrates or masses. Pulmonary arteries are prominent centrally.
Token IDs: tensor([  101,  8948,  3961, 23760, 10288,  9739,  5732,  1012,  2053,  2689,
         1999,  1996,  2157,  2690, 21833,  6728,  6305,  9031,  1012,  2053,
        22038, 20348, 29543,  2015,  2030, 11678,  1012, 21908, 28915,  2024,
         4069, 25497,  1012,   102,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,  

In [28]:
print(f'{len(train_dataset):>5,} training samples')
print(f'{len(val_dataset):>5,} validation samples')
#print(f'{len(test_dataset):>5,} test samples')

1,707 training samples
  570 validation samples


In [29]:
train_dataset[:3]

(tensor([[  101,  8948,  3961, 23760, 10288,  9739,  5732,  1012,  2053,  2689,
           1999,  1996,  2157,  2690, 21833,  6728,  6305,  9031,  1012,  2053,
          22038, 20348, 29543,  2015,  2030, 11678,  1012, 21908, 28915,  2024,
           4069, 25497,  1012,   102,     0,     0,     0,     0,     0,     0,
              0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
              0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
              0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
              0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
              0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
              0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
              0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
              0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
              0,     0,     0,     0,   

Create an iterator for the dataset using the torch DataLoader class.

In [30]:
data_loaders = {
    'train' : make_dataloader(train_dataset, wandb_config, eval=False),
    'train_size': len(train_dataset),
    'eval' : make_dataloader(val_dataset, wandb_config, eval=True),
    'eval_size' : len(val_dataset)
}

# Fine Tune BERT for Classification

## Setup Logging

In [31]:
# Setup logging
logger = logging.getLogger(__name__)
if not os.path.exists(wandb_config.output_dir):
    os.makedirs(wandb_config.output_dir)
logging.basicConfig(format="%(asctime)s - %(levelname)s - %(name)s -   %(message)s",
                    datefmt="%m/%d/%Y %H:%M:%S",
                    filename=os.path.join(wandb_config.output_dir, f"{os.path.splitext(wandb_config.train_file)[0]}_logging.txt"),
                    level=logging.INFO)
logger.warning("device: %s, n_gpu: %s",
        wandb_config.device,
        wandb_config.n_gpu
)
# Set the verbosity to info of the Transformers logger (on main process only):

# Set seed
set_seed(wandb_config)



## Set up the Model and Train

The Code will simply train and validate the specified train and validation sets.

Outputs and saved checkpoints are saved in the specifed `--output_dir` argument.
Tensorboard data are saved in the `runs/` directory with the date and time of the experiment as well as the filename of the train/test data file.

In [32]:
%pdb on
# set up model
if args.multiclass:
    labels = get_multiclass_labels()
    num_labels = len(labels)
else:
    labels = get_labels()
    num_labels = len(labels)
transformer_config = AutoConfig.from_pretrained(wandb_config.model_name, num_labels=num_labels)
tokenizer = AutoTokenizer.from_pretrained(
        wandb_config.tokenizer_name,
        do_lower_case=True,
        cache_dir=None,
    )
transformer_model = AutoModelForSequenceClassification.from_pretrained(wandb_config.model_name, config=transformer_config)
transformer_model.to(device)
logger.info(f"Training/evaluation parameters: {wandb_config}")
# Training
if wandb_config.do_train:
    if wandb_config.multiclass:
        criterion = get_multiclass_criterion(train_labels)
        global_step, tr_loss = textBert_utils.train(data_loaders, wandb_config, transformer_model, criterion)
    else:
        global_step, tr_loss = textBert_utils.train(data_loaders, wandb_config, transformer_model)
    logger.info(" global_step = %s, average loss = %s", global_step, tr_loss)

# Saving best-practices: if you use defaults names for the model, you can reload it using from_pretrained()
    logger.info("Saving model checkpoint to %s", wandb_config.output_dir)
    # Save a trained model, configuration and tokenizer using `save_pretrained()`.
    # They can then be reloaded using `from_pretrained()`
    model_to_save = (transformer_model.module if hasattr(transformer_model, "module") else transformer_model)  # Take care of distributed/parallel training
    torch.save(model_to_save.state_dict(), os.path.join(wandb_config.output_dir, WEIGHTS_NAME))
    tokenizer.save_pretrained(wandb_config.output_dir)
    transformer_config.save_pretrained(wandb_config.output_dir)

    # Good practice: save your training arguments together with the trained model
    torch.save(args, os.path.join(wandb_config.output_dir, "training_args.bin"))

    # Load a trained model and vocabulary that you have fine-tuned
    transformer_model = AutoModelForSequenceClassification.from_pretrained(wandb_config.model_name, config=transformer_config)
    transformer_model.load_state_dict(torch.load(os.path.join(wandb_config.output_dir, WEIGHTS_NAME)))
    tokenizer = AutoTokenizer.from_pretrained(wandb_config.output_dir)
    transformer_model.to(device)
logger.info("***** Training Finished *****")
wandb.finish()


Automatic pdb calling has been turned ON


tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]



model.safetensors:   0%|          | 0.00/440M [00:00<?, ?B/s]

Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.dense.bias', 'cls.predictions.transform.dense.weight', 'cls.seq_relationship.weight', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly i

{"eval_loss": 0.5048417432440652, "eval_accuracy": 0.7719298245614035, "learning_rate": 4.768518518518519e-05, "training_loss": 0.5411369478702546, "step": 25}



Batch Iteration:  46%|████▋     | 25/54 [00:58<03:48,  7.87s/it][A
Batch Iteration:  48%|████▊     | 26/54 [00:59<02:46,  5.96s/it][A
Batch Iteration:  50%|█████     | 27/54 [01:01<02:04,  4.62s/it][A
Batch Iteration:  52%|█████▏    | 28/54 [01:02<01:35,  3.68s/it][A
Batch Iteration:  54%|█████▎    | 29/54 [01:04<01:15,  3.02s/it][A
Batch Iteration:  56%|█████▌    | 30/54 [01:05<01:01,  2.57s/it][A
Batch Iteration:  57%|█████▋    | 31/54 [01:07<00:51,  2.25s/it][A
Batch Iteration:  59%|█████▉    | 32/54 [01:08<00:44,  2.02s/it][A
Batch Iteration:  61%|██████    | 33/54 [01:10<00:39,  1.86s/it][A
Batch Iteration:  63%|██████▎   | 34/54 [01:11<00:35,  1.76s/it][A
Batch Iteration:  65%|██████▍   | 35/54 [01:13<00:31,  1.68s/it][A
Batch Iteration:  67%|██████▋   | 36/54 [01:14<00:29,  1.64s/it][A
Batch Iteration:  69%|██████▊   | 37/54 [01:16<00:27,  1.60s/it][A
Batch Iteration:  70%|███████   | 38/54 [01:17<00:25,  1.58s/it][A
Batch Iteration:  72%|███████▏  | 39/54 [01:19<

{"eval_loss": 0.3432459367646111, "eval_accuracy": 0.8508771929824561, "learning_rate": 4.5370370370370374e-05, "training_loss": 0.44070822894573214, "step": 50}



Batch Iteration:  93%|█████████▎| 50/54 [01:56<00:30,  7.69s/it][A
Batch Iteration:  94%|█████████▍| 51/54 [01:58<00:17,  5.87s/it][A
Batch Iteration:  96%|█████████▋| 52/54 [02:00<00:09,  4.59s/it][A
Batch Iteration:  98%|█████████▊| 53/54 [02:01<00:03,  3.69s/it][A
Batch Iteration: 100%|██████████| 54/54 [02:02<00:00,  2.26s/it]

Batch Evaluating:   0%|          | 0/18 [00:00<?, ?it/s][A
Batch Evaluating:   6%|▌         | 1/18 [00:00<00:10,  1.56it/s][A
Batch Evaluating:  11%|█         | 2/18 [00:01<00:10,  1.57it/s][A
Batch Evaluating:  17%|█▋        | 3/18 [00:01<00:09,  1.55it/s][A
Batch Evaluating:  22%|██▏       | 4/18 [00:02<00:08,  1.56it/s][A
Batch Evaluating:  28%|██▊       | 5/18 [00:03<00:08,  1.56it/s][A
Batch Evaluating:  33%|███▎      | 6/18 [00:03<00:07,  1.54it/s][A
Batch Evaluating:  39%|███▉      | 7/18 [00:04<00:07,  1.54it/s][A
Batch Evaluating:  44%|████▍     | 8/18 [00:05<00:06,  1.53it/s][A
Batch Evaluating:  50%|█████     | 9/18 [00:05<00:05,  1.

{"eval_loss": 0.341716190179189, "eval_accuracy": 0.8508771929824561, "learning_rate": 4.305555555555556e-05, "training_loss": 0.32781006932258605, "step": 75}



Batch Iteration:  39%|███▉      | 21/54 [00:53<04:12,  7.65s/it][A
Batch Iteration:  41%|████      | 22/54 [00:55<03:06,  5.82s/it][A
Batch Iteration:  43%|████▎     | 23/54 [00:57<02:20,  4.54s/it][A
Batch Iteration:  44%|████▍     | 24/54 [00:58<01:49,  3.64s/it][A
Batch Iteration:  46%|████▋     | 25/54 [01:00<01:27,  3.02s/it][A
Batch Iteration:  48%|████▊     | 26/54 [01:01<01:12,  2.58s/it][A
Batch Iteration:  50%|█████     | 27/54 [01:03<01:01,  2.28s/it][A
Batch Iteration:  52%|█████▏    | 28/54 [01:04<00:53,  2.07s/it][A
Batch Iteration:  54%|█████▎    | 29/54 [01:06<00:48,  1.93s/it][A
Batch Iteration:  56%|█████▌    | 30/54 [01:08<00:43,  1.83s/it][A
Batch Iteration:  57%|█████▋    | 31/54 [01:09<00:40,  1.76s/it][A
Batch Iteration:  59%|█████▉    | 32/54 [01:11<00:37,  1.72s/it][A
Batch Iteration:  61%|██████    | 33/54 [01:12<00:35,  1.69s/it][A
Batch Iteration:  63%|██████▎   | 34/54 [01:14<00:33,  1.67s/it][A
Batch Iteration:  65%|██████▍   | 35/54 [01:16<

{"eval_loss": 0.3438637802998225, "eval_accuracy": 0.8631578947368421, "learning_rate": 4.074074074074074e-05, "training_loss": 0.31523105382919314, "step": 100}



Batch Iteration:  85%|████████▌ | 46/54 [01:54<01:02,  7.83s/it][A
Batch Iteration:  87%|████████▋ | 47/54 [01:56<00:41,  5.96s/it][A
Batch Iteration:  89%|████████▉ | 48/54 [01:58<00:27,  4.64s/it][A
Batch Iteration:  91%|█████████ | 49/54 [01:59<00:18,  3.71s/it][A
Batch Iteration:  93%|█████████▎| 50/54 [02:01<00:12,  3.07s/it][A
Batch Iteration:  94%|█████████▍| 51/54 [02:02<00:07,  2.62s/it][A
Batch Iteration:  96%|█████████▋| 52/54 [02:04<00:04,  2.30s/it][A
Batch Iteration:  98%|█████████▊| 53/54 [02:05<00:02,  2.09s/it][A
Batch Iteration: 100%|██████████| 54/54 [02:06<00:00,  2.34s/it]

Batch Evaluating:   0%|          | 0/18 [00:00<?, ?it/s][A
Batch Evaluating:   6%|▌         | 1/18 [00:00<00:10,  1.59it/s][A
Batch Evaluating:  11%|█         | 2/18 [00:01<00:10,  1.58it/s][A
Batch Evaluating:  17%|█▋        | 3/18 [00:01<00:09,  1.58it/s][A
Batch Evaluating:  22%|██▏       | 4/18 [00:02<00:08,  1.59it/s][A
Batch Evaluating:  28%|██▊       | 5/18 [00:03<00:08,  1.

{"eval_loss": 0.4263071161177423, "eval_accuracy": 0.8526315789473684, "learning_rate": 3.8425925925925924e-05, "training_loss": 0.20973343178629875, "step": 125}



Batch Iteration:  31%|███▏      | 17/54 [00:48<04:53,  7.94s/it][A
Batch Iteration:  33%|███▎      | 18/54 [00:50<03:37,  6.03s/it][A
Batch Iteration:  35%|███▌      | 19/54 [00:51<02:43,  4.69s/it][A
Batch Iteration:  37%|███▋      | 20/54 [00:53<02:07,  3.75s/it][A
Batch Iteration:  39%|███▉      | 21/54 [00:54<01:41,  3.09s/it][A
Batch Iteration:  41%|████      | 22/54 [00:56<01:24,  2.63s/it][A
Batch Iteration:  43%|████▎     | 23/54 [00:57<01:12,  2.34s/it][A
Batch Iteration:  44%|████▍     | 24/54 [00:59<01:04,  2.14s/it][A
Batch Iteration:  46%|████▋     | 25/54 [01:01<00:57,  1.98s/it][A
Batch Iteration:  48%|████▊     | 26/54 [01:02<00:52,  1.86s/it][A
Batch Iteration:  50%|█████     | 27/54 [01:04<00:48,  1.79s/it][A
Batch Iteration:  52%|█████▏    | 28/54 [01:06<00:45,  1.73s/it][A
Batch Iteration:  54%|█████▎    | 29/54 [01:07<00:42,  1.70s/it][A
Batch Iteration:  56%|█████▌    | 30/54 [01:09<00:40,  1.67s/it][A
Batch Iteration:  57%|█████▋    | 31/54 [01:10<

{"eval_loss": 0.46366561783684623, "eval_accuracy": 0.8526315789473684, "learning_rate": 3.611111111111111e-05, "training_loss": 0.21654624089598656, "step": 150}



Batch Iteration:  78%|███████▊  | 42/54 [01:50<01:36,  8.05s/it][A
Batch Iteration:  80%|███████▉  | 43/54 [01:51<01:07,  6.11s/it][A
Batch Iteration:  81%|████████▏ | 44/54 [01:53<00:47,  4.74s/it][A
Batch Iteration:  83%|████████▎ | 45/54 [01:54<00:34,  3.78s/it][A
Batch Iteration:  85%|████████▌ | 46/54 [01:56<00:24,  3.12s/it][A
Batch Iteration:  87%|████████▋ | 47/54 [01:58<00:18,  2.65s/it][A
Batch Iteration:  89%|████████▉ | 48/54 [01:59<00:13,  2.33s/it][A
Batch Iteration:  91%|█████████ | 49/54 [02:01<00:10,  2.10s/it][A
Batch Iteration:  93%|█████████▎| 50/54 [02:02<00:07,  1.95s/it][A
Batch Iteration:  94%|█████████▍| 51/54 [02:04<00:05,  1.84s/it][A
Batch Iteration:  96%|█████████▋| 52/54 [02:06<00:03,  1.76s/it][A
Batch Iteration:  98%|█████████▊| 53/54 [02:07<00:01,  1.72s/it][A
Batch Iteration: 100%|██████████| 54/54 [02:08<00:00,  2.37s/it]

Batch Evaluating:   0%|          | 0/18 [00:00<?, ?it/s][A
Batch Evaluating:   6%|▌         | 1/18 [00:00<00:10,  1.

{"eval_loss": 0.45301128427187604, "eval_accuracy": 0.8368421052631579, "learning_rate": 3.3796296296296295e-05, "training_loss": 0.23104581281542777, "step": 175}



Batch Iteration:  24%|██▍       | 13/54 [00:41<05:25,  7.94s/it][A
Batch Iteration:  26%|██▌       | 14/54 [00:43<04:00,  6.02s/it][A
Batch Iteration:  28%|██▊       | 15/54 [00:45<03:02,  4.67s/it][A
Batch Iteration:  30%|██▉       | 16/54 [00:46<02:21,  3.74s/it][A
Batch Iteration:  31%|███▏      | 17/54 [00:48<01:54,  3.08s/it][A
Batch Iteration:  33%|███▎      | 18/54 [00:49<01:34,  2.63s/it][A
Batch Iteration:  35%|███▌      | 19/54 [00:51<01:20,  2.31s/it][A
Batch Iteration:  37%|███▋      | 20/54 [00:52<01:11,  2.09s/it][A
Batch Iteration:  39%|███▉      | 21/54 [00:54<01:03,  1.94s/it][A
Batch Iteration:  41%|████      | 22/54 [00:56<00:58,  1.84s/it][A
Batch Iteration:  43%|████▎     | 23/54 [00:57<00:54,  1.77s/it][A
Batch Iteration:  44%|████▍     | 24/54 [00:59<00:51,  1.72s/it][A
Batch Iteration:  46%|████▋     | 25/54 [01:00<00:49,  1.69s/it][A
Batch Iteration:  48%|████▊     | 26/54 [01:02<00:46,  1.67s/it][A
Batch Iteration:  50%|█████     | 27/54 [01:04<

{"eval_loss": 0.5196867348212335, "eval_accuracy": 0.8403508771929824, "learning_rate": 3.148148148148148e-05, "training_loss": 0.12206920746713877, "step": 200}



Batch Iteration:  70%|███████   | 38/54 [01:45<02:17,  8.57s/it][A
Batch Iteration:  72%|███████▏  | 39/54 [01:46<01:36,  6.47s/it][A
Batch Iteration:  74%|███████▍  | 40/54 [01:48<01:09,  4.99s/it][A
Batch Iteration:  76%|███████▌  | 41/54 [01:49<00:51,  3.96s/it][A
Batch Iteration:  78%|███████▊  | 42/54 [01:51<00:38,  3.24s/it][A
Batch Iteration:  80%|███████▉  | 43/54 [01:53<00:30,  2.74s/it][A
Batch Iteration:  81%|████████▏ | 44/54 [01:54<00:23,  2.39s/it][A
Batch Iteration:  83%|████████▎ | 45/54 [01:56<00:19,  2.14s/it][A
Batch Iteration:  85%|████████▌ | 46/54 [01:57<00:15,  1.98s/it][A
Batch Iteration:  87%|████████▋ | 47/54 [01:59<00:13,  1.86s/it][A
Batch Iteration:  89%|████████▉ | 48/54 [02:00<00:10,  1.78s/it][A
Batch Iteration:  91%|█████████ | 49/54 [02:02<00:08,  1.73s/it][A
Batch Iteration:  93%|█████████▎| 50/54 [02:04<00:06,  1.70s/it][A
Batch Iteration:  94%|█████████▍| 51/54 [02:05<00:05,  1.68s/it][A
Batch Iteration:  96%|█████████▋| 52/54 [02:07<

{"eval_loss": 0.5107920782433616, "eval_accuracy": 0.8333333333333334, "learning_rate": 2.916666666666667e-05, "training_loss": 0.15701352283358574, "step": 225}



Batch Iteration:  17%|█▋        | 9/54 [00:39<07:03,  9.40s/it][A
Batch Iteration:  19%|█▊        | 10/54 [00:40<05:06,  6.97s/it][A
Batch Iteration:  20%|██        | 11/54 [00:42<03:48,  5.31s/it][A
Batch Iteration:  22%|██▏       | 12/54 [00:44<02:54,  4.16s/it][A
Batch Iteration:  24%|██▍       | 13/54 [00:45<02:18,  3.37s/it][A
Batch Iteration:  26%|██▌       | 14/54 [00:47<01:52,  2.82s/it][A
Batch Iteration:  28%|██▊       | 15/54 [00:48<01:35,  2.44s/it][A
Batch Iteration:  30%|██▉       | 16/54 [00:50<01:23,  2.19s/it][A
Batch Iteration:  31%|███▏      | 17/54 [00:51<01:14,  2.01s/it][A
Batch Iteration:  33%|███▎      | 18/54 [00:53<01:07,  1.88s/it][A
Batch Iteration:  35%|███▌      | 19/54 [00:55<01:03,  1.80s/it][A
Batch Iteration:  37%|███▋      | 20/54 [00:56<00:59,  1.74s/it][A
Batch Iteration:  39%|███▉      | 21/54 [00:58<00:56,  1.70s/it][A
Batch Iteration:  41%|████      | 22/54 [00:59<00:53,  1.68s/it][A
Batch Iteration:  43%|████▎     | 23/54 [01:01<0

{"eval_loss": 0.6264057734774219, "eval_accuracy": 0.8543859649122807, "learning_rate": 2.6851851851851855e-05, "training_loss": 0.10498273335397243, "step": 250}



Batch Iteration:  63%|██████▎   | 34/54 [01:41<02:39,  7.99s/it][A
Batch Iteration:  65%|██████▍   | 35/54 [01:42<01:55,  6.06s/it][A
Batch Iteration:  67%|██████▋   | 36/54 [01:44<01:24,  4.71s/it][A
Batch Iteration:  69%|██████▊   | 37/54 [01:45<01:03,  3.76s/it][A
Batch Iteration:  70%|███████   | 38/54 [01:47<00:49,  3.10s/it][A
Batch Iteration:  72%|███████▏  | 39/54 [01:48<00:39,  2.64s/it][A
Batch Iteration:  74%|███████▍  | 40/54 [01:50<00:32,  2.32s/it][A
Batch Iteration:  76%|███████▌  | 41/54 [01:51<00:27,  2.10s/it][A
Batch Iteration:  78%|███████▊  | 42/54 [01:53<00:23,  1.94s/it][A
Batch Iteration:  80%|███████▉  | 43/54 [01:55<00:20,  1.83s/it][A
Batch Iteration:  81%|████████▏ | 44/54 [01:56<00:17,  1.76s/it][A
Batch Iteration:  83%|████████▎ | 45/54 [01:58<00:15,  1.71s/it][A
Batch Iteration:  85%|████████▌ | 46/54 [01:59<00:13,  1.67s/it][A
Batch Iteration:  87%|████████▋ | 47/54 [02:01<00:11,  1.65s/it][A
Batch Iteration:  89%|████████▉ | 48/54 [02:03<

{"eval_loss": 0.5224594171676371, "eval_accuracy": 0.8596491228070176, "learning_rate": 2.4537037037037038e-05, "training_loss": 0.09395765330642462, "step": 275}



Batch Iteration:   9%|▉         | 5/54 [00:33<08:45, 10.73s/it][A
Batch Iteration:  11%|█         | 6/54 [00:34<06:05,  7.61s/it][A
Batch Iteration:  13%|█▎        | 7/54 [00:36<04:24,  5.63s/it][A
Batch Iteration:  15%|█▍        | 8/54 [00:38<03:19,  4.33s/it][A
Batch Iteration:  17%|█▋        | 9/54 [00:39<02:36,  3.47s/it][A
Batch Iteration:  19%|█▊        | 10/54 [00:41<02:06,  2.88s/it][A
Batch Iteration:  20%|██        | 11/54 [00:42<01:46,  2.48s/it][A
Batch Iteration:  22%|██▏       | 12/54 [00:44<01:32,  2.21s/it][A
Batch Iteration:  24%|██▍       | 13/54 [00:45<01:23,  2.03s/it][A
Batch Iteration:  26%|██▌       | 14/54 [00:47<01:15,  1.90s/it][A
Batch Iteration:  28%|██▊       | 15/54 [00:49<01:10,  1.81s/it][A
Batch Iteration:  30%|██▉       | 16/54 [00:50<01:06,  1.76s/it][A
Batch Iteration:  31%|███▏      | 17/54 [00:52<01:03,  1.72s/it][A
Batch Iteration:  33%|███▎      | 18/54 [00:54<01:00,  1.69s/it][A
Batch Iteration:  35%|███▌      | 19/54 [00:55<00:58

{"eval_loss": 0.7296255135701762, "eval_accuracy": 0.856140350877193, "learning_rate": 2.2222222222222223e-05, "training_loss": 0.040761339161545035, "step": 300}



Batch Iteration:  56%|█████▌    | 30/54 [01:40<03:49,  9.57s/it][A
Batch Iteration:  57%|█████▋    | 31/54 [01:41<02:44,  7.16s/it][A
Batch Iteration:  59%|█████▉    | 32/54 [01:43<02:00,  5.48s/it][A
Batch Iteration:  61%|██████    | 33/54 [01:44<01:30,  4.29s/it][A
Batch Iteration:  63%|██████▎   | 34/54 [01:46<01:09,  3.47s/it][A
Batch Iteration:  65%|██████▍   | 35/54 [01:48<00:55,  2.90s/it][A
Batch Iteration:  67%|██████▋   | 36/54 [01:49<00:44,  2.50s/it][A
Batch Iteration:  69%|██████▊   | 37/54 [01:51<00:37,  2.22s/it][A
Batch Iteration:  70%|███████   | 38/54 [01:52<00:32,  2.03s/it][A
Batch Iteration:  72%|███████▏  | 39/54 [01:54<00:28,  1.90s/it][A
Batch Iteration:  74%|███████▍  | 40/54 [01:55<00:25,  1.81s/it][A
Batch Iteration:  76%|███████▌  | 41/54 [01:57<00:22,  1.75s/it][A
Batch Iteration:  78%|███████▊  | 42/54 [01:59<00:20,  1.71s/it][A
Batch Iteration:  80%|███████▉  | 43/54 [02:00<00:18,  1.68s/it][A
Batch Iteration:  81%|████████▏ | 44/54 [02:02<

{"eval_loss": 0.9112317462762197, "eval_accuracy": 0.8228070175438597, "learning_rate": 1.990740740740741e-05, "training_loss": 0.060564382458105685, "step": 325}



Batch Iteration:   2%|▏         | 1/54 [00:25<22:17, 25.23s/it][A
Batch Iteration:   4%|▎         | 2/54 [00:26<09:48, 11.32s/it][A
Batch Iteration:   6%|▌         | 3/54 [00:28<05:49,  6.85s/it][A
Batch Iteration:   7%|▋         | 4/54 [00:29<03:57,  4.75s/it][A
Batch Iteration:   9%|▉         | 5/54 [00:31<02:56,  3.60s/it][A
Batch Iteration:  11%|█         | 6/54 [00:32<02:19,  2.90s/it][A
Batch Iteration:  13%|█▎        | 7/54 [00:34<01:55,  2.47s/it][A
Batch Iteration:  15%|█▍        | 8/54 [00:36<01:40,  2.19s/it][A
Batch Iteration:  17%|█▋        | 9/54 [00:37<01:30,  2.00s/it][A
Batch Iteration:  19%|█▊        | 10/54 [00:39<01:22,  1.87s/it][A
Batch Iteration:  20%|██        | 11/54 [00:40<01:16,  1.79s/it][A
Batch Iteration:  22%|██▏       | 12/54 [00:42<01:12,  1.73s/it][A
Batch Iteration:  24%|██▍       | 13/54 [00:44<01:09,  1.70s/it][A
Batch Iteration:  26%|██▌       | 14/54 [00:45<01:06,  1.67s/it][A
Batch Iteration:  28%|██▊       | 15/54 [00:47<01:04,  1

{"eval_loss": 0.6909018605947495, "eval_accuracy": 0.8543859649122807, "learning_rate": 1.7592592592592595e-05, "training_loss": 0.0630000024754554, "step": 350}



Batch Iteration:  48%|████▊     | 26/54 [01:26<03:38,  7.82s/it][A
Batch Iteration:  50%|█████     | 27/54 [01:27<02:40,  5.95s/it][A
Batch Iteration:  52%|█████▏    | 28/54 [01:29<02:00,  4.63s/it][A
Batch Iteration:  54%|█████▎    | 29/54 [01:30<01:32,  3.71s/it][A
Batch Iteration:  56%|█████▌    | 30/54 [01:32<01:13,  3.06s/it][A
Batch Iteration:  57%|█████▋    | 31/54 [01:33<01:00,  2.62s/it][A
Batch Iteration:  59%|█████▉    | 32/54 [01:35<00:50,  2.30s/it][A
Batch Iteration:  61%|██████    | 33/54 [01:37<00:43,  2.09s/it][A
Batch Iteration:  63%|██████▎   | 34/54 [01:38<00:38,  1.94s/it][A
Batch Iteration:  65%|██████▍   | 35/54 [01:40<00:34,  1.83s/it][A
Batch Iteration:  67%|██████▋   | 36/54 [01:41<00:31,  1.76s/it][A
Batch Iteration:  69%|██████▊   | 37/54 [01:43<00:29,  1.71s/it][A
Batch Iteration:  70%|███████   | 38/54 [01:45<00:26,  1.67s/it][A
Batch Iteration:  72%|███████▏  | 39/54 [01:46<00:24,  1.65s/it][A
Batch Iteration:  74%|███████▍  | 40/54 [01:48<

{"eval_loss": 0.7589446612530284, "eval_accuracy": 0.8526315789473684, "learning_rate": 1.527777777777778e-05, "training_loss": 0.0274848878569901, "step": 375}



Batch Iteration:  94%|█████████▍| 51/54 [02:29<00:25,  8.64s/it][A
Batch Iteration:  96%|█████████▋| 52/54 [02:31<00:13,  6.52s/it][A
Batch Iteration:  98%|█████████▊| 53/54 [02:32<00:05,  5.03s/it][A
Batch Iteration: 100%|██████████| 54/54 [02:33<00:00,  2.84s/it]

Batch Evaluating:   0%|          | 0/18 [00:00<?, ?it/s][A
Batch Evaluating:   6%|▌         | 1/18 [00:00<00:10,  1.62it/s][A
Batch Evaluating:  11%|█         | 2/18 [00:01<00:09,  1.63it/s][A
Batch Evaluating:  17%|█▋        | 3/18 [00:01<00:09,  1.61it/s][A
Batch Evaluating:  22%|██▏       | 4/18 [00:02<00:08,  1.62it/s][A
Batch Evaluating:  28%|██▊       | 5/18 [00:03<00:08,  1.61it/s][A
Batch Evaluating:  33%|███▎      | 6/18 [00:03<00:07,  1.59it/s][A
Batch Evaluating:  39%|███▉      | 7/18 [00:04<00:06,  1.60it/s][A
Batch Evaluating:  44%|████▍     | 8/18 [00:05<00:06,  1.58it/s][A
Batch Evaluating:  50%|█████     | 9/18 [00:05<00:05,  1.58it/s][A
Batch Evaluating:  56%|█████▌    | 10/18 [00:06<00:05,  1

{"eval_loss": 0.7501854060424699, "eval_accuracy": 0.8491228070175438, "learning_rate": 1.2962962962962962e-05, "training_loss": 0.017986237863078715, "step": 400}



Batch Iteration:  41%|████      | 22/54 [00:59<04:36,  8.66s/it][A
Batch Iteration:  43%|████▎     | 23/54 [01:00<03:22,  6.53s/it][A
Batch Iteration:  44%|████▍     | 24/54 [01:02<02:30,  5.03s/it][A
Batch Iteration:  46%|████▋     | 25/54 [01:04<01:55,  3.98s/it][A
Batch Iteration:  48%|████▊     | 26/54 [01:05<01:31,  3.26s/it][A
Batch Iteration:  50%|█████     | 27/54 [01:07<01:14,  2.74s/it][A
Batch Iteration:  52%|█████▏    | 28/54 [01:08<01:02,  2.39s/it][A
Batch Iteration:  54%|█████▎    | 29/54 [01:10<00:53,  2.14s/it][A
Batch Iteration:  56%|█████▌    | 30/54 [01:11<00:47,  1.97s/it][A
Batch Iteration:  57%|█████▋    | 31/54 [01:13<00:42,  1.86s/it][A
Batch Iteration:  59%|█████▉    | 32/54 [01:15<00:39,  1.78s/it][A
Batch Iteration:  61%|██████    | 33/54 [01:16<00:36,  1.73s/it][A
Batch Iteration:  63%|██████▎   | 34/54 [01:18<00:33,  1.69s/it][A
Batch Iteration:  65%|██████▍   | 35/54 [01:19<00:31,  1.67s/it][A
Batch Iteration:  67%|██████▋   | 36/54 [01:21<

{"eval_loss": 0.7369731134838529, "eval_accuracy": 0.856140350877193, "learning_rate": 1.0648148148148148e-05, "training_loss": 0.034348630886524914, "step": 425}



Batch Iteration:  87%|████████▋ | 47/54 [02:07<01:10, 10.03s/it][A
Batch Iteration:  89%|████████▉ | 48/54 [02:09<00:44,  7.49s/it][A
Batch Iteration:  91%|█████████ | 49/54 [02:10<00:28,  5.70s/it][A
Batch Iteration:  93%|█████████▎| 50/54 [02:12<00:17,  4.46s/it][A
Batch Iteration:  94%|█████████▍| 51/54 [02:13<00:10,  3.59s/it][A
Batch Iteration:  96%|█████████▋| 52/54 [02:15<00:05,  2.98s/it][A
Batch Iteration:  98%|█████████▊| 53/54 [02:16<00:02,  2.55s/it][A
Batch Iteration: 100%|██████████| 54/54 [02:17<00:00,  2.55s/it]

Batch Evaluating:   0%|          | 0/18 [00:00<?, ?it/s][A
Batch Evaluating:   6%|▌         | 1/18 [00:00<00:10,  1.60it/s][A
Batch Evaluating:  11%|█         | 2/18 [00:01<00:09,  1.61it/s][A
Batch Evaluating:  17%|█▋        | 3/18 [00:01<00:09,  1.59it/s][A
Batch Evaluating:  22%|██▏       | 4/18 [00:02<00:08,  1.59it/s][A
Batch Evaluating:  28%|██▊       | 5/18 [00:03<00:08,  1.59it/s][A
Batch Evaluating:  33%|███▎      | 6/18 [00:03<00:07,  1.

# Evaluation on Test set

## tokenizer and prepare test dataset

use the saved tokenizer from the training step

In [33]:
wandb.init(name="Test_Findings_Texts_10", tags=['Findings', 'frontal'], project="Text_Only", notes="10 epochs 256 size and 32 batch", config=args.__dict__)
# wandb.tensorboard.patch(root_logdir="...")
run_name = wandb.run.name
wandb_config = wandb.config

In [34]:
test_dataset = make_tensor_dataset(test_sentences, test_labels, wandb_config, saved_model=True)

Original:   Lungs are clear without focal consolidation, effusion or pneumothorax. Normal heart size. Bony thorax and soft tissues unremarkable
Token IDs: tensor([  101,  8948,  2024,  3154,  2302, 15918, 17439,  1010,  1041,  4246,
        14499,  2030,  1052,  2638,  2819, 29288,  2527,  2595,  1012,  3671,
         2540,  2946,  1012, 22678, 15321,  8528,  1998,  3730, 14095,  4895,
        28578, 17007,  3085,   102,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,  

In [35]:
data_loaders['test'] = make_dataloader(test_dataset, wandb_config, eval=True)
data_loaders['test_size'] = len(test_dataset)

In [36]:
# Evaluation
results = {}
if wandb_config.do_eval:
    checkpoints = [wandb_config.output_dir]
    if wandb_config.eval_all_checkpoints:
        checkpoints = list(os.path.dirname(c)
        for c in sorted(glob.glob(wandb_config.output_dir + "/**/" +
                                  WEIGHTS_NAME, recursive=False)))
        # recursive=False because otherwise the parent diretory gets included
        # which is not what we want; only subdirectories

    logger.info("Evaluate the following checkpoints: %s", checkpoints)

    for checkpoint in checkpoints:
        global_step = checkpoint.split("-")[-1] if len(checkpoints) > 1 else ""
        prefix = checkpoint.split("/")[-1] if checkpoint.find("checkpoint") != -1 else ""
        transformer_model = AutoModelForSequenceClassification.from_pretrained(wandb_config.model_name, config=transformer_config)
        checkpoint = os.path.join(checkpoint, 'pytorch_model.bin')
        transformer_model.load_state_dict(torch.load(checkpoint))
        transformer_model.to(wandb_config.device)
        if wandb_config.multiclass:
            result = textBert_utils.evaluate(data_loaders, wandb_config, transformer_model, prefix=prefix, test=True, criterion=criterion)
        else:
            result = textBert_utils.evaluate(data_loaders, wandb_config, transformer_model, prefix=prefix, test=True) # test=True uses the test_dataset not val_dataset
        result = dict((k + "_{}".format(global_step), v) for k, v in result.items())
        results.update(result)
    logger.info("***** Evaluation on Test Data Finished *****")
wandb.finish()

Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.dense.bias', 'cls.predictions.transform.dense.weight', 'cls.seq_relationship.weight', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly i

## Saving Test Eval Results

The code automatically saved evaluation result from each checkpoint in its respective folder. This next cell simply saves all of them in one place.

In [37]:
with open(os.path.join(args.output_dir, f"{os.path.splitext(args.test_file)[0]}_eval_results.txt"), mode='w', encoding='utf-8') as out_f:
    print(results, file=out_f)