In [1]:
BRANCH = 'r1.0.0rc1'

In [2]:
"""
You can run either this notebook locally (if you have all the dependencies and a GPU) or on Google Colab.

Instructions for setting up Colab are as follows:
1. Open a new Python 3 notebook.
2. Import this notebook from GitHub (File -> Upload Notebook -> "GITHUB" tab -> copy/paste GitHub URL)
3. Connect to an instance with a GPU (Runtime -> Change runtime type -> select "GPU" for hardware accelerator)
4. Run this cell to set up dependencies.
"""
# If you're using Google Colab and not running locally, run this cell

# install NeMo
#!python -m pip install git+https://github.com/NVIDIA/NeMo.git@$BRANCH#egg=nemo_toolkit[nlp]

'\nYou can run either this notebook locally (if you have all the dependencies and a GPU) or on Google Colab.\n\nInstructions for setting up Colab are as follows:\n1. Open a new Python 3 notebook.\n2. Import this notebook from GitHub (File -> Upload Notebook -> "GITHUB" tab -> copy/paste GitHub URL)\n3. Connect to an instance with a GPU (Runtime -> Change runtime type -> select "GPU" for hardware accelerator)\n4. Run this cell to set up dependencies.\n'

In [3]:
from nemo.utils.exp_manager import exp_manager
from nemo.collections import nlp as nemo_nlp

import os
import wget 
import torch
import pytorch_lightning as pl
from omegaconf import OmegaConf



# Task Description
Given a question and a context both in natural language, predict the span within the context with a start and end position which indicates the answer to the question.
For every word in our training dataset we’re going to predict:
- likelihood this word is the start of the span 
- likelihood this word is the end of the span 

We are using a pretrained [BERT](https://arxiv.org/pdf/1810.04805.pdf) encoder with 2 span prediction heads for prediction start and end position of the answer. The span predictions are token classifiers consisting of a single linear layer. 

# Dataset
This model expects the dataset to be in [SQuAD](https://rajpurkar.github.io/SQuAD-explorer/) format, e.g. a JSON file for each dataset split. 
In the following we will show example for a training file. Each title has one or multiple paragraph entries, each consisting of the text - "context", and question-answer entries. Each question-answer entry has:
* a question
* a globally unique id
* a boolean flag "is_impossible" which shows if the question is answerable or not
* in case the question is answerable one answer entry, which contains the text span and its starting character index in the context. If not answerable, the "answers" list is empty

The evaluation files (for validation and testing) follow the above format except for it can provide more than one answer to the same question. 
The inference file follows the above format except for it does not require the "answers" and "is_impossible" keywords.




```
{
    "data": [
        {
            "title": "Super_Bowl_50", 
            "paragraphs": [
                {
                    "context": "Super Bowl 50 was an American football game to determine the champion of the National Football League (NFL) for the 2015 season. The American Football Conference (AFC) champion Denver Broncos defeated the National Football Conference (NFC) champion Carolina Panthers 24\u201310 to earn their third Super Bowl title. The game was played on February 7, 2016, at Levi's Stadium in the San Francisco Bay Area at Santa Clara, California. As this was the 50th Super Bowl, the league emphasized the \"golden anniversary\" with various gold-themed initiatives, as well as temporarily suspending the tradition of naming each Super Bowl game with Roman numerals (under which the game would have been known as \"Super Bowl L\"), so that the logo could prominently feature the Arabic numerals 50.", 
                    "qas": [
                        {
                            "question": "Where did Super Bowl 50 take place?", 
                            "is_impossible": "false", 
                            "id": "56be4db0acb8001400a502ee", 
                            "answers": [
                                {
                                    "answer_start": "403", 
                                    "text": "Santa Clara, California"
                                }
                            ]
                        },
                        {
                            "question": "What was the winning score of the Super Bowl 50?", 
                            "is_impossible": "true", 
                            "id": "56be4db0acb8001400a502ez", 
                            "answers": [
                            ]
                        }
                    ]
                }
            ]
        }
    ]
}
...
```



## Download the data

In this notebook we are going download the [SQuAD](https://rajpurkar.github.io/SQuAD-explorer/) dataset to showcase how to do training and inference. There are two datasets, SQuAD1.0 and SQuAD2.0. SQuAD 1.1, the previous version of the SQuAD dataset, contains 100,000+ question-answer pairs on 500+ articles. SQuAD2.0 dataset combines the 100,000 questions in SQuAD1.1 with over 50,000 unanswerable questions written adversarially by crowdworkers to look similar to answerable ones. 


To download both datasets, we use  [NeMo/examples/nlp/question_answering/get_squad.py](https://github.com/NVIDIA/NeMo/blob/main/examples/nlp/question_answering/get_squad.py). 




In [4]:
# set the following paths
DATA_DIR = "/home/ocistudent/Desktop/Spring-21/NeMoTutorials"
WORK_DIR = "/home/ocistudent/Desktop/Spring-21/NeMoTutorials"

In [5]:
## download get_squad.py script to download and preprocess the SQuAD data
os.makedirs(WORK_DIR, exist_ok=True)
if not os.path.exists(WORK_DIR + '/get_squad.py'):
    print('Downloading get_squad.py...')
    wget.download(f'https://raw.githubusercontent.com/NVIDIA/NeMo/{BRANCH}/examples/nlp/question_answering/get_squad.py', WORK_DIR)
else:
    print ('get_squad.py already exists')

get_squad.py already exists


In [6]:
# download and preprocess the data
! python $WORK_DIR/get_squad.py --destDir $DATA_DIR

[NeMo I 2021-05-05 13:16:05 get_squad:66] /home/ocistudent/Desktop/Spring-21/NeMoTutorials
[NeMo I 2021-05-05 13:16:05 get_squad:47] Downloading: https://rajpurkar.github.io/SQuAD-explorer/dataset/train-v1.1.json
[NeMo I 2021-05-05 13:16:05 get_squad:49] ** Download file already exists, skipping download
[NeMo I 2021-05-05 13:16:05 get_squad:47] Downloading: https://rajpurkar.github.io/SQuAD-explorer/dataset/dev-v1.1.json
[NeMo I 2021-05-05 13:16:05 get_squad:49] ** Download file already exists, skipping download
[NeMo I 2021-05-05 13:16:05 get_squad:47] Downloading: https://rajpurkar.github.io/SQuAD-explorer/dataset/train-v2.0.json
[NeMo I 2021-05-05 13:16:05 get_squad:49] ** Download file already exists, skipping download
[NeMo I 2021-05-05 13:16:05 get_squad:47] Downloading: https://rajpurkar.github.io/SQuAD-explorer/dataset/dev-v2.0.json
[NeMo I 2021-05-05 13:16:05 get_squad:49] ** Download file already exists, skipping download


after execution of the above cell, your data folder will contain a subfolder "squad" the following 4 files for training and evaluation
- v1.1/train-v1.1.json
- v1.1/dev-v1.1.json
- v2.0/train-v2.0.json
- v2.0/dev-v2.0.json

In [7]:
! ls -LR {DATA_DIR}/squad

/home/ocistudent/Desktop/Spring-21/NeMoTutorials/squad:
v1.1  v2.0

/home/ocistudent/Desktop/Spring-21/NeMoTutorials/squad/v1.1:
dev-v1.1.json  train-v1.1.json

/home/ocistudent/Desktop/Spring-21/NeMoTutorials/squad/v2.0:
dev-v2.0.json  train-v2.0.json


## Data preprocessing

The input into the model is the concatenation of two tokenized sequences:
" [CLS] query [SEP] context [SEP]".
This is the tokenization used for BERT, i.e. [WordPiece](https://arxiv.org/pdf/1609.08144.pdf) Tokenizer, which uses the [Google's BERT vocabulary](https://github.com/google-research/bert). This tokenizer is configured with `model.tokenizer.tokenizer_name=bert-base-uncased` and is automatically instantiated using [Huggingface](https://huggingface.co/)'s API. 
The benefit of this tokenizer is that this is compatible with a pretrained BERT model, from which we can finetune instead of training the question answering model from scratch. However, we also support other tokenizers, such as `model.tokenizer.tokenizer_name=sentencepiece`. Unlike the BERT WordPiece tokenizer, the [SentencePiece](https://github.com/google/sentencepiece) tokenizer model needs to be first created from a text file.
See [02_NLP_Tokenizers.ipynb](https://colab.research.google.com/github/NVIDIA/NeMo/blob/main/tutorials/nlp/02_NLP_Tokenizers.ipynb) for more details on how to use NeMo Tokenizers.

# Data and Model Parameters


Note, this is only an example to showcase usage and is not optimized for accuracy. In the following, we will download and adjust the model configuration to create a toy example, where we only use a small fraction of the original dataset. 

In order to train the full SQuAD model, leave the model parameters from the configuration file unchanged. This sets NUM_SAMPLES=-1 to use the entire dataset, which will slow down performance significantly. We recommend to use bash script and multi-GPU to accelerate this. 


In [8]:
# This is the model configuration file that we will download, do not change this
MODEL_CONFIG = "question_answering_squad_config.yaml"

# model parameters, play with these
BATCH_SIZE = 12
MAX_SEQ_LENGTH = 384
# specify BERT-like model, you want to use
PRETRAINED_BERT_MODEL = "bert-base-uncased"
TOKENIZER_NAME = "bert-base-uncased" # tokenizer name

# Number of data examples used for training, validation, test and inference
TRAIN_NUM_SAMPLES = VAL_NUM_SAMPLES = TEST_NUM_SAMPLES = 5000 
INFER_NUM_SAMPLES = 5

TRAIN_FILE = f"{DATA_DIR}/squad/v1.1/train-v1.1.json"
VAL_FILE = f"{DATA_DIR}/squad/v1.1/dev-v1.1.json"
TEST_FILE = f"{DATA_DIR}/squad/v1.1/dev-v1.1.json"
INFER_FILE = f"{DATA_DIR}/squad/v1.1/dev-v1.1.json"

INFER_PREDICTION_OUTPUT_FILE = "output_prediction.json"
INFER_NBEST_OUTPUT_FILE = "output_nbest.json"

# training parameters
LEARNING_RATE = 0.00003

# number of epochs
MAX_EPOCHS = 1

# Model Configuration

The model is defined in a config file which declares multiple important sections. They are:
- **model**: All arguments that will relate to the Model - language model, span prediction, optimizer and schedulers, datasets and any other related information

- **trainer**: Any argument to be passed to PyTorch Lightning

In [9]:
# download the model's default configuration file 
config_dir = WORK_DIR + '/configs/'
os.makedirs(config_dir, exist_ok=True)
if not os.path.exists(config_dir + MODEL_CONFIG):
    print('Downloading config file...')
    wget.download(f'https://raw.githubusercontent.com/NVIDIA/NeMo/{BRANCH}/examples/nlp/question_answering/conf/{MODEL_CONFIG}', config_dir)
else:
    print ('config file is already exists')

config file is already exists


In [10]:
# this line will print the entire default config of the model
config_path = f'{WORK_DIR}/configs/{MODEL_CONFIG}'
print(config_path)
config = OmegaConf.load(config_path)
print(OmegaConf.to_yaml(config))

/home/ocistudent/Desktop/Spring-21/NeMoTutorials/configs/question_answering_squad_config.yaml
name: QA
pretrained_model: null
do_training: true
trainer:
  gpus: 1
  num_nodes: 1
  max_epochs: 2
  max_steps: null
  accumulate_grad_batches: 1
  precision: 16
  amp_level: O1
  accelerator: ddp
  gradient_clip_val: 0.0
  val_check_interval: 1.0
  checkpoint_callback: false
  logger: false
  num_sanity_val_steps: 0
  log_every_n_steps: 1
model:
  nemo_path: null
  dataset:
    version_2_with_negative: false
    doc_stride: 128
    max_query_length: 64
    max_seq_length: 384
    max_answer_length: 30
    null_score_diff_threshold: 0.0
    n_best_size: 20
    use_cache: true
    do_lower_case: true
    num_workers: 2
    pin_memory: false
    drop_last: false
  train_ds:
    file: null
    batch_size: 24
    shuffle: true
    num_samples: -1
    num_workers: ${model.dataset.num_workers}
    drop_last: ${model.dataset.drop_last}
    pin_memory: ${model.dataset.pin_memory}
  validation_ds:
   

## Setting up data within the config

Among other things, the config file contains dictionaries called dataset, train_ds and validation_ds, test_ds. These are configurations used to setup the Dataset and DataLoaders of the corresponding config.

Specify data paths using `model.train_ds.file`, `model.valuation_ds.file` and `model.test_ds.file`.

Let's now add the data paths to the config.

In [12]:
config.model.train_ds.file = TRAIN_FILE
config.model.validation_ds.file = VAL_FILE
config.model.test_ds.file = TEST_FILE

config.model.train_ds.num_samples = TRAIN_NUM_SAMPLES
config.model.validation_ds.num_samples = VAL_NUM_SAMPLES
config.model.test_ds.num_samples = TEST_NUM_SAMPLES

config.model.tokenizer.tokenizer_name = TOKENIZER_NAME

print(OmegaConf.to_yaml(config))

name: QA
pretrained_model: null
do_training: true
trainer:
  gpus: 1
  num_nodes: 1
  max_epochs: 2
  max_steps: null
  accumulate_grad_batches: 1
  precision: 16
  amp_level: O1
  accelerator: ddp
  gradient_clip_val: 0.0
  val_check_interval: 1.0
  checkpoint_callback: false
  logger: false
  num_sanity_val_steps: 0
  log_every_n_steps: 1
model:
  nemo_path: null
  dataset:
    version_2_with_negative: false
    doc_stride: 128
    max_query_length: 64
    max_seq_length: 384
    max_answer_length: 30
    null_score_diff_threshold: 0.0
    n_best_size: 20
    use_cache: true
    do_lower_case: true
    num_workers: 2
    pin_memory: false
    drop_last: false
  train_ds:
    file: /home/ocistudent/Desktop/Spring-21/NeMoTutorials/squad/v1.1/train-v1.1.json
    batch_size: 24
    shuffle: true
    num_samples: 5000
    num_workers: ${model.dataset.num_workers}
    drop_last: ${model.dataset.drop_last}
    pin_memory: ${model.dataset.pin_memory}
  validation_ds:
    file: /home/ocistude

# Building the PyTorch Lightning Trainer

NeMo models are primarily PyTorch Lightning modules - and therefore are entirely compatible with the PyTorch Lightning ecosystem!

Let's first instantiate a Trainer object!

In [13]:
# lets modify some trainer configs
# checks if we have GPU available and uses it
cuda = 1 if torch.cuda.is_available() else 0
config.trainer.gpus = cuda
config.trainer.precision = 16 if torch.cuda.is_available() else 32

# For mixed precision training, use precision=16 and amp_level=O1

config.trainer.max_epochs = MAX_EPOCHS

# Remove distributed training flags if only running on a single GPU or CPU
config.trainer.accelerator = None

print("Trainer config - \n")
print(OmegaConf.to_yaml(config.trainer))

trainer = pl.Trainer(**config.trainer)

GPU available: True, used: True
TPU available: None, using: 0 TPU cores
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1]
Using native 16bit precision.


Trainer config - 

gpus: 1
num_nodes: 1
max_epochs: 1
max_steps: null
accumulate_grad_batches: 1
precision: 16
amp_level: O1
accelerator: null
gradient_clip_val: 0.0
val_check_interval: 1.0
checkpoint_callback: false
logger: false
num_sanity_val_steps: 0
log_every_n_steps: 1



# Setting up a NeMo Experiment¶

NeMo has an experiment manager that handles logging and checkpointing for us, so let's use it!

In [14]:
config.exp_manager.exp_dir = WORK_DIR
exp_dir = exp_manager(trainer, config.get("exp_manager", None))

# the exp_dir provides a path to the current experiment for easy access
exp_dir = str(exp_dir)

[NeMo I 2021-05-05 13:18:30 exp_manager:208] Experiments will be logged at /home/ocistudent/Desktop/Spring-21/NeMoTutorials/QA/2021-05-05_13-18-30
[NeMo I 2021-05-05 13:18:30 exp_manager:548] TensorboardLogger has been set up


# Using an Out-Of-Box Model

In [15]:
# list available pretrained models
nemo_nlp.models.QAModel.list_available_models()

[PretrainedModelInfo(
 	pretrained_model_name=qa_squadv1.1_bertbase,
 	description=Question answering model finetuned from NeMo BERT Base Uncased on SQuAD v1.1 dataset which obtains an exact match (EM) score of 82.78% and an F1 score of 82.78%.,
 	location=https://api.ngc.nvidia.com/v2/models/nvidia/nemo/qa_squadv1_1_bertbase/versions/1.0.0rc1/files/qa_squadv1.1_bertbase.nemo
 ),
 PretrainedModelInfo(
 	pretrained_model_name=qa_squadv2.0_bertbase,
 	description=Question answering model finetuned from NeMo BERT Base Uncased on SQuAD v2.0 dataset which obtains an exact match (EM) score of 75.04% and an F1 score of 78.08%.,
 	location=https://api.ngc.nvidia.com/v2/models/nvidia/nemo/qa_squadv2_0_bertbase/versions/1.0.0rc1/files/qa_squadv2.0_bertbase.nemo
 ),
 PretrainedModelInfo(
 	pretrained_model_name=qa_squadv1_1_bertlarge,
 	description=Question answering model finetuned from NeMo BERT Large Uncased on SQuAD v1.1 dataset which obtains an exact match (EM) score of 85.44% and an F1 scor

In [16]:
# load pretained model
pretrained_model_name="qa_squadv1.1_bertbase"
model = nemo_nlp.models.QAModel.from_pretrained(model_name=pretrained_model_name)

[NeMo I 2021-05-05 13:19:29 cloud:66] Downloading from: https://api.ngc.nvidia.com/v2/models/nvidia/nemo/qa_squadv1_1_bertbase/versions/1.0.0rc1/files/qa_squadv1.1_bertbase.nemo to /home/ocistudent/.cache/torch/NeMo/NeMo_1.0.0rc1/qa_squadv1.1_bertbase/42a5d611ce54ad0e2cc05000e259932f/qa_squadv1.1_bertbase.nemo
[NeMo I 2021-05-05 13:24:40 common:654] Instantiating model from pre-trained checkpoint


[NeMo W 2021-05-05 13:24:43 modelPT:180] Using /tmp/tmp61qo5m_i/tokenizer.vocab_file instead of tokenizer.vocab_file.


HBox(children=(HTML(value='Downloading'), FloatProgress(value=0.0, max=570.0), HTML(value='')))




HBox(children=(HTML(value='Downloading'), FloatProgress(value=0.0, max=231508.0), HTML(value='')))




HBox(children=(HTML(value='Downloading'), FloatProgress(value=0.0, max=28.0), HTML(value='')))




HBox(children=(HTML(value='Downloading'), FloatProgress(value=0.0, max=466062.0), HTML(value='')))

Using bos_token, but it is not set yet.
Using eos_token, but it is not set yet.
[NeMo W 2021-05-05 13:25:35 modelPT:132] Please call the ModelPT.setup_training_data() method and provide a valid configuration file to setup the train data loader.
    Train config : 
    file: /datasets/squad/v1.1/train-v1.1.json
    batch_size: 3
    shuffle: true
    num_samples: -1
    num_workers: 2
    drop_last: false
    pin_memory: false
    
[NeMo W 2021-05-05 13:25:35 modelPT:139] Please call the ModelPT.setup_validation_data() or ModelPT.setup_multiple_validation_data() method and provide a valid configuration file to setup the validation data loader(s). 
    Validation config : 
    file: /datasets/squad/v1.1/dev-v1.1.json
    batch_size: 3
    shuffle: false
    num_samples: -1
    num_workers: 2
    drop_last: false
    pin_memory: false
    
[NeMo W 2021-05-05 13:25:35 modelPT:146] Please call the ModelPT.setup_test_data() or ModelPT.setup_multiple_test_data() method and provide a valid con




HBox(children=(HTML(value='Downloading'), FloatProgress(value=0.0, max=440473133.0), HTML(value='')))


[NeMo I 2021-05-05 13:28:07 modelPT:376] Model QAModel was successfully restored from /home/ocistudent/.cache/torch/NeMo/NeMo_1.0.0rc1/qa_squadv1.1_bertbase/42a5d611ce54ad0e2cc05000e259932f/qa_squadv1.1_bertbase.nemo.


# Model Training

Before initializing the model, we might want to modify some of the model configs.

In [17]:
# complete list of supported BERT-like models
nemo_nlp.modules.get_pretrained_lm_models_list()

['megatron-bert-345m-uncased',
 'megatron-bert-345m-cased',
 'megatron-bert-uncased',
 'megatron-bert-cased',
 'biomegatron-bert-345m-uncased',
 'biomegatron-bert-345m-cased',
 'bert-base-uncased',
 'bert-large-uncased',
 'bert-base-cased',
 'bert-large-cased',
 'bert-base-multilingual-uncased',
 'bert-base-multilingual-cased',
 'bert-base-chinese',
 'bert-base-german-cased',
 'bert-large-uncased-whole-word-masking',
 'bert-large-cased-whole-word-masking',
 'bert-large-uncased-whole-word-masking-finetuned-squad',
 'bert-large-cased-whole-word-masking-finetuned-squad',
 'bert-base-cased-finetuned-mrpc',
 'bert-base-german-dbmdz-cased',
 'bert-base-german-dbmdz-uncased',
 'cl-tohoku/bert-base-japanese',
 'cl-tohoku/bert-base-japanese-whole-word-masking',
 'cl-tohoku/bert-base-japanese-char',
 'cl-tohoku/bert-base-japanese-char-whole-word-masking',
 'TurkuNLP/bert-base-finnish-cased-v1',
 'TurkuNLP/bert-base-finnish-uncased-v1',
 'wietsedv/bert-base-dutch-cased',
 'distilbert-base-uncased

In [19]:
# add the specified above model parameters to the config
config.model.language_model.pretrained_model_name = PRETRAINED_BERT_MODEL
config.model.train_ds.batch_size = BATCH_SIZE
config.model.validation_ds.batch_size  = BATCH_SIZE
config.model.test_ds.batch_size = BATCH_SIZE
config.model.optim.lr = LEARNING_RATE

print("Updated model config - \n")
print(OmegaConf.to_yaml(config.model))

Updated model config - 

nemo_path: null
dataset:
  version_2_with_negative: false
  doc_stride: 128
  max_query_length: 64
  max_seq_length: 384
  max_answer_length: 30
  null_score_diff_threshold: 0.0
  n_best_size: 20
  use_cache: true
  do_lower_case: true
  num_workers: 2
  pin_memory: false
  drop_last: false
train_ds:
  file: /home/ocistudent/Desktop/Spring-21/NeMoTutorials/squad/v1.1/train-v1.1.json
  batch_size: 12
  shuffle: true
  num_samples: 5000
  num_workers: ${model.dataset.num_workers}
  drop_last: ${model.dataset.drop_last}
  pin_memory: ${model.dataset.pin_memory}
validation_ds:
  file: /home/ocistudent/Desktop/Spring-21/NeMoTutorials/squad/v1.1/dev-v1.1.json
  batch_size: 12
  shuffle: false
  num_samples: 5000
  num_workers: ${model.dataset.num_workers}
  drop_last: ${model.dataset.drop_last}
  pin_memory: ${model.dataset.pin_memory}
test_ds:
  file: /home/ocistudent/Desktop/Spring-21/NeMoTutorials/squad/v1.1/dev-v1.1.json
  batch_size: 12
  shuffle: false
  num_sa

In [21]:
# initialize the model
# dataset we'll be prepared for training and evaluation during
model = nemo_nlp.models.QAModel(cfg=config.model, trainer=trainer)

Using bos_token, but it is not set yet.
Using eos_token, but it is not set yet.
100%|██████████| 442/442 [00:18<00:00, 23.33it/s]

[NeMo I 2021-05-05 13:32:59 qa_dataset:119] loading from /home/ocistudent/Desktop/Spring-21/NeMoTutorials/squad/v1.1/train-v1.1.json_cache_train_BertTokenizer_30522_384_128_64_5000



100%|██████████| 48/48 [00:02<00:00, 21.19it/s]

[NeMo I 2021-05-05 13:33:02 qa_dataset:119] loading from /home/ocistudent/Desktop/Spring-21/NeMoTutorials/squad/v1.1/dev-v1.1.json_cache_eval_BertTokenizer_30522_384_128_64_5000



100%|██████████| 48/48 [00:02<00:00, 17.35it/s]

[NeMo I 2021-05-05 13:33:05 qa_dataset:119] loading from /home/ocistudent/Desktop/Spring-21/NeMoTutorials/squad/v1.1/dev-v1.1.json_cache_eval_BertTokenizer_30522_384_128_64_5000





## Monitoring Training Progress
Optionally, you can create a Tensorboard visualization to monitor training progress.

In [23]:
try:
  from google import colab
  COLAB_ENV = True
except (ImportError, ModuleNotFoundError):
  COLAB_ENV = False

# Load the TensorBoard notebook extension
if COLAB_ENV:
  %load_ext tensorboard
  %tensorboard --logdir {exp_dir}
else:
  print("To use tensorboard, please use this notebook in a Google Colab environment.")

To use tensorboard, please use this notebook in a Google Colab environment.


In [24]:
# start the training
trainer.fit(model)

[NeMo I 2021-05-05 13:36:28 modelPT:685] Optimizer config = AdamW (
    Parameter Group 0
        amsgrad: False
        betas: (0.9, 0.999)
        eps: 1e-08
        lr: 3e-05
        weight_decay: 0.0
    )
[NeMo I 2021-05-05 13:36:28 lr_scheduler:617] Scheduler "<nemo.core.optim.lr_scheduler.SquareRootAnnealing object at 0x7fd1a19d9d00>" 
    will be used during training (effective maximum steps = 426) - 
    Parameters : 
    (warmup_steps: null
    warmup_ratio: 0.0
    last_epoch: -1
    max_steps: 426
    )



  | Name       | Type            | Params
-----------------------------------------------
0 | bert_model | BertEncoder     | 109 M 
1 | classifier | TokenClassifier | 1.5 K 
2 | loss       | SpanningLoss    | 0     
-----------------------------------------------
109 M     Trainable params
0         Non-trainable params
109 M     Total params
    
    


HBox(children=(HTML(value='Training'), FloatProgress(value=1.0, bar_style='info', layout=Layout(flex='2'), max…

    


HBox(children=(HTML(value='Validating'), FloatProgress(value=1.0, bar_style='info', layout=Layout(flex='2'), m…

[NeMo I 2021-05-05 13:41:31 qa_model:172] val exact match 60.08
[NeMo I 2021-05-05 13:41:31 qa_model:173] val f1 70.8809637707633


Epoch 0, global step 425: val_loss reached 1.66743 (best 1.66743), saving model to "/home/ocistudent/Desktop/Spring-21/NeMoTutorials/QA/2021-05-05_13-18-30/checkpoints/QA---val_loss=1.67-epoch=0.ckpt" as top 3
Saving latest checkpoint...





1

After training for 1 epoch, exact match on the evaluation data should be around 59.2%, F1 around 70.2%.

# Evaluation

To see how the model performs, let’s run evaluation on the test dataset.

In [25]:
model.setup_test_data(test_data_config=config.model.test_ds)
trainer.test(model)

100%|██████████| 48/48 [00:02<00:00, 21.32it/s]

[NeMo I 2021-05-05 13:41:59 qa_dataset:119] loading from /home/ocistudent/Desktop/Spring-21/NeMoTutorials/squad/v1.1/dev-v1.1.json_cache_eval_BertTokenizer_30522_384_128_64_5000



    


HBox(children=(HTML(value='Testing'), FloatProgress(value=1.0, bar_style='info', layout=Layout(flex='2'), max=…

[NeMo I 2021-05-05 13:43:25 qa_model:172] test exact match 60.08
[NeMo I 2021-05-05 13:43:25 qa_model:173] test f1 70.8809637707633

--------------------------------------------------------------------------------
DATALOADER:0 TEST RESULTS
{'test_exact_match': 60.08,
 'test_f1': 70.8809637707633,
 'test_loss': tensor(1.6674, device='cuda:0')}
--------------------------------------------------------------------------------


[{'test_loss': 1.6674314737319946,
  'test_exact_match': 60.08,
  'test_f1': 70.8809637707633}]

# Inference

To use the model for creating predictions, let’s run inference on the unlabeled inference dataset.

In [26]:
# # store test prediction under the experiment output folder
output_prediction_file = f"{exp_dir}/{INFER_PREDICTION_OUTPUT_FILE}"
output_nbest_file = f"{exp_dir}/{INFER_NBEST_OUTPUT_FILE}"
all_preds, all_nbests = model.inference(file=INFER_FILE, batch_size=5, num_samples=INFER_NUM_SAMPLES, output_nbest_file=output_nbest_file, output_prediction_file=output_prediction_file)

100%|██████████| 48/48 [00:02<00:00, 21.46it/s]


In [27]:
for _, item in all_preds.items():
    print(f"question: {item[0]} answer: {item[1]}")
#The prediction file contains the predicted answer to each question id for the first TEST_NUM_SAMPLES.
! python -m json.tool $exp_dir/$INFER_PREDICTION_OUTPUT_FILE

question: Which NFL team represented the AFC at Super Bowl 50? answer: Denver Broncos
question: Which NFL team represented the NFC at Super Bowl 50? answer: Carolina Panthers
question: Where did Super Bowl 50 take place? answer: Levi's Stadium
question: Which NFL team won Super Bowl 50? answer: Denver Broncos
question: What color was used to emphasize the 50th anniversary of the Super Bowl? answer: gold
{
    "56be4db0acb8001400a502ec": [
        "Which NFL team represented the AFC at Super Bowl 50?",
        "Denver Broncos"
    ],
    "56be4db0acb8001400a502ed": [
        "Which NFL team represented the NFC at Super Bowl 50?",
        "Carolina Panthers"
    ],
    "56be4db0acb8001400a502ee": [
        "Where did Super Bowl 50 take place?",
        "Levi's Stadium"
    ],
    "56be4db0acb8001400a502ef": [
        "Which NFL team won Super Bowl 50?",
        "Denver Broncos"
    ],
    "56be4db0acb8001400a502f0": [
        "What color was used to emphasize the 50th a

If you have NeMo installed locally, you can also train the model with 
[NeMo/examples/nlp/question_answering/get_squad.py](https://github.com/NVIDIA/NeMo/blob/main/examples/nlp/question_answering/question_answering_squad.py).

To run training script, use:

`python question_answering_squad.py model.train_ds.file=TRAIN_FILE model.validation_ds.file=VAL_FILE model.test_ds.file=TEST_FILE`

To improve the performance of the model, train with multi-GPU and a global batch size of 24. So if you use 8 GPUs with `trainer.gpus=8`, set `model.train_ds.batch_size=3`