<a href="https://www.nvidia.com/dli"> <img src="images/DLI_Header.png" alt="Header" style="width: 400px;"/> </a>

# 7.0 NER Fine-Tuning
## (part of Lab 2)

In this notebook, you'll use the NVIDIA TAO (Train, Adapt, and Optimize) Toolkit to fine-tune a [BERT](https://arxiv.org/abs/1810.04805)-based model for a named entity recognition (NER) task in a restaurant context using the [MIT Restaurant Corpus](https://groups.csail.mit.edu/sls/downloads/restaurant) dataset. To do so, you will use the [Token Classification](https://docs.nvidia.com/metropolis/TAO/tao-user-guide/text/nlp/token_classification.html) task in TAO.

**[7.1 Named Entity Recognition](#7.1-Named-Entity-Recognition)<br>**
**[7.2 TAO Toolkit `token_classification` Task](#7.2-TAO-Toolkit-token_classification-Task)<br>**
&nbsp;&nbsp;&nbsp;&nbsp;[7.2.1 Path Setup](#7.2.1-Path-Setup)<br>
&nbsp;&nbsp;&nbsp;&nbsp;[7.2.2 Specification Files](#7.2.2-Specification-Files)<br>
**[7.3 General NER Inference](#7.3-General-NER-Inference)<br>**
&nbsp;&nbsp;&nbsp;&nbsp;[7.3.1 NER Inference with a GMB Context](#7.3.1-NER-Inference-with-a-GMB-Context)<br>
&nbsp;&nbsp;&nbsp;&nbsp;[7.3.2 Exercise: NER Inference with a Restaurant Context](#7.3.2-Exercise:-NER-Inference-with-a-Restaurant-Context)<br>
**[7.4 NER Training](#7.4-NER-Training)<br>**
&nbsp;&nbsp;&nbsp;&nbsp;[7.4.1 Restaurant Data Exploration](#7.4.1-Restaurant-Data-Exploration)<br>
&nbsp;&nbsp;&nbsp;&nbsp;[7.4.2 `train` Command](#7.4.2-train-Command)<br>
&nbsp;&nbsp;&nbsp;&nbsp;[7.4.3 Faster Training with AMP](#7.4.3-Faster-Training-with-AMP)<br>
&nbsp;&nbsp;&nbsp;&nbsp;[7.4.4 Change the Language Model](#7.4.4-Change-the-Language-Model)<br>
&nbsp;&nbsp;&nbsp;&nbsp;[7.4.5 Evaluate the Trained Model](#7.4.5-Evaluate-the-Trained-Model)<br>
**[7.5 NER Fine-Tuning](#7.5-NER-Fine-Tuning)<br>**
&nbsp;&nbsp;&nbsp;&nbsp;[7.5.1 Exercise: Evaluate the Fine-Tuned Model](#7.5.1-Exercise:-Evaluate-the-Fine-Tuned-Model)<br>
&nbsp;&nbsp;&nbsp;&nbsp;[7.5.2 Inference on the Fine-Tuned Model](#7.5.2-Inference-on-the-Fine-Tuned-Model)<br>
**[7.6 Export for Deployment](#7.6-Export-for-Deployment)<br>**
&nbsp;&nbsp;&nbsp;&nbsp;[7.6.1 Exercise: NER Model Export to ONNX](#7.6.1-Exercise:-NER-Model-Export-to-ONNX)<br>

### Notebook Dependencies
The steps in this notebook assume that you have:

1. **NGC Credentials**<br>Be sure you have added your NGC credential as described in the [NGC Setup notebook](003_Intro_NGC_Setup.ipynb)

In [1]:
# Check running docker containers. This should be empty.
!docker ps

CONTAINER ID   IMAGE     COMMAND   CREATED   STATUS    PORTS     NAMES


In [2]:
# If not empty, clear Docker containers
!docker kill $(docker ps -q)
# Check for clean environment - this should be empty
!docker ps

"docker kill" requires at least 1 argument.
See 'docker kill --help'.

Usage:  docker kill [OPTIONS] CONTAINER [CONTAINER...]

Kill one or more running containers
CONTAINER ID   IMAGE     COMMAND   CREATED   STATUS    PORTS     NAMES


---
# 7.1 Named Entity Recognition

NER, also referred to as entity chunking, identification, token classification, or extraction, is the task of detecting and classifying key information (entities) in text.  In the general example you used for the Riva Contact app, the entities classified were person, location, organization, time, and miscellaneous. 
For example, in a sentence: `Mary lives in Santa Clara and works at NVIDIA`, we should detect that `Mary` is a person, `Santa Clara` is a location and `NVIDIA` is an organization.

Using TAO, we can train a new model it to recognize different entities, such as cuisine, dish, hours, or restaurant_name for a new domain context.

Using `tao info`, review the tasks TAO can perform.  In our ASR examples we used the `speech_to_text` task.  For NER, we will use the `token_classification` task.

In [3]:
# Check the token_classification capability in your TAO version
!tao info --verbose

Configuration of the TAO Toolkit Instance

dockers: 		
	nvidia/tao/tao-toolkit-tf: 			
		docker_registry: nvcr.io
		docker_tag: v3.21.08-py3
		tasks: 
			1. augment
			2. bpnet
			3. classification
			4. detectnet_v2
			5. dssd
			6. emotionnet
			7. faster_rcnn
			8. fpenet
			9. gazenet
			10. gesturenet
			11. heartratenet
			12. lprnet
			13. mask_rcnn
			14. multitask_classification
			15. retinanet
			16. ssd
			17. unet
			18. yolo_v3
			19. yolo_v4
			20. converter
	nvidia/tao/tao-toolkit-pyt: 			
		docker_registry: nvcr.io
		docker_tag: v3.21.08-py3
		tasks: 
			1. speech_to_text
			2. speech_to_text_citrinet
			3. text_classification
			4. question_answering
			5. token_classification
			6. intent_slot_classification
			7. punctuation_and_capitalization
	nvidia/tao/tao-toolkit-lm: 			
		docker_registry: nvcr.io
		docker_tag: v3.21.08-py3
		tasks: 
			1. n_gram
format_version: 1.0
toolkit_version: 3.21.08
published_date: 08/17/2021


---
# 7.2 TAO Toolkit `token_classification` Task

The [token_classification](https://docs.nvidia.com/tao/tao-toolkit/text/nlp/token_classification.html) task provides commands to run data preprocessing, training, fine-tuning, evaluation, inference, and export. All configurations happen through YAML spec files. The `tao token_classification --help` usage information output is as follows:

```
usage: token_classification [-h] -r RESULTS_DIR [-k KEY]
                            [-e EXPERIMENT_SPEC_FILE] [-g GPUS]
                            [-m RESUME_MODEL_WEIGHTS] [-o OUTPUT_SPECS_DIR]
                            {dataset_convert,evaluate,export,finetune,infer,infer_onnx,train,download_specs}

Train Adapt Optimize Toolkit

positional arguments:
  {dataset_convert,evaluate,export,finetune,infer,infer_onnx,train,download_specs}
                        Subtask for a given task/model.

optional arguments:
  -h, --help            show this help message and exit
  -r RESULTS_DIR, --results_dir RESULTS_DIR
                        Path to a folder where the experiment outputs should
                        be written. (DEFAULT: ./)
  -k KEY, --key KEY     User specific encoding key to save or load a .tlt
                        model.
  -e EXPERIMENT_SPEC_FILE, --experiment_spec_file EXPERIMENT_SPEC_FILE
                        Path to the experiment spec file.
  -g GPUS, --gpus GPUS  Number of GPUs to use. The default value is 1.
  -m RESUME_MODEL_WEIGHTS, --resume_model_weights RESUME_MODEL_WEIGHTS
                        Path to a pre-trained model or model to continue
                        training.
  -o OUTPUT_SPECS_DIR, --output_specs_dir OUTPUT_SPECS_DIR
                        Path to a target folder where experiment spec files
                        will be downloaded.
```                        

This should look pretty familiar as it is almost identical in form to the `speech_to_text` task!  As before, additional arguments can be added to the end of the command to override values in the spec file.

## 7.2.1 Path Setup

Define some folder locations and an encryption key.

In [4]:
import os.path
from shutil import rmtree

# The source mount is our workspace on the host (this lab instance)
source_mount = "/dli/task/tao"
# The destination mount is our mapped workspace within the TAO docker container's file structure
destination_mount = "/workspace/mount"

# The following paths are set relative to the TAO docker container
# The path to the specification yaml files
SPECS_DIR=os.path.join(destination_mount, 'specs')

# The results are saved at this path by default
RESULTS_DIR=os.path.join(destination_mount, 'results')

# The data are located at this path by default
DATA_DIR=os.path.join(destination_mount, 'data')

# The models are located at this path by default
MODELS_DIR=os.path.join(destination_mount, 'models')

# Set your encryption key, and use the same key for all commands. Please use "tlt_encode" if you'd like to deploy the models later with NVIDIA Riva.
KEY='tlt_encode'

## 7.2.2 Specification Files
Fetch the example specification YAML files for the `token_classification` task. We can load example files with the [download_specs subtask](https://docs.nvidia.com/tao/tao-toolkit/text/nlp/token_classification.html#downloading-sample-spec-files), then modify them or override them later:

In [5]:
%%time
# The first time, TAO takes about 3 minutes to load and run

# Delete the token_classification specification directory if it already exists
folder = source_mount + '/specs/token_classification'
if os.path.exists(folder):
    rmtree(folder)
    
# Download specification files for token_classification 
!tao token_classification download_specs \
    -o $SPECS_DIR/token_classification \
    -r $RESULTS_DIR

2022-04-27 07:05:30,235 [INFO] root: Registry: ['nvcr.io']
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the "user":"UID:GID" in the
DockerOptions portion of the "/root/.tao_mounts.json" file. You can obtain your
users UID and GID by using the "id -u" and "id -g" commands on the
terminal.
[NeMo W 2022-04-27 07:05:34 experimental:27] Module <class 'nemo.collections.nlp.modules.common.megatron.megatron_bert.MegatronBertEncoder'> is experimental, not ready for production and is not fully supported. Use at your own risk.
INFO: Generating new fontManager, this may take some time...
[NeMo I 2022-04-27 07:05:36 tlt_logging:20] Experiment configuration:
    exp_manager:
      task_name: download_specs
      explicit_log_dir: /workspace/mount/results
    source_data_dir: /opt/conda/lib/python3.8/site-packages/nlp/token_classification/experiment_specs
    target_data_dir: /workspace/mount/specs/token_classification
    workflow: nlp
   

---
# 7.3 General NER Inference

For our general model, we will use the [Named Entity Recognition Bert Model](https://ngc.nvidia.com/catalog/models/nvidia:tlt-riva:namedentityrecognition_english_bert), a TAO compatible NER pretrained model available on NGC.

The model was trained on the [Groningen Meaning Bank (GMB) corpus](https://gmb.let.rug.nl/) for entity recognition. The GMB dataset is a fairly large corpus with annotations. Note that GMB is not completely human-annotated and it is not considered 100% correct.  The following entity classes appear in the dataset:
```
LOC = Geographical Entity
ORG = Organization
PER = Person
GPE = Geopolitical Entity
TIME = Time indicator
ART = Artifact           --|
EVE = Event              --|-- combined as MISC
NAT = Natural Phenomenon --|

```
For this model, the classes ART, EVE, and NAT were combined into a MISC class due to the small number of examples for these classes. 
This NER classifier achieves a 74.21 F1 macro score on the GMB dataset. The macro score computes the F1 score for each label and averages without taking any label imbalance into account. \begin{array}{rcl} \text{Macro F1-score} & = & \frac{1}{N} \sum_{i=0}^{N} {\text{F1-score}_i} \\ \end{array} where N the number of labels and i label index.

This model is already available in the `tao/models/` directory.

In [6]:
# check model on /tao/models
MODEL_DOWNLOAD_DIR=os.path.join(source_mount, 'models')
!ls $MODEL_DOWNLOAD_DIR/namedentityrecognition_english_bert.tlt

/dli/task/tao/models/namedentityrecognition_english_bert.tlt


## 7.3.1 NER Inference with a GMB Context

We need to use the `tao token_classification infer` command for inference.  <br> The corresponding [infer.yaml](tao/specs/token_classification/infer.yaml) file is straightforward and includes some "simulated" user input: 

```yaml
input_batch:
  - 'We bought four shirts from the Nvidia gear store in Santa Clara.'
  - 'Nvidia is a company.'
```


Try querying the general NER model. Feel free to try out custom inputs as an exercise by changing the data and running the inference command again.

In [7]:
# TAO inference NER general model with text in the general domain
!tao token_classification infer \
    -e $SPECS_DIR/token_classification/infer.yaml \
    -g 1 \
    -m $MODELS_DIR/namedentityrecognition_english_bert.tlt \
    -k $KEY \
    -r $RESULTS_DIR/bert-base/

2022-04-27 07:16:19,664 [INFO] root: Registry: ['nvcr.io']
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the "user":"UID:GID" in the
DockerOptions portion of the "/root/.tao_mounts.json" file. You can obtain your
users UID and GID by using the "id -u" and "id -g" commands on the
terminal.
[NeMo W 2022-04-27 07:16:23 experimental:27] Module <class 'nemo.collections.nlp.modules.common.megatron.megatron_bert.MegatronBertEncoder'> is experimental, not ready for production and is not fully supported. Use at your own risk.
INFO: Generating new fontManager, this may take some time...
[NeMo W 2022-04-27 07:16:26 experimental:27] Module <class 'nemo.collections.nlp.modules.common.megatron.megatron_bert.MegatronBertEncoder'> is experimental, not ready for production and is not fully supported. Use at your own risk.
[NeMo I 2022-04-27 07:16:27 tlt_logging:20] Experiment configuration:
    exp_manager:
      task_name: infer
      explici

You should find results like the following towards the end of the output.  They are also available in the results log at [tao/results/bert-base/infer.log](tao/results/bert-base/infer.log).

In [8]:
!grep Results $source_mount/results/bert-base/infer.log

[NeMo I 2022-04-27 07:16:46 infer:75] Results: Kareem[B-PER] Benzema[I-PER] confident Real[B-ORG] Madrid[I-ORG] will reach Champions League final
[NeMo I 2022-04-27 07:16:46 infer:75] Results: Vinicius[B-PER] enroute to Manchester[B-LOC] City[I-LOC] goal post.


## 7.3.2 Exercise: NER Inference with a Restaurant Context
Now try querying with sentences we might find in a restaurant context. Execute the following cell to populate a new YAML file,  `infer_restaurant.yaml`.  Then run NER inference as before and check the output.  What do you expect to see?  

In [9]:
%%writefile $source_mount/specs/token_classification/infer_restaurant.yaml

# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
# TAO Spec file for inference using a previously pretrained BERT model for a text classification task.

# "Simulate" user input: batch with four samples.
input_batch:
  - "I would like to order a pizza for 6pm"
  - "what sauce is in Pilau."
  - "mhh nice Swahili dish."
  - "any good cheap kenyan restaurants nearby"
  - "any good ice cream parlors around"
  - "any good place to get a pie at an affordable price"

Writing /dli/task/tao/specs/token_classification/infer_restaurant.yaml


Using what you've learned previously, run inference using the new `infer_restaurant.yaml` configuration file. If you get stuck, you can take a look at the [solution](solutions/ex7.3.2.ipynb).

In [10]:
# TODO infer on NER model with the infer_restaurant.yaml examples
!tao token_classification infer \
    -e $SPECS_DIR/token_classification/infer_restaurant.yaml \
    -g 1 \
    -m $MODELS_DIR/namedentityrecognition_english_bert.tlt \
    -k $KEY \
    -r $RESULTS_DIR/bert-base/

2022-04-27 07:23:33,480 [INFO] root: Registry: ['nvcr.io']
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the "user":"UID:GID" in the
DockerOptions portion of the "/root/.tao_mounts.json" file. You can obtain your
users UID and GID by using the "id -u" and "id -g" commands on the
terminal.
[NeMo W 2022-04-27 07:23:37 experimental:27] Module <class 'nemo.collections.nlp.modules.common.megatron.megatron_bert.MegatronBertEncoder'> is experimental, not ready for production and is not fully supported. Use at your own risk.
INFO: Generating new fontManager, this may take some time...
[NeMo W 2022-04-27 07:23:40 experimental:27] Module <class 'nemo.collections.nlp.modules.common.megatron.megatron_bert.MegatronBertEncoder'> is experimental, not ready for production and is not fully supported. Use at your own risk.
[NeMo I 2022-04-27 07:23:41 tlt_logging:20] Experiment configuration:
    exp_manager:
      task_name: infer
      explici

In [11]:
!grep Results $source_mount/results/bert-base/infer.log

[NeMo I 2022-04-27 07:16:46 infer:75] Results: Kareem[B-PER] Benzema[I-PER] confident Real[B-ORG] Madrid[I-ORG] will reach Champions League final
[NeMo I 2022-04-27 07:16:46 infer:75] Results: Vinicius[B-PER] enroute to Manchester[B-LOC] City[I-LOC] goal post.
[NeMo I 2022-04-27 07:23:58 infer:75] Results: I would like to order a pizza for 6pm[B-TIME]
[NeMo I 2022-04-27 07:23:58 infer:75] Results: what sauce is in Pilau[B-LOC].
[NeMo I 2022-04-27 07:23:58 infer:75] Results: mhh nice Swahili dish.
[NeMo I 2022-04-27 07:23:58 infer:75] Results: any good cheap kenyan[B-GPE] restaurants nearby
[NeMo I 2022-04-27 07:23:58 infer:75] Results: any good ice cream parlors around
[NeMo I 2022-04-27 07:23:58 infer:75] Results: any good place to get a pie at an affordable price


Good job running the inference!  Is the result useful?  

The current labels are not well suited for the restaurant context!

---
# 7.4 NER Training

To get useful information in a restaurant context, we need to train a robust classifier on a dataset that has the appropriate entities labeled. We will begin with a pretrained BERT language model to encode the text, as it already inherently understands word relationships. By default, TAO Toolkit uses the `bert-base-uncased` language model (110M parameters). Then, we will train a classifier to recognize restaurant entities.  Fortunately, we have an annotated dataset for restaurants that we can use.

## 7.4.1 Restaurant Data Exploration

The [Restaurant dataset](https://groups.csail.mit.edu/sls/downloads/restaurant) is labeled using the [IOB format](https://en.wikipedia.org/wiki/Inside%E2%80%93outside%E2%80%93beginning_(tagging)) (short for "inside", "outside", and "beginning"). The following entity classes appear in the dataset:

```
Amenity, Cuisine, Dish, Hours, Location, Price, Rating, Restaurant_Name
```

In [12]:
# set data path and explore data
DATA_DOWNLOAD_DIR = os.path.join(source_mount, 'data/restaurant')
!ls $DATA_DOWNLOAD_DIR

label_ids.csv		    labels_train.txt		  restauranttrain.bio
labels_dev.txt		    labels_train_label_stats.tsv  text_dev.txt
labels_dev_label_stats.tsv  restauranttest.bio		  text_train.txt


In [13]:
# print first test example
!head -7 $DATA_DOWNLOAD_DIR/restauranttest.bio

O	a
B-Rating	four
I-Rating	star
O	restaurant
B-Location	with
I-Location	a
B-Amenity	bar


### IOB Tagging

The files in the dataset, `restauranttrain.bio` and `restauranttest.bio` must be converted to an IOB format that is compatible with [TAO Token Classification module](https://docs.nvidia.com/metropolis/TAO/tao-user-guide/text/nlp/token_classification.html#data-input-for-token-classification-model). TAO Toolkit requires the input to be in two files:
-  `text.txt`: Each line of the text.txt file contains text sequences, where words are separated with spaces.
-  `labels.txt`: Each line contains corresponding labels for each word in text.txt; the labels are separated with spaces.

For the first test example printed previously, the TAO input format should be a `text.txt` file mapped to a `labels.txt` as follows:
```text
  text.txt:   a four     star     restaurant  with        a          bar
labels.txt:   O B-Rating I-Rating O           B-Location  I-Location B-Amenity
```
To generate the TAO-compatible dataset, we can use the [conversion script](https://github.com/NVIDIA/NeMo/blob/main/examples/nlp/token_classification/data/import_from_iob_format.py) from NVIDIA NeMo toolkit.  

We don't need to do that here, as the preprocessed dataset is already available on `data/restaurant` directory for this class. 

In [14]:
# show test text samples
!head $DATA_DOWNLOAD_DIR/text_dev.txt

a four star restaurant with a bar 
any asian cuisine around 
any bbq places open before 5 nearby 
any dancing establishments with reasonable pricing 
any good cheap german restaurants nearby 
any good ice cream parlors around 
any good place to get a pie at an affordable price 
any good vegan spots nearby 
any mexican places have a tameles special today 
any place along the road has a good beer selection that also serves ribs 


In [15]:
# show test labels samples
!head $DATA_DOWNLOAD_DIR/labels_dev.txt

O B-Rating I-Rating O B-Location I-Location B-Amenity 
O B-Cuisine O B-Location 
O B-Cuisine O B-Hours I-Hours I-Hours B-Location 
O B-Location I-Location O B-Price O 
O O B-Price B-Cuisine O B-Location 
O B-Rating B-Cuisine I-Cuisine I-Cuisine B-Location 
O B-Rating O O O O B-Dish O O B-Price O 
O O B-Cuisine O B-Location 
O B-Cuisine O O O B-Dish B-Amenity I-Amenity 
O O B-Location I-Location I-Location O O B-Rating B-Dish O O O O B-Dish 


## 7.4.2 `train` Command

To train a model using TAO, we must configure the spec file and run the `tao token_classification train` command. More details about the command can be found in the [documentation](https://docs.nvidia.com/tao/tao-toolkit/text/nlp/token_classification.html#training-a-token-classification-model), including [required arguments](https://docs.nvidia.com/tao/tao-toolkit/text/nlp/token_classification.html#required-arguments-for-training) and an example command:

```yaml
REQUIRED ARGUMENTS
-e: The experiment specification file to set up training.

-r: Path to the directory to store the results/logs. Note, the trained-model.tlt would be saved in this specified folder under a subfolder checkpoints; in our case it will be saved here: /results/token_classification/train/checkpoints/trained-model.tlt

-k: Encryption key

data_dir: Path to the data_dir with the processed data files.

model.label_ids: Path to the label_ids.csv file, usually stored at data_dir

```

```sh
EXAMPLE COMMAND
tao token_classification train [-h] \
    -e /specs/nlp/token_classification/train.yaml \
    -r /results/token_classification/train/ \
    -g 1 \
    -k $KEY
    data_dir=/path/to/data_dir \
    model.label_ids=/path/to/label_ids.csv \
    trainer.max_epochs=5 \
    training_ds.num_samples=-1 \
    validation_ds.num_samples=-1
```


This command relies on the [train.yaml](tao/specs/token_classification/train.yaml) specification file. Through the spec file, you can tune many knobs such as the model, dataset, hyperparameters, and optimizers.
Each `token_classification` command (`download_and_convert`, `train`, `finetune`, `evaluate`, `infer`, and so on) has a dedicated spec file with configurations pertinent to it. 

Take a look at the training spec file you downloaded earlier:

In [16]:
# This line will print the entire training config
!cat $source_mount/specs/token_classification/train.yaml

# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.
# TLT Spec file for training of the BERT model on a Token Classification task:
# Named Entity Recognition on GMB dataset

trainer:
  max_epochs: 5

model:
  tokenizer:
      tokenizer_name: ${model.language_model.pretrained_model_name} # or sentencepiece
      vocab_file: null # path to vocab file
      tokenizer_model: null # only used if tokenizer is sentencepiece
      special_tokens: null

  language_model:
    pretrained_model_name: bert-base-uncased
    lm_checkpoint: null
    config_file: null # json file, precedence over config
    config: null

  head:
    num_fc_layers: 2
    fc_dropout: 0.5
    activation: 'relu'
    use_transformer_init: True

  # Path to file with label_ids, generated with dataset_convert.py.
  # Those labels are used by the model as labels (names of target classes, their number).
  label_ids: ???

# Path to directory containing both finetuning and validation data.
data_dir: ???

training_ds:
 

The code cell below uses the default `train.yaml`. It is configured to use the `bert-base-uncased` pretrained model. Additionally, these configurations can be overridden by adding the overrides to the `tao` command. Here, we override the `data_dir`, `model.label_ids`, `trainer.max_epochs`, `training_ds.num_samples`, and `validation_ds.num_samples` configurations to suit our needs. <br>

In order to get good results, try training for a few epochs (depends on the size of the data). 

*NOTE: All file paths correspond to the destination-mount directory that is visible in the TAO docker container and used in the backend.*

In [17]:
%%time 
# TAO train NER model - this takes few minutes
!tao token_classification train \
    -e $SPECS_DIR/token_classification/train.yaml \
    -g 1  \
    -k $KEY \
    -r $RESULTS_DIR/bert-base_ner \
    data_dir={destination_mount}/data/restaurant \
    model.label_ids={destination_mount}/data/restaurant/label_ids.csv \
    trainer.max_epochs=5 \
    training_ds.num_samples=-1 \
    validation_ds.num_samples=-1

2022-04-27 07:26:57,159 [INFO] root: Registry: ['nvcr.io']
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the "user":"UID:GID" in the
DockerOptions portion of the "/root/.tao_mounts.json" file. You can obtain your
users UID and GID by using the "id -u" and "id -g" commands on the
terminal.
[NeMo W 2022-04-27 07:27:00 experimental:27] Module <class 'nemo.collections.nlp.modules.common.megatron.megatron_bert.MegatronBertEncoder'> is experimental, not ready for production and is not fully supported. Use at your own risk.
INFO: Generating new fontManager, this may take some time...
[NeMo W 2022-04-27 07:27:04 experimental:27] Module <class 'nemo.collections.nlp.modules.common.megatron.megatron_bert.MegatronBertEncoder'> is experimental, not ready for production and is not fully supported. Use at your own risk.
[NeMo I 2022-04-27 07:27:05 tlt_logging:20] Experiment configuration:
    restore_from: ???
    exp_manager:
      explicit

The train command produces `trained-model.tlt` saved at `$RESULTS_DIR/bert-base_ner/checkpoints/trained-model.tlt`. 
This file can be fed directly into the fine-tuning stage.

## 7.4.3 Faster Training with AMP
There are a number of parameters you can change for training.  For example, the batch size (`training_ds.batch_size`) may influence the validation accuracy. Larger batch sizes are faster to train with, however, you may get slighly better results with smaller batches.

An important consideration is the [Automatic Mixed Precision (AMP)](https://developer.nvidia.com/automatic-mixed-precision) setting.  To accelerate the training without loss of quality, it is possible to train with these parameters:  `trainer.amp_level="O1"` and `trainer.precision=16` for reduced precision.

Experiment by training again using mixed precision:

In [None]:
%%time 
# TAO train NER model with mixed precision
!tao token_classification train \
    -e $SPECS_DIR/token_classification/train.yaml \
    -g 1  \
    -k $KEY \
    -r $RESULTS_DIR/bert-base_ner_fp16 \
    data_dir={destination_mount}/data/restaurant \
    model.label_ids={destination_mount}/data/restaurant/label_ids.csv \
    trainer.max_epochs=3 \
    trainer.amp_level="O1" \
    trainer.precision=16 \
    training_ds.num_samples=-1 \
    validation_ds.num_samples=-1

Compare the two trainings with and without AMP in terms of:
- Training duration?
- NER Model performance?

Discuss your observations with the instructor.

## 7.4.4 Change the Language Model

Before training the NER classifier, the input text is encoded using a language model. TAO Toolkit supports four BERT and Megatron language models: 
- `bert-base-cased`
- `bert-base-uncased`
- `megatron-bert-345m-cased`
- `megatron-bert-345m-uncased`

By default, TAO Toolkit uses `bert-base-uncased`. This is the encoder you've used so far for training. To specify a different language model, add the `pretrained_model_name` argument to the launch command:
```python
    model.language_model.pretrained_model_name=<language-model-name>
```

In [18]:
%%time 
# TAO train NER model with Megatron. This takes few minutes
!tao token_classification train \
    -e $SPECS_DIR/token_classification/train.yaml \
    -g 1  \
    -k $KEY \
    -r $RESULTS_DIR/megatron-base_ner5 \
    data_dir={destination_mount}/data/restaurant \
    model.label_ids={destination_mount}/data/restaurant/label_ids.csv \
    exp_manager.create_checkpoint_callback=false\
    trainer.amp_level="O1" \
    trainer.precision=16 \
    training_ds.num_samples=-1 \
    validation_ds.num_samples=-1 \
    trainer.max_epochs=3 \
    model.language_model.pretrained_model_name=megatron-bert-345m-uncased 

2022-04-27 07:34:58,113 [INFO] root: Registry: ['nvcr.io']
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the "user":"UID:GID" in the
DockerOptions portion of the "/root/.tao_mounts.json" file. You can obtain your
users UID and GID by using the "id -u" and "id -g" commands on the
terminal.
[NeMo W 2022-04-27 07:35:02 experimental:27] Module <class 'nemo.collections.nlp.modules.common.megatron.megatron_bert.MegatronBertEncoder'> is experimental, not ready for production and is not fully supported. Use at your own risk.
INFO: Generating new fontManager, this may take some time...
[NeMo W 2022-04-27 07:35:06 experimental:27] Module <class 'nemo.collections.nlp.modules.common.megatron.megatron_bert.MegatronBertEncoder'> is experimental, not ready for production and is not fully supported. Use at your own risk.
[NeMo I 2022-04-27 07:35:07 tlt_logging:20] Experiment configuration:
    restore_from: ???
    exp_manager:
      explicit

## 7.4.5 Evaluate the Trained Model

For the token classification task, several metrics are recorded for the evaluation:
- Test loss
- F1 score, precision and recall per class
- F1 score, precision and recall aggregated with micro, macro and weighted average

Check out [this article](https://towardsdatascience.com/performance-metrics-confusion-matrix-precision-recall-and-f1-score-a8fe076a2262) to larn more about the performance metrics. 

The evaluation spec YAML is as simple as:

In [19]:
# Print the evaluation spec file 
!cat $source_mount/specs/token_classification/evaluate.yaml

# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.
# TLT Spec file for evaluation of a Token Classification model

# Name of the .tlt from which the model will be loaded.
restore_from: trained-model.tlt

data_dir: ???

# Test settings: dataset.
test_ds:
  text_file: text_dev.txt
  labels_file: labels_dev.txt
  batch_size: 1
  shuffle: false
  num_samples: -1 # number of samples to be considered, -1 means the whole the dataset


Note that the `data_dir` is not defined, which is an indication that we should override it in the command.  To evaluate the model, we use `tao text_classification evaluate` and override `data_dir`. Other arguments follow the same pattern as before.

In [20]:
# TAO evaluate NER model
!tao token_classification evaluate  \
   -e $SPECS_DIR/token_classification/evaluate.yaml \
   -r $RESULTS_DIR/bert-base_ner/evaluate \
   -g 1 \
   -m $RESULTS_DIR/bert-base_ner/checkpoints/trained-model.tlt \
   -k $KEY \
   data_dir={destination_mount}/data/restaurant

2022-04-27 07:42:50,551 [INFO] root: Registry: ['nvcr.io']
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the "user":"UID:GID" in the
DockerOptions portion of the "/root/.tao_mounts.json" file. You can obtain your
users UID and GID by using the "id -u" and "id -g" commands on the
terminal.
[NeMo W 2022-04-27 07:42:54 experimental:27] Module <class 'nemo.collections.nlp.modules.common.megatron.megatron_bert.MegatronBertEncoder'> is experimental, not ready for production and is not fully supported. Use at your own risk.
INFO: Generating new fontManager, this may take some time...
[NeMo W 2022-04-27 07:42:57 experimental:27] Module <class 'nemo.collections.nlp.modules.common.megatron.megatron_bert.MegatronBertEncoder'> is experimental, not ready for production and is not fully supported. Use at your own risk.
[NeMo I 2022-04-27 07:42:58 tlt_logging:20] Experiment configuration:
    restore_from: /workspace/mount/results/bert-base_

You can observe the F1 score, precision and recall metrics per class:

```
    label                                                precision    recall       f1           support   
    O (label_id: 0)                                         96.67      95.61      96.14       8659
    B-Amenity (label_id: 1)                                 74.37      77.30      75.80        533
    B-Cuisine (label_id: 2)                                 89.57      85.53      87.50        532
    B-Dish (label_id: 3)                                    82.83      85.42      84.10        288
    B-Hours (label_id: 4)                                   71.74      77.83      74.66        212
    B-Location (label_id: 5)                                90.48      90.15      90.31        812
    B-Price (label_id: 6)                                   82.97      88.30      85.55        171
    B-Rating (label_id: 7)                                  79.00      86.07      82.38        201
    B-Restaurant_Name (label_id: 8)                         94.90      92.54      93.70        402
    I-Amenity (label_id: 9)                                 75.99      80.92      78.37        524
    I-Cuisine (label_id: 10)                                74.26      74.81      74.54        135
    I-Dish (label_id: 11)                                   71.23      85.95      77.90        121
    I-Hours (label_id: 12)                                  84.42      91.86      87.99        295
    I-Location (label_id: 13)                               90.04      90.61      90.32        788
    I-Price (label_id: 14)                                  89.80      66.67      76.52         66
    I-Rating (label_id: 15)                                 84.87      80.80      82.79        125
    I-Restaurant_Name (label_id: 16)                        93.40      90.31      91.83        392
    -------------------
    micro avg                                               91.88      91.88      91.88      14256
    macro avg                                               83.91      84.75      84.14      14256
    weighted avg                                            92.07      91.88      91.94      14256
```

you can also observe the test loss (`test_loss`) and the aggregated F1 score, precision, recall metrics on the entire test set:
```
DATALOADER:0 TEST RESULTS
{'f1': tensor(84.1423, device='cuda:0'),
 'precision': tensor(83.9138, device='cuda:0'),
 'recall': tensor(84.7453, device='cuda:0'),
 'test_loss': tensor(0.2424, device='cuda:0')}
 ```

## 7.5 NER Fine-Tuning

The TAO Toolkit command for fine-tuning is very similar to that of training. Instead of `tao text_classification train`, use `tao text_classification finetune`.  This command will generate a fine-tuned model `finetuned-model.tlt` at `$RESULTS_DIR/bert-base-finetuned_ner/checkpoints`. 

The fine-tuning process will start with the trained model weights instead of random weights for the token classification model.  
Specify the model checkpoint from the previously trained model with the `-m` argument and specify the spec file corresponding to fine-tuning. 

The token classification fine-tuning of TAO allows users to:
- Fine-tune the token classifier on additional data
- Fine-tune on a subset of labels by removing or merging entities in the dataset

For this demonstration, as "Cuisine" and "Dish" labels are very close semantically in our context, we will merge them and keep one entity, the "Dish" label.  The merged data is set up in the directory `tao/data/restaurant_finetune`.  

In [21]:
# We should not find any "Cuisine" labels as they have been renamed "Dish"
!grep Cuisine $source_mount/data/restaurant_finetune/labels_dev.txt |wc -l
!grep Dish $source_mount/data/restaurant_finetune/labels_dev.txt |wc -l

0
772


In [22]:
# Print the fine-tuning spec file 
!cat $source_mount/specs/token_classification/finetune.yaml

# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.
# TLT Spec file for finetuning of the pre-trained TokenClassification model


data_dir: ???

# Fine-tuning settings: training dataset.
finetuning_ds:
  num_samples: -1 # number of samples to be considered, -1 means all the dataset

# Fine-tuning settings: validation dataset.
validation_ds:
  num_samples: -1 # number of samples to be considered, -1 means all the dataset

# Fine-tuning settings: different optimizer.
optim:
  name: adam
  lr: 2e-5

trainer:
  max_epochs: 3

In [23]:
# TAO NER model finetuning
!tao token_classification finetune \
   -e $SPECS_DIR/token_classification/finetune.yaml \
   -r $RESULTS_DIR/bert-base-finetuned_ner/ \
   -m $RESULTS_DIR/bert-base_ner/checkpoints/trained-model.tlt \
   -g 1 \
   data_dir={destination_mount}/data/restaurant_finetune \
   trainer.max_epochs=2 \
   trainer.amp_level="O1" \
   trainer.precision=16 \
   -k $KEY

2022-04-27 08:11:41,809 [INFO] root: Registry: ['nvcr.io']
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the "user":"UID:GID" in the
DockerOptions portion of the "/root/.tao_mounts.json" file. You can obtain your
users UID and GID by using the "id -u" and "id -g" commands on the
terminal.
[NeMo W 2022-04-27 08:11:45 experimental:27] Module <class 'nemo.collections.nlp.modules.common.megatron.megatron_bert.MegatronBertEncoder'> is experimental, not ready for production and is not fully supported. Use at your own risk.
INFO: Generating new fontManager, this may take some time...
[NeMo W 2022-04-27 08:11:49 experimental:27] Module <class 'nemo.collections.nlp.modules.common.megatron.megatron_bert.MegatronBertEncoder'> is experimental, not ready for production and is not fully supported. Use at your own risk.
[NeMo I 2022-04-27 08:11:50 tlt_logging:20] Experiment configuration:
    restore_from: /workspace/mount/results/bert-base_

## 7.5.1 Exercise: Evaluate the Fine-Tuned Model

Based on what you've learned, evaluate the performance of the fine-tuned NER model. If you get stuck, you can look at the [solution](solutions/ex7.5.1.ipynb).  If you are unsure of the location of the fine-tuned model, check the outputs from the fine-tuning or the logs.

In [24]:
# TODO evaluate the fine-tuned model
!tao token_classification evaluate  \
   -e $SPECS_DIR/token_classification/evaluate.yaml \
   -r $RESULTS_DIR/bert-base-finetuned_ner/evaluate \
   -g 1 \
   -m $RESULTS_DIR/bert-base-finetuned_ner/checkpoints/finetuned-model.tlt \
   -k $KEY \
   data_dir={destination_mount}/data/restaurant_finetune

2022-04-27 08:18:18,441 [INFO] root: Registry: ['nvcr.io']
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the "user":"UID:GID" in the
DockerOptions portion of the "/root/.tao_mounts.json" file. You can obtain your
users UID and GID by using the "id -u" and "id -g" commands on the
terminal.
[NeMo W 2022-04-27 08:18:22 experimental:27] Module <class 'nemo.collections.nlp.modules.common.megatron.megatron_bert.MegatronBertEncoder'> is experimental, not ready for production and is not fully supported. Use at your own risk.
INFO: Generating new fontManager, this may take some time...
[NeMo W 2022-04-27 08:18:26 experimental:27] Module <class 'nemo.collections.nlp.modules.common.megatron.megatron_bert.MegatronBertEncoder'> is experimental, not ready for production and is not fully supported. Use at your own risk.
[NeMo I 2022-04-27 08:18:27 tlt_logging:20] Experiment configuration:
    restore_from: /workspace/mount/results/bert-base-

Based on the evaluation results, you can either continue fine-tuning the model for more epochs, or move on to inference.

## 7.5.2 Inference on the Fine-Tuned Model

Try inference on the NER fine-tuned model using a few sentences within the restaurant context. 

In [25]:
!cat /dli/task/tao/specs/token_classification/infer_restaurant.yaml


# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
# TAO Spec file for inference using a previously pretrained BERT model for a text classification task.

# "Simulate" user input: batch with four samples.
input_batch:
  - "I would like to order a pizza for 6pm"
  - "what sauce is in Pilau."
  - "mhh nice Swahili dish."
  - "any good cheap kenyan restaurants nearby"
  - "any good ice cream parlors around"
  - "any good place to get a pie at an affordable price"


Inference can be done using the command `tao token_classification infer` as follows:

In [26]:
# TAO fine-tuned NER inference 
!tao token_classification infer \
    -e $SPECS_DIR/token_classification/infer_restaurant.yaml \
    -g 1 \
    -m $RESULTS_DIR/bert-base-finetuned_ner/checkpoints/finetuned-model.tlt \
    -k $KEY \
    -r $RESULTS_DIR/bert-base-finetuned_ner/

2022-04-27 08:20:16,124 [INFO] root: Registry: ['nvcr.io']
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the "user":"UID:GID" in the
DockerOptions portion of the "/root/.tao_mounts.json" file. You can obtain your
users UID and GID by using the "id -u" and "id -g" commands on the
terminal.
[NeMo W 2022-04-27 08:20:19 experimental:27] Module <class 'nemo.collections.nlp.modules.common.megatron.megatron_bert.MegatronBertEncoder'> is experimental, not ready for production and is not fully supported. Use at your own risk.
INFO: Generating new fontManager, this may take some time...
[NeMo W 2022-04-27 08:20:23 experimental:27] Module <class 'nemo.collections.nlp.modules.common.megatron.megatron_bert.MegatronBertEncoder'> is experimental, not ready for production and is not fully supported. Use at your own risk.
[NeMo I 2022-04-27 08:20:24 tlt_logging:20] Experiment configuration:
    exp_manager:
      task_name: infer
      explici

In [27]:
!grep Results $source_mount/results/bert-base-finetuned_ner/infer.log

[NeMo I 2022-04-27 08:20:42 infer:75] Results: I would like to order a pizza[B-Dish] for 6pm[B-Hours]
[NeMo I 2022-04-27 08:20:42 infer:75] Results: what sauce[B-Dish] is in Pilau[B-Dish].
[NeMo I 2022-04-27 08:20:42 infer:75] Results: mhh nice[B-Rating] Swahili[B-Dish] dish[I-Dish].
[NeMo I 2022-04-27 08:20:42 infer:75] Results: any good[B-Rating] cheap[I-Price] kenyan[B-Dish] restaurants nearby[B-Location]
[NeMo I 2022-04-27 08:20:42 infer:75] Results: any good[B-Rating] ice[B-Dish] cream[I-Dish] parlors[I-Dish] around[B-Location]
[NeMo I 2022-04-27 08:20:42 infer:75] Results: any good[B-Rating] place to get a pie[B-Dish] at an affordable[B-Price] price


This model should be able to recognize several useful enties with the restaurant context.

---
# 7.6 Export for Deployment
With TAO, we can export the fine-tuned model in a format that can be deployed using NVIDIA Riva.

In [28]:
# TAO export to Riva
!tao token_classification export \
     -e $SPECS_DIR/token_classification/export.yaml \
     -r $RESULTS_DIR/export/ \
     -m $RESULTS_DIR/bert-base-finetuned_ner/checkpoints/finetuned-model.tlt \
     -k $KEY \
     export_to=exported-model-NER.riva \
     export_format=RIVA

2022-04-27 08:22:11,873 [INFO] root: Registry: ['nvcr.io']
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the "user":"UID:GID" in the
DockerOptions portion of the "/root/.tao_mounts.json" file. You can obtain your
users UID and GID by using the "id -u" and "id -g" commands on the
terminal.
[NeMo W 2022-04-27 08:22:15 experimental:27] Module <class 'nemo.collections.nlp.modules.common.megatron.megatron_bert.MegatronBertEncoder'> is experimental, not ready for production and is not fully supported. Use at your own risk.
INFO: Generating new fontManager, this may take some time...
[NeMo W 2022-04-27 08:22:19 experimental:27] Module <class 'nemo.collections.nlp.modules.common.megatron.megatron_bert.MegatronBertEncoder'> is experimental, not ready for production and is not fully supported. Use at your own risk.
[NeMo I 2022-04-27 08:22:19 tlt_logging:20] Experiment configuration:
    restore_from: /workspace/mount/results/bert-base-

Verify that your model was exported as expected.

In [29]:
!ls /dli/task/tao/results/export/exported-model-NER.riva

/dli/task/tao/results/export/exported-model-NER.riva


## 7.6.1 Exercise: NER Model Export to ONNX

Using what you've learned, export the fine-tuned NER model to ONNX format.  Name the final model `exported-model-NER.eonnx`.  If you get stuck, you can look at the [solution](solutions/ex7.6.1.ipynb).


In [30]:
# TODO export the fine-tuned model to "exported-model-NER.eonnx"
!tao token_classification export \
     -e $SPECS_DIR/token_classification/export.yaml \
     -r $RESULTS_DIR/export/ \
     -m $RESULTS_DIR/bert-base-finetuned_ner/checkpoints/finetuned-model.tlt \
     -k $KEY \
     export_to=exported-model-NER.eonnx \
     export_format=ONNX

2022-04-27 08:24:15,648 [INFO] root: Registry: ['nvcr.io']
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the "user":"UID:GID" in the
DockerOptions portion of the "/root/.tao_mounts.json" file. You can obtain your
users UID and GID by using the "id -u" and "id -g" commands on the
terminal.
[NeMo W 2022-04-27 08:24:19 experimental:27] Module <class 'nemo.collections.nlp.modules.common.megatron.megatron_bert.MegatronBertEncoder'> is experimental, not ready for production and is not fully supported. Use at your own risk.
INFO: Generating new fontManager, this may take some time...
[NeMo W 2022-04-27 08:24:22 experimental:27] Module <class 'nemo.collections.nlp.modules.common.megatron.megatron_bert.MegatronBertEncoder'> is experimental, not ready for production and is not fully supported. Use at your own risk.
[NeMo I 2022-04-27 08:24:23 tlt_logging:20] Experiment configuration:
    restore_from: /workspace/mount/results/bert-base-

In [31]:
import os
if(os.path.exists('/dli/task/tao/results/export/exported-model-NER.eonnx')):
   print("You did it!")
else: 
   print("Sorry, the model isn't there.")

You did it!


---
<h2 style="color:green;">Congratulations!</h2>

In this notebook, you have:
- Gained an understanding IOB formatting for NER datasets
- Trained and fine-tuned an NER model with TAO Toolkit
- Launched TAO with an implicit docker container to run NER inference on text samples
- Exported the model to both ONNX and RIVA formats

Next, you'll deploy the model on NVIDIA Riva. Move on to [NLP Deployment with Riva](008_NLP_Deploy_NER.ipynb).


<a href="https://www.nvidia.com/dli"> <img src="images/DLI_Header.png" alt="Header" style="width: 400px;"/> </a>