<a href="https://www.nvidia.com/dli"> <img src="images/DLI_Header.png" alt="Header" style="width: 400px;"/> </a>

# 2.0 Build a Text Classifier
### (NVIDIA NeMo v1.0)

<img style="float: right;" src="images/nemo/nemo-app-stack.png" width=400>

In this notebook, you'll build an application to classify medical disease abstracts into one of three categories: cancer diseases, neurological diseases and disorders, and "other" for anything else.
You'll use [NVIDIA NeMo](https://developer.nvidia.com/nvidia-nemo) (Neural Modules) to quickly set up the problem from the command line. 

**[2.1 NeMo Overview](#2.1-NeMo-Overview)**<br>
&nbsp;&nbsp;&nbsp;&nbsp;[2.1.1 NeMo Models](#2.1.1-NeMo-Models)<br>
**[2.2 Text Classification from the Command Line](#2.2-Text-Classification-from-the-Command-Line)**<br>
&nbsp;&nbsp;&nbsp;&nbsp;[2.2.1 Prepare the Data](#2.2.1-Prepare-the-Data)<br>
&nbsp;&nbsp;&nbsp;&nbsp;[2.2.2 Configuration File](#2.2.2-Configuration-File)<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[2.2.2.1 OmegaConf Tool](#2.2.2.1-OmegaConf-Tool)<br>
&nbsp;&nbsp;&nbsp;&nbsp;[2.2.3 Hydra-Enabled Python Script](#2.2.3-Hydra-Enabled-Python-Script)<br>
&nbsp;&nbsp;&nbsp;&nbsp;[2.2.4 Exercise: Run an Experiment](#2.2.4-Exercise:-Run-an-Experiment)<br>
&nbsp;&nbsp;&nbsp;&nbsp;[2.2.5 Visualize the Results with TensorBoard](#2.2.5-Visualize-the-Results-with-TensorBoard)<br>
&nbsp;&nbsp;&nbsp;&nbsp;[2.2.6 Exercise: Change the Language Model (Optional!)](#2.2.6-Exercise:-Change-the-Language-Model-(Optional!))<br>
**[2.3 PyTorch Lightning Model and Trainer Workflow](#2.3-PyTorch-Lightning-Model-and-Trainer-Workflow)**<br>
&nbsp;&nbsp;&nbsp;&nbsp;[2.3.1 Script Key Features](#2.3.1-Script-Key-Features)<br>
&nbsp;&nbsp;&nbsp;&nbsp;[2.3.2 Model Training from Scratch](#2.3.2-Model-Training-from-Scratch)<br>
&nbsp;&nbsp;&nbsp;&nbsp;[2.3.3 Exercise: Query the Model](#2.3.3-Exercise:-Query-the-Model)<br>


---
# 2.1 NeMo Overview
NeMo is an open source toolkit for building conversational AI applications. NeMo is built around [Neural Modules](https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/core/core.html#neural-modules), conceptual blocks of neural networks that take typed inputs and produce typed outputs. Such modules typically represent data layers, encoders, decoders, language models, loss functions, or methods of combining activations. NeMo makes it easy to combine and re-use these building blocks while providing a level of semantic correctness checking via its neural type system.

The NeMo deep learning framework is based on [Pytorch Lightning](https://github.com/PyTorchLightning/pytorch-lightning), a PyTorch wrapper that organizes PyTorch code for neural network training.  PyTorch Lightning provides easy and high-performant multi-GPU/multi-node mixed precision training options. Creating a deep neural network project, or **experiment**, with PyTorch Lightning requires two main components:
1. [LightningModule](https://pytorch-lightning.readthedocs.io/en/stable/common/lightning_module.html)
2. [Trainer](https://pytorch-lightning.readthedocs.io/en/stable/common/trainer.html)

The _LightningModule_ is used to organize PyTorch code into computation, optimizers, and loops for training, validation, and test.  This abstraction makes deep learning experiments easier to understand and reproduce. 

The _Trainer_ is then able to take the LightningModule and automate everything needed for deep learning training.

## 2.1.1 NeMo Models

[NeMo models](https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/core/core.html) are LightningModules that come equipped with all supporting infrastructure for training and reproducibility. This includes the deep learning model architecture, data preprocessing, optimizer, checkpoints, and experiment logging. NeMo models, like LightningModules, are also PyTorch modules and are fully compatible with the broader PyTorch ecosystem. Any NeMo model can be taken and plugged into any PyTorch workflow.  

**Every NeMo model has an example configuration file and training script that can be found in the [NVIDIA NeMo GitHub Repo](https://github.com/NVIDIA/NeMo/tree/main/examples).**

For this class, we'll use a local repo copy included in this environment, based on the downloadable [NGC NeMo container](https://ngc.nvidia.com/catalog/containers/nvidia:nemo), and focus on NLP models.  Execute the following cell to see a tree of NeMo models in the `examples/nlp` directory.

In [1]:
!tree nemo/examples/nlp -L 2

[01;34mnemo/examples/nlp[00m
├── [01;34mdialogue_state_tracking[00m
│   ├── [01;34mconf[00m
│   └── sgd_qa.py
├── [01;34mentity_linking[00m
│   ├── build_index.py
│   ├── [01;34mconf[00m
│   ├── [01;34mdata[00m
│   ├── query_index.py
│   └── self_alignment_pretraining.py
├── [01;34mglue_benchmark[00m
│   ├── glue_benchmark.py
│   └── glue_benchmark_config.yaml
├── [01;34minformation_retrieval[00m
│   ├── bert_dpr.py
│   ├── bert_joint_ir.py
│   ├── [01;34mconf[00m
│   ├── construct_random_negatives.py
│   └── get_msmarco.sh
├── [01;34mintent_slot_classification[00m
│   ├── [01;34mconf[00m
│   ├── [01;34mdata[00m
│   └── intent_slot_classification.py
├── [01;34mlanguage_modeling[00m
│   ├── bert_pretraining.py
│   ├── [01;34mconf[00m
│   ├── convert_weights_to_nemo1.0.py
│   ├── get_wkt2.sh
│   └── transformer_lm.py
├── [01;34mmachine_translation[00m
│   ├── [01;34mconf[00m
│   ├── create_tarred_monolingual_dataset.py
│   ├── create_tarred_parallel_datase

There are a number of models listed covering several classic NLP tasks.  We will focus on [text classification](https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/nlp/text_classification.html) in this notebook and [token classification](https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/nlp/token_classification.html) in the next notebook on named entity recognition (NER). 

Notice that each NeMo model type includes a `conf` folder for configuration files and at least one Python training script file.  

Execute the following cell to see more detail for the text classification model:

In [None]:
TC_DIR = "/dli/task/nemo/examples/nlp/text_classification"
!tree $TC_DIR

The config file, `text_classification_config.yaml`, specifies model, training, and experiment management details, such as file locations, pretrained models, and hyperparameters.

The Python script, `text_classification_with_bert.py`, encapsulates everything you need to run a text classification experiment defined by the configuration file.  It employs Facebook's [Hydra](https://hydra.cc/) tool for configuration management, which allows you to run the entire experiment just with the script, using command line options to override the config values!

The key to building an experiment quickly, is to  understand what the default config file includes, and what needs to be changed for your own project.

---
# 2.2 Text Classification from the Command Line
The question we want to answer is: 

**Given a medical disease abstract, is the abstract about cancer, a neurological disorder, or something else?**

This is a 3-class text classification problem.  We'll use the NeMo [text classification model](https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/nlp/text_classification.html) with three classes (labels): "cancer" (0), "neurological disorders" (1), and "other" (2).  

## 2.2.1 Prepare the Data
You've already explored the [NCBI-disease corpus](https://www.ncbi.nlm.nih.gov/CBBresearch/Dogan/DISEASE/) and the text classification dataset derived from it in the [Explore the Data](010_ExploreData.ipynb) notebook.  Recall that the text classification files consist of tab-delimited abstracts and labels, with a row of headers.

In [None]:
TC3_DATA_DIR = '/dli/task/data/NCBI_tc-3'
!ls $TC3_DATA_DIR/*.tsv

In [None]:
# Take a look at the tab separated data
print("*****\ntrain.tsv sample\n*****")
!head -n 3 $TC3_DATA_DIR/train.tsv
print("\n\n*****\ndev.tsv sample\n*****")
!head -n 3 $TC3_DATA_DIR/dev.tsv
print("\n\n*****\ntest.tsv sample\n*****")
!head -n 3 $TC3_DATA_DIR/test.tsv



Note a few features of the data:
* The preprocessed data is already in the 

   ```
   [WORD][SPACE][WORD][SPACE][WORD][TAB][LABEL]
   ``` 
   format specified in the [documentation](https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/nlp/text_classification.html).
* There is a header row, "sentence label", that should be removed.
* The text is quite long, so `max_seq_length` values will need to take this into account for training.

Start by removing the header rows.  There are a number of ways to do this, but since it is a simple change we can use a bash stream editor (`sed`) command:

In [None]:
!sed 1d $TC3_DATA_DIR/train.tsv > $TC3_DATA_DIR/train_nemo_format.tsv
!sed 1d $TC3_DATA_DIR/dev.tsv > $TC3_DATA_DIR/dev_nemo_format.tsv
!sed 1d $TC3_DATA_DIR/test.tsv > $TC3_DATA_DIR/test_nemo_format.tsv

In [None]:
# Take a look at the tab separated data
# "1" is "positive" and "0" is "negative"
print("*****\ntrain_nemo_format.tsv sample\n*****")
!head -n 3 $TC3_DATA_DIR/train_nemo_format.tsv
print("\n\n*****\ndev_nemo_format.tsv sample\n*****")
!head -n 3 $TC3_DATA_DIR/dev_nemo_format.tsv
print("\n\n*****\ntest_nemo_format.tsv sample\n*****")
!head -n 3 $TC3_DATA_DIR/test_nemo_format.tsv

In [None]:
TC3_DATA_DIR = '/dli/task/data/NCBI_tc-3'
!ls $TC3_DATA_DIR/*.tsv

## 2.2.2 Configuration File
List the config file `text_classification_config.yaml` and take a look at the keys and default values.  Note the hierarchy of the keys and, especially, the three top-level keys: `trainer`, `model`, and `exp_manager`.

```yaml
trainer:
  gpus:
  num_nodes:
  max_epochs:
  ...
  
model:
  nemo_path:
  tokenizer:  
  language_model:
  classifier_head:
  ...

exp_manager:
  ...
```

In [3]:
CONFIG_DIR = "/dli/task/nemo/examples/nlp/text_classification/conf"
CONFIG_FILE = "text_classification_config.yaml"
!cat $CONFIG_DIR/$CONFIG_FILE

# Copyright (c) 2020, NVIDIA CORPORATION.  All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Config file for text classification with pre-trained BERT models

trainer:
  gpus: 1 # number of GPUs (0 for CPU), or list of the GPUs to use e.g. [0, 1]
  num_nodes: 1
  max_epochs: 100
  max_steps: null # precedence over max_epochs
  accumulate_grad_batches: 1 # accumulates grads every k batches
  gradient_clip_val: 0.0
  amp_level: O0 # O1/O2 for mixed precision
  precision: 32 # S

### 2.2.2.1 OmegaConf Tool
The YAML config file provides default values for most of the parameters, but there are a few items that must be specified for the text classification experiment in order to run it at all.  

Each YAML section is a bit easier to view using the [omegaconf](https://omegaconf.readthedocs.io/en/2.1_branch/#) package, which allows you to access and manipulate the configuration keys using a "dot" protocol.  

Start by instantiating an `OmegaConf` object from the config file. Keys in the object can be changed, added, viewed, saved, and so on.  

For example, to look at just the `model` section, we can load the config file and specify just the `config.model` section to view through a print statement:

In [None]:
from omegaconf import OmegaConf

config = OmegaConf.load(CONFIG_DIR + "/" + CONFIG_FILE)
print(OmegaConf.to_yaml(config.model))

Details about the model arguments can be found in the [documentation](https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/nlp/text_classification.html#model-arguments).  The `dataset.num_classes` value as well as locations of the data in `train_ds.file_path`, `val_ds.file_path`, and `test_ds.file_path` are required.

To make sure we don't run out of memory, we can limit the `dataset.max_seq_length` to 128.  It also looks like the `infer_samples` are related to movie reviews, so we can change those to sentences that are meaningful in the disease domain.

There are some other parameters we might want to change later, but for now, this is all we absolutely must provide.  

Next take a look at the `trainer` subsection:

In [None]:
print(OmegaConf.to_yaml(config.trainer))

We only have one GPU right now, so that setting is fine, but we might want to limit the `max_epochs` to just a few to start with.  As with the `model` configs, there are some other parameters we might want to change, but we can go with the default for our first try.  

Finally, what about the `exp_manager`?

In [None]:
print(OmegaConf.to_yaml(config.exp_manager))

This section is fine as it is. When the `exp_dir` is `null`, it will default to placing the experiment results in a new directory named `nemo_experiments`.

## 2.2.3 Hydra-Enabled Python Script
To recap, the parameters we need to change or override are:

* `model.dataset.num_classes`: set to 3
* `model.dataset.max_seq_length`: set to 128
* `model.train_ds.file_path`: set to train_nemo_format.tsv
* `model.val_ds.file_path`: set to dev_nemo_format.tsv
* `model.test_ds.file_path`: set to test_nemo_format.tsv
* `model.infer_samples` : set to relevent sentences
* `trainer.max_epochs`: set to 3

We can train, infer, and test it all **in one command** using the text classification training script!  

The script uses Hydra to manage the config file, so that means we can just override the values we want to from the command line as follows:

In [None]:
%%time
# The training takes about 2 minutes to run

TC_DIR = "/dli/task/nemo/examples/nlp/text_classification"

# set the values we want to override
NUM_CLASSES = 3
MAX_SEQ_LENGTH = 128
PATH_TO_TRAIN_FILE = "/dli/task/data/NCBI_tc-3/train_nemo_format.tsv"
PATH_TO_VAL_FILE = "/dli/task/data/NCBI_tc-3/dev_nemo_format.tsv"
PATH_TO_TEST_FILE = "/dli/task/data/NCBI_tc-3/test_nemo_format.tsv"
# disease domain inference sample answers should be 0, 1, 2 
INFER_SAMPLES_0 = "In contrast no mutations were detected in the p53 gene suggesting that this tumour suppressor is not frequently altered in this leukaemia "
INFER_SAMPLES_1 = "The first predictive testing for Huntington disease  was based on analysis of linked polymorphic DNA markers to estimate the likelihood of inheriting the mutation for HD"
INFER_SAMPLES_2 = "Further studies suggested that low dilutions of C5D serum contain a factor or factors interfering at some step in the hemolytic assay of C5 rather than a true C5 inhibitor or inactivator"
MAX_EPOCHS = 3

# Run the training script, overriding the config values in the command line
!python $TC_DIR/text_classification_with_bert.py \
        model.dataset.num_classes=$NUM_CLASSES \
        model.dataset.max_seq_length=$MAX_SEQ_LENGTH \
        model.train_ds.file_path=$PATH_TO_TRAIN_FILE \
        model.validation_ds.file_path=$PATH_TO_VAL_FILE \
        model.test_ds.file_path=$PATH_TO_TEST_FILE \
        model.infer_samples=["$INFER_SAMPLES_0","$INFER_SAMPLES_1","$INFER_SAMPLES_2"] \
        trainer.max_epochs=$MAX_EPOCHS

At the start of each training experiment, there is a printed log of the experiment specification including any parameters added or overridden via the command-line. It also shows additional information, such as which GPUs are available, where logs are saved, and some samples from the datasets with their corresponding inputs to the model. The log also provides some stats on the lengths of sequences in the dataset.

After each epoch, there is a summary table of metrics on the validation set which includes precision, recall, and f1 score. The f1 score takes both false positives and false negatives into account and is considered more useful than simple accuracy. 

At the end of training, NeMo saves the last checkpoint at the path specified by `model.nemo_file_path`.  Since we left this value at the default, it should have been written to our workspace in `.nemo` format.

In [None]:
!ls *.nemo

The results you achieved in the experiment may not have been very good.  However, it is pretty easy to try another experiment with just a few changes.  Longer training, adjusted learning rate, and changing the batch size for the training and validation datasets can improve results.

## 2.2.4 Exercise: Run an Experiment
Try running another similar experiment using the same text classification problem, this time with some suggested improvements:
  
* Set the mixed-precision `amp_level` to "O1" with a `precision` of 16 to make the model run faster with little or no reduction in accuracy.
* Adjust the number of epochs upward a bit to improve results (larger `max_epochs`)
* Increase the learning rate a little to allow more rapid response to the estimated error each time the model weights are updated

The new values have been provided for you in the cell below.  Add the command with appropriate overrides and run the cell.  If you get stuck, refer to the [solution](solutions/ex2.2.4.ipynb).

In [None]:
%%time
# The training takes about 2 minutes to run

TC_DIR = "/dli/task/nemo/examples/nlp/text_classification"

# set the values we want to override
NUM_CLASSES = 3
MAX_SEQ_LENGTH = 128
PATH_TO_TRAIN_FILE = "/dli/task/data/NCBI_tc-3/train_nemo_format.tsv"
PATH_TO_VAL_FILE = "/dli/task/data/NCBI_tc-3/dev_nemo_format.tsv"
PATH_TO_TEST_FILE = "/dli/task/data/NCBI_tc-3/test_nemo_format.tsv"
# disease domain inference sample answers should be 0, 1, 2 
INFER_SAMPLES_0 = "In contrast no mutations were detected in the p53 gene suggesting that this tumour suppressor is not frequently altered in this leukaemia "
INFER_SAMPLES_1 = "The first predictive testing for Huntington disease  was based on analysis of linked polymorphic DNA markers to estimate the likelihood of inheriting the mutation for HD"
INFER_SAMPLES_2 = "Further studies suggested that low dilutions of C5D serum contain a factor or factors interfering at some step in the hemolytic assay of C5 rather than a true C5 inhibitor or inactivator"
MAX_EPOCHS = 5
AMP_LEVEL = 'O1'
PRECISION = 16
LR = 5.0e-05
BATCH_SIZE = 32
# Override the config values in the command line
!python $TC_DIR/text_classification_with_bert.py \
        model.dataset.num_classes=$NUM_CLASSES \
        model.dataset.max_seq_length=$MAX_SEQ_LENGTH \
        model.train_ds.file_path=$PATH_TO_TRAIN_FILE \
        model.validation_ds.file_path=$PATH_TO_VAL_FILE \
        model.test_ds.file_path=$PATH_TO_TEST_FILE \
        model.infer_samples=["$INFER_SAMPLES_0","$INFER_SAMPLES_1","$INFER_SAMPLES_2"] \
        trainer.max_epochs=$MAX_EPOCHS \
        trainer.amp_level=$AMP_LEVEL \
        trainer.precision=$PRECISION \
        model.train_ds.batch_size =$BATCH_SIZE \
        model.optim.lr=$LR

How did the result from this experiment compare to the previous one?  Check the F1 scores and inference results in the output.

## 2.2.5 Visualize the Results with TensorBoard
The [experiment manager](https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/core/core.html?highlight=tensorboard#experiment-manager) saves results for viewing with TensorBoard. Execute the following cell to create a link to TensorBoard for your instance, then click on the link to open Tensorboard in a tab on your browser.

In [None]:
%%js
const href = window.location.hostname +'/tensorboard/';
let a = document.createElement('a');
let link = document.createTextNode('Open Tensorboard!');
a.appendChild(link);
a.href = "http://" + href;
a.style.color = "navy"
a.target = "_blank"
element.append(a);

To compare the performance of the models you've run, select the "train loss" scaler.  You can see all the models you've run compared together or select individual models for comparison.  The example below shows the first experiment in orange and the exercise experiment in blue.  You can see that the loss was smaller in the second experiment.

<img src="images/tensorboard_01.png" width=1000px>

## 2.2.6 Exercise: Change the Language Model
So far, you've used the basic `bert-base-uncased` language model, but that is just one of many you could try.  Run the following cell to see what language models are available.

In [1]:
# complete list of supported BERT-like models
from nemo.collections import nlp as nemo_nlp
nemo_nlp.modules.get_pretrained_lm_models_list()

['megatron-bert-345m-uncased',
 'megatron-bert-345m-cased',
 'megatron-bert-uncased',
 'megatron-bert-cased',
 'biomegatron-bert-345m-uncased',
 'biomegatron-bert-345m-cased',
 'bert-base-uncased',
 'bert-large-uncased',
 'bert-base-cased',
 'bert-large-cased',
 'bert-base-multilingual-uncased',
 'bert-base-multilingual-cased',
 'bert-base-chinese',
 'bert-base-german-cased',
 'bert-large-uncased-whole-word-masking',
 'bert-large-cased-whole-word-masking',
 'bert-large-uncased-whole-word-masking-finetuned-squad',
 'bert-large-cased-whole-word-masking-finetuned-squad',
 'bert-base-cased-finetuned-mrpc',
 'bert-base-german-dbmdz-cased',
 'bert-base-german-dbmdz-uncased',
 'cl-tohoku/bert-base-japanese',
 'cl-tohoku/bert-base-japanese-whole-word-masking',
 'cl-tohoku/bert-base-japanese-char',
 'cl-tohoku/bert-base-japanese-char-whole-word-masking',
 'TurkuNLP/bert-base-finnish-cased-v1',
 'TurkuNLP/bert-base-finnish-uncased-v1',
 'wietsedv/bert-base-dutch-cased',
 'distilbert-base-uncased

For this exercise, choose a new language model, such as `megatron-bert-345m-cased`.  

You may need to restart the notebook kernal to clear memory.  If you use a large model, other ways to save GPU memory space are to reduce the `batch_size` to 32, 16, or even 8 and reduce the `max_seq_length` to 64. There is no right answer to this exercise.  Rather, this is an opportunity for you to experiment.  Some of the models can take several minutes to run, so feel free to move on to the next notebook and return here when you have time later. If you get stuck, take a look at an example [solution](solutions/ex2.2.6.ipynb).  Be sure to take note of the loss and f1 results with this model, or check TensorBoard for a visualization of the differences.

In [2]:
# Restart the kernel
import IPython
app = IPython.Application.instance()
app.kernel.do_shutdown(True)

{'status': 'ok', 'restart': True}

In [10]:
%%time

# TODO Try your own experiment with a different language model!


# The training takes about____ to run

TC_DIR = "/dli/task/nemo/examples/nlp/text_classification"

# set the values we want to override
NUM_CLASSES = 3
MAX_SEQ_LENGTH = 64
PATH_TO_TRAIN_FILE = "/dli/task/data/NCBI_tc-3/train_nemo_format.tsv"
PATH_TO_VAL_FILE = "/dli/task/data/NCBI_tc-3/dev_nemo_format.tsv"
PATH_TO_TEST_FILE = "/dli/task/data/NCBI_tc-3/test_nemo_format.tsv"
# disease domain inference sample answers should be 0, 1, 2 
INFER_SAMPLES_0 = "In contrast no mutations were detected in the p53 gene suggesting that this tumour suppressor is not frequently altered in this leukaemia "
INFER_SAMPLES_1 = "The first predictive testing for Huntington disease  was based on analysis of linked polymorphic DNA markers to estimate the likelihood of inheriting the mutation for HD"
INFER_SAMPLES_2 = "Further studies suggested that low dilutions of C5D serum contain a factor or factors interfering at some step in the hemolytic assay of C5 rather than a true C5 inhibitor or inactivator"
MAX_EPOCHS = 5
AMP_LEVEL = 'O1'
PRECISION = 16
LR = 5.0e-05
PRETRAINED_MODEL_NAME = 'megatron-bert-345m-cased'
BATCH_SIZE = 32
# Override the config values in the command line
!python $TC_DIR/text_classification_with_bert.py \
        model.dataset.num_classes=$NUM_CLASSES \
        model.dataset.max_seq_length=$MAX_SEQ_LENGTH \
        model.train_ds.file_path=$PATH_TO_TRAIN_FILE \
        model.validation_ds.file_path=$PATH_TO_VAL_FILE \
        model.test_ds.file_path=$PATH_TO_TEST_FILE \
        model.train_ds.batch_size=$BATCH_SIZE \
        model.validation_ds.batch_size=$BATCH_SIZE \
        model.test_ds.batch_size=$BATCH_SIZE \
        model.infer_samples=["$INFER_SAMPLES_0","$INFER_SAMPLES_1","$INFER_SAMPLES_2"] \
        trainer.max_epochs=$MAX_EPOCHS \
        trainer.amp_level=$AMP_LEVEL \
        trainer.precision=$PRECISION \
        model.optim.lr=$LR \
        model.language_model.pretrained_model_name=$PRETRAINED_MODEL_NAME

    Use OmegaConf.to_yaml(cfg)
    
    
[NeMo I 2021-10-05 12:37:31 text_classification_with_bert:110] 
    Config Params:
    trainer:
      gpus: 1
      num_nodes: 1
      max_epochs: 5
      max_steps: null
      accumulate_grad_batches: 1
      gradient_clip_val: 0.0
      amp_level: O1
      precision: 16
      accelerator: ddp
      log_every_n_steps: 1
      val_check_interval: 1.0
      resume_from_checkpoint: null
      num_sanity_val_steps: 0
      checkpoint_callback: false
      logger: false
    model:
      nemo_path: text_classification_model.nemo
      tokenizer:
        tokenizer_name: ${model.language_model.pretrained_model_name}
        vocab_file: null
        tokenizer_model: null
        special_tokens: null
      language_model:
        pretrained_model_name: megatron-bert-345m-cased
        lm_checkpoint: null
        config_file: null
        config: null
      classifier_head:
        num_output_layers: 2
        fc_dropout: 0.1
      class_labels:
        c

---
# 2.3 PyTorch Lightning Model and Trainer Workflow
The NeMo model script is the quickest way to get up and running.  Sometimes, though, you may want to create your own script or work through your project in a more customized manner.  In that case, you can step through the PyTorch Lightning workflow, which is otherwise abstracted (encapsulated and hidden) within the model training script. We'll take a look at the script, then try working through the same workflow from scratch without the script or Hydra.  

## 2.3.1 Script Key Features
You can open the [text_classification_with_bert.py](nemo/examples/nlp/text_classification/text_classification_with_bert.py) script to see exactly what is happening.  

Here's an abbreviated version with logging and initial comments removed:

```python
import pytorch_lightning as pl
from omegaconf import DictConfig

from nemo.collections.nlp.models.text_classification import TextClassificationModel
from nemo.collections.nlp.parts.nlp_overrides import NLPDDPPlugin
from nemo.core.config import hydra_runner
from nemo.utils.exp_manager import exp_manager


@hydra_runner(config_path="conf", config_name="text_classification_config")
def main(cfg: DictConfig) -> None:
    trainer = pl.Trainer(plugins=[NLPDDPPlugin()], **cfg.trainer)
    exp_manager(trainer, cfg.get("exp_manager", None))

    if not cfg.model.train_ds.file_path:
        raise ValueError("'train_ds.file_path' need to be set for the training!")

    model = TextClassificationModel(cfg.model, trainer=trainer)
    trainer.fit(model)

    if cfg.model.nemo_path:
        # '.nemo' file contains the last checkpoint and the params to initialize the model
        model.save_to(cfg.model.nemo_path)

    # We evaluate the trained model on the test set if test_ds is set in the config file
    if cfg.model.test_ds.file_path:
        trainer.test(model=model, ckpt_path=None, verbose=False)

    # perform inference on a list of queries.
    if "infer_samples" in cfg.model and cfg.model.infer_samples:       
        # max_seq_length=512 is the maximum length BERT supports.
        results = model.classifytext(queries=cfg.model.infer_samples, batch_size=16, max_seq_length=512)

if __name__ == '__main__':
    main()
```
The Hydra decorator, `@hydra_runner`, connects the configuration file and provides the mechanism for the command line overrides. 

Once the configuration is established, the key steps are:
1. Instantiate the trainer with `trainer = pl.Trainer(plugins=[NLPDDPPlugin()], **cfg.trainer)`
1. Instantiate the model with `model = TextClassificationModel(cfg.model, trainer=trainer)`
1. Train the model with `trainer.fit(model)`

Additional steps for optional inference and evaluation are:
* Evaluate with `trainer.test(model=model, ckpt_path=None, verbose=False)`
* Infer with `results = model.classifytext(queries=cfg.model.infer_samples, batch_size=16, max_seq_length=512)`

## 2.3.2 Model Training from Scratch
Execute the following cell to restart the notebook kernel to clear variables and GPU memory.

In [2]:
# Restart the kernel
import IPython
app = IPython.Application.instance()
app.kernel.do_shutdown(True)

{'status': 'ok', 'restart': True}

We start by importing the `nemo_nlp` collection, the experiment manager, PyTorch Lightning, and OmegaConf.

In [2]:
from nemo.collections import nlp as nemo_nlp
from nemo.utils.exp_manager import exp_manager

import torch
import pytorch_lightning as pl
from omegaconf import OmegaConf

When running these training steps manually without the script or Hydra, the correct configuration must be set prior to instantiation.  We've already determined the changes we want to make to the config file for this project.  Previously, we used the Hydra override feature in the command line, but those changes are made this time using `OmegaConf`. The syntax looks similar, except this time we are directly changing the `OmegaConf` object, `config`, in Python, and will pass that object to `trainer`, `exp_manager`, and `model`.

The default language model is `bert-base-uncased`.  To override it, add to the cell (for example):
```python
    PRETRAINED_MODEL_NAME = 'bert-base-cased'
    config.model.language_model.pretrained_model_name=PRETRAINED_MODEL_NAME
```

In [11]:
# Instantiate the OmegaConf object by loading the config file
TC_DIR = "/dli/task/nemo/examples/nlp/text_classification"
CONFIG_FILE = "text_classification_config.yaml"
config = OmegaConf.load(TC_DIR + "/conf/" + CONFIG_FILE)

# set the values we want to change
NUM_CLASSES = 3
MAX_SEQ_LENGTH = 128
PATH_TO_TRAIN_FILE = "/dli/task/data/NCBI_tc-3/train_nemo_format.tsv"
PATH_TO_VAL_FILE = "/dli/task/data/NCBI_tc-3/dev_nemo_format.tsv"
PATH_TO_TEST_FILE = "/dli/task/data/NCBI_tc-3/test_nemo_format.tsv"
# disease domain inference sample answers should be 0, 1, 2 
INFER_SAMPLES = ["Germline mutations in BRCA1 are responsible for most cases of inherited breast and ovarian cancer ",
        "The first predictive testing for Huntington disease  was based on analysis of linked polymorphic DNA markers to estimate the likelihood of inheriting the mutation for HD",
        "Further studies suggested that low dilutions of C5D serum contain a factor or factors interfering at some step in the hemolytic assay of C5 rather than a true C5 inhibitor or inactivator"
        ]
MAX_EPOCHS = 5
AMP_LEVEL = 'O1'
PRECISION = 16
LR = 5.0e-05

# set the config values using omegaconf
config.model.dataset.num_classes = NUM_CLASSES
config.model.dataset.max_seq_length = MAX_SEQ_LENGTH
config.model.train_ds.file_path = PATH_TO_TRAIN_FILE
config.model.validation_ds.file_path = PATH_TO_VAL_FILE
config.model.test_ds.file_path = PATH_TO_TEST_FILE
config.model.infer_samples = INFER_SAMPLES
config.trainer.max_epochs = MAX_EPOCHS
config.trainer.amp_level = AMP_LEVEL
config.trainer.precision = PRECISION
config.model.optim.lr = LR

Now that `config` has been updated with the correct values, instantiate the trainer and experiment manager.

In [12]:
# Instantiate the trainer and experiment manager
trainer = pl.Trainer(**config.trainer)
exp_manager(trainer, config.exp_manager)

GPU available: True, used: True
TPU available: False, using: 0 TPU cores
Using native 16bit precision.


[NeMo I 2021-10-05 12:49:27 exp_manager:216] Experiments will be logged at /dli/task/nemo_experiments/TextClassification/2021-10-05_12-49-27
[NeMo I 2021-10-05 12:49:27 exp_manager:563] TensorboardLogger has been set up


PosixPath('/dli/task/nemo_experiments/TextClassification/2021-10-05_12-49-27')

In [13]:
# Instantiate the model 
model = nemo_nlp.models.TextClassificationModel(config.model, trainer=trainer)

Using bos_token, but it is not set yet.
Using eos_token, but it is not set yet.


[NeMo I 2021-10-05 12:49:57 text_classification_dataset:120] Read 683 examples from /dli/task/data/NCBI_tc-3/train_nemo_format.tsv.
[NeMo I 2021-10-05 12:49:57 text_classification_dataset:238] *** Example ***
[NeMo I 2021-10-05 12:49:57 text_classification_dataset:239] example 0: ['Two', 'new', 'arylsulfatase', 'A', '(ARSA)', 'mutations', 'in', 'a', 'juvenile', 'metachromatic', 'leukodystrophy', '(MLD)', 'patient.', 'Fragments', 'of', 'the', 'arylsulfatase', 'A', '(', 'ARSA', ')', 'gene', 'from', 'a', 'patient', 'with', 'juvenile-onset', 'metachromatic', 'leukodystrophy', '(', 'MLD', ')', 'were', 'amplified', 'by', 'PCR', 'and', 'ligated', 'into', 'MP13', 'cloning', 'vectors', '.', 'Clones', 'hybridizing', 'with', 'cDNA', 'for', 'human', 'ARSA', 'were', 'selected', ',', 'examined', 'for', 'appropriate', 'size', 'inserts', ',', 'and', 'used', 'to', 'prepare', 'single-stranded', 'phage', 'DNA', '.', 'Examination', 'of', 'the', 'entire', 'coding', 'and', 'most', 'of', 'the', 'intronic', '

[NeMo W 2021-10-05 12:50:08 text_classification_dataset:250] Found 664 out of 683 sentences with more than 128 subtokens. Truncated long sentences from the end.


[NeMo I 2021-10-05 12:50:08 data_preprocessing:299] Some stats of the lengths of the sequences:
[NeMo I 2021-10-05 12:50:08 data_preprocessing:301] Min: 74 |                  Max: 129 |                  Mean: 128.36163982430455 |                  Median: 129.0
[NeMo I 2021-10-05 12:50:08 data_preprocessing:307] 75 percentile: 129.00
[NeMo I 2021-10-05 12:50:08 data_preprocessing:308] 99 percentile: 129.00
[NeMo I 2021-10-05 12:50:08 text_classification_dataset:120] Read 100 examples from /dli/task/data/NCBI_tc-3/dev_nemo_format.tsv.
[NeMo I 2021-10-05 12:50:08 text_classification_dataset:238] *** Example ***
[NeMo I 2021-10-05 12:50:08 text_classification_dataset:239] example 0: ['BRCA1', 'is', 'secreted', 'and', 'exhibits', 'properties', 'of', 'a', 'granin.', 'Germline', 'mutations', 'in', 'BRCA1', 'are', 'responsible', 'for', 'most', 'cases', 'of', 'inherited', 'breast', 'and', 'ovarian', 'cancer', '.', 'However', ',', 'the', 'function', 'of', 'the', 'BRCA1', 'protein', 'has', 'remai

[NeMo W 2021-10-05 12:50:10 text_classification_dataset:250] Found 99 out of 100 sentences with more than 128 subtokens. Truncated long sentences from the end.


[NeMo I 2021-10-05 12:50:10 data_preprocessing:299] Some stats of the lengths of the sequences:
[NeMo I 2021-10-05 12:50:10 data_preprocessing:301] Min: 120 |                  Max: 129 |                  Mean: 128.91 |                  Median: 129.0
[NeMo I 2021-10-05 12:50:10 data_preprocessing:307] 75 percentile: 129.00
[NeMo I 2021-10-05 12:50:10 data_preprocessing:308] 99 percentile: 129.00
[NeMo I 2021-10-05 12:50:10 text_classification_dataset:120] Read 10 examples from /dli/task/data/NCBI_tc-3/test_nemo_format.tsv.
[NeMo I 2021-10-05 12:50:10 text_classification_dataset:238] *** Example ***
[NeMo I 2021-10-05 12:50:10 text_classification_dataset:239] example 0: ['Clustering', 'of', 'missense', 'mutations', 'in', 'the', 'ataxia-telangiectasia', 'gene', 'in', 'a', 'sporadic', 'T-cell', 'leukaemia.', 'Ataxia-telangiectasia', '(', 'A-T', ')', 'is', 'a', 'recessive', 'multi-system', 'disorder', 'caused', 'by', 'mutations', 'in', 'the', 'ATM', 'gene', 'at', '11q22-q23', '(', 'ref', '.

[NeMo W 2021-10-05 12:50:10 text_classification_dataset:250] Found 10 out of 10 sentences with more than 128 subtokens. Truncated long sentences from the end.


[NeMo I 2021-10-05 12:50:10 data_preprocessing:299] Some stats of the lengths of the sequences:
[NeMo I 2021-10-05 12:50:10 data_preprocessing:301] Min: 129 |                  Max: 129 |                  Mean: 129.0 |                  Median: 129.0
[NeMo I 2021-10-05 12:50:10 data_preprocessing:307] 75 percentile: 129.00
[NeMo I 2021-10-05 12:50:10 data_preprocessing:308] 99 percentile: 129.00


[NeMo W 2021-10-05 12:50:10 modelPT:197] You tried to register an artifact under config key=tokenizer.vocab_file but an artifact forit has already been registered.
      self.cfg.update_node(config_path, return_path)
    
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertModel: ['cls.predictions.transform.LayerNorm.weight', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.predictions.transform.dense.bias']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenc

In [14]:
%%time
# start model training and save result
# The training takes about 2 minutes to run
trainer.fit(model)
model.save_to(config.model.nemo_path)

LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


[NeMo I 2021-10-05 12:50:14 modelPT:748] Optimizer config = Adam (
    Parameter Group 0
        amsgrad: False
        betas: [0.9, 0.999]
        eps: 1e-08
        lr: 5e-05
        weight_decay: 0.01
    )
[NeMo I 2021-10-05 12:50:14 lr_scheduler:617] Scheduler "<nemo.core.optim.lr_scheduler.WarmupAnnealing object at 0x7f7e511b7730>" 
    will be used during training (effective maximum steps = 55) - 
    Parameters : 
    (warmup_steps: null
    warmup_ratio: 0.1
    last_epoch: -1
    max_steps: 55
    )


initializing ddp: GLOBAL_RANK: 0, MEMBER: 1/1

  | Name                  | Type                 | Params
---------------------------------------------------------------
0 | loss                  | CrossEntropyLoss     | 0     
1 | bert_model            | BertEncoder          | 109 M 
2 | classifier            | SequenceClassifier   | 592 K 
3 | classification_report | ClassificationReport | 0     
---------------------------------------------------------------
110 M     Trainable params
0         Non-trainable params
110 M     Total params
440.301   Total estimated model params size (MB)


Training: 0it [00:00, ?it/s]

Validating: 0it [00:00, ?it/s]

[NeMo I 2021-10-05 12:50:20 text_classification_model:165] val_report: 
    label                                                precision    recall       f1           support   
    label_id: 0                                              0.00       0.00       0.00         32
    label_id: 1                                              0.00       0.00       0.00         24
    label_id: 2                                             44.00     100.00      61.11         44
    -------------------
    micro avg                                               44.00      44.00      44.00        100
    macro avg                                               14.67      33.33      20.37        100
    weighted avg                                            19.36      44.00      26.89        100
    


Epoch 0, global step 10: val_loss reached 1.06308 (best 1.06308), saving model to "/dli/task/nemo_experiments/TextClassification/2021-10-05_12-49-27/checkpoints/TextClassification--val_loss=1.06-epoch=0.ckpt" as top 3


Validating: 0it [00:00, ?it/s]

[NeMo I 2021-10-05 12:50:28 text_classification_model:165] val_report: 
    label                                                precision    recall       f1           support   
    label_id: 0                                            100.00      78.12      87.72         32
    label_id: 1                                              0.00       0.00       0.00         24
    label_id: 2                                             58.67     100.00      73.95         44
    -------------------
    micro avg                                               69.00      69.00      69.00        100
    macro avg                                               52.89      59.38      53.89        100
    weighted avg                                            57.81      69.00      60.61        100
    


Epoch 1, global step 21: val_loss reached 0.81228 (best 0.81228), saving model to "/dli/task/nemo_experiments/TextClassification/2021-10-05_12-49-27/checkpoints/TextClassification--val_loss=0.81-epoch=1.ckpt" as top 3


Validating: 0it [00:00, ?it/s]

[NeMo I 2021-10-05 12:50:37 text_classification_model:165] val_report: 
    label                                                precision    recall       f1           support   
    label_id: 0                                             90.62      90.62      90.62         32
    label_id: 1                                             75.00      50.00      60.00         24
    label_id: 2                                             75.00      88.64      81.25         44
    -------------------
    micro avg                                               80.00      80.00      80.00        100
    macro avg                                               80.21      76.42      77.29        100
    weighted avg                                            80.00      80.00      79.15        100
    


Epoch 2, global step 32: val_loss reached 0.62659 (best 0.62659), saving model to "/dli/task/nemo_experiments/TextClassification/2021-10-05_12-49-27/checkpoints/TextClassification--val_loss=0.63-epoch=2.ckpt" as top 3


Validating: 0it [00:00, ?it/s]

[NeMo I 2021-10-05 12:50:45 text_classification_model:165] val_report: 
    label                                                precision    recall       f1           support   
    label_id: 0                                             87.88      90.62      89.23         32
    label_id: 1                                             78.26      75.00      76.60         24
    label_id: 2                                             86.36      86.36      86.36         44
    -------------------
    micro avg                                               85.00      85.00      85.00        100
    macro avg                                               84.17      84.00      84.06        100
    weighted avg                                            84.90      85.00      84.94        100
    


Epoch 3, global step 43: val_loss reached 0.53359 (best 0.53359), saving model to "/dli/task/nemo_experiments/TextClassification/2021-10-05_12-49-27/checkpoints/TextClassification--val_loss=0.53-epoch=3.ckpt" as top 3


Validating: 0it [00:00, ?it/s]

[NeMo I 2021-10-05 12:50:56 text_classification_model:165] val_report: 
    label                                                precision    recall       f1           support   
    label_id: 0                                             87.88      90.62      89.23         32
    label_id: 1                                             79.17      79.17      79.17         24
    label_id: 2                                             88.37      86.36      87.36         44
    -------------------
    micro avg                                               86.00      86.00      86.00        100
    macro avg                                               85.14      85.39      85.25        100
    weighted avg                                            86.00      86.00      85.99        100
    


Epoch 4, global step 54: val_loss reached 0.50618 (best 0.50618), saving model to "/dli/task/nemo_experiments/TextClassification/2021-10-05_12-49-27/checkpoints/TextClassification--val_loss=0.51-epoch=4.ckpt" as top 3
Saving latest checkpoint...
      conf.update_node(conf_path, item.path)
    


CPU times: user 1min 8s, sys: 25.3 s, total: 1min 33s
Wall time: 1min 36s


Evaluate the model with `trainer.test`, which will automatically use the file path to the test set we updated in `config`.

In [15]:
trainer.test(model=model, verbose=False)

LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Testing: 0it [00:00, ?it/s]

[NeMo I 2021-10-05 12:51:51 text_classification_model:165] test_report: 
    label                                                precision    recall       f1           support   
    label_id: 0                                             75.00      75.00      75.00          4
    label_id: 1                                            100.00     100.00     100.00          2
    label_id: 2                                             75.00      75.00      75.00          4
    -------------------
    micro avg                                               80.00      80.00      80.00         10
    macro avg                                               83.33      83.33      83.33         10
    weighted avg                                            80.00      80.00      80.00         10
    


[{'test_loss': 0.6855399012565613,
  'test_precision': 79.99999237060547,
  'test_f1': 79.99999237060547,
  'test_recall': 79.99999237060547}]

Finally, run inference using the inference samples from `config`. We can check on them by just printing directly from the `config.model.infer_samples` key object.  It displays as a list of strings.

To run inference tor text classification, use the `model.classifytext` method.  The inferred labels are output.

In [16]:
print(config.model.infer_samples)

['Germline mutations in BRCA1 are responsible for most cases of inherited breast and ovarian cancer ', 'The first predictive testing for Huntington disease  was based on analysis of linked polymorphic DNA markers to estimate the likelihood of inheriting the mutation for HD', 'Further studies suggested that low dilutions of C5D serum contain a factor or factors interfering at some step in the hemolytic assay of C5 rather than a true C5 inhibitor or inactivator']


In [17]:
model.classifytext(queries=config.model.infer_samples, batch_size=64, max_seq_length=128)

[0, 1, 2]

## 2.3.3 Exercise: Query the Model
What if we wanted to specify additional queries for inference?  The `model.classifytext` method we just used specifies the queries, but they do not _have_ to be in the config file.  We can simply create a list of strings for our queries.

In [None]:
my_queries = [
    'Clustering of missense mutations in the ataxia-telangiectasia gene in a sporadic T-cell leukaemia',
    'Myotonic dystrophy protein kinase is involved in the modulation of the Ca2+ homeostasis in skeletal muscle cells.',
    'Constitutional RB1-gene mutations in patients with isolated unilateral retinoblastoma.',
    'Hereditary deficiency of the fifth component of complement in man. I. Clinical, immunochemical, and family studies.'
]

Run inference on the `my_queries` list.  If you get stuck, refer to the [solution](solutions/ex2.3.3.ipynb)

In [18]:
# TODO Run inference over the my_queries list

# Instantiate the OmegaConf object by loading the config file
TC_DIR = "/dli/task/nemo/examples/nlp/text_classification"
CONFIG_FILE = "text_classification_config.yaml"
config = OmegaConf.load(TC_DIR + "/conf/" + CONFIG_FILE)

# set the values we want to change
NUM_CLASSES = 3
MAX_SEQ_LENGTH = 128
PATH_TO_TRAIN_FILE = "/dli/task/data/NCBI_tc-3/train_nemo_format.tsv"
PATH_TO_VAL_FILE = "/dli/task/data/NCBI_tc-3/dev_nemo_format.tsv"
PATH_TO_TEST_FILE = "/dli/task/data/NCBI_tc-3/test_nemo_format.tsv"
# disease domain inference sample answers should be 0, 1, 2 
my_queries = [
    'Clustering of missense mutations in the ataxia-telangiectasia gene in a sporadic T-cell leukaemia',
    'Myotonic dystrophy protein kinase is involved in the modulation of the Ca2+ homeostasis in skeletal muscle cells.',
    'Constitutional RB1-gene mutations in patients with isolated unilateral retinoblastoma.',
    'Hereditary deficiency of the fifth component of complement in man. I. Clinical, immunochemical, and family studies.'
]
MAX_EPOCHS = 5
AMP_LEVEL = 'O1'
PRECISION = 16
LR = 5.0e-05

# set the config values using omegaconf
config.model.dataset.num_classes = NUM_CLASSES
config.model.dataset.max_seq_length = MAX_SEQ_LENGTH
config.model.train_ds.file_path = PATH_TO_TRAIN_FILE
config.model.validation_ds.file_path = PATH_TO_VAL_FILE
config.model.test_ds.file_path = PATH_TO_TEST_FILE
config.model.infer_samples = my_queries
config.trainer.max_epochs = MAX_EPOCHS
config.trainer.amp_level = AMP_LEVEL
config.trainer.precision = PRECISION
config.model.optim.lr = LR

---
<h2 style="color:green;">Congratulations!</h2>

You've built a text classifier with three classes and learned:
* How to use NeMo NLP model config files and scripts to quickly create experiments
* How to override the config `model`, `trainer`, and `exp_manager settings`
* How to train, evaluate, and infer a text classifier using a single command line
* How to train, evaluate, and infer a text classifier using PyTorch Lightning

You're ready to try a different NLP task.<br>

Move on to [3.0 Build a Named Entity Recognizer](030_NamedEntityRecognition.ipynb).

<a href="https://www.nvidia.com/dli"> <img src="images/DLI_Header.png" alt="Header" style="width: 400px;"/> </a>