# Call Center - Model Transfer Learning and Fine-Tuning

TAO Toolkit is a python based AI toolkit for taking purpose-built pre-trained AI models and customizing them with your own data. Transfer learning extracts learned features from an existing neural network to a new one. Transfer learning is often used when creating a large training dataset is not feasible in order to enhance the base performance of state-of-the-art models.

For this call center solution, the speech-to-text and sentiment analysis models are fine-tuned on call center data to augment the model performance on business specific terminology.

For more information on the TAO Toolkit, please visit [here](https://developer.nvidia.com/tao).

![Transfer Learning Toolkit](https://developer.nvidia.com/sites/default/files/akamai/embedded-transfer-learning-toolkit-software-stack-1200x670px.png)

### Installing necessary dependencies 

For ease of use, please install TAO Toolkit inside a python virtual environment. We recommend performing this step first and then launching the notebook from the virtual environment. Please refer to the README for these instructions.

## Importing Libraries

In [1]:
import os
import re
import glob
import wave
import random
import contextlib
from tqdm.notebook import tqdm

from utils_TLT import (
    prepare_train_test_manifests,
)

Use these constants to affect different aspects of this pipeline:
- `DATA_DIR`: base folder where data is stored
- `DATASET_NAME`: name of the dataset
- `RIVA_MODEL_DIR`: directory where the exported models will be saved (.riva and .rmir)
- `STT_MODEL_NAME`: name of the speech-to-text model 
- `SEA_MODEL_NAME`: name of the sentiment analysis model 

For the variable names, the `STT` tag corresponds to the speech-to-text model, the `SEA` prefix to the sentiment analysis.

#### NOTE: MAKE SURE THESE CONSTANTS ALIGN WITH `Call Center - Sentiment Analysis Pipeline.ipynb`

In [2]:
DATA_DIR = "data"
DATASET_NAME = "ReleasedDataset_mp3"
RIVA_MODEL_DIR = "/sfl_data/riva/models"

STT_MODEL_NAME = "speech-to-text-model.riva"
SEA_MODEL_NAME = "sentiment-analysis-model.riva"

## Setting up directories 

After installing TAO Toolkit, the next step is to setup the mounts. The TAO Toolkit launcher uses docker containers under the hood, and **for our data and results directory to be visible to the docker, they need to be mapped**. The launcher can be configured using the config file `~/.tao_mounts.json`. Apart from the mounts, you can also configure additional options like the Environment Variables and amount of Shared Memory available to the TAO Toolkit launcher. <br>

The code below creates a `~/.tao_mounts.json`  file. This maps directories in which we save the data, specs, results and cache. You should configure it for your specific case so these directories are correctly visible to the docker container. The `source` directories are found on the host machine and use the `HOST` tag in the variable names (e.g. `STT_HOST_CONFIG_DIR`). The `destination` directories are found on the docker container created by the TAO Toolkit and use the `TAO` tag in the variable names (e.g. `STT_TAO_CONFIG_DIR`).

In [3]:
HOST_DATA_DIR    = "/sfl_data/devs/diego/NetApp_JarvisDemo/data"

# Speech to Text #
STT_HOST_CONFIG_DIR  = "/sfl_data/tao/config/speech_to_text"
STT_HOST_RESULTS_DIR = "/sfl_data/tao/results/speech_to_text"
STT_HOST_CACHE_DIR   = "/sfl_data/tao/.cache/speech_to_text"

# Sentiment Analysis #
SEA_HOST_CONFIG_DIR  = "/sfl_data/tao/config/sentiment_analysis"
SEA_HOST_RESULTS_DIR = "/sfl_data/tao/results/sentiment_analysis"
SEA_HOST_CACHE_DIR   = "/sfl_data/tao/.cache/sentiment_analysis"

In [4]:
!mkdir -p $STT_HOST_CONFIG_DIR
!mkdir -p $STT_HOST_RESULTS_DIR
!mkdir -p $STT_HOST_CACHE_DIR

!mkdir -p $SEA_HOST_CONFIG_DIR
!mkdir -p $SEA_HOST_RESULTS_DIR
!mkdir -p $SEA_HOST_CACHE_DIR

In [5]:
%%bash
tee ~/.tao_mounts.json <<'EOF'
{
   "Mounts":[
       {
           "source": "/sfl_data/devs/diego/NetApp_JarvisDemo/data" ,
           "destination": "/data"
       },
       {
           "source": "/sfl_data/tao/config" ,
           "destination": "/config"
       },
       {
           "source": "/sfl_data/tao/results" ,
           "destination": "/results"
       },
       {
           "source": "/sfl_data/tao/.cache",
           "destination": "/root/.cache"
       }
   ],
   "DockerOptions": {
        "shm_size": "128G",
        "ulimits": {
            "memlock": -1,
            "stack": 67108864
         }
   }
}
EOF

{
   "Mounts":[
       {
           "source": "/sfl_data/devs/diego/NetApp_JarvisDemo/data" ,
           "destination": "/data"
       },
       {
           "source": "/sfl_data/tao/config" ,
           "destination": "/config"
       },
       {
           "source": "/sfl_data/tao/results" ,
           "destination": "/results"
       },
       {
           "source": "/sfl_data/tao/.cache",
           "destination": "/root/.cache"
       }
   ],
   "DockerOptions": {
        "shm_size": "128G",
        "ulimits": {
            "memlock": -1,
            "stack": 67108864
         }
   }
}


Check if the GPUs are available using the `nvidia-smi` command.

In [6]:
!nvidia-smi

Mon Sep 20 17:01:21 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.57.02    Driver Version: 470.57.02    CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla V100-SXM2...  On   | 00000000:06:00.0 Off |                    0 |
| N/A   37C    P0    82W / 300W |   7670MiB / 32510MiB |     48%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  Tesla V100-SXM2...  On   | 00000000:07:00.0 Off |                    0 |
| N/A   35C    P0    44W / 300W |      0MiB / 32510MiB |      0%      Default |
|       

You can check the docker image versions and the tasks that TAO Toolkit can perform with `tao --help` or `tao info`.

# [NetApp DataOps Toolkit](https://github.com/NetApp/netapp-dataops-toolkit)

The massive volume of calls that a call center must process on a daily basis means that a database can be quickly overwhelmed by audio files. Efficiently managing the processing and transfer of these audio files is an integral part of the model training and fine-tuning.

The data processing steps can be facilitated through the use of the **NetApp DataOps Toolkit**. This toolkit is a Python library that makes it simple for developers, data scientists, DevOps engineers, and data engineers to perform various data management tasks, such as provisioning a new data volume, near-instantaneously cloning a data volume, and near-instantaneously snapshotting a data volume for traceability/baselining. 

Installation and usage of the **NetApp DataOps Toolkit** for Traditional Environments requires that Python 3.6 or above be installed on the local host. Additionally, the toolkit requires that pip for Python3 be installed.

For more information on the **NetApp DataOps Toolkit**, click [here](https://github.com/NetApp/netapp-dataops-toolkit).

To install the **NetApp DataOps Toolkit** for Traditional Environments, run the following command.

```
python3 -m pip install netapp-dataops-traditional
```

A config file must be created before the **NetApp DataOps Toolkit** for Traditional Environments can be used to perform data management operations. To create a config file, run the following command. This command will create a config file named 'config.json' in '~/.netapp_dataops/'.

```
netapp_dataops_cli.py config
```

# Speech-to-Text

The speech-to-text (or Automatic Speech Recognition) is a part of NVIDIA's TAO Conversational AI Toolkit. This Toolkit can train models for common conversational AI tasks such as text classification, question answering, speech recognition, and more.

For an overview of the Conversational AI Toolkit, click [here](https://ngc.nvidia.com/catalog/collections/nvidia:tao:tao_conversationalai).

### Set TAO Toolkit Paths

`NOTE`: The following paths are set from the perspective of the TAO Toolkit Docker.

In [7]:
# the data directory structure is based off main_RIVA.ipynb
# the config and results are manually created
STT_TAO_DATA_DIR = "/data"
STT_TAO_CONFIG_DIR = "/config/speech_to_text"
STT_TAO_RESULTS_DIR = "/results/speech_to_text"

# The encryption key from config.sh. Use the same key for all commands
KEY = 'tlt_encode'

### Downloading Specs

TAO's Conversational AI Toolkit works off of spec files which make it easy to edit hyperparameters on the fly. We can proceed to downloading the spec files. The user may choose to modify/rewrite these specs, or even individually override them through the launcher. You can download the default spec files by using the download_specs command.

The -o argument indicating the folder where the default configuration files will be downloaded, and -r that instructs the script where to save the logs. Make sure the -o points to an empty folder, otherwise the config files will not be downloaded. If you have already downloaded the config files, then this command will not overwrite them.

For more information on how to build and deploy models using the TAO Toolkit, visit [here](https://developer.nvidia.com/blog/building-and-deploying-conversational-ai-models-using-tao-toolkit/).

In [8]:
!tao speech_to_text download_specs \
    -r $STT_TAO_RESULTS_DIR \
    -o $STT_TAO_CONFIG_DIR

2021-09-20 17:01:22,711 [INFO] root: Registry: ['nvcr.io']
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the "user":"UID:GID" in the
DockerOptions portion of the "/root/.tao_mounts.json" file. You can obtain your
users UID and GID by using the "id -u" and "id -g" commands on the
terminal.
[NeMo W 2021-09-20 21:01:26 experimental:27] Module <class 'nemo.collections.asr.losses.ctc.CTCLoss'> is experimental, not ready for production and is not fully supported. Use at your own risk.
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     /root/nltk_data...
[nltk_data]   Unzipping taggers/averaged_perceptron_tagger.zip.
[nltk_data] Downloading package cmudict to /root/nltk_data...
[nltk_data]   Unzipping corpora/cmudict.zip.
[NeMo W 2021-09-20 21:01:27 experimental:27] Module <class 'nemo.collections.asr.data.audio_to_text.AudioToCharDataset'> is experimental, not ready for production and is not fully supporte

### Overwrite Specs

The default speech-to-text specs are built for translation into the Russian language. `finetune.yaml` must be overwritten for the pipeline to work in English.

`NOTE`: **THE PATH TO `finetune.yaml` MUST ALIGN WITH `STT_HOST_CONFIG_DIR`**. If you change the `STT_HOST_CONFIG_DIR`, make sure you change the path between `tee` and `<<`.

In [9]:
STT_HOST_CONFIG_DIR

'/sfl_data/tao/config/speech_to_text'

In [10]:
%%bash
tee /sfl_data/tao/config/speech_to_text/finetune.yaml <<'EOF'

trainer:
  max_epochs: 1   # This is low for demo purposes

# Whether or not to change the decoder vocabulary.
# Note that this MUST be set if the labels change, e.g. to a different language's character set
# or if additional punctuation characters are added.
change_vocabulary: true

# Fine-tuning settings: training dataset
finetuning_ds:
  manifest_filepath: ???
  sample_rate: 16000
  labels: [" ", "a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m",
           "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z", "'"]
  batch_size: 16
  trim_silence: false
  max_duration: 16.7
  shuffle: true
  is_tarred: false
  tarred_audio_filepaths: null

# Fine-tuning settings: validation dataset
validation_ds:
  manifest_filepath: ???
  sample_rate: 16000
  labels: [" ", "a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m",
           "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z", "'"]
  batch_size: 32
  shuffle: false

# Fine-tuning settings: optimizer
optim:
  name: novograd
  lr: 0.001
EOF


trainer:
  max_epochs: 1   # This is low for demo purposes

# Whether or not to change the decoder vocabulary.
# Note that this MUST be set if the labels change, e.g. to a different language's character set
# or if additional punctuation characters are added.
change_vocabulary: true

# Fine-tuning settings: training dataset
finetuning_ds:
  manifest_filepath: ???
  sample_rate: 16000
  labels: [" ", "a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m",
           "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z", "'"]
  batch_size: 16
  trim_silence: false
  max_duration: 16.7
  shuffle: true
  is_tarred: false
  tarred_audio_filepaths: null

# Fine-tuning settings: validation dataset
validation_ds:
  manifest_filepath: ???
  sample_rate: 16000
  labels: [" ", "a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m",
           "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z", "'"]
  batch_size: 32
  shuffle: false

# Fine-tuning settings

## Training

Download the Jasper pretrained model [here](https://ngc.nvidia.com/catalog/models/nvidia:tlt-jarvis:speechtotext_english_jasper/files). The wget command below has been modified to automatically save the model in its target location (`STT_HOST_CONFIG_DIR`). Note that the pretrained model only needs to be downloaded once.

In [11]:
if not os.path.exists(os.path.join(STT_HOST_CONFIG_DIR, "speechtotext_english_jasper.tlt")):
    !wget -O $STT_HOST_CONFIG_DIR/speechtotext_english_jasper.tlt https://api.ngc.nvidia.com/v2/models/nvidia/tlt-jarvis/speechtotext_english_jasper/versions/trainable_v1.2/files/speechtotext_english_jasper.tlt

## Fine-Tuning

Before fine-tuning, we need to create a manifest for the train and test sets. There are a handful of parameters used to create this manifest:
- `N_SAMPLE`: number of calls to sample
- `MAX_DURATION`: maximum duration for the WAV files (for Jasper, 16.7 is the maximum)
- `SET_SIZES`: dictionary of set sizes (must include "train", "valid" and "test")
- `COMPANY_BLACKLIST`: blacklist of files to remove

In [12]:
COMPANY_BLACKLIST = [
    "Hormel Foods Corp._20170223",
    "Kraft Heinz Co_20170503",
    "Amazon.com Inc._20170202",
    "Vulcan Materials_20170802",
    "Masco Corp._20171024",
    "Fortive Corp_20170207",
    "Salesforce.com_20170228",
    "Home Depot_20170516",
    "Hasbro Inc._20170206",
    "Exxon Mobil Corp._20171027",
    "Biogen Inc._20170126",
    "Goodyear Tire & Rubber_20170428",
    "Alaska Air Group Inc_20171025",
    "FleetCor Technologies Inc_20170803",
    "Roper Technologies_20170209",
    "Foot Locker Inc_20170224",
    "Starbucks Corp._20170126",
    "Dover Corp._20170720",
    "Xerox_20170801",
    "AT&T Inc._2017042",
    "AT&T Inc._20170425",
    "Salesforce.com_20170822",
    "Varian Medical Systems_20171025",
]

N_SAMPLE = 200
MAX_DURATION = 16.7
SET_SIZES = {
    "train": 0.75,
    "valid": 0.20,
    "test": 0.05,
}

In [13]:
prepare_train_test_manifests(
    output_path       = STT_HOST_CONFIG_DIR,
    host_data_dir     = HOST_DATA_DIR,
    tlt_data_dir      = STT_TAO_DATA_DIR,
    dataset_name      = DATASET_NAME,
    set_sizes         = SET_SIZES,
    max_duration      = MAX_DURATION,
    company_blacklist = COMPANY_BLACKLIST,
    n_sample          = N_SAMPLE,
)

  0%|          | 0/200 [00:00<?, ?it/s]

Unable to load earnings call 'Amazon.com Inc._20170202'
[ERROR] list index out of range
Unable to load earnings call 'Foot Locker Inc_20170224'
[ERROR] list index out of range
Unable to load earnings call 'F5 Networks_20170726'
[ERROR] list index out of range
Unable to load earnings call 'Xcel Energy Inc_20170202'
[ERROR] list index out of range
Unable to load earnings call 'Goodyear Tire & Rubber_20170428'
[ERROR] list index out of range
Skipped Iron Mountain Incorporated_20170728
Unable to load earnings call 'Biogen Inc._20170126'
[ERROR] list index out of range
Unable to load earnings call 'ResMed_20170427'
[ERROR] list index out of range
Unable to load earnings call 'JPMorgan Chase & Co._20170714'
[ERROR] list index out of range
Unable to load earnings call 'Celgene Corp._20170427'
[ERROR] list index out of range
Unable to load earnings call 'Comerica Inc._20170418'
[ERROR] list index out of range
Unable to load earnings call 'Home Depot_20170516'
[ERROR] list index out of range
Un

Once the pretrained model is in place and the manifest is created, the following command can be used to fine tune the ASR model.

For more information on how to build and deploy models using the TAO Toolkit, visit [here](https://developer.nvidia.com/blog/building-and-deploying-conversational-ai-models-using-tao-toolkit/).

In [14]:
!tao speech_to_text finetune \
     -e $STT_TAO_CONFIG_DIR/finetune.yaml \
     -g 1 \
     -k $KEY \
     -m $STT_TAO_CONFIG_DIR/speechtotext_english_jasper.tlt \
     -r $STT_TAO_RESULTS_DIR/finetune \
     finetuning_ds.manifest_filepath=$STT_TAO_CONFIG_DIR/train_manifest.json \
     validation_ds.manifest_filepath=$STT_TAO_CONFIG_DIR/valid_manifest.json \
     trainer.max_epochs=20 \
     finetuning_ds.max_duration=$MAX_DURATION \
     validation_ds.max_duration=$MAX_DURATION \
     finetuning_ds.trim_silence=false \
     validation_ds.trim_silence=false \
     finetuning_ds.batch_size=16 \
     finetuning_ds.batch_size=16 \
     finetuning_ds.num_workers=16 \
     validation_ds.num_workers=16 \
     trainer.gpus=1 \
     optim.lr=0.001

2021-09-20 17:01:51,118 [INFO] root: Registry: ['nvcr.io']
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the "user":"UID:GID" in the
DockerOptions portion of the "/root/.tao_mounts.json" file. You can obtain your
users UID and GID by using the "id -u" and "id -g" commands on the
terminal.
[NeMo W 2021-09-20 21:01:55 experimental:27] Module <class 'nemo.collections.asr.losses.ctc.CTCLoss'> is experimental, not ready for production and is not fully supported. Use at your own risk.
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     /root/nltk_data...
[nltk_data]   Unzipping taggers/averaged_perceptron_tagger.zip.
[nltk_data] Downloading package cmudict to /root/nltk_data...
[nltk_data]   Unzipping corpora/cmudict.zip.
[NeMo W 2021-09-20 21:01:55 experimental:27] Module <class 'nemo.collections.asr.data.audio_to_text.AudioToCharDataset'> is experimental, not ready for production and is not fully supporte

[NeMo W 2021-09-20 21:02:10 modelPT:145] Please call the ModelPT.setup_training_data() method and provide a valid configuration file to setup the train data loader.
    Train config : 
    manifest_filepath: /data/fisher_min5sec/small_manifest.json
    batch_size: 32
    sample_rate: 16000
    labels:
    - ' '
    - a
    - b
    - c
    - d
    - e
    - f
    - g
    - h
    - i
    - j
    - k
    - l
    - m
    - 'n'
    - o
    - p
    - q
    - r
    - s
    - t
    - u
    - v
    - w
    - x
    - 'y'
    - z
    - ''''
    num_workers: null
    trim_silence: true
    shuffle: true
    max_duration: 16.7
    is_tarred: false
    tarred_audio_filepaths: null
    
[NeMo W 2021-09-20 21:02:10 modelPT:152] Please call the ModelPT.setup_validation_data() or ModelPT.setup_multiple_validation_data() method and provide a valid configuration file to setup the validation data loader(s). 
    Validation config : 
    manifest_filepath: /data/fisher_min5sec/small_manifest.json
    batch_

Restored states from the checkpoint file at /results/speech_to_text/finetune/checkpoints/finetuned-model-last.ckpt
Restored states from the checkpoint file at /results/speech_to_text/finetune/checkpoints/finetuned-model-last.ckpt
Validation sanity check: 0it [00:00, ?it/s][NeMo W 2021-09-20 21:03:29 patch_utils:49] torch.stft() signature has been updated for PyTorch 1.7+
    Please update PyTorch to remain compatible with later versions of NeMo.
    
Training: 0it [00:00, ?it/s]                                                    
[NeMo I 2021-09-20 21:05:05 finetune:138] Experiment logs saved to '/results/speech_to_text/finetune'
[NeMo I 2021-09-20 21:05:05 finetune:139] Fine-tuned model saved to '/results/speech_to_text/finetune/checkpoints/finetuned-model.tlt'
2021-09-20 17:05:07,658 [INFO] tlt.components.docker_handler.docker_handler: Stopping container.


## Evaluate

Once the model has been fine-tuned, its performance needs to be assessed. The base JASPER model is compared with the fine-tuned version.

For more information on how to build and deploy models using the TAO Toolkit, visit [here](https://developer.nvidia.com/blog/building-and-deploying-conversational-ai-models-using-tao-toolkit/).

In [15]:
!tao speech_to_text evaluate \
     -e $STT_TAO_CONFIG_DIR/evaluate.yaml \
     -g 1 \
     -k $KEY \
     -m $STT_TAO_RESULTS_DIR/finetune/checkpoints/finetuned-model.tlt \
     -r $STT_TAO_RESULTS_DIR/evaluate \
     test_ds.manifest_filepath=$STT_TAO_CONFIG_DIR/test_manifest.json

2021-09-20 17:05:18,564 [INFO] root: Registry: ['nvcr.io']
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the "user":"UID:GID" in the
DockerOptions portion of the "/root/.tao_mounts.json" file. You can obtain your
users UID and GID by using the "id -u" and "id -g" commands on the
terminal.
[NeMo W 2021-09-20 21:05:22 experimental:27] Module <class 'nemo.collections.asr.losses.ctc.CTCLoss'> is experimental, not ready for production and is not fully supported. Use at your own risk.
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     /root/nltk_data...
[nltk_data]   Unzipping taggers/averaged_perceptron_tagger.zip.
[nltk_data] Downloading package cmudict to /root/nltk_data...
[nltk_data]   Unzipping corpora/cmudict.zip.
[NeMo W 2021-09-20 21:05:23 experimental:27] Module <class 'nemo.collections.asr.data.audio_to_text.AudioToCharDataset'> is experimental, not ready for production and is not fully supporte

[NeMo I 2021-09-20 21:06:15 collections:173] Dataset loaded with 991 files totalling 2.53 hours
[NeMo I 2021-09-20 21:06:15 collections:174] 0 files were filtered totalling 0.00 hours
initializing ddp: GLOBAL_RANK: 0, MEMBER: 1/1
initializing ddp: GLOBAL_RANK: 0, MEMBER: 1/1
Added key: store_based_barrier_key:1 to store for rank: 0
    
Testing: 0it [00:00, ?it/s][NeMo W 2021-09-20 21:06:30 patch_utils:49] torch.stft() signature has been updated for PyTorch 1.7+
    Please update PyTorch to remain compatible with later versions of NeMo.
    
Testing: 100%|██████████████████████████████████| 31/31 [05:55<00:00, 11.48s/it]
--------------------------------------------------------------------------------
DATALOADER:0 TEST RESULTS
{'test_loss': tensor(143.2543, device='cuda:0'),
 'test_wer': tensor(0.4080, device='cuda:0')}
--------------------------------------------------------------------------------
[NeMo I 2021-09-20 21:12:12 evaluate:99] Experiment logs saved to '/results/speech_to_te

## Export Model to RIVA

With TAO Toolkit, you can also export your model in a format that can deployed using Nvidia RIVA.

For more information on how to build and deploy models using the TAO Toolkit, visit [here](https://developer.nvidia.com/blog/building-and-deploying-conversational-ai-models-using-tao-toolkit/).

In [16]:
!tao speech_to_text export \
     -e $STT_TAO_CONFIG_DIR/export.yaml \
     -g 1 \
     -k $KEY \
     -m $STT_TAO_RESULTS_DIR/finetune/checkpoints/finetuned-model.tlt \
     -r $STT_TAO_RESULTS_DIR/riva \
     export_format=JARVIS \
     export_to=$STT_MODEL_NAME

2021-09-20 17:12:25,262 [INFO] root: Registry: ['nvcr.io']
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the "user":"UID:GID" in the
DockerOptions portion of the "/root/.tao_mounts.json" file. You can obtain your
users UID and GID by using the "id -u" and "id -g" commands on the
terminal.
[NeMo W 2021-09-20 21:12:29 experimental:27] Module <class 'nemo.collections.asr.losses.ctc.CTCLoss'> is experimental, not ready for production and is not fully supported. Use at your own risk.
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     /root/nltk_data...
[nltk_data]   Unzipping taggers/averaged_perceptron_tagger.zip.
[nltk_data] Downloading package cmudict to /root/nltk_data...
[nltk_data]   Unzipping corpora/cmudict.zip.
[NeMo W 2021-09-20 21:12:30 experimental:27] Module <class 'nemo.collections.asr.data.audio_to_text.AudioToCharDataset'> is experimental, not ready for production and is not fully supporte

Copy the model to the RIVA model directory (`RIVA_MODEL_DIR`). **NOTE: Make sure this directory aligns with the config.sh file for RIVA.**

In [17]:
!cp $STT_HOST_RESULTS_DIR/riva/$STT_MODEL_NAME $RIVA_MODEL_DIR/$STT_MODEL_NAME

### Business Use Case

There are business specific words and phrases that base ASR models (such as Jasper and Quartznet) are not likely to have seen in their training. Through the use of fine-tuning, industry specific vocabulary can be integrated into the model language.

# Sentiment Analysis

### Set TAO Toolkit Paths

NOTE: The following paths are set from the perspective of the TAO Toolkit Docker.

In [None]:
SEA_TAO_DATA_DIR = "/data"
SEA_TAO_CONFIG_DIR = "/config/sentiment_analysis"
SEA_TAO_RESULTS_DIR = "/results/sentiment_analysis"

# The encryption key from config.sh. Use the same key for all commands
KEY = 'tlt_encode'

### Downloading Specs
We can proceed to downloading the spec files. The user may choose to modify/rewrite these specs, or even individually override them through the launcher. You can download the default spec files by using the `download_specs` command. <br>

The -o argument indicating the folder where the default specification files will be downloaded, and -r that instructs the script where to save the logs. **Make sure the -o points to an empty folder!**

For more information on how to build and deploy models using the TAO Toolkit, visit [here](https://developer.nvidia.com/blog/building-and-deploying-conversational-ai-models-using-tao-toolkit/).

In [None]:
!tao text_classification download_specs \
    -r $SEA_TAO_RESULTS_DIR \
    -o $SEA_TAO_CONFIG_DIR

<a id='dataset-convert'></a>
### Dataset Convert

- source_data_dir: directory path for the downloaded dataset
- target_data_dir: directory path for the processed dataset

In [None]:
# For BERT dataset conversion
!tao text_classification dataset_convert \
    -e $SEA_TAO_CONFIG_DIR/dataset_convert.yaml \
    -r $SEA_TAO_CONFIG_DIR/dataset_convert \
    dataset_name=sst2 source_data_dir=$SEA_TAO_DATA_DIR/SST-2 target_data_dir=$SEA_TAO_DATA_DIR/processed

<a id='training'></a>
### Training

Training a model using TAO Toolkit is as simple as configuring your spec file and running the train command. The code cell below uses the default train.yaml available for users as reference. It is configured by default to use the `bert-base-uncased` pretrained model. Additionally, these configurations could easily be overridden using the tlt-launcher CLI as shown below. For instance, below we override the `training_ds.file`, `validation_ds.file`, `trainer.max_epochs`, `training_ds.num_workers` and `validation_ds.num_workers` configurations to suit our needs. We encourage you to take a look at the .yaml spec files we provide! <br>

In order to get good results, you need to train for 20-50 epochs (depends on the size of the data). Training with 1 epoch in the tutorial is just for demonstration purposes.


For training a Text Classification model in TAO Toolkit, we use the `tao text_classification train` command with the following args:
- `-e`: Path to the spec file
- `-g`: Number of GPUs to use
- `-k`: User specified encryption key to use while saving/loading the model
- `-r`: Path to a folder where the outputs should be written. Make sure this is mapped in tlt_mounts.json
- Any overrides to the spec file eg. trainer.max_epochs

For more information on how to build and deploy models using the TAO Toolkit, visit [here](https://developer.nvidia.com/blog/building-and-deploying-conversational-ai-models-using-tao-toolkit/).

In [None]:
# For BERT training on SST-2
!tao text_classification train \
    -e $SEA_TAO_CONFIG_DIR/train.yaml \
    -g 1  \
    -k $KEY \
    -r $SEA_TAO_RESULTS_DIR/train \
    training_ds.file_path=$SEA_TAO_DATA_DIR/processed/train.tsv \
    validation_ds.file_path=$SEA_TAO_DATA_DIR/processed/dev.tsv \
    model.class_labels.class_labels_file=$SEA_TAO_DATA_DIR/processed/label_ids.csv \
    trainer.max_epochs=10 \
    optim.lr=0.00002

<a id='ft'></a>
### Fine-Tuning

The command for fine-tuning is very similar to that of training. Instead of `tao text_classification train`, we use `tao text_classification finetune` instead. We also specify the spec file corresponding to fine-tuning. All commands in TAO Toolkit follow a similar pattern.


For more information on how to build and deploy models using the TAO Toolkit, visit [here](https://developer.nvidia.com/blog/building-and-deploying-conversational-ai-models-using-tao-toolkit/).

In [None]:
!tao text_classification finetune \
    -e $SEA_TAO_CONFIG_DIR/finetune.yaml \
    -g 1 \
    -m $SEA_TAO_RESULTS_DIR/train/checkpoints/trained-model.tlt \
    -k $KEY \
    -r $SEA_TAO_RESULTS_DIR/finetune \
    finetuning_ds.file_path=$SEA_TAO_DATA_DIR/processed/train.tsv \
    validation_ds.file_path=$SEA_TAO_DATA_DIR/processed/dev.tsv \
    trainer.max_epochs=10 \
    optim.lr=0.00002

<a id='evaluation'></a>
### Evaluation
The evaluation spec .yaml is as simple as:

```
test_ds:
  file: ??? # e.g. $DATA_DIR/test.tsv
  batch_size: 32
  shuffle: false
  num_samples: 500
```

Below, we use `tao text_classification evaluate` and override the test data configuration by specifying `test_ds.file_path`. Other arguments follow the same pattern as before.

For more information on how to build and deploy models using the TAO Toolkit, visit [here](https://developer.nvidia.com/blog/building-and-deploying-conversational-ai-models-using-tao-toolkit/).

In [None]:
!tao text_classification evaluate \
    -e $SEA_TAO_CONFIG_DIR/evaluate.yaml \
    -g 1 \
    -m $SEA_TAO_RESULTS_DIR/finetune/checkpoints/finetuned-model.tlt \
    -k $KEY \
    -r $SEA_TAO_RESULTS_DIR/evaluate \
    test_ds.file_path=$SEA_TAO_DATA_DIR/SST-2/dev.tsv \
    test_ds.batch_size=32

<a id='export-riva'></a>
### Export to Riva

With TAO Toolkit, you can also export your model in a format that can deployed using [NVIDIA Riva](https://developer.nvidia.com/riva), a highly performant application framework for multi-modal conversational AI services using GPUs! The same command for exporting to ONNX can be used here. The only small variation is the configuration for `export_format` in the spec file!

In [None]:
!tao text_classification export \
    -e $SEA_TAO_CONFIG_DIR/export.yaml \
    -g 1 \
    -m $SEA_TAO_RESULTS_DIR/finetune/checkpoints/finetuned-model.tlt \
    -k $KEY \
    -r $SEA_TAO_RESULTS_DIR/riva \
    export_format=JARVIS \
    export_to=$SEA_MODEL_NAME

Copy the model to the RIVA model directory (`RIVA_MODEL_DIR`). **`NOTE`: Make sure this directory aligns with the config.sh file for RIVA.**

In [None]:
!cp $SEA_HOST_RESULTS_DIR/riva/$SEA_MODEL_NAME $RIVA_MODEL_DIR/$SEA_MODEL_NAME

### Business Use Case

There are customer specific words and phrases that base BERT models are not likely to have seen in their training. Through the use of fine-tuning, industry specific vocabulary can be integrated into the model language.

# TAO Toolkit-RIVA Integration

In the previous sections, the speech-to-text and sentiment analysis models were saved to the .riva format. The next step is to convert these models to the .rimr format so they can be deployed on a RIVA server.

For more information on how to build and deploy models using the TAO Toolkit, visit [here](https://developer.nvidia.com/blog/building-and-deploying-conversational-ai-models-using-tao-toolkit/).

In [None]:
# ServiceMaker Docker image (must be the same as the image_init_speech in config.sh)
RIVA_SM_CONTAINER = "nvcr.io/nvidia/riva/riva-speech:1.4.0-beta-servicemaker"

# Model names (must be the same as <MODEL_NAMES> in config.sh)
RIVA_STT_MODEL_NAME = STT_MODEL_NAME.replace(".riva",".rmir")
RIVA_SEA_MODEL_NAME = SEA_MODEL_NAME.replace(".riva",".rmir")

print(f"Saved models '{RIVA_STT_MODEL_NAME}' and '{RIVA_SEA_MODEL_NAME}'")

Check all the trained models are in the correct directory.

In [None]:
trained_models = os.listdir(RIVA_MODEL_DIR)
trained_models = [f for f in trained_models if ".riva" in f]

assert STT_MODEL_NAME in trained_models, f"Missing the speech-to-text model named '{STT_MODEL_NAME}'"
assert SEA_MODEL_NAME in trained_models, f"Missing the sentiment analysis model named '{SEA_MODEL_NAME}'"

### 1. Riva-build

This step helps build a Riva-ready version of the model. It’s only output is an intermediate format (called a RMIR) of an end to end pipeline for the supported services within Riva. We are taking a ASR QuartzNet Model in consideration.

riva-build is responsible for the combination of one or more exported models (.riva files) into a single file containing an intermediate format called Riva Model Intermediate Representation (.rmir). This file contains a deployment-agnostic specification of the whole end-to-end pipeline along with all the assets required for the final deployment and inference.

In [None]:
!docker pull $RIVA_SM_CONTAINER

#### Builds the speech-to-text model

In [None]:
!docker run --rm --gpus 0 -v $RIVA_MODEL_DIR:/data $RIVA_SM_CONTAINER -- \
            riva-build speech_recognition -f /data/$RIVA_STT_MODEL_NAME:$KEY /data/$STT_MODEL_NAME:$KEY \
            --offline --decoder_type=greedy

#### Builds the sentiment analysis model

In [None]:
!docker run --rm --gpus 0 -v $RIVA_MODEL_DIR:/data $RIVA_SM_CONTAINER -- \
            riva-build text_classification -f /data/$RIVA_SEA_MODEL_NAME:$KEY /data/$SEA_MODEL_NAME:$KEY

### 2. Riva-deploy

The deployment tool takes as input one or more Riva Model Intermediate Representation (RMIR) files and a target model repository directory. It creates an ensemble configuration specifying the pipeline for the execution and finally writes all those assets to the output model repository directory.

Note that this step is analogous to running `riva_init.sh`.

In [None]:
!docker run --rm --gpus 0 -v $RIVA_MODEL_DIR:/data $RIVA_SM_CONTAINER -- \
            riva-deploy -f /data/$RIVA_STT_MODEL_NAME:$KEY /data/models

In [None]:
!docker run --rm --gpus 0 -v $RIVA_MODEL_DIR:/data $RIVA_SM_CONTAINER -- \
            riva-deploy -f /data/$RIVA_SEA_MODEL_NAME:$KEY /data/models