In [None]:
"""
You can run either this notebook locally (if you have all the dependencies and a GPU) or on Google Colab.

Instructions for setting up Colab are as follows:
1. Open a new Python 3 notebook.
2. Import this notebook from GitHub (File -> Upload Notebook -> "GITHUB" tab -> copy/paste GitHub URL)
3. Connect to an instance with a GPU (Runtime -> Change runtime type -> select "GPU" for hardware accelerator)
4. Run this cell to set up dependencies.
"""
# If you're using Google Colab and not running locally, run this cell

# install NeMo
BRANCH = 'main'
!python -m pip install git+https://github.com/NVIDIA/NeMo.git@$BRANCH#egg=nemo_toolkit[nlp]

In [None]:
from nemo.collections import nlp as nemo_nlp
from nemo.utils.exp_manager import exp_manager
from nemo.utils import logging

import os
import wget
import torch
import pytorch_lightning as pl
from omegaconf import OmegaConf

# Task Description
**Joint Intent and Slot classification** - is a task of classifying an Intent and detecting all relevant Slots (Entities)
for this Intent in a query.
For example, in the query:  `What is the weather in Santa Clara tomorrow morning?`, we would like to classify the query
as a `weather` Intent, and detect `Santa Clara` as a `location` slot and `tomorrow morning` as a `date_time` slot.
Intents and Slots names are usually task specific and defined as labels in the training data.
This is a fundamental step that is executed in any task-driven Conversational Assistant.

Our Bert based model implementation enables to train and then detect both of these tasks together.


# Dataset and NeMo data format

In this tutorial we are going to use a virtual assistant interaction data set that can be downloaded from here: https://github.com/xliuhw/NLU-Evaluation-Data.
There are about 10K training and 1K testing queries which cover 64 various Intents and 55 Slots. 

To work with NeMo NLP classification model, this dataset should be first converted to the NeMo format, which requires next files:
- **dict.intents.csv** - list of all intent names in the data. One line per an intent name.
- **dict.slots.csv** - list of all slot names in the data. One line per a slot name. It is possible to use both: B- I- notations, for separating between first and intermediate tokens for multi token slots. Or just use one slot type for each token of multi token slot. Our recommendation is to use later one, since it is simpler and there is no visible degradation in performance.
- **train.tsv/test.tsv** - contain original queries, one per line, and intent number separated by tab. For example: `what alarms do i have set right now	0`. Intent numbers are according to the intent line in the intent dictionary file (dict.intents.csv) starting from 0. First line of these files contains a header line: `sentence \tab label`.
- **train_slot.tvs/test_slot.tsv** - contain one line per a query, where instead each token there is a number of the token from the slots dictionary file (dict.slots.csv), starting from 0. Last 'out-of scope' token is usually located in the last line of the dictionary. Example: `54 0 0 54 54 12 12` (numbers separated by space). No header line in these files.

NeMo provides **import_dataset.py** converter for few reference datasets (Assistant / Atis / Snips) which converts them to the NeMo data format for the Intent and Slot classification model. If you have your own annotated dataset in a different format, you will need to write a data converter. Possible recommended format for your own annotation, is to have one text file per all examples of one intent. With one line per query in a form like: `did i set an alarm to [alarm_type : wake up] in the [timeofday : morning]`, using brackets to define slot names. This is very similar to the assistant format from this example and you can use its converter to NeMo format with small changes. 

You can run this utility as follows:

**python examples/nlp/intent_slot_classification/data/import_datasets.py --dataset_name=assistant --source_data_dir=source_dir_name --target_data_dir=target_dir_name**


# Download, preprocess and explore the dataset
## Download the dataset and convert it to the NeMo format

In [None]:
# you can replace DATA_DIR and NEMO_DIR with your own locations
DATA_DIR = "."
NEMO_DIR = '.'

# download the converter files from github for the purpose of this tutorial
wget.download(f'https://raw.githubusercontent.com/NVIDIA/NeMo/{BRANCH}/examples/nlp/intent_slot_classification/data/import_datasets.py', NEMO_DIR)
wget.download(f'https://raw.githubusercontent.com/NVIDIA/NeMo/{BRANCH}/examples/nlp/intent_slot_classification/data/assistant_utils.py', NEMO_DIR)

In [None]:
# download and unzip the example dataset from github
print('Downloading dataset...')
wget.download('https://github.com/xliuhw/NLU-Evaluation-Data/archive/master.zip', DATA_DIR)
! unzip {DATA_DIR}/NLU-Evaluation-Data-master.zip -d {DATA_DIR}

In [None]:
# convert the dataset to the NeMo format
!python {NEMO_DIR}/import_datasets.py --dataset_name=assistant --source_data_dir={DATA_DIR}/NLU-Evaluation-Data-master --target_data_dir={DATA_DIR}/nemo_format


## Data exploration
You can see the dataset in both the original and NeMo's formats. We have here 65 different Intents and 55 Slots, which could be typical commands for virtual assistants. Out of scope slot has the name 'O' and is the last in the dictionary of Slots. And we can see examples of queries and also format of training intent and slot files. 

In [None]:
# list of queries divided by intent files in the original training dataset
! ls -l {DATA_DIR}/NLU-Evaluation-Data-master/dataset/trainset

In [None]:
# print all intents from the NeMo format intent dictionary
!echo 'Intents: ' $(wc -l < {DATA_DIR}/nemo_format/dict.intents.csv)
! cat {DATA_DIR}/nemo_format/dict.intents.csv

In [None]:
# print all slots from the NeMo format slot dictionary
!echo 'Slots: ' $(wc -l < {DATA_DIR}/nemo_format/dict.slots.csv)
! cat {DATA_DIR}/nemo_format/dict.slots.csv

In [None]:
# examples from the intent training file
! head -n 10 {DATA_DIR}/nemo_format/train.tsv

In [None]:
# examples from the slot training file
! head -n 10 {DATA_DIR}/nemo_format/train_slots.tsv

# Training model

## Model configuration

Our Joint Intent and Slot classification model is comprised of the pretrained [BERT](https://arxiv.org/pdf/1810.04805.pdf) model with an Intent and Slot Classification layer on top of it.

All model and training parameters are defined in the **intent_slot_classification_config.yaml** config file. This file is located in the folder **examples/nlp/intent_slot_classification/conf/**. It contains 2 main sections:
- **model**: All arguments that are related to the Model - language model, token classifier, optimizer and schedulers, datasets and any other related information

- **trainer**: Any argument to be passed to PyTorch Lightning

We will download the config file from repository for the purpose of the tutorial. If you have a version of NeMo installed locally, you can use it from the above folder.

In [None]:
# download the model config file from repository for the purpose of this example
wget.download(f'https://raw.githubusercontent.com/NVIDIA/NeMo/{BRANCH}/examples/nlp/intent_slot_classification/conf/intent_slot_classification_config.yaml', NEMO_DIR)

# print content of the config file
config_file = "intent_slot_classification_config.yaml"
print(config_file)
config = OmegaConf.load(config_file)
print(OmegaConf.to_yaml(config))

## Setting up Data within the config

Among other things, the config file contains dictionaries called train_ds and validation_ds. These are configurations used to setup the Dataset and DataLoaders of the corresponding config.

The converter utility creates both training and evaluation files in the same directory, so we need to specify `model.data_dir` parameter to this directory. Also notice that some config lines, including `model.data_dir`, have `???` in place of paths, this means that values for these fields are required to be specified by the user.

`config.model.intent_loss_weight` parameter - is a balance of training loss between Intent and Slot losses, a number between 0 to 1. Its default value is 0.6 which gives slightly higher priority to the Intent loss and it empirically works quite well. You can experiment with this value if you like.
Also you can try to change `config.model.class_balancing` parameter to `weighted_loss` and see if you get better accuracy.

Let's now add the data directory path to the config.

In [None]:
config.model.data_dir = f'{DATA_DIR}/nemo_format'

## Building the PyTorch Lightning Trainer

NeMo models are primarily PyTorch Lightning modules - and therefore are entirely compatible with the PyTorch Lightning ecosystem. `config.trainer.max_epochs` - param defines number of training epochs. Usually 50-100 epochs or less should be enough to train on your data. Let's instantiate the Trainer object.

In [None]:
# lets modify some trainer configs
# checks if we have GPU available and uses it
cuda = 1 if torch.cuda.is_available() else 0
config.trainer.gpus = cuda

config.trainer.precision = 16 if torch.cuda.is_available() else 32

# for mixed precision training, uncomment the line below (precision should be set to 16 and amp_level to O1):
# config.trainer.amp_level = O1

# remove distributed training flags
config.trainer.accelerator = None

# setup a small number of epochs for demonstration purposes of this tutorial
config.trainer.max_epochs = 5

trainer = pl.Trainer(**config.trainer)

## Setting up a NeMo Experiment

NeMo has an experiment manager that handles logging and checkpointing for us, so let's use it. Model check points during training will be saved in this directory. 

In [None]:
exp_dir = exp_manager(trainer, config.get("exp_manager", None))
# the exp_dir provides a path to the current experiment for easy access
print(str(exp_dir))

## Initializing the model and Training

Initial statistics of the dataset will be displayed at the beginning of the training and then Intent and Slot classification report will be displayed after each training epoch.

In [None]:
# initialize the model
model = nemo_nlp.models.IntentSlotClassificationModel(config.model, trainer=trainer)

# train
trainer.fit(model)

After training for 5 epochs, which should take no more than few minutes, you can expect training precision for this data set to be around these numbers (the accuracy will gradually continue to improve for this dataset up to about 50 epochs of training): 
```
Intents:
    label                                                precision    recall       f1           support   
    alarm_query (label_id: 0)                               94.74      94.74      94.74         19
    alarm_remove (label_id: 1)                             100.00     100.00     100.00         11
    alarm_set (label_id: 2)                                 85.71      94.74      90.00         19
    audio_volume_down (label_id: 3)                          0.00       0.00       0.00          8
    audio_volume_mute (label_id: 4)                        100.00      86.67      92.86         15
    audio_volume_up (label_id: 5)                           56.52     100.00      72.22         13
    calendar_query (label_id: 6)                            55.00      57.89      56.41         19
    calendar_remove (label_id: 7)                           88.89      84.21      86.49         19
    calendar_set (label_id: 8)                              81.25      68.42      74.29         19
    cooking_recipe (label_id: 9)                            86.36     100.00      92.68         19
    datetime_convert (label_id: 10)                          0.00       0.00       0.00          8
    datetime_query (label_id: 11)                           65.52     100.00      79.17         19
    email_addcontact (label_id: 12)                        100.00      12.50      22.22          8
    email_query (label_id: 13)                              83.33      78.95      81.08         19
    email_querycontact (label_id: 14)                       62.50      78.95      69.77         19
    email_sendemail (label_id: 15)                          70.83      89.47      79.07         19
    general_affirm (label_id: 16)                           95.00     100.00      97.44         19
    general_commandstop (label_id: 17)                     100.00     100.00     100.00         19
    general_confirm (label_id: 18)                         100.00     100.00     100.00         19
    general_dontcare (label_id: 19)                        100.00     100.00     100.00         19
    general_explain (label_id: 20)                         100.00      94.74      97.30         19
    general_joke (label_id: 21)                            100.00     100.00     100.00         12
    general_negate (label_id: 22)                           95.00     100.00      97.44         19
    general_praise (label_id: 23)                          100.00      94.74      97.30         19
    general_quirky (label_id: 24)                           40.00      10.53      16.67         19
    general_repeat (label_id: 25)                          100.00     100.00     100.00         19
    iot_cleaning (label_id: 26)                             84.21     100.00      91.43         16
    iot_coffee (label_id: 27)                               94.74      94.74      94.74         19
    iot_hue_lightchange (label_id: 28)                      94.44      89.47      91.89         19
    iot_hue_lightdim (label_id: 29)                        100.00      83.33      90.91         12
    iot_hue_lightoff (label_id: 30)                         89.47      89.47      89.47         19
    iot_hue_lighton (label_id: 31)                           0.00       0.00       0.00          3
    iot_hue_lightup (label_id: 32)                          81.25      92.86      86.67         14
    iot_wemo_off (label_id: 33)                             60.00     100.00      75.00          9
    iot_wemo_on (label_id: 34)                             100.00      14.29      25.00          7
    lists_createoradd (label_id: 35)                        78.95      78.95      78.95         19
    lists_query (label_id: 36)                              78.95      78.95      78.95         19
    lists_remove (label_id: 37)                             90.00      94.74      92.31         19
    music_likeness (label_id: 38)                           70.59      66.67      68.57         18
    music_query (label_id: 39)                              77.78      73.68      75.68         19
    music_settings (label_id: 40)                            0.00       0.00       0.00          7
    news_query (label_id: 41)                               77.78      73.68      75.68         19
    play_audiobook (label_id: 42)                           90.00      94.74      92.31         19
    play_game (label_id: 43)                                80.00      84.21      82.05         19
    play_music (label_id: 44)                               53.85      73.68      62.22         19
    play_podcasts (label_id: 45)                            89.47      89.47      89.47         19
    play_radio (label_id: 46)                               93.75      78.95      85.71         19
    qa_currency (label_id: 47)                              95.00     100.00      97.44         19
    qa_definition (label_id: 48)                            85.00      89.47      87.18         19
    qa_factoid (label_id: 49)                               45.16      73.68      56.00         19
    qa_maths (label_id: 50)                                100.00     100.00     100.00         14
    qa_stock (label_id: 51)                                 95.00     100.00      97.44         19
    recommendation_events (label_id: 52)                    94.44      89.47      91.89         19
    recommendation_locations (label_id: 53)                 94.74      94.74      94.74         19
    recommendation_movies (label_id: 54)                   100.00     100.00     100.00         10
    social_post (label_id: 55)                              90.00      94.74      92.31         19
    social_query (label_id: 56)                             94.74     100.00      97.30         18
    takeaway_order (label_id: 57)                           93.75      78.95      85.71         19
    takeaway_query (label_id: 58)                           85.71      94.74      90.00         19
    transport_query (label_id: 59)                          83.33      78.95      81.08         19
    transport_taxi (label_id: 60)                          100.00     100.00     100.00         18
    transport_ticket (label_id: 61)                         89.47      89.47      89.47         19
    transport_traffic (label_id: 62)                       100.00     100.00     100.00         19
    weather_query (label_id: 63)                           100.00      89.47      94.44         19
    -------------------
    micro avg                                               85.04      85.04      85.04       1076
    macro avg                                               81.13      80.81      79.36       1076
    weighted avg                                            84.10      85.04      83.54       1076
    
Slots:
    label                                                precision    recall       f1           support   
    alarm_type (label_id: 0)                                 0.00       0.00       0.00          0
    app_name (label_id: 1)                                   0.00       0.00       0.00          6
    artist_name (label_id: 2)                                0.00       0.00       0.00         21
    audiobook_author (label_id: 3)                           0.00       0.00       0.00          1
    audiobook_name (label_id: 4)                             0.00       0.00       0.00         18
    business_name (label_id: 5)                             60.00      56.60      58.25         53
    business_type (label_id: 6)                              0.00       0.00       0.00         24
    change_amount (label_id: 7)                              0.00       0.00       0.00         25
    coffee_type (label_id: 8)                                0.00       0.00       0.00          4
    color_type (label_id: 9)                                 0.00       0.00       0.00         12
    cooking_type (label_id: 10)                              0.00       0.00       0.00          0
    currency_name (label_id: 11)                            84.09      75.51      79.57         49
    date (label_id: 12)                                     57.95      91.07      70.83        112
    definition_word (label_id: 13)                           0.00       0.00       0.00         20
    device_type (label_id: 14)                              74.55      51.25      60.74         80
    drink_type (label_id: 15)                                0.00       0.00       0.00          0
    email_address (label_id: 16)                             0.00       0.00       0.00         14
    email_folder (label_id: 17)                              0.00       0.00       0.00          1
    event_name (label_id: 18)                              100.00      13.24      23.38         68
    food_type (label_id: 19)                                51.72      69.77      59.41         43
    game_name (label_id: 20)                                60.00      14.29      23.08         21
    game_type (label_id: 21)                                 0.00       0.00       0.00          0
    general_frequency (label_id: 22)                         0.00       0.00       0.00          9
    house_place (label_id: 23)                              93.33      42.42      58.33         33
    ingredient (label_id: 24)                                0.00       0.00       0.00          6
    joke_type (label_id: 25)                                 0.00       0.00       0.00          4
    list_name (label_id: 26)                                 0.00       0.00       0.00         21
    meal_type (label_id: 27)                                 0.00       0.00       0.00          0
    media_type (label_id: 28)                                0.00       0.00       0.00         37
    movie_name (label_id: 29)                                0.00       0.00       0.00          0
    movie_type (label_id: 30)                                0.00       0.00       0.00          0
    music_album (label_id: 31)                               0.00       0.00       0.00          0
    music_descriptor (label_id: 32)                          0.00       0.00       0.00          3
    music_genre (label_id: 33)                               0.00       0.00       0.00          9
    news_topic (label_id: 34)                                0.00       0.00       0.00         17
    order_type (label_id: 35)                                0.00       0.00       0.00         17
    person (label_id: 36)                                   44.86      92.31      60.38         52
    personal_info (label_id: 37)                             0.00       0.00       0.00         20
    place_name (label_id: 38)                               71.25      77.03      74.03        148
    player_setting (label_id: 39)                            0.00       0.00       0.00          1
    playlist_name (label_id: 40)                             0.00       0.00       0.00          1
    podcast_descriptor (label_id: 41)                        0.00       0.00       0.00         13
    podcast_name (label_id: 42)                              0.00       0.00       0.00          4
    radio_name (label_id: 43)                               66.67      10.53      18.18         38
    relation (label_id: 44)                                  0.00       0.00       0.00         17
    song_name (label_id: 45)                                 0.00       0.00       0.00         22
    time (label_id: 46)                                     70.27      78.20      74.02        133
    time_zone (label_id: 47)                                 0.00       0.00       0.00          9
    timeofday (label_id: 48)                                 0.00       0.00       0.00         28
    transport_agency (label_id: 49)                          0.00       0.00       0.00          9
    transport_descriptor (label_id: 50)                      0.00       0.00       0.00          0
    transport_name (label_id: 51)                            0.00       0.00       0.00          4
    transport_type (label_id: 52)                           78.38      82.86      80.56         35
    weather_descriptor (label_id: 53)                        0.00       0.00       0.00         17
    O (label_id: 54)                                        92.42      98.80      95.50       5920
    -------------------
    micro avg                                               89.10      89.10      89.10       7199
    macro avg                                               21.86      18.56      18.18       7199
    weighted avg                                            84.42      89.10      86.01       7199
```

## Evaluation
To see how the model performs, we can evaluate the performance of the trained model on a test data file. Here we would load the best checkpoint (the one with the lowest validation loss) and create a model (eval_model) from the checkpoint. We will use the same trainer for testing.

In [None]:
# extract the path of the best checkpoint from the training, you may update it to any other saved checkpoint file
checkpoint_path = trainer.checkpoint_callback.best_model_path

# load the model from this checkpoint
eval_model = nemo_nlp.models.IntentSlotClassificationModel.load_from_checkpoint(checkpoint_path=checkpoint_path)

In [None]:
# we will setup testing data reusing the same config (test section)
eval_model.setup_test_data(test_data_config=config.model.test_ds)

# run the evaluation on the test dataset
trainer.test(model=eval_model, ckpt_path=None, verbose=False)

## Inference from Examples
Next step to see how the trained model will classify Intents and Slots for given queries from this domain. To improve the predictions you may need to train the model for more than 5 epochs.


In [None]:
queries = [
    'set alarm for seven thirty am',
    'lower volume by fifty percent',
    'what is my schedule for tomorrow',
]

pred_intents, pred_slots = eval_model.predict_from_examples(queries, config.model.test_ds)

logging.info('The prediction results of some sample queries with the trained model:')
for query, intent, slots in zip(queries, pred_intents, pred_slots):
    logging.info(f'Query : {query}')
    logging.info(f'Predicted Intent: {intent}')
    logging.info(f'Predicted Slots: {slots}')

## Training Script

If you have NeMo installed locally (eg. cloned from the Github), you can also train the model with the example script: `examples/nlp/intent_slot_classification/intent_slot_classification.py.`
This script contains an example on how to train, evaluate and perform inference with the IntentSlotClassificationModel.

To run a training script, use:

`cd examples/nlp/intent_slot_classification`

`python intent_slot_classification.py model.data_dir=PATH_TO_DATA_DIR`

By default, this script uses examples/nlp/intent_slot_classification/conf/intent_slot_classification_config.py config file, and you may update all the params inside of this config file or alternatively providing them in the command line.

# Multi-Label Intent Classification
---

Similar to the previous task, instead of classifying only one intent for an utterance, we have a model to train for multiple intents. For this tutorial, we will be using the ATIS dataset. 

## Download the dataset and convert it to the NeMo format

In [None]:
# download the converter files from github for the purpose of this tutorial
DATA_DIR = "/home/rchen/datafolder"
NEMO_DIR = '/home/rchen/nemofolder'
BRANCH = 'main'


files = [f'https://raw.githubusercontent.com/howl-anderson/ATIS_dataset/master/data/raw_data/ms-cntk-atis/atis.dict.intent.csv', 
         f'https://raw.githubusercontent.com/howl-anderson/ATIS_dataset/master/data/raw_data/ms-cntk-atis/atis.dict.slots.csv',
         f'https://raw.githubusercontent.com/howl-anderson/ATIS_dataset/master/data/raw_data/ms-cntk-atis/atis.dict.vocab.csv',
         f'https://raw.githubusercontent.com/howl-anderson/ATIS_dataset/master/data/raw_data/ms-cntk-atis/atis.test.intent.csv',
         f'https://raw.githubusercontent.com/howl-anderson/ATIS_dataset/master/data/raw_data/ms-cntk-atis/atis.test.pkl', 
         f'https://raw.githubusercontent.com/howl-anderson/ATIS_dataset/master/data/raw_data/ms-cntk-atis/atis.test.query.csv',
         f'https://raw.githubusercontent.com/howl-anderson/ATIS_dataset/master/data/raw_data/ms-cntk-atis/atis.test.slots.csv', 
         f'https://raw.githubusercontent.com/howl-anderson/ATIS_dataset/master/data/raw_data/ms-cntk-atis/atis.train.intent.csv',
         f'https://raw.githubusercontent.com/howl-anderson/ATIS_dataset/master/data/raw_data/ms-cntk-atis/atis.train.pkl',
         f'https://raw.githubusercontent.com/howl-anderson/ATIS_dataset/master/data/raw_data/ms-cntk-atis/atis.train.query.csv',
         f'https://raw.githubusercontent.com/howl-anderson/ATIS_dataset/master/data/raw_data/ms-cntk-atis/atis.train.slots.csv']

         
for file in files:
    wget.download(file, DATA_DIR)


# download the converter files from github for the purpose of this tutorial
wget.download(f'https://raw.githubusercontent.com/NVIDIA/NeMo/{BRANCH}/examples/nlp/intent_slot_classification/data/import_datasets.py', NEMO_DIR)
wget.download(f'https://raw.githubusercontent.com/NVIDIA/NeMo/{BRANCH}/examples/nlp/intent_slot_classification/data/assistant_utils.py', NEMO_DIR)
wget.download(f'https://raw.githubusercontent.com/chenrichard10/NeMo/{BRANCH}/examples/nlp/multi_label_intent_slot_classification/data/convert_datasets.py', NEMO_DIR)

# Get original atis dataset
!python {NEMO_DIR}/import_datasets.py --dataset_name=atis --source_data_dir={DATA_DIR} --target_data_dir={DATA_DIR}/nemo_format
# Script will create new files at {DATA_DIR}/new_format
!mkdir {DATA_DIR}/new_format
!python {NEMO_DIR}/convert_datasets.py --source_data_dir={DATA_DIR}/nemo_format --target_data_dir={DATA_DIR}/new_format


DATA_DIR = f"{DATA_DIR}/new_format"

## Training the Model

In [1]:
from nemo.collections import nlp as nemo_nlp
from nemo.utils.exp_manager import exp_manager
from nemo.utils import logging

import os
import wget
import torch
import pytorch_lightning as pl
from omegaconf import OmegaConf

In [8]:
# download the model config file from repository for the purpose of this example
wget.download(f'https://raw.githubusercontent.com/chenrichard10/NeMo/{BRANCH}/examples/nlp/multi_label_intent_slot_classification/conf/multi_label_intent_slot_classification_config.yaml', NEMO_DIR)

# print content of the config file
config_file = f"{NEMO_DIR}/multi-label-intent-slot-classification.yaml"
print(config_file)
config = OmegaConf.load(config_file)
print(OmegaConf.to_yaml(config))

  0% [                                                                                ]    0 / 3259100% [................................................................................] 3259 / 3259/home/rchen/nemofolder/multi-label-intent-slot-classification.yaml
trainer:
  gpus: 2
  num_nodes: 1
  max_epochs: 5
  max_steps: null
  accumulate_grad_batches: 1
  precision: 32
  accelerator: ddp
  log_every_n_steps: 1
  val_check_interval: 1.0
  resume_from_checkpoint: null
  checkpoint_callback: false
  logger: false
model:
  nemo_path: null
  data_dir: ???
  class_labels:
    intent_labels_file: intent_labels.csv
    slot_labels_file: slot_labels.csv
  class_balancing: null
  intent_loss_weight: 0.6
  pad_label: -1
  ignore_extra_tokens: false
  ignore_start_end: true
  train_ds:
    prefix: train
    batch_size: 32
    shuffle: true
    num_samples: -1
    num_workers: 8
    drop_last: false
    pin_memory: false
  validation_ds:
    prefix: test
    batch_size: 32
    shuffle: fals

In [6]:
import time

config.model.data_dir = f"{DATA_DIR}/new_format"
config.model.validation_ds.prefix = "dev"
config.model.test_ds.prefix = "dev"
config.model.class_balancing = "weighted_loss"
config.model.head.num_output_layers = 2
config.trainer.max_epochs = 40
run_name = "test"

# checks if we have GPU available and uses it
cuda = 1 if torch.cuda.is_available() else 0
config.trainer.gpus = cuda

# config.trainer.precision = 16 if torch.cuda.is_available() else 32
config.trainer.precision = 32

# remove distributed training flags
config.trainer.accelerator = None

trainer = pl.Trainer(**config.trainer)
config.exp_manager.exp_dir = os.path.join(DATA_DIR, "output/" + run_name)
config.exp_manager.create_checkpoint_callback = True
config.exp_manager.version = time.strftime('%Y-%m-%d_%H-%M-%S')

exp_dir = exp_manager(trainer, config.get("exp_manager", None))
model = nemo_nlp.models.MultiLabelIntentSlotClassificationModel(config.model, trainer=trainer)
trainer.fit(model)

SyntaxError: EOL while scanning string literal (3312526059.py, line 3)

## Evaluation

To see how the model performs, we can evaluate the performance of the trained model on a test data file. Here we would load the best checkpoint (the one with the lowest validation loss) and create a model (eval_model) from the checkpoint. We will use the same trainer for testing. We will need to use the file specified in the output during the training process.

In [None]:
# specify checkpoint path with .nemo file (Look for /MultiLabelIntentSlot.nemo)
checkpoint_path = "/home/rchen/datafolder/new_format/output/test/MultiLabelIntentSlot/2022-02-14_08-52-53/checkpoints/MultiLabelIntentSlot.nemo"

# load the model from this checkpoint
eval_model =  nemo_nlp.models.MultiLabelIntentSlotClassificationModel.restore_from(checkpoint_path)

### Optimizing Threshold
Before using our model, we find the probability rounding threshold that gives us the best results on the validation set. 

In [None]:
eval_model.optimize_threshold(config.model.test_ds)

###  Inference from Examples
Similar to the previous example we can run inference to see how the trained model will classify Intents and Slots for given queries from this domain. To improve the predictions you may need to train the model for more than 10 epochs.


In [None]:
queries = [
    'i would like to find a flight from charlotte to las vegas that makes a stop in st. louis',
    'on april first i need a ticket from tacoma to san jose departing before 7 am',
    'how much is the limousine service in boston',
]

# We use the optimized threshold for predictions
pred_intents, pred_slots, pred_list = eval_model.predict_from_examples(queries, config.model.test_ds, eval_model.threshold)
logging.info('The prediction results of some sample queries with the trained model:')
    
for query, intent, slots in zip(queries, pred_intents, pred_slots):
    logging.info(f'Query : {query}')
    logging.info(f'Predicted Intents: {intent}')
    logging.info(f'Predicted Slots: {slots}')

### Data Augmentation
---

In scenarios when we don't have many multiple labels, data augmentation can be very useful. This can be done by concatentating utterances together with the word "and" or a period.  A script has been provided below to help with augmentation, but it could be changed depending on your use case.

In [3]:
# download the model config file from repository for the purpose of this example
DATA_DIR = "/home/rchen/datafolder"
NEMO_DIR = '/home/rchen/nemofolder'
BRANCH = 'main'
wget.download(f'https://raw.githubusercontent.com/chenrichard10/NeMo/{BRANCH}/examples/nlp/multi_label_intent_slot_classification/data/augment_training_data.py', NEMO_DIR)

  0% [                                                                                ]    0 / 4296100% [................................................................................] 4296 / 4296

'/home/rchen/nemofolder/augment_training_data (1).py'

In [5]:
!python {NEMO_DIR}/augment_training_data.py --source_data_dir={DATA_DIR}/new_format --target_data_dir={DATA_DIR}/augmented_data --num_mixed=10

In [9]:
import time

config.model.data_dir = f"{DATA_DIR}/augmented_data"
config.model.validation_ds.prefix = "dev"
config.model.test_ds.prefix = "dev"
config.model.class_balancing = "weighted_loss"
config.model.head.num_output_layers = 2
config.trainer.max_epochs = 40
run_name = "test"

# checks if we have GPU available and uses it
cuda = 1 if torch.cuda.is_available() else 0
config.trainer.gpus = cuda

# config.trainer.precision = 16 if torch.cuda.is_available() else 32
config.trainer.precision = 32

# remove distributed training flags
config.trainer.accelerator = None

trainer = pl.Trainer(**config.trainer)
config.exp_manager.exp_dir = os.path.join(DATA_DIR, "output/" + run_name)
config.exp_manager.create_checkpoint_callback = True
config.exp_manager.version = time.strftime('%Y-%m-%d_%H-%M-%S')

exp_dir = exp_manager(trainer, config.get("exp_manager", None))
model = nemo_nlp.models.MultiLabelIntentSlotClassificationModel(config.model, trainer=trainer)
trainer.fit(model)

      "Setting `max_steps = None` is deprecated in v1.5 and will no longer be supported in v1.7."
    
      f"Setting `Trainer(checkpoint_callback={checkpoint_callback})` is deprecated in v1.5 and will "
    
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs


[NeMo I 2022-02-18 08:36:02 exp_manager:283] Experiments will be logged at /home/rchen/datafolder/output/test/MultiLabelIntentSlot/2022-02-18_08-36-02


[NeMo W 2022-02-18 08:36:02 exp_manager:880] The checkpoint callback was told to monitor a validation value and trainer's max_steps was set to -1. Please ensure that max_steps will run for at least 1 epochs to ensure that checkpointing will not error out.
      "`ModelCheckpoint(every_n_val_epochs)` is deprecated in v1.4 and will be removed in v1.6."
    


[NeMo I 2022-02-18 08:36:02 tokenizer_utils:125] Getting HuggingFace AutoTokenizer with pretrained_model_name: bert-base-uncased, vocab_file: None, special_tokens_dict: {}, and use_fast: False


Using eos_token, but it is not set yet.
Using bos_token, but it is not set yet.


[NeMo I 2022-02-18 08:36:07 multi_label_intent_slot_classification_descriptor:90]  Stats calculating for train mode...
[NeMo I 2022-02-18 08:36:07 multi_label_intent_slot_classification_descriptor:117] Three most popular intents in train mode:
[NeMo I 2022-02-18 08:36:07 multi_label_intent_slot_classification_descriptor:123] Three most popular slots in train mode:
[NeMo I 2022-02-18 08:36:07 data_preprocessing:194] label: 128, 71207 out of 108030 (65.91%).
[NeMo I 2022-02-18 08:36:07 data_preprocessing:194] label: 48, 6968 out of 108030 (6.45%).
[NeMo I 2022-02-18 08:36:07 data_preprocessing:194] label: 78, 6697 out of 108030 (6.20%).
[NeMo I 2022-02-18 08:36:07 multi_label_intent_slot_classification_descriptor:126] Total Number of Intents: 7170
[NeMo I 2022-02-18 08:36:07 multi_label_intent_slot_classification_descriptor:127] Intent Label Frequencies: {0: [6813, 357], 1: [6721, 449], 2: [6168, 1002], 3: [6633, 537], 4: [6940, 230], 5: [6944, 226], 6: [7002, 168], 7: [6941, 229], 8: [7

[NeMo I 2022-02-18 08:36:07 intent_slot_classification_model:149] Labels mapping saved to : /home/rchen/datafolder/augmented_data/slot_labels.csv
[NeMo I 2022-02-18 08:36:07 intent_slot_classification_model:148] Labels: {'abbreviation': 0, 'aircraft': 1, 'airfare': 2, 'airline': 3, 'airport': 4, 'capacity': 5, 'cheapest': 6, 'city': 7, 'day_name': 8, 'distance': 9, 'flight': 10, 'flight_no': 11, 'flight_time': 12, 'ground_fare': 13, 'ground_service': 14, 'meal': 15, 'quantity': 16, 'restriction': 17}
[NeMo I 2022-02-18 08:36:07 intent_slot_classification_model:149] Labels mapping saved to : /home/rchen/datafolder/augmented_data/intent_labels.csv


[NeMo W 2022-02-18 08:36:07 modelPT:203] You tried to register an artifact under config key=tokenizer.vocab_file but an artifact for it has already been registered.


[NeMo I 2022-02-18 08:36:07 tokenizer_utils:125] Getting HuggingFace AutoTokenizer with pretrained_model_name: bert-base-uncased, vocab_file: /home/rchen/.cache/huggingface/nemo_nlp_tmp/ae0d012864bdb2474ba67537c3f5e0fa/vocab.txt, special_tokens_dict: {}, and use_fast: False


Using eos_token, but it is not set yet.
Using bos_token, but it is not set yet.


[NeMo I 2022-02-18 08:36:11 multi_label_intent_slot_classification_descriptor:90]  Stats calculating for train mode...
[NeMo I 2022-02-18 08:36:11 multi_label_intent_slot_classification_descriptor:117] Three most popular intents in train mode:
[NeMo I 2022-02-18 08:36:11 multi_label_intent_slot_classification_descriptor:123] Three most popular slots in train mode:
[NeMo I 2022-02-18 08:36:11 data_preprocessing:194] label: 128, 71207 out of 108030 (65.91%).
[NeMo I 2022-02-18 08:36:11 data_preprocessing:194] label: 48, 6968 out of 108030 (6.45%).
[NeMo I 2022-02-18 08:36:11 data_preprocessing:194] label: 78, 6697 out of 108030 (6.20%).
[NeMo I 2022-02-18 08:36:11 multi_label_intent_slot_classification_descriptor:126] Total Number of Intents: 7170
[NeMo I 2022-02-18 08:36:11 multi_label_intent_slot_classification_descriptor:127] Intent Label Frequencies: {0: [6813, 357], 1: [6721, 449], 2: [6168, 1002], 3: [6633, 537], 4: [6940, 230], 5: [6944, 226], 6: [7002, 168], 7: [6941, 229], 8: [7

[NeMo I 2022-02-18 08:36:11 intent_slot_classification_model:149] Labels mapping saved to : /home/rchen/datafolder/augmented_data/slot_labels.csv
[NeMo I 2022-02-18 08:36:11 intent_slot_classification_model:148] Labels: {'abbreviation': 0, 'aircraft': 1, 'airfare': 2, 'airline': 3, 'airport': 4, 'capacity': 5, 'cheapest': 6, 'city': 7, 'day_name': 8, 'distance': 9, 'flight': 10, 'flight_no': 11, 'flight_time': 12, 'ground_fare': 13, 'ground_service': 14, 'meal': 15, 'quantity': 16, 'restriction': 17}
[NeMo I 2022-02-18 08:36:11 intent_slot_classification_model:149] Labels mapping saved to : /home/rchen/datafolder/augmented_data/intent_labels.csv


[NeMo W 2022-02-18 08:36:11 modelPT:203] You tried to register an artifact under config key=class_labels.intent_labels_file but an artifact for it has already been registered.
[NeMo W 2022-02-18 08:36:11 modelPT:203] You tried to register an artifact under config key=class_labels.slot_labels_file but an artifact for it has already been registered.


[NeMo I 2022-02-18 08:36:16 intent_slot_classification_dataset:92] Setting max length to: 50
[NeMo I 2022-02-18 08:36:16 data_preprocessing:387] Some stats of the lengths of the sequences:
[NeMo I 2022-02-18 08:36:16 data_preprocessing:393] Min: 3 |                  Max: 69 |                  Mean: 17.54044630404463 |                  Median: 15.0
[NeMo I 2022-02-18 08:36:16 data_preprocessing:395] 75 percentile: 21.00
[NeMo I 2022-02-18 08:36:16 data_preprocessing:396] 99 percentile: 47.00
[NeMo I 2022-02-18 08:36:16 intent_slot_classification_dataset:121] 37 are longer than 50
[NeMo I 2022-02-18 08:36:17 intent_slot_classification_dataset:92] Setting max length to: 35
[NeMo I 2022-02-18 08:36:17 data_preprocessing:387] Some stats of the lengths of the sequences:
[NeMo I 2022-02-18 08:36:17 data_preprocessing:393] Min: 4 |                  Max: 35 |                  Mean: 12.647256438969764 |                  Median: 12.0
[NeMo I 2022-02-18 08:36:17 data_preprocessing:395] 75 percenti

[NeMo W 2022-02-18 08:36:17 modelPT:203] You tried to register an artifact under config key=tokenizer.vocab_file but an artifact for it has already been registered.
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertModel: ['cls.predictions.transform.dense.weight', 'cls.seq_relationship.weight', 'cls.predictions.decoder.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.dense.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias', 'cls.predictions.bias']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassificatio

[NeMo I 2022-02-18 08:36:27 modelPT:566] Optimizer config = Adam (
    Parameter Group 0
        amsgrad: False
        betas: (0.9, 0.999)
        eps: 1e-08
        lr: 2e-05
        weight_decay: 0.01
    )
[NeMo I 2022-02-18 08:36:27 lr_scheduler:837] Scheduler "<nemo.core.optim.lr_scheduler.WarmupAnnealing object at 0x7fabe78fab50>" 
    will be used during training (effective maximum steps = 9000) - 
    Parameters : 
    (last_epoch: -1
    max_steps: 9000
    warmup_steps: null
    warmup_ratio: null
    )



  | Name                         | Type                           | Params
--------------------------------------------------------------------------------
0 | bert_model                   | BertEncoder                    | 109 M 
1 | classifier                   | SequenceTokenClassifier        | 1.3 M 
2 | intent_loss                  | BCEWithLogitsLoss              | 0     
3 | slot_loss                    | CrossEntropyLoss               | 0     
4 | total_loss                   | AggregatorLoss                 | 0     
5 | intent_classification_report | MultiLabelClassificationReport | 0     
6 | slot_classification_report   | ClassificationReport           | 0     
--------------------------------------------------------------------------------
110 M     Trainable params
0         Non-trainable params
110 M     Total params
443.106   Total estimated model params size (MB)


Validation sanity check: 0it [00:00, ?it/s]

[NeMo I 2022-02-18 08:36:29 intent_slot_classification_model:296] Intent report: 
    label                                                precision    recall       f1           support   
    abbreviation (label_id: 0)                               0.00       0.00       0.00          0
    aircraft (label_id: 1)                                   0.00       0.00       0.00          0
    airfare (label_id: 2)                                    4.26     100.00       8.16          2
    airline (label_id: 3)                                    0.00       0.00       0.00          0
    airport (label_id: 4)                                    0.00       0.00       0.00          0
    capacity (label_id: 5)                                   0.00       0.00       0.00          0
    cheapest (label_id: 6)                                   0.00       0.00       0.00          0
    city (label_id: 7)                                       0.00       0.00       0.00          0
    day_name (label

Training: 0it [00:00, ?it/s]

Validating: 0it [00:00, ?it/s]

[NeMo I 2022-02-18 08:36:49 intent_slot_classification_model:296] Intent report: 
    label                                                precision    recall       f1           support   
    abbreviation (label_id: 0)                              30.28     100.00      46.48         33
    aircraft (label_id: 1)                                   6.45      66.67      11.76          9
    airfare (label_id: 2)                                   14.86      36.07      21.05         61
    airline (label_id: 3)                                   17.31      67.50      27.55         40
    airport (label_id: 4)                                   14.68      88.89      25.20         18
    capacity (label_id: 5)                                  18.35      95.24      30.77         21
    cheapest (label_id: 6)                                   0.00       0.00       0.00          0
    city (label_id: 7)                                       4.00      83.33       7.63          6
    day_name (label

Epoch 0, global step 224: val_loss reached 2.07278 (best 2.07278), saving model to "/home/rchen/datafolder/output/test/MultiLabelIntentSlot/2022-02-18_08-36-02/checkpoints/MultiLabelIntentSlot--val_loss=2.0728-epoch=0.ckpt" as top 3


Validating: 0it [00:00, ?it/s]

[NeMo I 2022-02-18 08:37:13 intent_slot_classification_model:296] Intent report: 
    label                                                precision    recall       f1           support   
    abbreviation (label_id: 0)                              70.21     100.00      82.50         33
    aircraft (label_id: 1)                                  12.50     100.00      22.22          9
    airfare (label_id: 2)                                   56.63      77.05      65.28         61
    airline (label_id: 3)                                   41.94      97.50      58.65         40
    airport (label_id: 4)                                   41.86     100.00      59.02         18
    capacity (label_id: 5)                                  58.82      95.24      72.73         21
    cheapest (label_id: 6)                                   0.00       0.00       0.00          0
    city (label_id: 7)                                       5.56      83.33      10.42          6
    day_name (label

Epoch 1, global step 449: val_loss reached 1.62399 (best 1.62399), saving model to "/home/rchen/datafolder/output/test/MultiLabelIntentSlot/2022-02-18_08-36-02/checkpoints/MultiLabelIntentSlot--val_loss=1.6240-epoch=1.ckpt" as top 3


Validating: 0it [00:00, ?it/s]

[NeMo I 2022-02-18 08:37:37 intent_slot_classification_model:296] Intent report: 
    label                                                precision    recall       f1           support   
    abbreviation (label_id: 0)                              82.50     100.00      90.41         33
    aircraft (label_id: 1)                                  18.37     100.00      31.03          9
    airfare (label_id: 2)                                   71.25      93.44      80.85         61
    airline (label_id: 3)                                   53.42      97.50      69.03         40
    airport (label_id: 4)                                   78.26     100.00      87.80         18
    capacity (label_id: 5)                                  80.77     100.00      89.36         21
    cheapest (label_id: 6)                                   0.00       0.00       0.00          0
    city (label_id: 7)                                      16.13      83.33      27.03          6
    day_name (label

Epoch 2, global step 674: val_loss reached 1.33731 (best 1.33731), saving model to "/home/rchen/datafolder/output/test/MultiLabelIntentSlot/2022-02-18_08-36-02/checkpoints/MultiLabelIntentSlot--val_loss=1.3373-epoch=2.ckpt" as top 3


Validating: 0it [00:00, ?it/s]

[NeMo I 2022-02-18 08:38:01 intent_slot_classification_model:296] Intent report: 
    label                                                precision    recall       f1           support   
    abbreviation (label_id: 0)                              89.19     100.00      94.29         33
    aircraft (label_id: 1)                                  45.00     100.00      62.07          9
    airfare (label_id: 2)                                   90.32      91.80      91.06         61
    airline (label_id: 3)                                   69.64      97.50      81.25         40
    airport (label_id: 4)                                   85.71     100.00      92.31         18
    capacity (label_id: 5)                                  84.00     100.00      91.30         21
    cheapest (label_id: 6)                                   0.00       0.00       0.00          0
    city (label_id: 7)                                      13.64      50.00      21.43          6
    day_name (label

Epoch 3, global step 899: val_loss reached 1.13710 (best 1.13710), saving model to "/home/rchen/datafolder/output/test/MultiLabelIntentSlot/2022-02-18_08-36-02/checkpoints/MultiLabelIntentSlot--val_loss=1.1371-epoch=3.ckpt" as top 3


Validating: 0it [00:00, ?it/s]

[NeMo I 2022-02-18 08:38:25 intent_slot_classification_model:296] Intent report: 
    label                                                precision    recall       f1           support   
    abbreviation (label_id: 0)                              89.19     100.00      94.29         33
    aircraft (label_id: 1)                                  36.00     100.00      52.94          9
    airfare (label_id: 2)                                   80.82      96.72      88.06         61
    airline (label_id: 3)                                   79.59      97.50      87.64         40
    airport (label_id: 4)                                   78.26     100.00      87.80         18
    capacity (label_id: 5)                                  77.78     100.00      87.50         21
    cheapest (label_id: 6)                                   0.00       0.00       0.00          0
    city (label_id: 7)                                      13.64      50.00      21.43          6
    day_name (label

Epoch 4, global step 1124: val_loss reached 1.00026 (best 1.00026), saving model to "/home/rchen/datafolder/output/test/MultiLabelIntentSlot/2022-02-18_08-36-02/checkpoints/MultiLabelIntentSlot--val_loss=1.0003-epoch=4.ckpt" as top 3


Validating: 0it [00:00, ?it/s]

[NeMo I 2022-02-18 08:38:49 intent_slot_classification_model:296] Intent report: 
    label                                                precision    recall       f1           support   
    abbreviation (label_id: 0)                              89.19     100.00      94.29         33
    aircraft (label_id: 1)                                  42.86     100.00      60.00          9
    airfare (label_id: 2)                                   88.06      96.72      92.19         61
    airline (label_id: 3)                                   84.78      97.50      90.70         40
    airport (label_id: 4)                                   78.26     100.00      87.80         18
    capacity (label_id: 5)                                  91.30     100.00      95.45         21
    cheapest (label_id: 6)                                   0.00       0.00       0.00          0
    city (label_id: 7)                                      13.64      50.00      21.43          6
    day_name (label

Epoch 5, global step 1349: val_loss reached 0.89741 (best 0.89741), saving model to "/home/rchen/datafolder/output/test/MultiLabelIntentSlot/2022-02-18_08-36-02/checkpoints/MultiLabelIntentSlot--val_loss=0.8974-epoch=5.ckpt" as top 3


Validating: 0it [00:00, ?it/s]

[NeMo I 2022-02-18 08:39:12 intent_slot_classification_model:296] Intent report: 
    label                                                precision    recall       f1           support   
    abbreviation (label_id: 0)                              86.84     100.00      92.96         33
    aircraft (label_id: 1)                                  34.62     100.00      51.43          9
    airfare (label_id: 2)                                   92.31      98.36      95.24         61
    airline (label_id: 3)                                   80.00     100.00      88.89         40
    airport (label_id: 4)                                   81.82     100.00      90.00         18
    capacity (label_id: 5)                                  75.00     100.00      85.71         21
    cheapest (label_id: 6)                                   0.00       0.00       0.00          0
    city (label_id: 7)                                      18.18      66.67      28.57          6
    day_name (label

Epoch 6, global step 1574: val_loss reached 0.83292 (best 0.83292), saving model to "/home/rchen/datafolder/output/test/MultiLabelIntentSlot/2022-02-18_08-36-02/checkpoints/MultiLabelIntentSlot--val_loss=0.8329-epoch=6.ckpt" as top 3


Validating: 0it [00:00, ?it/s]

[NeMo I 2022-02-18 08:39:36 intent_slot_classification_model:296] Intent report: 
    label                                                precision    recall       f1           support   
    abbreviation (label_id: 0)                              89.19     100.00      94.29         33
    aircraft (label_id: 1)                                  39.13     100.00      56.25          9
    airfare (label_id: 2)                                   89.06      93.44      91.20         61
    airline (label_id: 3)                                   83.33     100.00      90.91         40
    airport (label_id: 4)                                   69.23     100.00      81.82         18
    capacity (label_id: 5)                                  77.78     100.00      87.50         21
    cheapest (label_id: 6)                                   0.00       0.00       0.00          0
    city (label_id: 7)                                      17.39      66.67      27.59          6
    day_name (label

Epoch 7, global step 1799: val_loss reached 0.79223 (best 0.79223), saving model to "/home/rchen/datafolder/output/test/MultiLabelIntentSlot/2022-02-18_08-36-02/checkpoints/MultiLabelIntentSlot--val_loss=0.7922-epoch=7.ckpt" as top 3


Validating: 0it [00:00, ?it/s]

[NeMo I 2022-02-18 08:40:01 intent_slot_classification_model:296] Intent report: 
    label                                                precision    recall       f1           support   
    abbreviation (label_id: 0)                              86.84     100.00      92.96         33
    aircraft (label_id: 1)                                  42.86     100.00      60.00          9
    airfare (label_id: 2)                                   90.48      93.44      91.94         61
    airline (label_id: 3)                                   81.63     100.00      89.89         40
    airport (label_id: 4)                                   72.00     100.00      83.72         18
    capacity (label_id: 5)                                  77.78     100.00      87.50         21
    cheapest (label_id: 6)                                   0.00       0.00       0.00          0
    city (label_id: 7)                                      16.67      66.67      26.67          6
    day_name (label

Epoch 8, global step 2024: val_loss reached 0.74538 (best 0.74538), saving model to "/home/rchen/datafolder/output/test/MultiLabelIntentSlot/2022-02-18_08-36-02/checkpoints/MultiLabelIntentSlot--val_loss=0.7454-epoch=8.ckpt" as top 3


Validating: 0it [00:00, ?it/s]

[NeMo I 2022-02-18 08:40:25 intent_slot_classification_model:296] Intent report: 
    label                                                precision    recall       f1           support   
    abbreviation (label_id: 0)                              76.74     100.00      86.84         33
    aircraft (label_id: 1)                                  29.03     100.00      45.00          9
    airfare (label_id: 2)                                   80.26     100.00      89.05         61
    airline (label_id: 3)                                   80.00     100.00      88.89         40
    airport (label_id: 4)                                   78.26     100.00      87.80         18
    capacity (label_id: 5)                                  91.30     100.00      95.45         21
    cheapest (label_id: 6)                                   0.00       0.00       0.00          0
    city (label_id: 7)                                      30.77      66.67      42.11          6
    day_name (label

Epoch 9, global step 2249: val_loss reached 0.72313 (best 0.72313), saving model to "/home/rchen/datafolder/output/test/MultiLabelIntentSlot/2022-02-18_08-36-02/checkpoints/MultiLabelIntentSlot--val_loss=0.7231-epoch=9.ckpt" as top 3


Validating: 0it [00:00, ?it/s]

[NeMo I 2022-02-18 08:40:49 intent_slot_classification_model:296] Intent report: 
    label                                                precision    recall       f1           support   
    abbreviation (label_id: 0)                              94.29     100.00      97.06         33
    aircraft (label_id: 1)                                  40.91     100.00      58.06          9
    airfare (label_id: 2)                                   89.39      96.72      92.91         61
    airline (label_id: 3)                                   81.63     100.00      89.89         40
    airport (label_id: 4)                                   78.26     100.00      87.80         18
    capacity (label_id: 5)                                  77.78     100.00      87.50         21
    cheapest (label_id: 6)                                   0.00       0.00       0.00          0
    city (label_id: 7)                                      20.00      66.67      30.77          6
    day_name (label

Epoch 10, global step 2474: val_loss reached 0.70312 (best 0.70312), saving model to "/home/rchen/datafolder/output/test/MultiLabelIntentSlot/2022-02-18_08-36-02/checkpoints/MultiLabelIntentSlot--val_loss=0.7031-epoch=10.ckpt" as top 3


Validating: 0it [00:00, ?it/s]

[NeMo I 2022-02-18 08:41:13 intent_slot_classification_model:296] Intent report: 
    label                                                precision    recall       f1           support   
    abbreviation (label_id: 0)                              78.57     100.00      88.00         33
    aircraft (label_id: 1)                                  40.91     100.00      58.06          9
    airfare (label_id: 2)                                   83.56     100.00      91.04         61
    airline (label_id: 3)                                   75.47     100.00      86.02         40
    airport (label_id: 4)                                   75.00     100.00      85.71         18
    capacity (label_id: 5)                                  75.00     100.00      85.71         21
    cheapest (label_id: 6)                                   0.00       0.00       0.00          0
    city (label_id: 7)                                      18.18      66.67      28.57          6
    day_name (label

Epoch 11, global step 2699: val_loss reached 0.68170 (best 0.68170), saving model to "/home/rchen/datafolder/output/test/MultiLabelIntentSlot/2022-02-18_08-36-02/checkpoints/MultiLabelIntentSlot--val_loss=0.6817-epoch=11.ckpt" as top 3


Validating: 0it [00:00, ?it/s]

[NeMo I 2022-02-18 08:41:37 intent_slot_classification_model:296] Intent report: 
    label                                                precision    recall       f1           support   
    abbreviation (label_id: 0)                              82.50     100.00      90.41         33
    aircraft (label_id: 1)                                  42.86     100.00      60.00          9
    airfare (label_id: 2)                                   80.56      95.08      87.22         61
    airline (label_id: 3)                                   88.89     100.00      94.12         40
    airport (label_id: 4)                                   81.82     100.00      90.00         18
    capacity (label_id: 5)                                  77.78     100.00      87.50         21
    cheapest (label_id: 6)                                   0.00       0.00       0.00          0
    city (label_id: 7)                                      25.00      66.67      36.36          6
    day_name (label

Epoch 12, global step 2924: val_loss reached 0.68214 (best 0.68170), saving model to "/home/rchen/datafolder/output/test/MultiLabelIntentSlot/2022-02-18_08-36-02/checkpoints/MultiLabelIntentSlot--val_loss=0.6821-epoch=12.ckpt" as top 3


Validating: 0it [00:00, ?it/s]

[NeMo I 2022-02-18 08:42:01 intent_slot_classification_model:296] Intent report: 
    label                                                precision    recall       f1           support   
    abbreviation (label_id: 0)                              70.21     100.00      82.50         33
    aircraft (label_id: 1)                                  42.86     100.00      60.00          9
    airfare (label_id: 2)                                   81.33     100.00      89.71         61
    airline (label_id: 3)                                   88.89     100.00      94.12         40
    airport (label_id: 4)                                   78.26     100.00      87.80         18
    capacity (label_id: 5)                                  84.00     100.00      91.30         21
    cheapest (label_id: 6)                                   0.00       0.00       0.00          0
    city (label_id: 7)                                      19.05      66.67      29.63          6
    day_name (label

Epoch 13, global step 3149: val_loss reached 0.65975 (best 0.65975), saving model to "/home/rchen/datafolder/output/test/MultiLabelIntentSlot/2022-02-18_08-36-02/checkpoints/MultiLabelIntentSlot--val_loss=0.6597-epoch=13.ckpt" as top 3


Validating: 0it [00:00, ?it/s]

[NeMo I 2022-02-18 08:42:25 intent_slot_classification_model:296] Intent report: 
    label                                                precision    recall       f1           support   
    abbreviation (label_id: 0)                              68.75     100.00      81.48         33
    aircraft (label_id: 1)                                  31.03     100.00      47.37          9
    airfare (label_id: 2)                                   81.33     100.00      89.71         61
    airline (label_id: 3)                                   74.07     100.00      85.11         40
    airport (label_id: 4)                                   81.82     100.00      90.00         18
    capacity (label_id: 5)                                  80.77     100.00      89.36         21
    cheapest (label_id: 6)                                   0.00       0.00       0.00          0
    city (label_id: 7)                                      40.00      66.67      50.00          6
    day_name (label

Epoch 14, global step 3374: val_loss reached 0.65164 (best 0.65164), saving model to "/home/rchen/datafolder/output/test/MultiLabelIntentSlot/2022-02-18_08-36-02/checkpoints/MultiLabelIntentSlot--val_loss=0.6516-epoch=14.ckpt" as top 3


Validating: 0it [00:00, ?it/s]

[NeMo I 2022-02-18 08:42:49 intent_slot_classification_model:296] Intent report: 
    label                                                precision    recall       f1           support   
    abbreviation (label_id: 0)                              70.21     100.00      82.50         33
    aircraft (label_id: 1)                                  33.33     100.00      50.00          9
    airfare (label_id: 2)                                   80.26     100.00      89.05         61
    airline (label_id: 3)                                   76.92     100.00      86.96         40
    airport (label_id: 4)                                   72.00     100.00      83.72         18
    capacity (label_id: 5)                                  80.77     100.00      89.36         21
    cheapest (label_id: 6)                                   0.00       0.00       0.00          0
    city (label_id: 7)                                      40.00      66.67      50.00          6
    day_name (label

Epoch 15, global step 3599: val_loss reached 0.64915 (best 0.64915), saving model to "/home/rchen/datafolder/output/test/MultiLabelIntentSlot/2022-02-18_08-36-02/checkpoints/MultiLabelIntentSlot--val_loss=0.6492-epoch=15.ckpt" as top 3


Validating: 0it [00:00, ?it/s]

[NeMo I 2022-02-18 08:43:13 intent_slot_classification_model:296] Intent report: 
    label                                                precision    recall       f1           support   
    abbreviation (label_id: 0)                              76.74     100.00      86.84         33
    aircraft (label_id: 1)                                  42.86     100.00      60.00          9
    airfare (label_id: 2)                                   81.33     100.00      89.71         61
    airline (label_id: 3)                                   75.47     100.00      86.02         40
    airport (label_id: 4)                                   81.82     100.00      90.00         18
    capacity (label_id: 5)                                  84.00     100.00      91.30         21
    cheapest (label_id: 6)                                   0.00       0.00       0.00          0
    city (label_id: 7)                                      66.67      66.67      66.67          6
    day_name (label

Epoch 16, global step 3824: val_loss reached 0.63700 (best 0.63700), saving model to "/home/rchen/datafolder/output/test/MultiLabelIntentSlot/2022-02-18_08-36-02/checkpoints/MultiLabelIntentSlot--val_loss=0.6370-epoch=16.ckpt" as top 3


Validating: 0it [00:00, ?it/s]

[NeMo I 2022-02-18 08:43:37 intent_slot_classification_model:296] Intent report: 
    label                                                precision    recall       f1           support   
    abbreviation (label_id: 0)                              68.75     100.00      81.48         33
    aircraft (label_id: 1)                                  42.86     100.00      60.00          9
    airfare (label_id: 2)                                   80.26     100.00      89.05         61
    airline (label_id: 3)                                   83.33     100.00      90.91         40
    airport (label_id: 4)                                   81.82     100.00      90.00         18
    capacity (label_id: 5)                                  91.30     100.00      95.45         21
    cheapest (label_id: 6)                                   0.00       0.00       0.00          0
    city (label_id: 7)                                      16.00      66.67      25.81          6
    day_name (label

Epoch 17, global step 4049: val_loss reached 0.62733 (best 0.62733), saving model to "/home/rchen/datafolder/output/test/MultiLabelIntentSlot/2022-02-18_08-36-02/checkpoints/MultiLabelIntentSlot--val_loss=0.6273-epoch=17.ckpt" as top 3


Validating: 0it [00:00, ?it/s]

[NeMo I 2022-02-18 08:44:01 intent_slot_classification_model:296] Intent report: 
    label                                                precision    recall       f1           support   
    abbreviation (label_id: 0)                              73.33     100.00      84.62         33
    aircraft (label_id: 1)                                  37.50     100.00      54.55          9
    airfare (label_id: 2)                                   80.26     100.00      89.05         61
    airline (label_id: 3)                                   74.07     100.00      85.11         40
    airport (label_id: 4)                                   90.00     100.00      94.74         18
    capacity (label_id: 5)                                  80.77     100.00      89.36         21
    cheapest (label_id: 6)                                   0.00       0.00       0.00          0
    city (label_id: 7)                                      50.00      66.67      57.14          6
    day_name (label

Epoch 18, global step 4274: val_loss reached 0.64208 (best 0.62733), saving model to "/home/rchen/datafolder/output/test/MultiLabelIntentSlot/2022-02-18_08-36-02/checkpoints/MultiLabelIntentSlot--val_loss=0.6421-epoch=18.ckpt" as top 3


Validating: 0it [00:00, ?it/s]

[NeMo I 2022-02-18 08:44:25 intent_slot_classification_model:296] Intent report: 
    label                                                precision    recall       f1           support   
    abbreviation (label_id: 0)                              73.33     100.00      84.62         33
    aircraft (label_id: 1)                                  42.86     100.00      60.00          9
    airfare (label_id: 2)                                   78.21     100.00      87.77         61
    airline (label_id: 3)                                   76.92     100.00      86.96         40
    airport (label_id: 4)                                   90.00     100.00      94.74         18
    capacity (label_id: 5)                                  87.50     100.00      93.33         21
    cheapest (label_id: 6)                                   0.00       0.00       0.00          0
    city (label_id: 7)                                      80.00      66.67      72.73          6
    day_name (label

Epoch 19, global step 4499: val_loss was not in top 3


Validating: 0it [00:00, ?it/s]

[NeMo I 2022-02-18 08:44:47 intent_slot_classification_model:296] Intent report: 
    label                                                precision    recall       f1           support   
    abbreviation (label_id: 0)                              66.00     100.00      79.52         33
    aircraft (label_id: 1)                                  42.86     100.00      60.00          9
    airfare (label_id: 2)                                   83.56     100.00      91.04         61
    airline (label_id: 3)                                   74.07     100.00      85.11         40
    airport (label_id: 4)                                   90.00     100.00      94.74         18
    capacity (label_id: 5)                                  80.77     100.00      89.36         21
    cheapest (label_id: 6)                                   0.00       0.00       0.00          0
    city (label_id: 7)                                      17.39      66.67      27.59          6
    day_name (label

Epoch 20, global step 4724: val_loss reached 0.62522 (best 0.62522), saving model to "/home/rchen/datafolder/output/test/MultiLabelIntentSlot/2022-02-18_08-36-02/checkpoints/MultiLabelIntentSlot--val_loss=0.6252-epoch=20.ckpt" as top 3


Validating: 0it [00:00, ?it/s]

[NeMo I 2022-02-18 08:45:11 intent_slot_classification_model:296] Intent report: 
    label                                                precision    recall       f1           support   
    abbreviation (label_id: 0)                              70.21     100.00      82.50         33
    aircraft (label_id: 1)                                  42.86     100.00      60.00          9
    airfare (label_id: 2)                                   80.26     100.00      89.05         61
    airline (label_id: 3)                                   70.18     100.00      82.47         40
    airport (label_id: 4)                                   81.82     100.00      90.00         18
    capacity (label_id: 5)                                  80.77     100.00      89.36         21
    cheapest (label_id: 6)                                   0.00       0.00       0.00          0
    city (label_id: 7)                                      25.00      66.67      36.36          6
    day_name (label

Epoch 21, global step 4949: val_loss reached 0.61566 (best 0.61566), saving model to "/home/rchen/datafolder/output/test/MultiLabelIntentSlot/2022-02-18_08-36-02/checkpoints/MultiLabelIntentSlot--val_loss=0.6157-epoch=21.ckpt" as top 3


Validating: 0it [00:00, ?it/s]

[NeMo I 2022-02-18 08:45:35 intent_slot_classification_model:296] Intent report: 
    label                                                precision    recall       f1           support   
    abbreviation (label_id: 0)                              67.35     100.00      80.49         33
    aircraft (label_id: 1)                                  42.86     100.00      60.00          9
    airfare (label_id: 2)                                   82.43     100.00      90.37         61
    airline (label_id: 3)                                   74.07     100.00      85.11         40
    airport (label_id: 4)                                   90.00     100.00      94.74         18
    capacity (label_id: 5)                                  80.77     100.00      89.36         21
    cheapest (label_id: 6)                                   0.00       0.00       0.00          0
    city (label_id: 7)                                      20.00      66.67      30.77          6
    day_name (label

Epoch 22, global step 5174: val_loss reached 0.61255 (best 0.61255), saving model to "/home/rchen/datafolder/output/test/MultiLabelIntentSlot/2022-02-18_08-36-02/checkpoints/MultiLabelIntentSlot--val_loss=0.6125-epoch=22.ckpt" as top 3


Validating: 0it [00:00, ?it/s]

[NeMo I 2022-02-18 08:45:59 intent_slot_classification_model:296] Intent report: 
    label                                                precision    recall       f1           support   
    abbreviation (label_id: 0)                              70.21     100.00      82.50         33
    aircraft (label_id: 1)                                  37.50     100.00      54.55          9
    airfare (label_id: 2)                                   83.56     100.00      91.04         61
    airline (label_id: 3)                                   74.07     100.00      85.11         40
    airport (label_id: 4)                                   81.82     100.00      90.00         18
    capacity (label_id: 5)                                  84.00     100.00      91.30         21
    cheapest (label_id: 6)                                   0.00       0.00       0.00          0
    city (label_id: 7)                                      17.39      66.67      27.59          6
    day_name (label

Epoch 23, global step 5399: val_loss reached 0.62332 (best 0.61255), saving model to "/home/rchen/datafolder/output/test/MultiLabelIntentSlot/2022-02-18_08-36-02/checkpoints/MultiLabelIntentSlot--val_loss=0.6233-epoch=23.ckpt" as top 3


Validating: 0it [00:00, ?it/s]

[NeMo I 2022-02-18 08:46:23 intent_slot_classification_model:296] Intent report: 
    label                                                precision    recall       f1           support   
    abbreviation (label_id: 0)                              71.74     100.00      83.54         33
    aircraft (label_id: 1)                                  34.62     100.00      51.43          9
    airfare (label_id: 2)                                   79.22     100.00      88.41         61
    airline (label_id: 3)                                   75.47     100.00      86.02         40
    airport (label_id: 4)                                   90.00     100.00      94.74         18
    capacity (label_id: 5)                                  80.77     100.00      89.36         21
    cheapest (label_id: 6)                                   0.00       0.00       0.00          0
    city (label_id: 7)                                      36.36      66.67      47.06          6
    day_name (label

Epoch 24, global step 5624: val_loss reached 0.62206 (best 0.61255), saving model to "/home/rchen/datafolder/output/test/MultiLabelIntentSlot/2022-02-18_08-36-02/checkpoints/MultiLabelIntentSlot--val_loss=0.6221-epoch=24.ckpt" as top 3


Validating: 0it [00:00, ?it/s]

[NeMo I 2022-02-18 08:46:47 intent_slot_classification_model:296] Intent report: 
    label                                                precision    recall       f1           support   
    abbreviation (label_id: 0)                              70.21     100.00      82.50         33
    aircraft (label_id: 1)                                  42.86     100.00      60.00          9
    airfare (label_id: 2)                                   76.25     100.00      86.52         61
    airline (label_id: 3)                                   76.92     100.00      86.96         40
    airport (label_id: 4)                                   90.00     100.00      94.74         18
    capacity (label_id: 5)                                  87.50     100.00      93.33         21
    cheapest (label_id: 6)                                   0.00       0.00       0.00          0
    city (label_id: 7)                                      19.05      66.67      29.63          6
    day_name (label

Epoch 25, global step 5849: val_loss reached 0.61229 (best 0.61229), saving model to "/home/rchen/datafolder/output/test/MultiLabelIntentSlot/2022-02-18_08-36-02/checkpoints/MultiLabelIntentSlot--val_loss=0.6123-epoch=25.ckpt" as top 3


Validating: 0it [00:00, ?it/s]

[NeMo I 2022-02-18 08:47:12 intent_slot_classification_model:296] Intent report: 
    label                                                precision    recall       f1           support   
    abbreviation (label_id: 0)                              70.21     100.00      82.50         33
    aircraft (label_id: 1)                                  37.50     100.00      54.55          9
    airfare (label_id: 2)                                   74.39     100.00      85.31         61
    airline (label_id: 3)                                   74.07     100.00      85.11         40
    airport (label_id: 4)                                   90.00     100.00      94.74         18
    capacity (label_id: 5)                                  80.77     100.00      89.36         21
    cheapest (label_id: 6)                                   0.00       0.00       0.00          0
    city (label_id: 7)                                      22.22      66.67      33.33          6
    day_name (label

Epoch 26, global step 6074: val_loss reached 0.60944 (best 0.60944), saving model to "/home/rchen/datafolder/output/test/MultiLabelIntentSlot/2022-02-18_08-36-02/checkpoints/MultiLabelIntentSlot--val_loss=0.6094-epoch=26.ckpt" as top 3


Validating: 0it [00:00, ?it/s]

[NeMo I 2022-02-18 08:47:36 intent_slot_classification_model:296] Intent report: 
    label                                                precision    recall       f1           support   
    abbreviation (label_id: 0)                              70.21     100.00      82.50         33
    aircraft (label_id: 1)                                  39.13     100.00      56.25          9
    airfare (label_id: 2)                                   82.43     100.00      90.37         61
    airline (label_id: 3)                                   75.47     100.00      86.02         40
    airport (label_id: 4)                                   85.71     100.00      92.31         18
    capacity (label_id: 5)                                  87.50     100.00      93.33         21
    cheapest (label_id: 6)                                   0.00       0.00       0.00          0
    city (label_id: 7)                                      23.53      66.67      34.78          6
    day_name (label

Epoch 27, global step 6299: val_loss reached 0.59828 (best 0.59828), saving model to "/home/rchen/datafolder/output/test/MultiLabelIntentSlot/2022-02-18_08-36-02/checkpoints/MultiLabelIntentSlot--val_loss=0.5983-epoch=27.ckpt" as top 3


Validating: 0it [00:00, ?it/s]

[NeMo I 2022-02-18 08:48:00 intent_slot_classification_model:296] Intent report: 
    label                                                precision    recall       f1           support   
    abbreviation (label_id: 0)                              66.00     100.00      79.52         33
    aircraft (label_id: 1)                                  33.33     100.00      50.00          9
    airfare (label_id: 2)                                   83.56     100.00      91.04         61
    airline (label_id: 3)                                   75.47     100.00      86.02         40
    airport (label_id: 4)                                   90.00     100.00      94.74         18
    capacity (label_id: 5)                                  84.00     100.00      91.30         21
    cheapest (label_id: 6)                                   0.00       0.00       0.00          0
    city (label_id: 7)                                      25.00      66.67      36.36          6
    day_name (label

Epoch 28, global step 6524: val_loss reached 0.60369 (best 0.59828), saving model to "/home/rchen/datafolder/output/test/MultiLabelIntentSlot/2022-02-18_08-36-02/checkpoints/MultiLabelIntentSlot--val_loss=0.6037-epoch=28.ckpt" as top 3


Validating: 0it [00:00, ?it/s]

[NeMo I 2022-02-18 08:48:24 intent_slot_classification_model:296] Intent report: 
    label                                                precision    recall       f1           support   
    abbreviation (label_id: 0)                              66.00     100.00      79.52         33
    aircraft (label_id: 1)                                  33.33     100.00      50.00          9
    airfare (label_id: 2)                                   80.26     100.00      89.05         61
    airline (label_id: 3)                                   74.07     100.00      85.11         40
    airport (label_id: 4)                                   85.71     100.00      92.31         18
    capacity (label_id: 5)                                  80.77     100.00      89.36         21
    cheapest (label_id: 6)                                   0.00       0.00       0.00          0
    city (label_id: 7)                                      18.18      66.67      28.57          6
    day_name (label

Epoch 29, global step 6749: val_loss reached 0.59403 (best 0.59403), saving model to "/home/rchen/datafolder/output/test/MultiLabelIntentSlot/2022-02-18_08-36-02/checkpoints/MultiLabelIntentSlot--val_loss=0.5940-epoch=29.ckpt" as top 3


Validating: 0it [00:00, ?it/s]

[NeMo I 2022-02-18 08:48:49 intent_slot_classification_model:296] Intent report: 
    label                                                precision    recall       f1           support   
    abbreviation (label_id: 0)                              70.21     100.00      82.50         33
    aircraft (label_id: 1)                                  32.14     100.00      48.65          9
    airfare (label_id: 2)                                   80.26     100.00      89.05         61
    airline (label_id: 3)                                   76.92     100.00      86.96         40
    airport (label_id: 4)                                   90.00     100.00      94.74         18
    capacity (label_id: 5)                                  80.77     100.00      89.36         21
    cheapest (label_id: 6)                                   0.00       0.00       0.00          0
    city (label_id: 7)                                      30.77      66.67      42.11          6
    day_name (label

Epoch 30, global step 6974: val_loss reached 0.59348 (best 0.59348), saving model to "/home/rchen/datafolder/output/test/MultiLabelIntentSlot/2022-02-18_08-36-02/checkpoints/MultiLabelIntentSlot--val_loss=0.5935-epoch=30.ckpt" as top 3


Validating: 0it [00:00, ?it/s]

[NeMo I 2022-02-18 08:49:12 intent_slot_classification_model:296] Intent report: 
    label                                                precision    recall       f1           support   
    abbreviation (label_id: 0)                              66.00     100.00      79.52         33
    aircraft (label_id: 1)                                  37.50     100.00      54.55          9
    airfare (label_id: 2)                                   80.26     100.00      89.05         61
    airline (label_id: 3)                                   76.92     100.00      86.96         40
    airport (label_id: 4)                                   78.26     100.00      87.80         18
    capacity (label_id: 5)                                  84.00     100.00      91.30         21
    cheapest (label_id: 6)                                   0.00       0.00       0.00          0
    city (label_id: 7)                                      19.05      66.67      29.63          6
    day_name (label

Epoch 31, global step 7199: val_loss reached 0.59299 (best 0.59299), saving model to "/home/rchen/datafolder/output/test/MultiLabelIntentSlot/2022-02-18_08-36-02/checkpoints/MultiLabelIntentSlot--val_loss=0.5930-epoch=31.ckpt" as top 3


Validating: 0it [00:00, ?it/s]

[NeMo I 2022-02-18 08:49:36 intent_slot_classification_model:296] Intent report: 
    label                                                precision    recall       f1           support   
    abbreviation (label_id: 0)                              68.75     100.00      81.48         33
    aircraft (label_id: 1)                                  40.91     100.00      58.06          9
    airfare (label_id: 2)                                   80.26     100.00      89.05         61
    airline (label_id: 3)                                   76.92     100.00      86.96         40
    airport (label_id: 4)                                   90.00     100.00      94.74         18
    capacity (label_id: 5)                                  87.50     100.00      93.33         21
    cheapest (label_id: 6)                                   0.00       0.00       0.00          0
    city (label_id: 7)                                      50.00      66.67      57.14          6
    day_name (label

Epoch 32, global step 7424: val_loss reached 0.58983 (best 0.58983), saving model to "/home/rchen/datafolder/output/test/MultiLabelIntentSlot/2022-02-18_08-36-02/checkpoints/MultiLabelIntentSlot--val_loss=0.5898-epoch=32.ckpt" as top 3


Validating: 0it [00:00, ?it/s]

[NeMo I 2022-02-18 08:50:00 intent_slot_classification_model:296] Intent report: 
    label                                                precision    recall       f1           support   
    abbreviation (label_id: 0)                              70.21     100.00      82.50         33
    aircraft (label_id: 1)                                  39.13     100.00      56.25          9
    airfare (label_id: 2)                                   78.21     100.00      87.77         61
    airline (label_id: 3)                                   76.92     100.00      86.96         40
    airport (label_id: 4)                                   90.00     100.00      94.74         18
    capacity (label_id: 5)                                  84.00     100.00      91.30         21
    cheapest (label_id: 6)                                   0.00       0.00       0.00          0
    city (label_id: 7)                                      23.53      66.67      34.78          6
    day_name (label

Epoch 33, global step 7649: val_loss reached 0.59223 (best 0.58983), saving model to "/home/rchen/datafolder/output/test/MultiLabelIntentSlot/2022-02-18_08-36-02/checkpoints/MultiLabelIntentSlot--val_loss=0.5922-epoch=33.ckpt" as top 3


Validating: 0it [00:00, ?it/s]

[NeMo I 2022-02-18 08:50:24 intent_slot_classification_model:296] Intent report: 
    label                                                precision    recall       f1           support   
    abbreviation (label_id: 0)                              73.33     100.00      84.62         33
    aircraft (label_id: 1)                                  42.86     100.00      60.00          9
    airfare (label_id: 2)                                   80.26     100.00      89.05         61
    airline (label_id: 3)                                   76.92     100.00      86.96         40
    airport (label_id: 4)                                   90.00     100.00      94.74         18
    capacity (label_id: 5)                                  87.50     100.00      93.33         21
    cheapest (label_id: 6)                                   0.00       0.00       0.00          0
    city (label_id: 7)                                      80.00      66.67      72.73          6
    day_name (label

Epoch 34, global step 7874: val_loss was not in top 3


Validating: 0it [00:00, ?it/s]

[NeMo I 2022-02-18 08:50:46 intent_slot_classification_model:296] Intent report: 
    label                                                precision    recall       f1           support   
    abbreviation (label_id: 0)                              68.75     100.00      81.48         33
    aircraft (label_id: 1)                                  40.91     100.00      58.06          9
    airfare (label_id: 2)                                   84.72     100.00      91.73         61
    airline (label_id: 3)                                   75.47     100.00      86.02         40
    airport (label_id: 4)                                   85.71     100.00      92.31         18
    capacity (label_id: 5)                                  84.00     100.00      91.30         21
    cheapest (label_id: 6)                                   0.00       0.00       0.00          0
    city (label_id: 7)                                      23.53      66.67      34.78          6
    day_name (label

Epoch 35, global step 8099: val_loss reached 0.58468 (best 0.58468), saving model to "/home/rchen/datafolder/output/test/MultiLabelIntentSlot/2022-02-18_08-36-02/checkpoints/MultiLabelIntentSlot--val_loss=0.5847-epoch=35.ckpt" as top 3


Validating: 0it [00:00, ?it/s]

[NeMo I 2022-02-18 08:51:10 intent_slot_classification_model:296] Intent report: 
    label                                                precision    recall       f1           support   
    abbreviation (label_id: 0)                              70.21     100.00      82.50         33
    aircraft (label_id: 1)                                  37.50     100.00      54.55          9
    airfare (label_id: 2)                                   78.21     100.00      87.77         61
    airline (label_id: 3)                                   76.92     100.00      86.96         40
    airport (label_id: 4)                                   85.71     100.00      92.31         18
    capacity (label_id: 5)                                  84.00     100.00      91.30         21
    cheapest (label_id: 6)                                   0.00       0.00       0.00          0
    city (label_id: 7)                                      28.57      66.67      40.00          6
    day_name (label

Epoch 36, global step 8324: val_loss reached 0.58941 (best 0.58468), saving model to "/home/rchen/datafolder/output/test/MultiLabelIntentSlot/2022-02-18_08-36-02/checkpoints/MultiLabelIntentSlot--val_loss=0.5894-epoch=36.ckpt" as top 3


Validating: 0it [00:00, ?it/s]

[NeMo I 2022-02-18 08:51:34 intent_slot_classification_model:296] Intent report: 
    label                                                precision    recall       f1           support   
    abbreviation (label_id: 0)                              71.74     100.00      83.54         33
    aircraft (label_id: 1)                                  39.13     100.00      56.25          9
    airfare (label_id: 2)                                   80.26     100.00      89.05         61
    airline (label_id: 3)                                   76.92     100.00      86.96         40
    airport (label_id: 4)                                   90.00     100.00      94.74         18
    capacity (label_id: 5)                                  87.50     100.00      93.33         21
    cheapest (label_id: 6)                                   0.00       0.00       0.00          0
    city (label_id: 7)                                      66.67      66.67      66.67          6
    day_name (label

Epoch 37, global step 8549: val_loss was not in top 3


Validating: 0it [00:00, ?it/s]

[NeMo I 2022-02-18 08:51:56 intent_slot_classification_model:296] Intent report: 
    label                                                precision    recall       f1           support   
    abbreviation (label_id: 0)                              71.74     100.00      83.54         33
    aircraft (label_id: 1)                                  40.91     100.00      58.06          9
    airfare (label_id: 2)                                   80.26     100.00      89.05         61
    airline (label_id: 3)                                   76.92     100.00      86.96         40
    airport (label_id: 4)                                   90.00     100.00      94.74         18
    capacity (label_id: 5)                                  87.50     100.00      93.33         21
    cheapest (label_id: 6)                                   0.00       0.00       0.00          0
    city (label_id: 7)                                      36.36      66.67      47.06          6
    day_name (label

Epoch 38, global step 8774: val_loss reached 0.58874 (best 0.58468), saving model to "/home/rchen/datafolder/output/test/MultiLabelIntentSlot/2022-02-18_08-36-02/checkpoints/MultiLabelIntentSlot--val_loss=0.5887-epoch=38.ckpt" as top 3


Validating: 0it [00:00, ?it/s]

[NeMo I 2022-02-18 08:52:20 intent_slot_classification_model:296] Intent report: 
    label                                                precision    recall       f1           support   
    abbreviation (label_id: 0)                              71.74     100.00      83.54         33
    aircraft (label_id: 1)                                  40.91     100.00      58.06          9
    airfare (label_id: 2)                                   80.26     100.00      89.05         61
    airline (label_id: 3)                                   76.92     100.00      86.96         40
    airport (label_id: 4)                                   90.00     100.00      94.74         18
    capacity (label_id: 5)                                  84.00     100.00      91.30         21
    cheapest (label_id: 6)                                   0.00       0.00       0.00          0
    city (label_id: 7)                                      36.36      66.67      47.06          6
    day_name (label

Epoch 39, global step 8999: val_loss was not in top 3
Saving latest checkpoint...


In [10]:
model.optimize_threshold(config.model.test_ds)

[NeMo I 2022-02-18 08:53:44 intent_slot_classification_dataset:92] Setting max length to: 35
[NeMo I 2022-02-18 08:53:44 data_preprocessing:387] Some stats of the lengths of the sequences:
[NeMo I 2022-02-18 08:53:44 data_preprocessing:393] Min: 4 |                  Max: 35 |                  Mean: 12.647256438969764 |                  Median: 12.0
[NeMo I 2022-02-18 08:53:44 data_preprocessing:395] 75 percentile: 15.00
[NeMo I 2022-02-18 08:53:44 data_preprocessing:396] 99 percentile: 24.00
[NeMo I 2022-02-18 08:53:44 intent_slot_classification_dataset:121] 0 are longer than 35
[NeMo I 2022-02-18 08:53:47 multi_label_intent_slot_classification_model:346] Maximum Threshold for F1-Score: 0.6700000000000002, [Precision, Recall, F1-Score]: [0.926595744680851, 0.9592511013215859, 0.9426406926406926]
[NeMo I 2022-02-18 08:53:47 multi_label_intent_slot_classification_model:349] Maximum Threshold for Precision: 0.9500000000000004, [Precision, Recall, F1-Score]: [0.9725609756097561, 0.70264317

In [12]:
queries = [
    'i would like to find a flight from charlotte to las vegas that makes a stop in st. louis',
    'on april first i need a ticket from tacoma to san jose departing before 7 am',
    'how much is the limousine service in boston',
]

# We use the optimized threshold for predictions
pred_intents, pred_slots, pred_list = model.predict_from_examples(queries, config.model.test_ds, model.threshold)
logging.info('The prediction results of some sample queries with the trained model:')
    
for query, intent, slots in zip(queries, pred_intents, pred_slots):
    logging.info(f'Query : {query}')
    logging.info(f'Predicted Intents: {intent}')
    logging.info(f'Predicted Slots: {slots}')

[NeMo I 2022-02-18 08:54:29 intent_slot_classification_dataset:92] Setting max length to: 22
[NeMo I 2022-02-18 08:54:29 data_preprocessing:387] Some stats of the lengths of the sequences:
[NeMo I 2022-02-18 08:54:29 data_preprocessing:393] Min: 10 |                  Max: 22 |                  Mean: 16.666666666666668 |                  Median: 18.0
[NeMo I 2022-02-18 08:54:29 data_preprocessing:395] 75 percentile: 20.00
[NeMo I 2022-02-18 08:54:29 data_preprocessing:396] 99 percentile: 21.92
[NeMo I 2022-02-18 08:54:29 intent_slot_classification_dataset:121] 0 are longer than 22
[NeMo I 2022-02-18 08:54:30 3019254592:9] The prediction results of some sample queries with the trained model:
[NeMo I 2022-02-18 08:54:30 3019254592:12] Query : i would like to find a flight from charlotte to las vegas that makes a stop in st. louis
[NeMo I 2022-02-18 08:54:30 3019254592:13] Predicted Intents: [('flight', 0.95)]
[NeMo I 2022-02-18 08:54:30 3019254592:14] Predicted Slots: O O O O O O O B-time