# Installing NeMo from source


You can run either this notebook locally (if you have all the dependencies and a GPU) or on Google Colab.

Instructions for setting up Colab are as follows:
1. Open a new Python 3 notebook.
2. Import this notebook from GitHub (File -> Upload Notebook -> "GITHUB" tab -> copy/paste GitHub URL)
3. Connect to an instance with a GPU (Runtime -> Change runtime type -> select "GPU" for hardware accelerator)
4. Run the cell below to set up dependencies.


In [None]:
import os 
BRANCH = 'main'
!apt-get update && apt-get install -y libsndfile1 ffmpeg
!git clone https://github.com/NVIDIA/NeMo --branch $BRANCH
os.chdir('NeMo')
!./reinstall.sh
os.chdir('..')


# Overview

There are four tasks as part of this tutorial

1. Intent and Slot Classification using Assistant Dataset and a BERT model
2. Intent Classification using Schema Guided Dialogue Dataset and a GPT2 model
3. Answer Extender using MS Marco NLGen Dataset and a BART model
4. Zero Shot Slot Model using Assistant Dataset

Feel free to skip to the task that interests you most after installing NeMo from source.

# 1. Intent and Slot Classification using Assistant Dataset

## 1.1 Task Description

**Joint Intent and Slot classification** - is a task of classifying an Intent and detecting all relevant Slots (Entities)
for this Intent in a query.
For example, in the query:  `What is the weather in Santa Clara tomorrow morning?`, we would like to classify the query
as a `weather` Intent, and detect `Santa Clara` as a `location` slot and `tomorrow morning` as a `date_time` slot.
Intents and Slots names are usually task specific and defined as labels in the training data.
This is a fundamental step that is executed in any task-driven Conversational Assistant.

Our model enables to train and then detect both of these tasks together.

Note: There is a similar model available at [Joint Intent Slot Classification Colab](https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/nlp/Joint_Intent_and_Slot_Classification.ipynb). However, this model only support BERT style models while the model in this tutorial supports other types of models such as GPT2. 


## 1.2 Download Assistant dataset and convert to NeMo format

This is a virtual assistant interaction data set that can be downloaded from here: https://github.com/xliuhw/NLU-Evaluation-Data.
There are about 10K training and 1K testing queries which cover 64 various Intents and 55 Slots. 

An example is:

* utterance: what alarms have i set for tomorrow 
* intent: alarm_query
* slots: date(tomorrow)


Note: While only the assistant dataset is used here, import_dataset.py is also compatible with ATIS and SNIPS

In [None]:
# download and unzip the example dataset from github
!wget https://github.com/xliuhw/NLU-Evaluation-Data/archive/master.zip
!unzip master.zip
# convert the dataset to the NeMo format
!python NeMo/scripts/dataset_processing/nlp/intent_and_slot/import_datasets.py --dataset_name=assistant --source_data_dir=./NLU-Evaluation-Data-master --target_data_dir=./assistant

## 1.3 Training and/or Testing the model




In [None]:
# model.dataset.data_dir: folder to load data from
# model.dataset.dialogues_example_dir: folder that stores predictions for each sample
!(python NeMo/examples/nlp/dialogue/dialogue.py \
  do_training=True \
  model.dataset.data_dir='./assistant' \
  model.dataset.dialogues_example_dir='./assistant_bert_examples' \
  model.dataset.task='assistant' \
  model.language_model.pretrained_model_name='bert-base-uncased' \
  exp_manager.create_wandb_logger=False)


**Results after 3 epochs**

Intent report: 
```
    label                                                precision    recall       f1           support   
    alarm_query (label_id: 0)                              100.00      94.44      97.14         18
    alarm_remove (label_id: 1)                             100.00      90.91      95.24         11
    alarm_set (label_id: 2)                                 94.12      94.12      94.12         17
    audio_volume_down (label_id: 3)                         75.00      42.86      54.55          7
    audio_volume_mute (label_id: 4)                        100.00      92.86      96.30         14
    audio_volume_up (label_id: 5)                           72.22     100.00      83.87         13
    calendar_query (label_id: 6)                            87.50      77.78      82.35         18
    calendar_remove (label_id: 7)                           94.44     100.00      97.14         17
    calendar_set (label_id: 8)                              94.44      94.44      94.44         18
    cooking_recipe (label_id: 9)                            85.71      70.59      77.42         17
    datetime_convert (label_id: 10)                         88.89     100.00      94.12          8
    datetime_query (label_id: 11)                           89.47     100.00      94.44         17
    email_addcontact (label_id: 12)                         80.00     100.00      88.89          8
    email_query (label_id: 13)                             100.00      83.33      90.91         18
    email_querycontact (label_id: 14)                       78.95      88.24      83.33         17
    email_sendemail (label_id: 15)                          94.44      94.44      94.44         18
    general_affirm (label_id: 16)                          100.00     100.00     100.00         17
    general_commandstop (label_id: 17)                     100.00     100.00     100.00         18
    general_confirm (label_id: 18)                         100.00     100.00     100.00         17
    general_dontcare (label_id: 19)                        100.00     100.00     100.00         18
    general_explain (label_id: 20)                         100.00     100.00     100.00         17
    general_joke (label_id: 21)                             91.67     100.00      95.65         11
    general_negate (label_id: 22)                          100.00     100.00     100.00         18
    general_praise (label_id: 23)                          100.00     100.00     100.00         17
    general_quirky (label_id: 24)                           60.00      50.00      54.55         18
    general_repeat (label_id: 25)                          100.00     100.00     100.00         17
    iot_cleaning (label_id: 26)                            100.00     100.00     100.00         15
    iot_coffee (label_id: 27)                               85.71     100.00      92.31         18
    iot_hue_lightchange (label_id: 28)                     100.00      94.12      96.97         17
    iot_hue_lightdim (label_id: 29)                        100.00     100.00     100.00         12
    iot_hue_lightoff (label_id: 30)                        100.00     100.00     100.00         17
    iot_hue_lighton (label_id: 31)                         100.00      50.00      66.67          4
    iot_hue_lightup (label_id: 32)                          84.62      91.67      88.00         12
    iot_wemo_off (label_id: 33)                            100.00     100.00     100.00          9
    iot_wemo_on (label_id: 34)                             100.00      85.71      92.31          7
    lists_createoradd (label_id: 35)                        90.00     100.00      94.74         18
    lists_query (label_id: 36)                             100.00      94.12      96.97         17
    lists_remove (label_id: 37)                             88.89      88.89      88.89         18
    music_likeness (label_id: 38)                          100.00      93.75      96.77         16
    music_query (label_id: 39)                             100.00     100.00     100.00         17
    music_settings (label_id: 40)                           77.78     100.00      87.50          7
    news_query (label_id: 41)                               72.73      88.89      80.00         18
    play_audiobook (label_id: 42)                          100.00     100.00     100.00         17
    play_game (label_id: 43)                                93.75      83.33      88.24         18
    play_music (label_id: 44)                               85.00     100.00      91.89         17
    play_podcasts (label_id: 45)                           100.00      88.89      94.12         18
    play_radio (label_id: 46)                               84.21      94.12      88.89         17
    qa_currency (label_id: 47)                              85.00      94.44      89.47         18
    qa_definition (label_id: 48)                            89.47     100.00      94.44         17
    qa_factoid (label_id: 49)                               64.00      88.89      74.42         18
    qa_maths (label_id: 50)                                 84.62      84.62      84.62         13
    qa_stock (label_id: 51)                                 87.50      77.78      82.35         18
    recommendation_events (label_id: 52)                    87.50      82.35      84.85         17
    recommendation_locations (label_id: 53)                 83.33      83.33      83.33         18
    recommendation_movies (label_id: 54)                   100.00      60.00      75.00         10
    social_post (label_id: 55)                             100.00      94.12      96.97         17
    social_query (label_id: 56)                            100.00      82.35      90.32         17
    takeaway_order (label_id: 57)                           92.31      70.59      80.00         17
    takeaway_query (label_id: 58)                           93.75      83.33      88.24         18
    transport_query (label_id: 59)                          81.25      76.47      78.79         17
    transport_taxi (label_id: 60)                          100.00     100.00     100.00         16
    transport_ticket (label_id: 61)                         85.00      94.44      89.47         18
    transport_traffic (label_id: 62)                        93.75      88.24      90.91         17
    weather_query (label_id: 63)                            89.47     100.00      94.44         17
    -------------------
    micro avg                                               91.16      91.16      91.16        996
    macro avg                                               91.66      90.44      90.48        996
    weighted avg                                            91.72      91.16      91.04        996
```
Slot report: 
```
    label                                                precision    recall       f1           support   
    alarm_type (label_id: 0)                                 0.00       0.00       0.00          2
    app_name (label_id: 1)                                   0.00       0.00       0.00          1
    artist_name (label_id: 2)                               17.39      80.00      28.57          5
    audiobook_author (label_id: 3)                           0.00       0.00       0.00          0
    audiobook_name (label_id: 4)                            64.52      74.07      68.97         27
    business_name (label_id: 5)                             81.48      84.62      83.02         52
    business_type (label_id: 6)                             80.00      80.00      80.00         20
    change_amount (label_id: 7)                             57.14      66.67      61.54          6
    coffee_type (label_id: 8)                              100.00      33.33      50.00          3
    color_type (label_id: 9)                                75.00      92.31      82.76         13
    cooking_type (label_id: 10)                              0.00       0.00       0.00          1
    currency_name (label_id: 11)                           100.00      96.43      98.18         28
    date (label_id: 12)                                     87.88      87.22      87.55        133
    definition_word (label_id: 13)                          85.00      85.00      85.00         20
    device_type (label_id: 14)                              84.75      76.92      80.65         65
    drink_type (label_id: 15)                                0.00       0.00       0.00          0
    email_address (label_id: 16)                            64.29     100.00      78.26          9
    email_folder (label_id: 17)                            100.00      50.00      66.67          2
    event_name (label_id: 18)                               80.00      75.00      77.42         64
    food_type (label_id: 19)                                84.38      77.14      80.60         35
    game_name (label_id: 20)                                93.55      78.38      85.29         37
    game_type (label_id: 21)                                 0.00       0.00       0.00          0
    general_frequency (label_id: 22)                         0.00       0.00       0.00          9
    house_place (label_id: 23)                              80.95      91.89      86.08         37
    ingredient (label_id: 24)                                0.00       0.00       0.00          1
    joke_type (label_id: 25)                               100.00     100.00     100.00          5
    list_name (label_id: 26)                                89.29      69.44      78.12         36
    meal_type (label_id: 27)                                 0.00       0.00       0.00          3
    media_type (label_id: 28)                               78.95      83.33      81.08         36
    movie_name (label_id: 29)                                0.00       0.00       0.00          1
    movie_type (label_id: 30)                                0.00       0.00       0.00          0
    music_album (label_id: 31)                               0.00       0.00       0.00          0
    music_descriptor (label_id: 32)                          0.00       0.00       0.00          2
    music_genre (label_id: 33)                              81.82      90.00      85.71         10
    news_topic (label_id: 34)                               80.00      30.77      44.44         13
    order_type (label_id: 35)                              100.00      42.11      59.26         19
    person (label_id: 36)                                   70.79     100.00      82.89         63
    personal_info (label_id: 37)                            76.19      94.12      84.21         17
    place_name (label_id: 38)                               82.86      84.47      83.65        103
    player_setting (label_id: 39)                           75.00      42.86      54.55          7
    playlist_name (label_id: 40)                             0.00       0.00       0.00          3
    podcast_descriptor (label_id: 41)                       92.31      54.55      68.57         22
    podcast_name (label_id: 42)                             66.67      16.67      26.67         12
    radio_name (label_id: 43)                               94.87      94.87      94.87         39
    relation (label_id: 44)                                 90.91      90.91      90.91         11
    song_name (label_id: 45)                               100.00       6.67      12.50         15
    time (label_id: 46)                                     77.57      84.69      80.98         98
    time_zone (label_id: 47)                                44.44     100.00      61.54          4
    timeofday (label_id: 48)                                86.96      80.00      83.33         25
    transport_agency (label_id: 49)                         80.00      57.14      66.67          7
    transport_descriptor (label_id: 50)                      0.00       0.00       0.00          5
    transport_name (label_id: 51)                            0.00       0.00       0.00          0
    transport_type (label_id: 52)                           88.89     100.00      94.12         40
    weather_descriptor (label_id: 53)                       87.50      87.50      87.50          8
    O (label_id: 54)                                        97.07      97.52      97.30       5408
    -------------------
    micro avg                                               94.24      94.24      94.24       6582
    macro avg                                               64.87      59.93      59.17       6582
    weighted avg                                            94.23      94.24      93.95       6582
```

## 1.4 (Optional) To train/ test a GPT2 model on the assistant dataset, run the cell below 

In [None]:
# model.dataset.data_dir: folder to load data from
# model.dataset.dialogues_example_dir: folder that stores predictions for each sample
# model.tokenizer.special_tokens="{pad_token:'<|endoftext|>'}": gpt2 doesn't specify a pad token, therefore using its EOS token as the pad token
# model.dataset.target_template=with_slots: this perform slot filling with intent classification
!(python NeMo/examples/nlp/dialogue/dialogue.py \
  do_training=True \
  model.dataset.data_dir='./assistant' \
  model.dataset.dialogues_example_dir='./assistant_gpt2_examples' \
  model.dataset.task='assistant' \
  model.language_model.pretrained_model_name='gpt2' \
  trainer.max_epochs=1 \
  model.tokenizer.special_tokens="{pad_token:'<|endoftext|>'}" \
  model.dataset.target_template=with_slots \
  model.dataset.eval_mode=generation \
  exp_manager.create_wandb_logger=False)

**After 1 epoch:**

More epochs would be helpful

Intent report:

  ```
  label                                                precision    recall       f1           support   
    transport query (label_id: 0)                           72.73      84.21      78.05         19
    weather query (label_id: 1)                             94.74      94.74      94.74         19
    play game (label_id: 2)                                 92.86      68.42      78.79         19
    qa currency (label_id: 3)                              100.00     100.00     100.00         19
    qa maths (label_id: 4)                                 100.00     100.00     100.00         14
    iot wemo off (label_id: 5)                              75.00     100.00      85.71          9
    datetime convert (label_id: 6)                          46.67      87.50      60.87          8
    email addcontact (label_id: 7)                          70.00      87.50      77.78          8
    music likeness (label_id: 8)                            57.89      61.11      59.46         18
    music query (label_id: 9)                               78.57      57.89      66.67         19
    general negate (label_id: 10)                           95.00     100.00      97.44         19
    email sendemail (label_id: 11)                          92.86      68.42      78.79         19
    general affirm (label_id: 12)                           95.00     100.00      97.44         19
    play audiobook (label_id: 13)                           57.69      78.95      66.67         19
    general praise (label_id: 14)                          100.00      94.74      97.30         19
    alarm set (label_id: 15)                                85.71      94.74      90.00         19
    general explain (label_id: 16)                         100.00      89.47      94.44         19
    iot wemo on (label_id: 17)                              83.33      71.43      76.92          7
    cooking recipe (label_id: 18)                           90.00      94.74      92.31         19
    music settings (label_id: 19)                           60.00      42.86      50.00          7
    social post (label_id: 20)                              84.21      84.21      84.21         19
    recommendation events (label_id: 21)                    72.73      84.21      78.05         19
    audio volume up (label_id: 22)                          76.47     100.00      86.67         13
    lists remove (label_id: 23)                             73.08     100.00      84.44         19
    transport ticket (label_id: 24)                         94.74      94.74      94.74         19
    general joke (label_id: 25)                            100.00     100.00     100.00         12
    play podcasts (label_id: 26)                            94.12      84.21      88.89         19
    iot hue lightchange (label_id: 27)                      85.71      63.16      72.73         19
    audio volume mute (label_id: 28)                        84.62      73.33      78.57         15
    general dontcare (label_id: 29)                         95.00     100.00      97.44         19
    qa definition (label_id: 30)                            77.27      89.47      82.93         19
    email querycontact (label_id: 31)                       58.33      73.68      65.12         19
    general commandstop (label_id: 32)                     100.00     100.00     100.00         19
    calendar remove (label_id: 33)                          94.44      89.47      91.89         19
    news query (label_id: 34)                              100.00      57.89      73.33         19
    calendar query (label_id: 35)                           63.16      63.16      63.16         19
    social query (label_id: 36)                             88.24      83.33      85.71         18
    transport traffic (label_id: 37)                        90.48     100.00      95.00         19
    transport taxi (label_id: 38)                          100.00      94.44      97.14         18
    alarm query (label_id: 39)                             100.00      94.74      97.30         19
    iot hue lightoff (label_id: 40)                         88.89      84.21      86.49         19
    takeaway order (label_id: 41)                           81.25      68.42      74.29         19
    iot coffee (label_id: 42)                              100.00      94.74      97.30         19
    recommendation movies (label_id: 43)                    75.00      90.00      81.82         10
    iot hue lightup (label_id: 44)                          78.57      78.57      78.57         14
    email query (label_id: 45)                              85.71      94.74      90.00         19
    lists createoradd (label_id: 46)                        82.35      73.68      77.78         19
    play radio (label_id: 47)                               84.21      84.21      84.21         19
    audio volume down (label_id: 48)                       100.00      87.50      93.33          8
    general quirky (label_id: 49)                           30.00      15.79      20.69         19
    play music (label_id: 50)                               71.43      52.63      60.61         19
    qa stock (label_id: 51)                                 90.48     100.00      95.00         19
    iot cleaning (label_id: 52)                             93.33      87.50      90.32         16
    iot hue lightdim (label_id: 53)                        100.00     100.00     100.00         12
    recommendation locations (label_id: 54)                100.00      89.47      94.44         19
    general repeat (label_id: 55)                          100.00     100.00     100.00         19
    takeaway query (label_id: 56)                           77.27      89.47      82.93         19
    alarm remove (label_id: 57)                            100.00     100.00     100.00         11
    datetime query (label_id: 58)                           75.00      63.16      68.57         19
    iot hue lighton (label_id: 59)                          60.00     100.00      75.00          3
    qa factoid (label_id: 60)                               50.00      57.89      53.66         19
    calendar set (label_id: 61)                             75.00      78.95      76.92         19
    general confirm (label_id: 62)                         100.00     100.00     100.00         19
    lists query (label_id: 63)                              66.67      73.68      70.00         19
    label_id: 64                                             0.00       0.00       0.00          0
    -------------------
    micro avg                                               83.55      83.55      83.55       1076
    macro avg                                               83.53      83.93      83.01       1076
    weighted avg                                            84.26      83.55      83.30       1076
    
```

```
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
       Test metric             DataLoader 0
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
        intent_f1            83.55018615722656
    intent_precision         83.55018615722656
      intent_recall          83.55018615722656
         slot_f1             73.99985919756773
slot_joint_goal_accuracy     65.89219330855019
     slot_precision          73.85223048327137
       slot_recall           74.14807930607186
  test_intent_accuracy       83.55018587360595
     test_loss_epoch       0.019178826361894608
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
```

# 2. Schema Guided Dialogue (SGD)

## 2.1 Task Description
---

SGD is a multi-domain intent classification dataset from Google with close to 100k examples.

An example is:

* utterance: I will be eating there at 11:30 am so make the reservation for then.
* intent: ReserveRestaurant
* slots: {"time": "11:30 am"}




## 2.2 Download the dataset

In [None]:
!git clone https://github.com/google-research-datasets/dstc8-schema-guided-dialogue.git

## 2.3 Training and/or Testing the model


In [None]:
# model.dataset.data_dir: folder to load data from
# model.dataset.dialogues_example_dir: folder that stores predictions for each sample
# model.tokenizer.special_tokens="{pad_token:'<|endoftext|>'}": gpt2 doesn't specify a pad token, therefore using its EOS token as the pad token

!(python NeMo/examples/nlp/dialogue/dialogue.py \
  do_training=True \
  model.dataset.data_dir='./dstc8-schema-guided-dialogue' \
  model.dataset.dialogues_example_dir='./sgd_gpt2_predictions' \
  model.dataset.task='sgd' \
  model.language_model.pretrained_model_name='gpt2' \
  trainer.max_epochs=1 \
  model.tokenizer.special_tokens="{pad_token:'<|endoftext|>'}" \
  exp_manager.create_wandb_logger=False)


In [None]:
!ls sgd_gpt2_predictions

**After 1 epoch:**

More epochs would needed to reach convergence.


```
    label                                                precision    recall       f1           support   
    check balance (label_id: 0)                              0.00       0.00       0.00          0
    find trains (label_id: 1)                               80.20      91.95      85.68        348
    make payment (label_id: 2)                              83.12      28.07      41.97        228
    book appointment (label_id: 3)                          86.93      87.15      87.04        397
    get cars available (label_id: 4)                        96.88      90.51      93.58        274
    get event dates (label_id: 5)                            0.00       0.00       0.00          0
    buy bus ticket (label_id: 6)                            78.61      91.33      84.49        173
    add event (label_id: 7)                                  0.00       0.00       0.00          0
    get alarms (label_id: 8)                                58.33      77.78      66.67         45
    reserve car (label_id: 9)                               83.75      72.43      77.68        185
    get events (label_id: 10)                                0.00       0.00       0.00          0
    reserve roundtrip flights (label_id: 11)                 0.00       0.00       0.00          0
    lookup music (label_id: 12)                             89.83      86.89      88.33         61
    book house (label_id: 13)                               91.13      92.50      91.81        200
    search oneway flight (label_id: 14)                     74.77      47.70      58.25        174
    buy event tickets (label_id: 15)                        72.19      95.31      82.15        128
    find apartment (label_id: 16)                            0.00       0.00       0.00          0
    schedule visit (label_id: 17)                           77.27      66.06      71.23        386
    play media (label_id: 18)                               92.94      86.81      89.77         91
    get ride (label_id: 19)                                 99.41      98.82      99.12        170
    reserve oneway flight (label_id: 20)                     0.00       0.00       0.00          0
    find bus (label_id: 21)                                 96.64      87.53      91.86        361
    find restaurants (label_id: 22)                         77.14      91.22      83.59        148
    get times for movie (label_id: 23)                       0.00       0.00       0.00          0
    transfer money (label_id: 24)                            0.00       0.00       0.00          0
    request payment (label_id: 25)                          46.71      63.39      53.79        112
    play movie (label_id: 26)                              100.00      65.11      78.87        321
    search house (label_id: 27)                             97.91      91.83      94.77        306
    search roundtrip flights (label_id: 28)                 67.49      82.41      74.21        199
    find provider (label_id: 29)                            95.11      90.53      92.77        602
    find attractions (label_id: 30)                        100.00      89.01      94.19         91
    reserve hotel (label_id: 31)                            56.75      97.04      71.62        169
    lookup song (label_id: 32)                               0.00       0.00       0.00          0
    add alarm (label_id: 33)                                95.68      60.18      73.89        221
    find home by area (label_id: 34)                        48.95      59.79      53.83        194
    get available time (label_id: 35)                        0.00       0.00       0.00          0
    buy movie tickets (label_id: 36)                       100.00      29.39      45.42        473
    reserve restaurant (label_id: 37)                       95.71      84.80      89.92        342
    find movies (label_id: 38)                              62.40      97.61      76.14        335
    get weather (label_id: 39)                             100.00      87.69      93.44        195
    search hotel (label_id: 40)                             99.35      52.60      68.78        289
    find events (label_id: 41)                              99.57      82.56      90.27        281
    play song (label_id: 42)                                 0.00       0.00       0.00          0
    rent movie (label_id: 43)                                0.00       0.00       0.00          0
    get train tickets (label_id: 44)                        45.83       5.56       9.91        198
    none (label_id: 45)                                     55.77      98.90      71.32        728
    label_id: 46                                             0.00       0.00       0.00          0
    -------------------
    micro avg                                               77.23      77.23      77.23       8425
    macro avg                                               82.01      76.68      76.56       8425
    weighted avg                                            83.23      77.23      76.86       8425

```

# 3. MS Marco

## Task Description

MS Marco NLGen is a dataset from Microsoft that takes extracted answers and questions and output fluent answers.

An example is 


*   question: What county is Nine Mile in?
*   extracted_answer: Onondaga
*   fluent_answer: Nine Mile is in Onondaga county.


## Download and unzip files

In [None]:
!mkdir ms_marco
os.chdir('ms_marco')
!wget https://msmarco.blob.core.windows.net/msmarco/train_v2.1.json.gz
!wget https://msmarco.blob.core.windows.net/msmarco/dev_v2.1.json.gz

!gunzip train_v2.1.json.gz
!gunzip dev_v2.1.json.gz

!python ../NeMo/examples/nlp/dialogue/remove_ms_marco_samples_without_wellFormedAnswers.py --filename train_v2.1.json 
!python ../NeMo/examples/nlp/dialogue/remove_ms_marco_samples_without_wellFormedAnswers.py --filename dev_v2.1.json 

os.chdir('..')

## Training and/or Testing the model


In [None]:
# model.dataset.data_dir: folder to load data from
# model.dataset.dialogues_example_dir: folder that stores predictions for each sample

!(python NeMo/examples/nlp/dialogue/dialogue.py \
  do_training=True \
  model.dataset.dialogues_example_dir='./marco_bart_predictions' \
  model.dataset.data_dir='./ms_marco' \
  model.save_model=True \
  model.dataset.debug_mode=True \
  model.dataset.task='ms_marco' \
  model.language_model.pretrained_model_name='facebook/bart-base' \
  trainer.max_epochs=1 \
  model.dataset.debug_mode=False \
  exp_manager.create_wandb_logger=False)

**After 1 epoch:**

Train more epochs for optimal performance

```
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
       Test metric             DataLoader 0
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
          bleu               65.46179962158203
           f1                78.24439835896995
        precision            81.92473076099847
         recall              76.72508929408436
      test_accuracy         25.563487607283225
        test_loss           0.4419259166606655
     test_loss_epoch        0.4420809745788574
        test_ppl            1.5557004846779854
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
```

# 4. Zero Shot Slot Filling using Assistant Dataset

## 4.1 Task Description

**Zero Shot Slot Filling** - is a task of identifying and filling slots without requiring fine-tuning using labelled data. This task is done by first detecting mentions in the form of BIO spans (Beginning, Inside, Others) and then classifying the mention to a slot class based on the similarities of the mention to the descriptions of the slot classes.

For example, in the query:  `What is the weather in Santa Clara tomorrow morning?`, we would like to detect `Santa Clara` as a `location` slot and `tomorrow morning` as a `date_time` slot.

Slots names are usually task specific and defined as labels in the training data.
This is a fundamental step that is executed in any task-driven Conversational Assistant.

Our model enables to fine-tune a model on in-domain dataset as well as enables to transfer fine-tuned models to test out of domain dataset.



## 4.2 Download Assistant dataset and convert to NeMo format

(If you already download Assistant dataset at 1.2, please skip this section 4.2)

This is a virtual assistant interaction data set that can be downloaded from here: https://github.com/xliuhw/NLU-Evaluation-Data.
There are about 10K training and 1K testing queries which cover 64 various Intents and 55 Slots. 

An example is:

* utterance: what alarms have i set for tomorrow 
* intent: alarm_query
* slots: date(tomorrow)


In [None]:
# download and unzip the example dataset from github
!wget https://github.com/xliuhw/NLU-Evaluation-Data/archive/master.zip
!unzip master.zip
# convert the dataset to the NeMo format
!python NeMo/scripts/dataset_processing/nlp/intent_and_slot/import_datasets.py --dataset_name=assistant --source_data_dir=./NLU-Evaluation-Data-master --target_data_dir=./assistant

## 4.3 Pre-process the dataset and Generate slot class description


In [None]:
# pre-process dataset for zero shot slot filling
!python NeMo/examples/nlp/dialogue/preprocess_for_zero_shot_slot_filling.py --preprocess_file_path ./assistant/ --dataset assistant

# generate description of the slot class from pre-processed dataset
!python NeMo/examples/nlp/dialogue/generate_description_for_zero_shot_slot_filling.py --preprocess_file_path ./assistant/with_entity/ --dataset assistant

## 4.4 Training and/or Testing the model in domain

In [None]:
# model.dataset.data_dir: folder to load data from
# model.dataset.dialogues_example_dir: folder that stores predictions for each sample
# trainer.max_epochs: number of epochs for training
# model.bio_slot_loss_weight: mention detection (BIO tagging) loss weight
# model.nemo_path: the path that stores trained nemo model
# model.optim.lr: learning rate for training
# fine-tuned best overall f1 score in Assistant dataset at parameter: bio_slot_loss_weight=0.5, max_epochs=10, optim.lr=0.00001
  
!(python NeMo/examples/nlp/dialogue/dialogue.py \
  do_training=True \
  model.dataset.data_dir="./assistant/with_entity" \
  model.dataset.dialogues_example_dir="./assistant/with_entity_prediction" \
  model.dataset.task='zero_shot_slot_filling' \
  model.language_model.pretrained_model_name='bert-base-uncased' \
  exp_manager.create_wandb_logger=False \
  trainer.max_epochs=10 \
  model.bio_slot_loss_weight=0.5 \
  model.nemo_path="nemo_experiments/assistant/assistant_0.5_0.5_epoch_10_lr_0.00001.nemo" \
  model.optim.lr=0.00001)

**Results after 10 epochs**

BIO slot report: 
```
    label                                                precision    recall       f1           support   
    0 (label_id: 0)                                         97.06      95.86      96.45       3404
    1 (label_id: 1)                                         87.86      91.38      89.58        800
    2 (label_id: 2)                                         86.23      87.60      86.91        629
    -------------------
    micro avg                                               94.04      94.04      94.04       4833
    macro avg                                               90.38      91.61      90.98       4833
    weighted avg                                            94.12      94.04      94.07       4833
```
Slot similarity report: 
```
    label                                                precision    recall       f1           support   
    alarm type (label_id: 0)                                 0.00       0.00       0.00          0
    app name (label_id: 1)                                  83.33     100.00      90.91          5
    artist name (label_id: 2)                               88.89      72.73      80.00         11
    audiobook author (label_id: 3)                         100.00     100.00     100.00          1
    audiobook name (label_id: 4)                            69.23      90.00      78.26         10
    business name (label_id: 5)                             92.68     100.00      96.20         38
    business type (label_id: 6)                             89.47      94.44      91.89         18
    change amount (label_id: 7)                             81.82     100.00      90.00          9
    coffee type (label_id: 8)                              100.00      75.00      85.71          4
    color type (label_id: 9)                               100.00      90.91      95.24         11
    cooking type (label_id: 10)                              0.00       0.00       0.00          0
    currency name (label_id: 11)                           100.00      94.44      97.14         18
    date (label_id: 12)                                     95.06      93.90      94.48         82
    definition word (label_id: 13)                         100.00     100.00     100.00         16
    device type (label_id: 14)                              95.24     100.00      97.56         40
    drink type (label_id: 15)                                0.00       0.00       0.00          0
    email address (label_id: 16)                           100.00     100.00     100.00          4
    email folder (label_id: 17)                            100.00     100.00     100.00          1
    event name (label_id: 18)                               92.68      79.17      85.39         48
    food type (label_id: 19)                                92.00      95.83      93.88         24
    game name (label_id: 20)                               100.00      93.75      96.77         16
    game type (label_id: 21)                                 0.00       0.00       0.00          0
    general frequency (label_id: 22)                        75.00      60.00      66.67          5
    house place (label_id: 23)                             100.00      95.83      97.87         24
    ingredient (label_id: 24)                              100.00      50.00      66.67          4
    joke type (label_id: 25)                               100.00     100.00     100.00          4
    list name (label_id: 26)                                61.11      91.67      73.33         12
    meal type (label_id: 27)                                 0.00       0.00       0.00          0
    media type (label_id: 28)                               93.33      87.50      90.32         32
    movie name (label_id: 29)                                0.00       0.00       0.00          0
    movie type (label_id: 30)                                0.00       0.00       0.00          0
    music album (label_id: 31)                               0.00       0.00       0.00          0
    music descriptor (label_id: 32)                          0.00       0.00       0.00          2
    music genre (label_id: 33)                              77.78     100.00      87.50          7
    news topic (label_id: 34)                               85.71      66.67      75.00          9
    order type (label_id: 35)                              100.00     100.00     100.00         17
    person (label_id: 36)                                   86.67      97.50      91.76         40
    personal info (label_id: 37)                            92.86     100.00      96.30         13
    place name (label_id: 38)                               98.72      92.77      95.65         83
    player setting (label_id: 39)                           33.33     100.00      50.00          1
    playlist name (label_id: 40)                             0.00       0.00       0.00          1
    podcast descriptor (label_id: 41)                      100.00      83.33      90.91          6
    podcast name (label_id: 42)                            100.00      50.00      66.67          2
    radio name (label_id: 43)                              100.00      83.33      90.91         12
    relation (label_id: 44)                                 84.62      84.62      84.62         13
    song name (label_id: 45)                                71.43      55.56      62.50          9
    time (label_id: 46)                                     93.33      93.33      93.33         60
    time zone (label_id: 47)                                71.43     100.00      83.33          5
    timeofday (label_id: 48)                                92.59      96.15      94.34         26
    transport agency (label_id: 49)                        100.00     100.00     100.00          9
    transport descriptor (label_id: 50)                      0.00       0.00       0.00          0
    transport name (label_id: 51)                          100.00     100.00     100.00          2
    transport type (label_id: 52)                          100.00     100.00     100.00         34
    weather descriptor (label_id: 53)                      100.00      83.33      90.91         12
    other (label_id: 54)                                     0.00       0.00       0.00          0
    -------------------
    micro avg                                               91.88      91.88      91.88        800
    macro avg                                               86.63      85.59      84.93        800
    weighted avg                                            92.92      91.88      92.05        800
```

Overall slot report: 
```
    label                                                precision    recall       f1           support   
    alarm type (label_id: 0)                                 0.00       0.00       0.00          0
    app name (label_id: 1)                                  83.33      83.33      83.33          6
    artist name (label_id: 2)                               87.50      66.67      75.68         21
    audiobook author (label_id: 3)                          50.00     100.00      66.67          1
    audiobook name (label_id: 4)                            50.00      83.33      62.50         18
    business name (label_id: 5)                             86.44      98.08      91.89         52
    business type (label_id: 6)                             68.00      70.83      69.39         24
    change amount (label_id: 7)                             83.33      80.00      81.63         25
    coffee type (label_id: 8)                              100.00      75.00      85.71          4
    color type (label_id: 9)                                62.50      83.33      71.43         12
    cooking type (label_id: 10)                              0.00       0.00       0.00          0
    currency name (label_id: 11)                            95.65      88.00      91.67         25
    date (label_id: 12)                                     83.61      91.07      87.18        112
    definition word (label_id: 13)                         100.00     100.00     100.00         20
    device type (label_id: 14)                              92.59      93.75      93.17         80
    drink type (label_id: 15)                                0.00       0.00       0.00          0
    email address (label_id: 16)                           100.00      71.43      83.33         14
    email folder (label_id: 17)                            100.00     100.00     100.00          1
    event name (label_id: 18)                               74.19      67.65      70.77         68
    food type (label_id: 19)                                76.74      76.74      76.74         43
    game name (label_id: 20)                               100.00      90.48      95.00         21
    game type (label_id: 21)                                 0.00       0.00       0.00          0
    general frequency (label_id: 22)                        42.86      33.33      37.50          9
    house place (label_id: 23)                              94.12      96.97      95.52         33
    ingredient (label_id: 24)                              100.00      16.67      28.57          6
    joke type (label_id: 25)                               100.00     100.00     100.00          4
    list name (label_id: 26)                                51.52      80.95      62.96         21
    meal type (label_id: 27)                                 0.00       0.00       0.00          0
    media type (label_id: 28)                               82.76      64.86      72.73         37
    movie name (label_id: 29)                                0.00       0.00       0.00          0
    movie type (label_id: 30)                                0.00       0.00       0.00          0
    music album (label_id: 31)                               0.00       0.00       0.00          0
    music descriptor (label_id: 32)                          0.00       0.00       0.00          3
    music genre (label_id: 33)                              69.23     100.00      81.82          9
    news topic (label_id: 34)                               73.33      64.71      68.75         17
    order type (label_id: 35)                               66.67      94.12      78.05         17
    person (label_id: 36)                                   76.56      96.08      85.22         51
    personal info (label_id: 37)                            86.67      68.42      76.47         19
    place name (label_id: 38)                               93.86      80.45      86.64        133
    player setting (label_id: 39)                           12.50     100.00      22.22          1
    playlist name (label_id: 40)                             0.00       0.00       0.00          1
    podcast descriptor (label_id: 41)                      100.00      84.62      91.67         13
    podcast name (label_id: 42)                             33.33      25.00      28.57          4
    radio name (label_id: 43)                               93.75      78.95      85.71         38
    relation (label_id: 44)                                 75.00      70.59      72.73         17
    song name (label_id: 45)                                60.00      40.91      48.65         22
    time (label_id: 46)                                     82.48      85.61      84.01        132
    time zone (label_id: 47)                                77.78     100.00      87.50          7
    timeofday (label_id: 48)                                80.65      89.29      84.75         28
    transport agency (label_id: 49)                        100.00     100.00     100.00          9
    transport descriptor (label_id: 50)                      0.00       0.00       0.00          0
    transport name (label_id: 51)                           80.00     100.00      88.89          4
    transport type (label_id: 52)                           97.06      94.29      95.65         35
    weather descriptor (label_id: 53)                       90.91      66.67      76.92         15
    other (label_id: 54)                                    96.59      96.32      96.46       3291
    -------------------
    micro avg                                               92.53      92.53      92.53       4523
    macro avg                                               76.34      77.14      74.44       4523
    weighted avg                                            92.99      92.53      92.56       4523

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃           Test metric            ┃           DataLoader 0           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│           bio_slot_f1            │        94.04096984863281         │
│        bio_slot_precision        │        94.04096984863281         │
│         bio_slot_recall          │        94.04096984863281         │
│         overall_slot_f1          │        92.52709197998047         │
│      overall_slot_precision      │        92.52708435058594         │
│       overall_slot_recall        │        92.52708435058594         │
│        slot_similarity_f1        │              91.875              │
│    slot_similarity_precision     │              91.875              │
│      slot_similarity_recall      │              91.875              │
│         unified_slot_f1          │        80.28195816548065         │
│ unified_slot_joint_goal_accuracy │         71.2624584717608         │
│      unified_slot_precision      │        79.27740863787376         │
│       unified_slot_recall        │        81.31229235880399         │
│             val_loss             │        0.6432915925979614        │
└──────────────────────────────────┴──────────────────────────────────┘
```

## 4.5 (Optional) Download CoNLL-2003 dataset and convert to NeMo format


The shared task of CoNLL-2003 concerns language-independent named entity recognition. We will concentrate on four types of named entities: persons, locations, organizations and names of miscellaneous entities that do not belong to the previous three groups.


For more details see https://www.clips.uantwerpen.be/conll2003/ner/ and https://www.aclweb.org/anthology/W03-0419

######################################################################

Licensing Information

From the CoNLL2003 shared task page: https://www.clips.uantwerpen.be/conll2003/ner/

"The English data is a collection of news wire articles from the Reuters Corpus. The annotation has been done by people of the University of Antwerp. Because of copyright reasons we only make available the annotations. In order to build the complete data sets you will need access to the Reuters Corpus. It can be obtained for research purposes without any charge from NIST."


The copyrights are defined below, from the Reuters Corpus page: https://trec.nist.gov/data/reuters/reuters.html

"The stories in the Reuters Corpus are under the copyright of Reuters Ltd and/or Thomson Reuters, and their use is governed by the following agreements:

Organizational agreement https://trec.nist.gov/data/reuters/org_appl_reuters_v4.html

This agreement must be signed by the person responsible for the data at your organization, and sent to NIST.

Individual agreement https://trec.nist.gov/data/reuters/ind_appl_reuters_v4.html

This agreement must be signed by all researchers using the Reuters Corpus at your organization, and kept on file at your organization."


In [None]:
# download and unzip the example dataset from deepai.org
!wget https://data.deepai.org/conll2003.zip
!unzip conll2003.zip -d ./conll_2003

# convert the CoNLL-2003 IOB format (short for inside, outside, beginning) dataset to the NeMo format
!python NeMo/examples/nlp/token_classification/data/import_from_iob_format.py --data_file=./conll_2003/train.txt
!python NeMo/examples/nlp/token_classification/data/import_from_iob_format.py --data_file=./conll_2003/test.txt
!python NeMo/examples/nlp/token_classification/data/import_from_iob_format.py --data_file=./conll_2003/valid.txt

## 4.6 (Optional) Pre-process the CoNLL-2003 dataset and Generate slot class description

In [None]:
# pre-process dataset for zero shot slot filling
!python NeMo/examples/nlp/dialogue/preprocess_for_zero_shot_slot_filling.py --preprocess_file_path ./conll_2003/ --dataset conll_2003

# generate description of the slot class from pre-processed dataset
!python NeMo/examples/nlp/dialogue/generate_description_for_zero_shot_slot_filling.py --preprocess_file_path ./conll_2003/with_entity/ --dataset conll_2003

## 4.7 (Optional) Training and/or Testing the model using CoNLL-2003 Dataset



In [None]:
# model.dataset.data_dir: folder to load data from
# model.dataset.dialogues_example_dir: folder that stores predictions for each sample
# trainer.max_epochs: number of epochs for training
# model.bio_slot_loss_weight: mention detection (BIO tagging) loss weight
# model.nemo_path: the path that stores trained nemo model
# model.optim.lr: learning rate for training
# fine-tuned best overall f1 score in CoNLL-2003 dataset at parameter: bio_slot_loss_weight=0.7, max_epochs=5, optim.lr=0.00005

  
!(python NeMo/examples/nlp/dialogue/dialogue.py \
  do_training=True \
  model.dataset.data_dir="./conll_2003/with_entity" \
  model.dataset.dialogues_example_dir="./conll_2003/with_entity_prediction" \
  model.dataset.task='zero_shot_slot_filling' \
  model.language_model.pretrained_model_name='bert-base-uncased' \
  exp_manager.create_wandb_logger=False \
  trainer.max_epochs=1 \
  model.bio_slot_loss_weight=0.7 \
  model.nemo_path="nemo_experiments/conll2003/conll2003_0.7_0.3_epoch_1_lr_0.00005.nemo" \
  model.optim.lr=0.00005)

**After 1 epoch:**

Fine-tuned with bio_slot_loss_weight=0.7, max_epochs=5, optim.lr=0.00005 get best overall f1 score

BIO slot report: 
```
    label                                                precision    recall       f1           support   
    0 (label_id: 0)                                         98.80      97.89      98.34      38900
    1 (label_id: 1)                                         92.51      93.85      93.18       4082
    2 (label_id: 2)                                         90.54      95.03      92.73       5973
    -------------------
    micro avg                                               97.21      97.21      97.21      48955
    macro avg                                               93.95      95.59      94.75      48955
    weighted avg                                            97.27      97.21      97.23      48955
```
Slot similarity report: 
```
    label                                                precision    recall       f1           support   
    other (label_id: 0)                                      0.00       0.00       0.00          0
    location (label_id: 1)                                  93.54      96.05      94.78       1266
    miscellaneous (label_id: 2)                             89.77      90.41      90.09        563
    organization (label_id: 3)                              92.43      91.45      91.94       1228
    person (label_id: 4)                                    99.00      96.59      97.78       1025
    -------------------
    micro avg                                               94.02      94.02      94.02       4082
    macro avg                                               93.68      93.62      93.65       4082
    weighted avg                                            94.06      94.02      94.03       4082
```

Overall slot report: 
```
    label                                                precision    recall       f1           support   
    other (label_id: 0)                                     98.74      98.58      98.66      30551
    location (label_id: 1)                                  76.88      90.61      83.18       1523
    miscellaneous (label_id: 2)                             77.34      72.99      75.10        748
    organization (label_id: 3)                              84.72      78.04      81.24       1826
    person (label_id: 4)                                    95.38      93.60      94.48       1874
    -------------------
    micro avg                                               96.44      96.44      96.44      36522
    macro avg                                               86.61      86.76      86.53      36522
    weighted avg                                            96.52      96.44      96.45      36522


┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃           Test metric            ┃           DataLoader 0           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│           bio_slot_f1            │        97.20763397216797         │
│        bio_slot_precision        │         97.2076416015625         │
│         bio_slot_recall          │         97.2076416015625         │
│         overall_slot_f1          │        96.44049835205078         │
│      overall_slot_precision      │        96.44049835205078         │
│       overall_slot_recall        │        96.44049835205078         │
│        slot_similarity_f1        │        94.02253723144531         │
│    slot_similarity_precision     │        94.02253723144531         │
│      slot_similarity_recall      │        94.02253723144531         │
│         unified_slot_f1          │        81.62207207376734         │
│ unified_slot_joint_goal_accuracy │        74.56458635703919         │
│      unified_slot_precision      │        82.10873246250604         │
│       unified_slot_recall        │        81.14114658925979         │
│             val_loss             │        0.1337154656648636        │
└──────────────────────────────────┴──────────────────────────────────┘
```

## 4.8 (Optional) Testing the model out of domain

Testing the Zero Shot Slot Filling Modle on CoNLL-2003 Dataset, with pre-trained model that fine-tuned on Assistant Dataset

In [None]:
# model.dataset.data_dir: folder to load data from
# model.dataset.dialogues_example_dir: folder that stores predictions for each sample
# model.nemo_path: the path that loads the pre-trained nemo model
# fine-tuned best overall f1 score in Assistant dataset at parameter: bio_slot_loss_weight=0.5, max_epochs=10, optim.lr=0.00001
  
!(python NeMo/examples/nlp/dialogue/dialogue.py \
  do_training=False \
  model.dataset.data_dir="./conll_2003/with_entity" \
  model.dataset.dialogues_example_dir="./train_assistant_test_conll2003/with_entity_prediction" \
  model.dataset.task='zero_shot_slot_filling' \
  model.language_model.pretrained_model_name='bert-base-uncased' \
  exp_manager.create_wandb_logger=False \
  model.nemo_path="nemo_experiments/assistant/assistant_0.5_0.5_epoch_10_lr_0.00001.nemo")


**Results pre-trained Zero Shot Slof Filling model on Assistant Dataset, transfer to CoNLL-2003 Dataset**

BIO slot report: 
```
    label                                                precision    recall       f1           support   
    0 (label_id: 0)                                         89.20      82.35      85.64      38900
    1 (label_id: 1)                                         50.27      47.62      48.91       4082
    2 (label_id: 2)                                         42.99      66.03      52.08       5973
    -------------------
    micro avg                                               77.47      77.47      77.47      48955
    macro avg                                               60.82      65.34      62.21      48955
    weighted avg                                            80.32      77.47      78.48      48955
```
Slot similarity report: 
```
    label                                                precision    recall       f1           support   
    other (label_id: 0)                                      0.00       0.00       0.00          0
    location (label_id: 1)                                  73.79      18.01      28.95       1266
    miscellaneous (label_id: 2)                             46.13      22.20      29.98        563
    organization (label_id: 3)                              46.01      82.57      59.09       1228
    person (label_id: 4)                                    73.82      92.68      82.18       1025
    -------------------
    micro avg                                               56.76      56.76      56.76       4082
    macro avg                                               59.93      53.87      50.05       4082
    weighted avg                                            61.62      56.76      51.53       4082
```

Overall slot report: 
```
    label                                                precision    recall       f1           support   
    other (label_id: 0)                                     90.60      91.04      90.82      30551
    location (label_id: 1)                                  31.89       7.75      12.47       1523
    miscellaneous (label_id: 2)                             12.55      21.79      15.93        748
    organization (label_id: 3)                              28.13      31.76      29.84       1826
    person (label_id: 4)                                    55.49      62.01      58.57       1874
    -------------------
    micro avg                                               81.69      81.69      81.69      36522
    macro avg                                               43.73      42.87      41.52      36522
    weighted avg                                            81.63      81.69      81.32      36522

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃           Test metric            ┃           DataLoader 0           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│           bio_slot_f1            │        77.46501922607422         │
│        bio_slot_precision        │        77.46501922607422         │
│         bio_slot_recall          │        77.46501922607422         │
│         overall_slot_f1          │        81.69322967529297         │
│      overall_slot_precision      │        81.69322967529297         │
│       overall_slot_recall        │        81.69322967529297         │
│        slot_similarity_f1        │        56.761390686035156        │
│    slot_similarity_precision     │        56.761390686035156        │
│      slot_similarity_recall      │        56.761390686035156        │
│         unified_slot_f1          │        19.54190263978709         │
│ unified_slot_joint_goal_accuracy │        8.925979680696662         │
│      unified_slot_precision      │        20.694242864054182        │
│       unified_slot_recall        │        18.51112723754233         │
│             val_loss             │        1.477256178855896         │
└──────────────────────────────────┴──────────────────────────────────┘
```