In [None]:
BRANCH = 'main'

In [None]:
"""
You can run either this notebook locally (if you have all the dependencies and a GPU) or on Google Colab.

Instructions for setting up Colab are as follows:
1. Open a new Python 3 notebook.
2. Import this notebook from GitHub (File -> Upload Notebook -> "GITHUB" tab -> copy/paste GitHub URL)
3. Connect to an instance with a GPU (Runtime -> Change runtime type -> select "GPU" for hardware accelerator)
4. Run this cell to set up dependencies.
"""
# If you're using Google Colab and not running locally, run this cell

# install NeMo
!python -m pip install git+https://github.com/NVIDIA/NeMo.git@{BRANCH}#egg=nemo_toolkit[nlp]

In [4]:
# from nemo.collections import nlp as nemo_nlp
# from nemo.utils.exp_manager import exp_manager

import os
import wget
import torch
import pytorch_lightning as pl
from omegaconf import OmegaConf

# Task Description
**Joint Intent and Slot classification** - is a task of classifying an Intent and detecting all relevant Slots (Entities)
for this Intent in a query.
For example, in the query:  `What is the weather in Santa Clara tomorrow morning?`, we would like to classify the query
as `Weather` Intent, and detect `Santa Clara` as a `location` slot and `tomorrow morning` as a `date_time` slot.
Intents and Slots names are usually task specific and defined as labels in the training data.
This is a fundamental step that is executed in any task-driven Conversational Assistant.

* Our Bert based model implementation enables to train and then detect both of these tasks together.
* Training loss can be a balanced loss between intent and slot losses or you can change it with model.intent_loss_weight parameter with values between 0 to 1 in the **intent_slot_classification_config.yaml** file. The default value is 0.6 giving a slightly higher weight for Intent loss and it works quite well.

# Dataset and Nemo data format

In this tutorial we are going to use a virtual assistant interaction data set that can be downloaded from here: https://github.com/xliuhw/NLU-Evaluation-Data.
There are about 10K training and 1K testing queries which cover 64 various Intents and 55 Slots. 

To work with Nemo NLP classification model, this dataset should be first converted to the Nemo format, which requires next files:
- **dict.intents.csv** - list of all intent names in the data. One line per an intent name.
- **dict.slots.csv** - list of all slot names in the data. One line per a slot name. It is possible to use both: B- I- notations, for separating between first and intermediate tokens for multi token slots. Or just use one slot type for each token of multi token slot. Our recommendation is to use later one, since it is simpler and there is no visible degradation in performance.
- **train.tsv/test.tsv** - contain original queries, one per line, and intent number separated by tab. For example: `what alarms do i have set right now	0`. Intent numbers are according to the intent line in the intent dictionary file (dict.intents.csv) starting from 0. First line of these files contains a header line: `sentence \tab label`.
- **train_slot.tvs/test_slot.tsv** - contain one line per a query, where instead each token there is a number of the token from the slots dictionary file (dict.slots.csv), starting from 0. Last 'out-of scope' token is usually located in the last line of the dictionary. Example: `54 0 0 54 54 12 12` (numbers separated by space). No header line in these files.

Nemo provides **import_dataset.py** converter for few reference datasets (Assistant/Atis/Snips) which converts them to Nemo data format for Intent and Slot classification model. If you have your own annotated dataset in different format, you will need to write a data converter. Possible recommended format for your own annotation, is to have one text file per all examples of one intent. With one line per query in a form like: `did i set an alarm to [alarm_type : wake up] in the [timeofday : morning]`, using brackets to define slot names. This very similar to Assistant format and you can use it's converter to Nemo format with small changes. 

You can run this utility as follows:

**python examples/nlp/intent_slot_classification/data/import_datasets.py --dataset_name=assistant --source_data_dir=source_dir_name --target_data_dir=target_dir_name**


# Download, preprocess and explore the dataset

In [50]:
DATA_DIR = "/Users/vgetselevich/Development/data"
NEMO_DIR = "/Users/vgetselevich/Development/NeMoOriginal"
EXAMPLE_DIR = f'{NEMO_DIR}/examples/nlp/intent_slot_classification'

In [15]:
# download and unzip example dataset from github
print('Downloading dataset...')
wget.download('https://github.com/xliuhw/NLU-Evaluation-Data/archive/master.zip', DATA_DIR)
! unzip {DATA_DIR}/NLU-Evaluation-Data-master.zip -d {DATA_DIR}

Archive:  /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master.zip
c2491de8342748f9484896e48f21b85e59b3c4a2
   creating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/
   creating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/AnnotatedData/
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/AnnotatedData/NLU-Data-Home-Domain-Annotated-All.csv  
   creating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/Collected-Original-Data/
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/Collected-Original-Data/paraphrases_and_intents_26k_normalised_all.csv  
   creating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/
   creating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/
   creating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/auto

  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_1/testset/annotated/recommendation_movies/recommendation_movies_anno_20180322130133773.json  
   creating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_1/testset/annotated/social_post/
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_1/testset/annotated/social_post/social_post_anno_20180322130133838.json  
   creating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_1/testset/annotated/social_query/
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossVal

  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_1/testset/text/general_praise.txt  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_1/testset/text/general_quirky.txt  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_1/testset/text/general_repeat.txt  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_1/testset/text/iot_cleaning.txt  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_1/testset/tex

  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_1/trainset/iot_hue_lightoff.csv  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_1/trainset/iot_hue_lighton.csv  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_1/trainset/iot_hue_lightup.csv  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_1/trainset/iot_wemo_off.csv  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_1/trainset/iot_wemo_o

  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_10/testset/annotated/transport_taxi/transport_taxi_anno_20180322130135651.json  
   creating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_10/testset/annotated/transport_ticket/
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_10/testset/annotated/transport_ticket/transport_ticket_anno_20180322130135664.json  
   creating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_10/testset/annotated/transport_traffic/
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-maste

  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_10/testset/text/play_podcasts.txt  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_10/testset/text/play_radio.txt  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_10/testset/text/qa_currency.txt  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_10/testset/text/qa_definition.txt  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_10/testset/t

  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_2/testset/annotated/email_addcontact/email_addcontact_anno_20180322130133995.json  
   creating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_2/testset/annotated/email_query/
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_2/testset/annotated/email_query/email_query_anno_20180322130134044.json  
   creating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_2/testset/annotated/email_querycontact/
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidat

  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_2/testset/annotated/transport_ticket/transport_ticket_anno_20180322130134001.json  
   creating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_2/testset/annotated/transport_traffic/
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_2/testset/annotated/transport_traffic/transport_traffic_anno_20180322130133940.json  
   creating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_2/testset/annotated/weather_query/
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master

  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_2/trainset/general_dontcare.csv  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_2/trainset/general_explain.csv  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_2/trainset/general_joke.csv  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_2/trainset/general_negate.csv  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_2/trainset/general_pra

  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_3/testset/csv/iot_hue_lighton.csv  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_3/testset/csv/iot_hue_lightup.csv  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_3/testset/csv/iot_wemo_off.csv  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_3/testset/csv/iot_wemo_on.csv  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_3/testset/csv/

  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_3/trainset/news_query.csv  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_3/trainset/play_audiobook.csv  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_3/trainset/play_game.csv  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_3/trainset/play_music.csv  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_3/trainset/play_podcasts.csv  
  in

  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_4/testset/annotated/transport_taxi/transport_taxi_anno_20180322130134429.json  
   creating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_4/testset/annotated/transport_ticket/
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_4/testset/annotated/transport_ticket/transport_ticket_anno_20180322130134442.json  
   creating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_4/testset/annotated/transport_traffic/
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/Cr

  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_4/trainset/recommendation_locations.csv  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_4/trainset/recommendation_movies.csv  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_4/trainset/social_post.csv  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_4/trainset/social_query.csv  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_4/trainset/

  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_5/testset/csv/recommendation_locations.csv  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_5/testset/csv/recommendation_movies.csv  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_5/testset/csv/social_post.csv  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_5/testset/csv/social_query.csv  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold

  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_5/trainset/lists_remove.csv  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_5/trainset/music_likeness.csv  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_5/trainset/music_query.csv  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_5/trainset/music_settings.csv  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_5/trainset/news_query.csv  

  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_6/testset/csv/alarm_query.csv  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_6/testset/csv/alarm_remove.csv  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_6/testset/csv/alarm_set.csv  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_6/testset/csv/audio_volume_down.csv  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_6/testset/csv/audio_vo

  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_6/trainset/qa_factoid.csv  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_6/trainset/qa_maths.csv  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_6/trainset/qa_stock.csv  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_6/trainset/recommendation_events.csv  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_6/trainset/recommendation_locati

  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_7/testset/annotated/general_repeat/general_repeat_anno_20180322130135050.json  
   creating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_7/testset/annotated/iot_cleaning/
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_7/testset/annotated/iot_cleaning/iot_cleaning_anno_20180322130135053.json  
   creating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_7/testset/annotated/iot_coffee/
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneF

  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_7/testset/csv/weather_query.csv  
   creating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_7/testset/text/
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_7/testset/text/alarm_query.txt  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_7/testset/text/alarm_remove.txt  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_7/testset/text/alarm_set.txt  
  

  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_7/trainset/play_music.csv  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_7/trainset/play_podcasts.csv  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_7/trainset/play_radio.csv  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_7/trainset/qa_currency.csv  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_7/trainset/qa_definition.csv  
  i

  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_8/testset/csv/audio_volume_down.csv  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_8/testset/csv/audio_volume_mute.csv  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_8/testset/csv/audio_volume_up.csv  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_8/testset/csv/calendar_query.csv  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_8/te

  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_8/testset/csv/transport_ticket.csv  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_8/testset/csv/transport_traffic.csv  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_8/testset/csv/weather_query.csv  
   creating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_8/testset/text/
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_8/testset/text/alarm_query

  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_9/testset/annotated/datetime_convert/datetime_convert_anno_20180322130135446.json  
   creating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_9/testset/annotated/datetime_query/
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_9/testset/annotated/datetime_query/datetime_query_anno_20180322130135451.json  
   creating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_9/testset/annotated/email_addcontact/
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/Cross

  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_9/testset/csv/datetime_query.csv  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_9/testset/csv/email_addcontact.csv  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_9/testset/csv/email_query.csv  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_9/testset/csv/email_querycontact.csv  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/autoGeneFromRealAnno/autoGene_2018_03_22-13_01_25_169/CrossValidation/KFold_9/testse

  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_1/merged-kfold1.zip  
   creating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_1/merged/
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_1/merged/agent.json  
   creating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_1/merged/entities/
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_1/merged/entities/alarm_type.json  
  inflating: /Users/vgetselevich/Development/d

  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_1/merged/intents/news_query.json  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_1/merged/intents/play_audiobook.json  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_1/merged/intents/play_game.json  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_1/merged/intents/play_music.json  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_1/merged/inte

  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_10/merged/intents/recommendation_locations.json  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_10/merged/intents/recommendation_movies.json  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_10/merged/intents/social_post.json  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_10/merged/intents/social_query.json  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossV

  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_2/merged/intents/recommendation_events.json  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_2/merged/intents/recommendation_locations.json  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_2/merged/intents/recommendation_movies.json  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_2/merged/intents/social_post.json  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/C

  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_3/merged/intents/iot_hue_lightup.json  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_3/merged/intents/iot_wemo_off.json  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_3/merged/intents/iot_wemo_on.json  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_3/merged/intents/lists_createoradd.json  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_3

  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_4/merged/entities/date.json  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_4/merged/entities/definition_word.json  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_4/merged/entities/device_type.json  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_4/merged/entities/drink_type.json  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_4/merged/ent

  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_5/merged/entities/event_name.json  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_5/merged/entities/food_type.json  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_5/merged/entities/game_name.json  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_5/merged/entities/game_type.json  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_5/merged/entiti

   creating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_6/merged/
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_6/merged/agent.json  
   creating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_6/merged/entities/
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_6/merged/entities/alarm_type.json  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_6/merged/entities/app_name.json  
  inflating: /Users/vgetselevich/D

  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_6/merged/intents/recommendation_locations.json  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_6/merged/intents/recommendation_movies.json  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_6/merged/intents/social_post.json  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_6/merged/intents/social_query.json  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossValid

  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_7/merged/entities/timeofday.json  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_7/merged/entities/transport_agency.json  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_7/merged/entities/transport_descriptor.json  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_7/merged/entities/transport_name.json  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossValidation

  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_7/merged/intents/qa_currency.json  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_7/merged/intents/qa_definition.json  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_7/merged/intents/qa_factoid.json  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_7/merged/intents/qa_maths.json  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_7/merged/inten

  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_8/merged/intents/qa_maths.json  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_8/merged/intents/qa_stock.json  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_8/merged/intents/recommendation_events.json  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_8/merged/intents/recommendation_locations.json  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossValidation/K

  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_9/merged/intents/email_sendemail.json  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_9/merged/intents/general_affirm.json  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_9/merged/intents/general_commandstop.json  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_9/merged/intents/general_confirm.json  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4ApiaiReal/Apiai_trainset_2018_03_22-13_01_25_169/CrossValidation

  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4LuisReal/Luis_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_10/merged/Luis_KFold_10_2018_03_22-13_01_25_169_20180327170202385.json  
   creating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4LuisReal/Luis_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_2/
   creating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4LuisReal/Luis_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_2/merged/
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4LuisReal/Luis_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_2/merged/Luis_KFold_2_2018_03_22-13_01_25_169_20180327170202385.json  
   creating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4LuisReal/Luis_trainset_2018_03_22-13_01_25_169/CrossValidation/KFold_3/
   creatin

   creating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4RasaReal/rasa_json_2018_03_22-13_01_25_169_80Train/CrossValidation/KFold_2/mergedTestset/
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4RasaReal/rasa_json_2018_03_22-13_01_25_169_80Train/CrossValidation/KFold_2/mergedTestset/RasaNluTestset_Merged_2018_03_22-13_01_25_169.json  
   creating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4RasaReal/rasa_json_2018_03_22-13_01_25_169_80Train/CrossValidation/KFold_2/rasaTrainConfig/
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4RasaReal/rasa_json_2018_03_22-13_01_25_169_80Train/CrossValidation/KFold_2/rasaTrainConfig/configLatest_KFold_2_mitie_2018_03_22-13_01_25_169.json  
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4RasaReal/rasa_json_2018_03_22-13_01_25

  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4RasaReal/rasa_json_2018_03_22-13_01_25_169_80Train/CrossValidation/KFold_7/merged/RasaNlu_Merged_KFold_7_2018_03_22-13_01_25_169.json  
   creating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4RasaReal/rasa_json_2018_03_22-13_01_25_169_80Train/CrossValidation/KFold_7/mergedTestset/
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4RasaReal/rasa_json_2018_03_22-13_01_25_169_80Train/CrossValidation/KFold_7/mergedTestset/RasaNluTestset_Merged_2018_03_22-13_01_25_169.json  
   creating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4RasaReal/rasa_json_2018_03_22-13_01_25_169_80Train/CrossValidation/KFold_7/rasaTrainConfig/
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4RasaReal/rasa_json_2018_03_22-13_01_25_169_80Train/

   creating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4WatsonReal/Watson_2018_03_22-13_01_25_169_trainset/CrossValidation/KFold_6/
   creating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4WatsonReal/Watson_2018_03_22-13_01_25_169_trainset/CrossValidation/KFold_6/merged/
   creating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4WatsonReal/Watson_2018_03_22-13_01_25_169_trainset/CrossValidation/KFold_6/merged/entities/
  inflating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4WatsonReal/Watson_2018_03_22-13_01_25_169_trainset/CrossValidation/KFold_6/merged/entities/watsonEntities_2018_03_22-13_01_25_169.csv  
   creating: /Users/vgetselevich/Development/data/NLU-Evaluation-Data-master/CrossValidation/out4WatsonReal/Watson_2018_03_22-13_01_25_169_trainset/CrossValidation/KFold_6/merged/intents/
  inflating: /Users/vgetselevich

In [51]:
# convert the dataset to Nemo format
!python {EXAMPLE_DIR}/data/import_datasets.py --dataset_name=assistant --source_data_dir={DATA_DIR}/NLU-Evaluation-Data-master --target_data_dir={DATA_DIR}/nemo_format


[NeMo W 2020-09-10 14:21:49 experimental:28] Module <class 'nemo.collections.nlp.modules.common.megatron.megatron_bert.MegatronBertEncoder'> is experimental, not ready for production and is not fully supported. Use at your own risk.
[NeMo W 2020-09-10 14:21:49 experimental:28] Module <class 'nemo.collections.nlp.modules.common.sequence_token_classifier.SequenceTokenClassifier'> is experimental, not ready for production and is not fully supported. Use at your own risk.
[NeMo I 2020-09-10 14:21:49 assistant_utils:124] robot dataset has already been processed and stored at /Users/vgetselevich/Development/data/nemo_format


In [45]:
# list of queries divided by intent files in the original training dataset
! ls -l {DATA_DIR}/NLU-Evaluation-Data-master/dataset/trainset

total 2976
-rw-r--r--  1 vgetselevich  staff  22459 Apr 30  2019 alarm_query.csv
-rw-r--r--  1 vgetselevich  staff  13045 Apr 30  2019 alarm_remove.csv
-rw-r--r--  1 vgetselevich  staff  23552 Apr 30  2019 alarm_set.csv
-rw-r--r--  1 vgetselevich  staff   7911 Apr 30  2019 audio_volume_down.csv
-rw-r--r--  1 vgetselevich  staff  14525 Apr 30  2019 audio_volume_mute.csv
-rw-r--r--  1 vgetselevich  staff  14732 Apr 30  2019 audio_volume_up.csv
-rw-r--r--  1 vgetselevich  staff  27361 Apr 30  2019 calendar_query.csv
-rw-r--r--  1 vgetselevich  staff  27068 Apr 30  2019 calendar_remove.csv
-rw-r--r--  1 vgetselevich  staff  34492 Apr 30  2019 calendar_set.csv
-rw-r--r--  1 vgetselevich  staff  25138 Apr 30  2019 cooking_recipe.csv
-rw-r--r--  1 vgetselevich  staff  16357 Apr 30  2019 datetime_convert.csv
-rw-r--r--  1 vgetselevich  staff  23226 Apr 30  2019 datetime_query.csv
-rw-r--r--  1 vgetselevich  staff  12869 Apr 30  2019 email_addcontact.csv
-rw-r--r--  1 vgetselevich

In [41]:
# print all intents from nemo format intent dictionary
!echo 'Intents: ' $(wc -l < {DATA_DIR}/nemo_format/dict.intents.csv)
! cat {DATA_DIR}/nemo_format/dict.intents.csv

Intents:  64
alarm_query
alarm_remove
alarm_set
audio_volume_down
audio_volume_mute
audio_volume_up
calendar_query
calendar_remove
calendar_set
cooking_recipe
datetime_convert
datetime_query
email_addcontact
email_query
email_querycontact
email_sendemail
general_affirm
general_commandstop
general_confirm
general_dontcare
general_explain
general_joke
general_negate
general_praise
general_quirky
general_repeat
iot_cleaning
iot_coffee
iot_hue_lightchange
iot_hue_lightdim
iot_hue_lightoff
iot_hue_lighton
iot_hue_lightup
iot_wemo_off
iot_wemo_on
lists_createoradd
lists_query
lists_remove
music_likeness
music_query
music_settings
news_query
play_audiobook
play_game
play_music
play_podcasts
play_radio
qa_currency
qa_definition
qa_factoid
qa_maths
qa_stock
recommendation_events
recommendation_locations
recommendation_movies
social_post
social_query
takeaway_order
takeaway_query
transport_query
transport_taxi
transport_ticket
transport_traffic
weather_query


In [42]:
# print all slots from nemo format slot dictionary
!echo 'Slots: ' $(wc -l < {DATA_DIR}/nemo_format/dict.slots.csv)
! cat {DATA_DIR}/nemo_format/dict.slots.csv

Slots:  55
alarm_type
app_name
artist_name
audiobook_author
audiobook_name
business_name
business_type
change_amount
coffee_type
color_type
cooking_type
currency_name
date
definition_word
device_type
drink_type
email_address
email_folder
event_name
food_type
game_name
game_type
general_frequency
house_place
ingredient
joke_type
list_name
meal_type
media_type
movie_name
movie_type
music_album
music_descriptor
music_genre
news_topic
order_type
person
personal_info
place_name
player_setting
playlist_name
podcast_descriptor
podcast_name
radio_name
relation
song_name
time
time_zone
timeofday
transport_agency
transport_descriptor
transport_name
transport_type
weather_descriptor
O


In [44]:
# examples from the intent training file
! head -n 10 {DATA_DIR}/nemo_format/train.tsv

sentence	label
what alarms do i have set right now	0
checkout today alarm of meeting	0
report alarm settings	0
see see for me the alarms that you have set tomorrow morning	0
is there an alarm for ten am	0
confirm the alarm time	0
show my alarms	0
at what time have you set alarm for me	0
please list active alarms	0


In [46]:
# examples from the slot training file
! head -n 10 {DATA_DIR}/nemo_format/train_slots.tsv

54 54 54 54 54 54 54 54
54 12 54 54 54
54 54 54
54 54 54 54 54 54 54 54 54 54 12 48
54 54 54 54 54 46 46
54 54 54 54
54 54 54
54 54 54 54 54 54 54 54 54
54 54 54 54
54 54 54 54 54 54


# Training model

## Model configuration

Our Joint Intent and Slot classification model is comprised of the pretrained [BERT](https://arxiv.org/pdf/1810.04805.pdf) model with an Intent and Slot Classification layer atop.

All model and training parameters are defined in the **intent_slot_classification_config.yaml** config file. It contains 2 main sections:
- **model**: All arguments that are related to the Model - language model, token classifier, optimizer and schedulers, datasets and any other related information

- **trainer**: Any argument to be passed to PyTorch Lightning

In [56]:
MODEL_CONFIG = "intent_slot_classification_config.yaml"
config_file = f'{EXAMPLE_DIR}/conf/{MODEL_CONFIG}'
print(config_file)
config = OmegaConf.load(config_file)
print(OmegaConf.to_yaml(config))
# print(config)

/Users/vgetselevich/Development/NeMoOriginal/examples/nlp/intent_slot_classification/conf/intent_slot_classification_config.yaml


AttributeError: type object 'OmegaConf' has no attribute 'to_yaml'