Slot filling and intent detection tasks of spoken language understanding

Basic models for slot filling and intent detection:
- An implementation for "focus" part of the paper "Encoder-decoder with focus-mechanism for sequence labelling based spoken language understanding".
- An implementation of BLSTM-CRF based on jiesutd/NCRFpp
- An implementation of joint training of slot filling and intent detection tasks (Bing Liu and Ian Lane, 2016).
Basic models + ELMo / BERT / XLNET
Tutorials on ATIS, SNIPS and MIT_Restaurant_Movie_corpus(w/o intent) and E-commerce Shopping Assistant (ECSA) from Alibaba(w/o intent, in Chinese) datasets.

Setup

python 3.6.x
pytorch 1.1
pip install gpustat [if gpu is used]
embeddings: pip install embeddings
ELMo in allennlp: pip install allennlp
BERT/XLNET in transformers: pip install transformers

About the evaluations of intent detection on ATIS and SNIPS datasets.

As we can know from the datasets, ATIS may have multiple intents for one utterance while SNIPS has only one intent for one utterance. For example, "show me all flights and fares from denver to san francisco <=> atis_flight && atis_airfare". Therefore, there is a public trick in the training and evaluation stages for intent detection of ATIS dataset.

NOTE!!!: Impacted by the paper "What is left to be understood in ATIS?", almost all works about ATIS choose the first intent as the label to train a "softmax" intent classifier. In the evaluation stage, it will be viewed as correct if the predicted intent is one of the multiple intents.

TODO:

Add char-embeddings

Tutorials A: Slot filling and intent detection with pretrained word embeddings

Pretrained word embeddings are borrowed from CNN-BLSTM language models of ELMo where word embeddings are modelled by char-CNNs. We extract the pretrained word embeddings for ATIS, SNIPS and MIT_Restaurant_Movie_corpus(w/o intent) datasets by:

  python3 scripts/get_ELMo_word_embedding_for_a_dataset.py \
          --in_files data/atis-2/{train,valid,test} \
          --output_word2vec local/word_embeddings/elmo_1024_cased_for_atis.txt
  python3 scripts/get_ELMo_word_embedding_for_a_dataset.py \
          --in_files data/snips/{train,valid,test} \
          --output_word2vec local/word_embeddings/elmo_1024_cased_for_snips.txt
  python3 scripts/get_ELMo_word_embedding_for_a_dataset.py \
          --in_files data/MIT_corpus/{movie_eng,movie_trivia10k13,restaurant}/{train,valid,test} \
          --output_word2vec local/word_embeddings/elmo_1024_cased_for_MIT_corpus.txt

, or use Glove and KazumaChar embeddings which are also exploited in the TRADE dialogue state tracker:

  python3 scripts/get_Glove-KazumaChar_word_embedding_for_a_dataset.py \
          --in_files data/atis-2/{train,valid,test} \
          --output_word2vec local/word_embeddings/glove-kazumachar_400_cased_for_atis.txt
  python3 scripts/get_Glove-KazumaChar_word_embedding_for_a_dataset.py \
          --in_files data/snips/{train,valid,test} \
          --output_word2vec local/word_embeddings/glove-kazumachar_400_cased_for_snips.txt
  python3 scripts/get_Glove-KazumaChar_word_embedding_for_a_dataset.py \
          --in_files data/MIT_corpus/{movie_eng,movie_trivia10k13,restaurant}/{train,valid,test} \
          --output_word2vec local/word_embeddings/glove-kazumachar_400_cased_for_MIT_corpus.txt

Run scripts of training and evaluation at each epoch.

BLSTM model:

bash run/atis_with_pretrained_word_embeddings.sh slot_tagger
bash run/snips_with_pretrained_word_embeddings.sh slot_tagger
bash run/MIT_corpus_with_pretrained_word_embeddings.sh slot_tagger

BLSTM-CRF model:

bash run/atis_with_pretrained_word_embeddings.sh slot_tagger_with_crf
bash run/snips_with_pretrained_word_embeddings.sh slot_tagger_with_crf
bash run/MIT_corpus_with_pretrained_word_embeddings.sh slot_tagger_with_crf

Enc-dec focus model (BLSTM-LSTM), the same as Encoder-Decoder NN (with aligned inputs)(Liu and Lane, 2016):

bash run/atis_with_pretrained_word_embeddings.sh slot_tagger_with_focus
bash run/snips_with_pretrained_word_embeddings.sh slot_tagger_with_focus
bash run/MIT_corpus_with_pretrained_word_embeddings.sh slot_tagger_with_focus

Tutorials B: Slot filling and intent detection with ELMo

Run scripts of training and evaluation at each epoch.

ELMo + BLSTM/BLSTM-CRF/Enc-dec focus model (BLSTM-LSTM) models:

slot_intent_model=slot_tagger # slot_tagger, slot_tagger_with_crf, slot_tagger_with_focus
bash run/atis_with_elmo.sh ${slot_intent_model}
bash run/snips_with_elmo.sh ${slot_intent_model}
bash run/MIT_corpus_with_elmo.sh ${slot_intent_model}

Tutorials C: Slot filling and intent detection with BERT

Model architectures:

Joint BERT or "with pure BERT":

Our BERT + BLSTM (BLSTM-CRF\Enc-dec focus):

Run scripts of training and evaluation at each epoch.

Pure BERT (without or with crf) model:

slot_model=NN # NN, NN_crf
intent_input=CLS # none, CLS, max, CLS_max
bash run/atis_with_pure_bert.sh ${slot_model} ${intent_input}
bash run/snips_with_pure_bert.sh ${slot_model} ${intent_input}
bash run/MIT_corpus_with_pure_bert.sh ${slot_model} ${intent_input}

BERT + BLSTM/BLSTM-CRF/Enc-dec focus model (BLSTM-LSTM) models:

slot_intent_model=slot_tagger # slot_tagger, slot_tagger_with_crf, slot_tagger_with_focus
bash run/atis_with_bert.sh ${slot_intent_model}
bash run/snips_with_bert.sh ${slot_intent_model}
bash run/MIT_corpus_with_bert.sh ${slot_intent_model}

For optimizer, you can try BertAdam and AdamW. In my experiments, I choose to use BertAdam.

Tutorials D: Slot filling and intent detection with XLNET

Run scripts of training and evaluation at each epoch.

Pure XLNET (without or with crf) model:

slot_model=NN # NN, NN_crf
intent_input=CLS # none, CLS, max, CLS_max
bash run/atis_with_pure_xlnet.sh ${slot_model} ${intent_input}
bash run/snips_with_pure_xlnet.sh ${slot_model} ${intent_input}
bash run/MIT_corpus_with_pure_xlnet.sh ${slot_model} ${intent_input}

XLNET + BLSTM/BLSTM-CRF/Enc-dec focus model (BLSTM-LSTM) models:

slot_intent_model=slot_tagger # slot_tagger, slot_tagger_with_crf, slot_tagger_with_focus
bash run/atis_with_xlnet.sh ${slot_intent_model}
bash run/snips_with_xlnet.sh ${slot_intent_model}
bash run/MIT_corpus_with_xlnet.sh ${slot_intent_model}

For optimizer, you can try BertAdam and AdamW.

Results:

For "NLU + BERT/XLNET" models, hyper-parameters are not tuned carefully.

Results of ATIS:

models	intent Acc (%)	slot F1-score (%)
[Atten. enc-dec NN with aligned inputs](Liu and Lane, 2016)	98.43	95.87
[Atten.-BiRNN](Liu and Lane, 2016)	98.21	95.98
[Enc-dec focus](Zhu and Yu, 2017)	-	95.79
[Slot-Gated](Goo et al., 2018)	94.1	95.2
Intent Gating & self-attention	98.77	96.52
BLSTM-CRF + ELMo	97.42	95.62
Joint BERT	97.5	96.1
Joint BERT + CRF	97.9	96.0
BLSTM (A. Pre-train word emb. of ELMo)	98.10	95.67
BLSTM-CRF (A. Pre-train word emb. of ELMo)	98.54	95.39
Enc-dec focus (A. Pre-train word emb. of ELMo)	98.43	95.78
BLSTM (A. Pre-train word emb. of Glove & KazumaChar)	98.66	95.55
BLSTM-CRF (A. Pre-train word emb. of Glove & KazumaChar)	98.21	95.74
Enc-dec focus (A. Pre-train word emb. of Glove & KazumaChar)	98.66	95.86
BLSTM (B. +ELMo)	98.66	95.52
BLSTM-CRF (B. +ELMo)	98.32	95.62
Enc-dec focus (B. +ELMo)	98.66	95.70
BLSTM (C. +BERT)	99.10	95.94
BLSTM (D. +XLNET)	98.77	96.08

Results of SNIPS:

Cased BERT-base model gives better result than uncased model.

models	intent Acc (%)	slot F1-score (%)
[Slot-Gated](Goo et al., 2018)	97.0	88.8
BLSTM-CRF + ELMo	99.29	93.90
Joint BERT	98.6	97.0
Joint BERT + CRF	98.4	96.7
BLSTM (A. Pre-train word emb. of ELMo)	99.14	95.75
BLSTM-CRF (A. Pre-train word emb. of ELMo)	99.00	96.92
Enc-dec focus (A. Pre-train word emb. of ELMo)	98.71	96.22
BLSTM (A. Pre-train word emb. of Glove & KazumaChar)	99.14	96.24
BLSTM-CRF (A. Pre-train word emb. of Glove & KazumaChar)	98.86	96.31
Enc-dec focus (A. Pre-train word emb. of Glove & KazumaChar)	98.43	96.06
BLSTM (B. +ELMo)	98.71	96.32
BLSTM-CRF (B. +ELMo)	98.57	96.61
Enc-dec focus (B. +ELMo)	99.14	96.69
BLSTM (C. +BERT)	98.86	96.92
BLSTM-CRF (C. +BERT)	98.86	97.00
Enc-dec focus (C. +BERT)	98.71	97.17
BLSTM (D. +XLNET)	98.86	97.05

Slot F1-scores of MIT_Restaurant_Movie_corpus(w/o intent):

models	Restaurant	Movie_eng	Movie_trivia10k13
Dom-Gen-Adv	74.25	83.03	63.51
Joint Dom Spec & Gen-Adv	74.47	85.33	65.33
Data Augmentation via Joint Variational Generation	73.0	82.9	65.7
BLSTM (A. Pre-train word emb. of ELMo)	77.54	85.37	67.97
BLSTM-CRF (A. Pre-train word emb. of ELMo)	79.77	87.36	71.83
Enc-dec focus (A. Pre-train word emb. of ELMo)	78.77	86.68	70.85
BLSTM (A. Pre-train word emb. of Glove & KazumaChar)	78.02	86.33	68.55
BLSTM-CRF (A. Pre-train word emb. of Glove & KazumaChar)	79.84	87.61	71.90
Enc-dec focus (A. Pre-train word emb. of Glove & KazumaChar)	79.98	86.82	71.10

Slot F1-scores of E-commerce Shopping Assistant (ECSA) from Alibaba(w/o intent, in Chinese):

models slot F1-score (%)

Basic BiLSTM-CRF 43.02

Pure BERT 46.96

Pure BERT-CRF 47.75

Inference Mode

An example here:

bash run/atis_with_pretrained_word_embeddings_for_inference_mode__an_example.sh

Reference

Su Zhu and Kai Yu, "Encoder-decoder with focus-mechanism for sequence labelling based spoken language understanding," in IEEE International Conference on Acoustics, Speech and Signal Processing(ICASSP), 2017, pp. 5675-5679.

Name		Name	Last commit message	Last commit date
Latest commit History 79 Commits
data		data
exp/model_slot_tagger__and__hiddenAttention__and__single_cls_CE/data_atis/bidir_True__emb_dim_400__hid_dim_200_x_1__bs_20__dropout_0.5__optimizer_adam__lr_0.001__mn_5.0__me_50__tes_100__alpha_0.5__preEmb_in		exp/model_slot_tagger__and__hiddenAttention__and__single_cls_CE/data_atis/bidir_True__emb_dim_400__hid_dim_200_x_1__bs_20__dropout_0.5__optimizer_adam__lr_0.001__mn_5.0__me_50__tes_100__alpha_0.5__preEmb_in
figs		figs
models		models
run		run
scripts		scripts
tests		tests
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
path.sh		path.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Slot filling and intent detection tasks of spoken language understanding

Setup

About the evaluations of intent detection on ATIS and SNIPS datasets.

TODO:

Tutorials A: Slot filling and intent detection with pretrained word embeddings

Tutorials B: Slot filling and intent detection with ELMo

Tutorials C: Slot filling and intent detection with BERT

Tutorials D: Slot filling and intent detection with XLNET

Results:

Inference Mode

Reference

About

Releases

Packages

Languages

models	slot F1-score (%)
Basic BiLSTM-CRF	43.02
Pure BERT	46.96
Pure BERT-CRF	47.75

License

OguzCennet/slot_filling_and_intent_detection_of_SLU

Folders and files

Latest commit

History

Repository files navigation

Slot filling and intent detection tasks of spoken language understanding

Setup

About the evaluations of intent detection on ATIS and SNIPS datasets.

TODO:

Tutorials A: Slot filling and intent detection with pretrained word embeddings

Tutorials B: Slot filling and intent detection with ELMo

Tutorials C: Slot filling and intent detection with BERT

Tutorials D: Slot filling and intent detection with XLNET

Results:

Inference Mode

Reference

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages