Skip to content
Switch branches/tags


Failed to load latest commit information.
Latest commit message
Commit time

NLU datasets with task-oriented dialogue

Datasets of natural language understanding and dialogue state tracking for task-oriented dialogue, which can be used in research. There are some other survey of datasets in respective of diaogue system, like AtmaHou's Task-Oriented-Dialogue-Dataset-Survey (I am one of the contributors). But we focus on how to build a semantic parser for spoken dialogue system.

If you want to know more about NLU of task-oriented dialogue, please see recommended papers.

There is an implementation of joint training of slot filling and intent detection for SLU, which is evaluated on ATIS and SNIPS datasets.

Table of Contents


Items description example
NLU Natural Language Understanding, which should contains text classification, sequence labelling and semantic parsing tasks.
DST Dialogue State Tracking DSTC 2
domain dialogue domain movie, music, flight, restaurant, ...
intent It an abstract meaning which always refers to a sentence or sub-sentence. The intent of "show me a movie named Titanic" is "find_movie"
slot It is attribute or key, which should have a value. "show me a movie named Titanic" has a slot-value pair "movie_name = Titanic"
act type a general speech action inform, deny, confirm, request, hello, bye, ...
dialogue act act_type(slot=value,...), inform(movie_name = Titanic), request(price), ...
  • Intent Detection or intent classification: sentence classification task
  • Slot Tagging: sequence labelling task
  • Slot Filling: It equals to slot tagging if all values of slots can be aligned into input sentence. Otherwise, the value of slot should be predicted in a classification or generation way.

Datasets with single turn (not a dialogue)

dataset domain semantic annotation tasks url
ATIS book flight intent, slot Intent classification, slot tagging
MIT corpus Restaurant & Movie slot slot tagging
SNIPS Playlist, Restaurant, Weather, Music, RateBook, etc. intent, slot Intent classification, slot tagging
facebook TOP semantic parsing navigation and event hierarchical intent, slot constituency parsing,
Facebook Multilingual Task Oriented Dataset ALARM, REMINDER, and WEATHER intent,slot Intent classification, slot tagging
snips_slu_data_v1.0 SmartLights, SmartSpeaker intent,slot Intent classification, slot tagging
SMP2017-ECDT (in Chinese) flight, hotel, Chit-chat intent Intent classification,
E-commerce Shopping Assistant (ECSA) (in Chinese) E-commerce Shopping slot slot tagging
Clinc Intent Detection Banking, Work, Meta, Auto, Travel, Home, Utility, Kitchen, Small Talk, Credit Cards intent Intent classification and out-of-scope detection
FewJoint (in Chinese) Many domains for few-shot learning intent, slot Intent classification, slot tagging Dataset; Baseline

Datasets with multiple turns (dialogue with context)

dataset #domains cross_domains semantic annotation NLU/DST tasks url
cam DSTC 2&3 2 No dialogue act NLU (slot filling), DST (slot-value pairs)
DSTC 4 ~5 Yes speech action, slot NLU (slot tagging), DST (slot-value pairs) (challenge participants only)
google Sim-R/Sim-M/Sim-gen 3 No act type, slot NLU (slot tagging), DST (slot-value pairs)
cam MultiWOZ 2.0/2.1 5 yes multi-domains, slot-value pairs DST (slot-value pairs)
maluuba Frames 1 No intent, dialogue act NLU (intent classification, slot tagging), DST (slot-value pairs)
Microsoft Dialogue Challenge 3 No dialogue act NLU (slot tagging)
dstc8-schema-guided-dialogue 17 Yes multi-domains, slot-value pairs, request-slots DST
MultiDoGo 6 Yes over 81K dialogues harvested across six domains NLU, DST
Taskmaster-1/2 6+7 No 13,215 + 17,289 task-based dialogs comprising multiple domains NLU/DST
CrossWOZ(In Chinese) 5 Yes 5,012 task-based dialogs comprising five domains NLU/DST


More information about each dataset.


  • single turn;
  • input sentences: natural language;
  • data size (single domain of "flight information searching"):
    • training set: 4978 utterances;
    • test set: 893 utterances;
  • semantic annotation: intent (sentence class), slot (sequence labelling)
    • intent number: 18
    • slot number: 83
  • Download:

MIT corpus

  • single turn;
  • input sentences: natural language;
  • data size:
    • MIT_Restaurant domain:
      • training set: 7660 utterances;
      • test set: 1521 utterances;
    • MIT_Movie domain (simple query):
      • training set: 9775 utterances;
      • test set: 2443 utterances;
    • MIT_Movie domain (complex query):
      • training set: 7816 utterances;
      • test set: 1953 utterances;
  • semantic annotation: slot (sequence labelling)
  • Download:


TOP semantic parsing

  • single turn;
  • input sentences: natural language;
  • data size:
    • training set: 35741 queries
    • test set: 9042 queries
  • semantic annotation: hierarchical intents, slot (it is a tree)
    • intent number: 25
    • slot number: 36
  • Download:

SMP2017-ECDT (in Chinese)

DSTC 2&3

  • multiple turns: human-machine dialogues;
  • input sentences:
    • transcription by human;
    • ASR output: n-best, word confusion network;
  • data size:
    • DSTC 2 (Restaurant Information Domain): source domain
      • training set: about 2k dialogues;
      • test set: about 1k dialogues;
    • DSTC 3 (Tourist Information Domain): extented domain
      • seed data: about 10 dialogues;
      • test set: about 2k dialogues;
  • semantic annotation: dialogue act
    • DSTC 2: 8 slots;
    • DSTC 3: 13 slots;
  • Download:


  • multiple turns: human-human dialogues;
  • input sentences: natural language, transcription by human;
  • data size:
    • This data is about touristic information for Singapore collected from Skype calls.
    • 35 dialogs sum up to 31,034 utterances and 273,580 words
  • semantic annotation: speech action, slot, dialogue state (slot-value pairs) in sub-dialogue level
  • Download: challenge participants only,

google Sim-R/Sim-M/Sim-gen

  • multiple turns: conversations between an agent and a simulated user;
  • input sentences: natural language;
  • data size:
Dataset Slots Train Dev Test
Sim-R (Restaurant) price_range, location, restaurant_name,
category, num_people, date, time
1116 349 775
Sim-M (Movie) theatre_name, movie, date, time,
384 120 264
Sim-GEN (Movie) theatre_name, movie, date, time,
100K 10K 10K

cam MultiWOZ 2.0/2.1

  • multiple turns: human-human dialogues collected in the way of WOZ (Wizard-of-Oz);
  • input sentences: natural language;
  • data size: There are 3,406 single-domain dialogues that include booking if the domain allows for that and 7,032 multi-domain dialogues consisting of at least 2 up to 5 domains.
  • semantic annotation: dialogue state (slot-value pairs)
  • Download:

maluuba Frames

Microsoft Dialogue Challenge

  • multiple turns:
    • human-human dialogues collected via Amazon Mechanical Turk;
    • Built-in user simulators are provided;
  • input sentences: natural language;
  • data size:
Task Intents Slots Dialogues
Movie-Ticket Booking 11 29 2890
Restaurant Reservation 11 30 4103
Taxi Ordering 11 29 3094