Skip to content
Copora for evaluating NLU Services/Platforms such as Dialogflow, LUIS, Watson, Rasa etc.
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.
AnnotatedData updated data csv with removed internal test columns such as ml_xxx an… Apr 30, 2019
Collected-Original-Data initial commit of NLU data Feb 11, 2019
CrossValidation modified readme, added cross validation data Feb 12, 2019
LICENSE initial commit of NLU data Feb 11, 2019


This project contains natural languageg data for human-robot interaction in home domain which we collected and annotated for evaluating NLU Services/platforms.

If you use the data and publish the results please let us know and cite our IWSDS 2019 paper:

(It is also available at arXiv.)

  author    = {Xingkun Liu, Arash Eshghi, Pawel Swietojanski and Verena Rieser},
  title     = {Benchmarking Natural Language Understanding Services for building Conversational Agents},
  booktitle = {Proceedings of the Tenth International Workshop on Spoken Dialogue Systems Technology (IWSDS)},
  month     = {April},
  year      = {2019},
  address   = {Ortigia, Siracusa (SR), Italy},
  publisher = {Springer},
  pages     = {xxx--xxx},
  url       = {http://www.xx.xx/xx/}


All data are released under the CC BY-SA 3.0 license.


It contains

  1. Collected-Original-Data (25K): collected original data with normalization for numbers/date etc which contain the pre-designed human-robot interaction questions and the user answers. They are organized in CSV format.

  2. AnnotatedData (25716 Lines): annotated for Intents and Entities, organized in csv format.

    The annotated csv file has following columns: userid, answerid, scenario, intent, status, answer_annotation, notes, suggested_entities, answer_normalised, answer, question

    Most of them come from the original data collection, we keep them here for monitoring of the afterwards processing.

    "answer" contains the original user answers.
    "answer_normalised" were normalised from "answer".
    "notes" was used for the annotators. They put some notes there if they have changed anything.
    "status" was used for annotation and post processing. The utterance will be ignored by the post processing scripts if the column content starts with 'IRR_'.
    "answer_annotation" contains the annotated results, it will be used for generating the train/test datasets, along with "scenario", "intent" and "status".

  3. The 10-fold cross-validation we used (here for reference only)

NB: The CSV file uses Semicolon(;) as the field/column delimiter! It may mess up with the data if using Colon(,).

CrossValidation contains the generated data for different NLU services we used for our evaluations which are uploaded here for reference only as they can be generated from the annotated csv data using our scripts. NB: the script will shuffle the data each time when runing the script, so the generated data may not be exact the same each time.

autoGeneFromRealAnno/: generated trainset and testset from the annotated csv file. The other four subdirectories (out4ApiaiReal, out4LuisReal, out4RasaReal and out4WatsonReal) in CrossValidation/ are the converted NLU service input data for Dialogflow, LUIS, Rasa and Watson respectively. The inside 'merged' directory contains the trainning input data.

Scripts for the Data Preparation and Evaluation

Java for preparing the data, evaluating the performances and Python scripts for querying the Services/Platforms are Here


Please contact, if you have any questions

You can’t perform that action at this time.