Overview
Many Natural Language Processing (NLP) tasks depend on using Named Entities (NEs) that are contained in texts and in external knowledge sources. While this is easy for humans, the present neural methods that rely on learned word embeddings may not perform well for these NLP tasks, especially in the presence of Out-Of-Vocabulary (OOV) or rare NEs. The datasets contain extended versions of dialog bAbI tasks 1,2 and 4 and OOV versions of the CBT test set.
NE-Table: A Neural key-value table for Named Entities, RANLP 2019
Janarthanan Rajendran*, Jatin Ganhotra*, Xiaoxiao Guo, Mo Yu, Satinder Singh and Lazaros Polymenakos
https://dblp.org/rec/conf/ranlp/RajendranGGYSP19
(*Equal Contribution)
Extended Dialog bAbI tasks
Adaptation of the "Dialog bAbI tasks data" dataset released by Facebook, available at https://research.fb.com/downloads/babi/, under the CC BY 3.0 Unported license, available at https://creativecommons.org/licenses/by/3.0/legalcode
CBT-OOV datasets
Adaptation of the "The Children's Book Test (CBT)" dataset released by Facebook, available at https://research.fb.com/downloads/babi/, under the GNU Free Documentation License (Version 1.3, 3 November 2008), available at https://www.gnu.org/licenses/fdl-1.3.en.html
License
The dataset is released under CC BY-SA 4.0 license. For the full license, see LICENSE.txt. Please cite the following paper if you use this dataset in your work
@inproceedings{DBLP:conf/ranlp/RajendranGGYSP19,
author = {Janarthanan Rajendran and
Jatin Ganhotra and
Xiaoxiao Guo and
Mo Yu and
Satinder Singh and
Lazaros Polymenakos},
editor = {Ruslan Mitkov and
Galia Angelova},
title = {NE-Table: {A} Neural key-value table for Named Entities},
booktitle = {Proceedings of the International Conference on Recent Advances in
Natural Language Processing, {RANLP} 2019, Varna, Bulgaria, September
2-4, 2019},
pages = {980--993},
publisher = {{INCOMA} Ltd.},
year = {2019},
url = {https://doi.org/10.26615/978-954-452-056-4\_114},
doi = {10.26615/978-954-452-056-4\_114},
timestamp = {Fri, 31 Jan 2020 12:36:51 +0100},
biburl = {https://dblp.org/rec/conf/ranlp/RajendranGGYSP19.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
Contact
For more details on the datasets, see the paper
NE-Table: A Neural key-value table for Named Entities, RANLP 2019
Janarthanan Rajendran*, Jatin Ganhotra*, Xiaoxiao Guo, Mo Yu, Satinder Singh and Lazaros Polymenakos
https://dblp.org/rec/conf/ranlp/RajendranGGYSP19
(*Equal Contribution)
For questions on Extended Dialog bAbI tasks, contact Janarthanan Rajendran : rjana (at) umich (dot) edu
For questions on CBT-OOV dataset, contact Jatin Ganhotra : jatinganhotra (at) us (dot) ibm (dot) com
Dataset Metadata
The following table is necessary for this dataset to be indexed by search engines such as Google Dataset Search.
property | value | ||||||
---|---|---|---|---|---|---|---|
name | Extended dialog bAbI tasks and CBT-OOV datasets |
||||||
alternateName | Extended dialog bAbI tasks 1, 2 and 4 and OOV versions of the CBT test set |
||||||
url | https://github.com/IBM/ne-table-datasets |
||||||
sameAs | https://github.com/IBM/ne-table-datasets |
||||||
description | Many Natural Language Processing (NLP) tasks depend on using Named Entities (NEs) that are contained in texts and in external knowledge sources. While this is easy for humans, the present neural methods that rely on learned word embeddings may not perform well for these NLP tasks, especially in the presence of Out-Of-Vocabulary (OOV) or rare NEs. The datasets contain extended versions of dialog bAbI tasks 1,2 and 4 and OOV versions of the CBT test set. |
||||||
provider |
|
||||||
citation | https://dblp.org/rec/conf/ranlp/RajendranGGYSP19 |