Skip to content

Latest commit

 

History

History
 
 

egs

Examples

All examples are under directory egs and named by its name of dataset. All data-sets starts with "mock" are data-sets for test.

Examples for NLP

DataSet Supported Tasks Description
ATIS Sequence labeling/ Text classification/ NLU joint learning Air Travel Information System (ATIS) pilot corpus.
CoNLL2003 Sequence labeling The CoNLL 2003 NER task consists of newswire text from the Reuters RCV1 corpus tagged with four different entity types (PER, LOC, ORG, MISC).
MSRA_NER Sequence labeling MSRA datasets are in the news domain about NER.
SNIL Sentence Matching Stanford Natural Language Inference corpus is a new, freely available collection of labeled sentence pairs, written by humans doing a novel grounded task based on image captioning.
Quora_QP Sentence Matching Data collected from the quara platform. Quora is a place to gain and share knowledge—about anything.
Yahoo_Answer Document Classification Yahoo answers are obtained from (Zhang et al., 2015). This is a topic classification task with 10 classes. The document we use includes question titles, question contexts and best answers.
Trec Document Classification This data collection contains all the data used in our learning question classification experiments,which has question class definitions.

Examples for Speech

DataSet Supported Tasks Description
hkust ASR
voxceleb Speaker Verfication
sre16 Speaker Verfication
iemocap Emotion