Skip to content
/ qa-it Public

Classification of Non-referential It on Question Answer Pairs

License

Notifications You must be signed in to change notification settings

emorynlp/qa-it

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

52 Commits
 
 
 
 
 
 

Repository files navigation

QA-It: Non-referential It for Question Answering

The QA-It dataset provides manual annotation for the classification of non-referential it. Our dataset is unique because it is annotated on question answer pairs collected from multiple genres, which is useful for developing advanced question answering systems. Our annotation scheme makes clear distinctions between 4 types of pronominal it, and provides guidelines for many erroneous cases.

Citation

Format

Our corpus is in the TSV format, where each column contains the following contents:

  • 0: the source the data.
  • 1: the genre.
  • 2: the document size (token count).
  • 3: the total count of pronominal it.
  • 4: a question answer pair, where the question and the answer are delimited by >>>>> and all pronominal its are surrounded by double square brackets, [[it]].
  • 5: the classes of the pronominal its in the question answer pair, where each class is delimited by , (e.g., 1,2 implies that the classes of the first and the second pronominal its are 1 and 2, respectively).

The followings describe the meaning of each class:

  • 1: Non-referential (pleonastic)
  • 2: Referential - nominal
  • 3: Referential - others
  • 4: Error

Acknowledgement

We gratefully acknowledge the support from the Emory URC Award. Any contents in this material are those of the authors and do not necessarily reflect the views of Emory University.

Contact

About

Classification of Non-referential It on Question Answer Pairs

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published