This dataset contains naturally-occurring English sentences that feature non-trivial noun-verb ambiguity.
Switch branches/tags
Nothing to show
Clone or download
Emily Pitler
Emily Pitler Fix table formatting.
Latest commit 1f3bb8e Oct 26, 2018
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
CONTRIBUTORS Initial commit. Oct 26, 2018
LICENSE Initial commit. Oct 26, 2018
README.md Fix table formatting. Oct 26, 2018
dev.conll Initial commit. Oct 26, 2018
test.conll Initial commit. Oct 26, 2018
train.conll Initial commit. Oct 26, 2018

README.md

Noun Verb Dataset

This dataset contains naturally-occurring English sentences that feature non-trivial noun-verb ambiguity.

Motivation

English part-of-speech taggers regularly make egregious errors related to noun-verb ambiguity, despite having achieved 97%+ accuracy on the WSJ Penn Treebank since 2002. These mistakes have been difficult to quantify and make taggers less useful to downstream tasks such as translation and text-to-speech synthesis.

Below are some examples from the dataset:

  1. Certain insects can damage plumerias, such as mites, flies, or aphids. NOUN
  2. Mark which area you want to distress. VERB

All tested existing part-of-speech taggers mistag both of these examples, tagging flies as a verb and Mark as a noun.

Description

The dataset contains sentences in CoNLL format. Each sentence has a single token that has been manually annotated as either VERB or NON-VERB. The sentences come from multiple domains. Where applicable, the url of the source page for the sentence is included in a comment line before the sentence.

The dataset is split into Train/Dev/Test sections. For Dev and Test sections the annotations included either VERB or NON-VERB in XPOS, UPOS and FEATS columns. For the Train section, XPOS and UPOS columns are replaced with a (predicted) fine POS tag obtained by running automatic tagger and selecting the top tag that matched the gold coarse-grained VERB/NON-VERB label.

An example of a training sentence is shown below:

# https://www.wikihow.com/Not-Get-Bored-on-a-Long-Car-Ride
1   License _   NN  NN  POS=NON-VERB|fPOS=NON-VERB  -1  _   _   _
2   plates  _   _   _   _   -1  _   _   _
3   of  _   _   _   _   -1  _   _   _
4   cars    _   _   _   _   -1  _   _   _
5   from    _   _   _   _   -1  _   _   _
6   your    _   _   _   _   -1  _   _   _
7   area    _   _   _   _   -1  _   _   _
8   or  _   _   _   _   -1  _   _   _
9   your    _   _   _   _   -1  _   _   _
10  destination _   _   _   _   -1  _   _   _
11  .   _   _   _   _   -1  _   _   _

The number of examples are shown below:

Data Split Train Dev Test
Total 23458 2367 5907

These numbers are slightly lower than those reported in Table 5 of the paper by 796, 33, and 93 examples, respectively.

Updated Results

We reran the experiments reported in Tables 1 and 9 of the paper using the slightly filtered data. The updated results are reported as means and standard deviations based on an average of 10 runs:

Model WSJ NV Homographs (Micro) Homographs (Macro)
WSJ 98.07±0.06 73.9±1.5 95.7±0.3 95.8±0.3
WSJ+ELMo 97.99±0.06 81.5±1.1 96.8±0.2 96.8±0.2
WSJ+Silver 98.00±0.07 86.4±0.5 96.0±0.2 96.1±0.2
WSJ+ELMo+Silver 98.01±0.06 88.7±0.4 96.7±0.2 96.8±0.2

An updated version of Table 7, tuned on the Noun-Verb development set, is as follows:

Model WSJ NV
WSJ 98.03±0.10 76.9±0.5
WSJ+ELMo 97.90±0.09 83.4±0.3
WSJ+Silver 97.97±0.09 87.1±0.3
WSJ+ELMo+Silver 97.95±0.08 89.1±0.2

Contact

If you have a technical question regarding the dataset or publication, please create an issue in this repository.

Citation

If you use or discuss this dataset in your work, please cite our paper:

@InProceedings{NOUNVERB,
  title = {A Challenge Set and Methods for Noun-Verb Ambiguity},
  author = {Ali Elkahky and Kellie Webster and Daniel Andor and Emily Pitler},
  booktitle = {Proceedings of EMNLP},
  year = {2018}
}