Skip to content
Go to file

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

Frame Disambiguation with CrowdTruth


This repository contains a ground truth corpus for semantic frame disambiguation, acquired with crowdsourcing and processed with CrowdTruth metrics that capture ambiguity in annotations by measuring inter-annotator disagreement.

The dataset contains annotations for over 9000 sentence-word pairs from the FrameNet corpus v.1.7, with each sentence-word pair annotated for frame disambiguation by 15 workers. The crowdsourced data was collected from Amazon Mechanical Turk.

The corpus has been referenced in the following papers:

To replicate the data processing from the paper, use the Jupyter Notebook file CrowdTruth metrics.ipynb. It requires the installation of the CrowdTruth metrics Python package (v >= 2.0).

The data aggregated with CrowdTruth metrics is available in folder data/output/

The raw crowdsourcing data is available in folder data/input/

If you find this data useful in your research, please consider citing:

  Author = {Anca Dumitrache and Lora Aroyo and Chris Welty},
  Title = {A Crowdsourced Frame Disambiguation Corpus with Ambiguity},
  Booktitle = {Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL)},
  Year = {2019}


Crowdsourced data for semantic frame disambiguation from sentences.



No packages published
You can’t perform that action at this time.