CORBON 2017 Shared Task data and materials
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
data
Parallel_annotation_guidelines.pdf
README.md

README.md

CORBON-2017-Shared-Task data and materials

This repository contains the datasets and material for the Shared Task on coreference resolution to be held as a part of CORBON workshop at EACL 2017. The description of the shared task can be found here: http://corbon.nlp.ipipan.waw.pl/index.php/shared-task/

This repository contains:

  1. raw parallel data (data/raw): English-German and English-Russian News-Commentary11 raw sentence-aligned corpus (Tiedemann, 2012), split into documents and tokenised by EuroParl tools (Koehn, 2005).

  2. coreference-resolved English data (data/training): English part of News-Commentary11 corpus coref-resolved by Berkeley Entity Resolution System (coref-predict mode) (Durrett and Klein, 2013).

  3. annotation guidelines (parallel_annotation_guidelines.pdf): parallel pronominal coreference annotation guidelines as described in (Grishina and Stede, 2015).

  4. sample annotation (data/sample_annotation): sample files annotated according to the guidelines in (3).

  5. test data for German and Russian (data/test).