Support evaluating with incomplete gold standards #17

jnothman · 2015-08-04T11:57:56Z

A user on the TAC mailing list wished to evaluate a mix of 2014 and pre-2014 EL tasks, such that precision should indicate precision of linking/clustering, ignoring spurious mentions.

While it is not hard to remove spurious mentions using grep, this may be worth facilitating in one of the following ways:

a command to output the subset of a dataset that aligns to the gold standard.
an --ignore-spurious flag to evaluate, significance and confidence.
an is_aligned attribute on Annotations that is True for all gold annotations and set with respect to some gold standard when loading annotations from a system output.

(3.) would appear to be most flexible and in line with current design.

The text was updated successfully, but these errors were encountered:

jnothman added the 0 - Backlog label Sep 18, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support evaluating with incomplete gold standards #17

Support evaluating with incomplete gold standards #17

jnothman commented Aug 4, 2015 •

edited

Loading

Support evaluating with incomplete gold standards #17

Support evaluating with incomplete gold standards #17

Comments

jnothman commented Aug 4, 2015 • edited Loading

jnothman commented Aug 4, 2015 •

edited

Loading