Resolving annotation conflicts #27

hiroshinoji · 2016-02-16T09:48:49Z

Currently, if we apply two annotators which annotate the same element, both are added to the result. Stanford CoreNLP instead overrides the old annotation. Following this, I implemented a method that checks whether there already exist the same elements when adding XML elements. Such duplicate occurs, e.g., when running a joint parser of POS and tree after applying POS tagger.

I plan to push this modification but I was also wondering this overriding method is the best way to resolve conflicts. Maybe it's better also to output some warnings, but this may be future work.

hiroshinoji · 2016-02-16T10:09:19Z

Now CabochaAnnotator replaces the old annotations (chunks and dependencies) if exist.
323a3b0#diff-9b2b4b9eb3146599a3ce60c12afa4ddeR46

hiroshinoji · 2016-02-18T02:50:41Z

Another option:

add an option to remain the old annotations;
distinguish different annotations with the same tag using attribute.

Anyway, each annotation should have an attribute recording the used annotator, e.g.:

<tokens annotators="juman">...</tokens>
<tokens annotators="knp">...</tokens>

hiroshinoji · 2016-03-16T07:27:34Z

I've changed this behavior of cabocha in d17b751 to remain the old annotation, because now annotator name (cabocha) is recorded on every element.

It may be better to support some option to decide whether leaving or replacing the old annotation as in -knp.replaceJumanTokens.

Generally, remaining the same type of annotations with different annotators seems to make the lower-level processing a bit complicated, so the default behavior might be better to replace the old annotation.

hiroshinoji added the discussion label Feb 17, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Resolving annotation conflicts #27

Resolving annotation conflicts #27

hiroshinoji commented Feb 16, 2016

hiroshinoji commented Feb 16, 2016

hiroshinoji commented Feb 18, 2016

hiroshinoji commented Mar 16, 2016

Resolving annotation conflicts #27

Resolving annotation conflicts #27

Comments

hiroshinoji commented Feb 16, 2016

hiroshinoji commented Feb 16, 2016

hiroshinoji commented Feb 18, 2016

hiroshinoji commented Mar 16, 2016