-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handling Missing Annotations on certain sentence #27
Comments
Hi, I'm not sure what you mean exactly. Can you give an example? |
Example: (to generate .m2 file like this one, assuming two annotators and some sentences having only single annotator) |
Aha right ok. So in some files (mainly NUCLE and CoNLL) whenever an annotator is "missing", the implication is that they made no changes to the sentence. So in your above example, Annotator 1 thought the second sentence is already correct, and Annotator 0 thought the third sentence is already correct. If it makes things easier, you can add a noop edit for each "missing" annotator to explicitly indicate that they made no changes to the sentence; e.g. |
I think the case you mentioned can be handled by the current errant_parallel apis by making correction= source sentence. In my case, all the annotators have not annotated all the sentences. |
Ah I see. Yes, ERRANT was never designed for this as it's generally assumed that all annotators will annotate the same number of sentences otherwise correct sentences are indistinguishable from unannotated sentences. There's no easy way around this; I wouldn't want to add a special symbol for missing annotations because this would also affect other datasets too. The only other option I can think of is to change the annotator IDs based on how many annotations there are; e.g. In Sentence1, all 3 annotators saw the sentence and made edits. This kind of structure will work in ERRANT, but note that the M2 annotator IDs will no longer refer to a specific annotator (if that's important, which it usually isn't). |
I am not able to generate m2 files for the case when annotations are missing for certain sentences for some of the annotators. Choosing orig==annotated has its side effects. Am I missing something?
The text was updated successfully, but these errors were encountered: