-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Doccano is duplicating the words displayed for some unknown reason #1105
Comments
Issue-Label Bot is automatically applying the label Links: app homepage, dashboard and code for this bot. |
Could you please show me an example of pre-labeled data? It helps me a lot to investigate the problem. |
Hi @Hironsan, sure, here you go:
|
This one is when I imported the pre-labelled text into doccano. After labelling, somehow it had duplicates. {"id": 6111, "text": "10. The“Adsorption (2018) 24:691” article should be acknowledged , it discussed the same material and separation but from a different approach. Similarly,“Chem. Mater., 2018, 30 (2), 447-455” also discussed the same material for a novel application, which should be briefly incorporated into the introduction about the target materials.", "meta": {}, "annotation_approver": "xxx@yyy.com", "labels": [[7, 33, "LOCATION"], [51, 64, "ACTION"], [42, 48, "MODAL"], [113, 116, "TRIGGER"], [42, 48, "MODAL"], [52, 64, "ACTION"], [4, 41, "CONTENT"], [154, 191, "CONTENT"], [256, 262, "MODAL"], [274, 286, "ACTION"], [297, 309, "LOCATION"]]} |
Thank you. I looked at the data and found I have a hypothesis. Did you turn on the |
Collaborative annotation is now checked, but I do not remember if I checked it when I created the project and imported the dataset, or afterwards. Shall I disable it and try to export again? |
Ok, the good news is that once I disable the "Collaborative annotation", I do not see the duplicates in doccano any longer, however, when I export them from doccano, there are duplicates in the json file. |
The way to reproduce the problem
Problem
class SequenceAnnotation(Annotation):
...
class Meta:
unique_together = ('document', 'user', 'label', 'start_offset', 'end_offset') Solution ideaI can't come up with anything right now. I will think the solution. |
Thanks @Hironsan ! The main issue I have now is that even after disabling the share annotation option, the view is fixed for the user B, however while exporting the annotations in JSON, the two are still shown. To bypass the problem, I am querying directly the PostgreSQL db with a WHERE cause user_id = user B. Looking forward to a cleaner solution. |
I have imported a pre-labeled dataset to doccano and asked some colleagues to check and fix the annotations. In some cases, the sentences displayed in doccano have one annotated word duplicated, see the following screenshots:
I open doccano, the word should is duplicated:
https://ibb.co/3WKKd2m
I remove both duplicate annotations, should is shown only once
https://ibb.co/SJnV1mL
When I try to annotate should, the label selection window is not popping up
When I annotate something before the word should:
https://ibb.co/nzB28bx
The annotation is placed on a wrong position:
https://ibb.co/7QpWvjR
If I remove all annotations and redo them again, all looks ok.
Your Environment
The text was updated successfully, but these errors were encountered: