New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Offsets incorrectly displaying in annotation #1008
Comments
the text before the incorrect Ecuador: |
Yes, these problems usually happen because of emoji and special characters. |
Is there any fix in process to correctly count the characters of emoji and special characters to not have this issue of offsets? |
@lluissalord what LS version do you have now? |
@makseq I am using version 1.3 An example of what is happening to me is below, where "Jr" should be labelled as SENIORITY. The current result for this case is:
Hence, we can see that the value on the result is correct, however the visualization on Label Studio is not correct. Besides, if I try to label it correctly, now the "start" and "end" does not match to what it should counting the characters:
|
@lluissalord could you please share a sample text, your labeling config and a full result that's displaying incorrectly? it'd be super helpful for debugging the issue |
Sample text: "👋🏽 Hola, soy Roberto Jr" Labelling config:
Result:
|
thank you! i'll get back with further investigation |
@lluissalord after checking your example we found out that this is an emoji issue. we're experiencing problems with calculating length of composite emojis that contain more than one unicode character. currently we're working on a fix that will be released during LS 1.3 lifecycle |
Hi all! @Lolologist @lluissalord |
Hi @hlomzik I can confirm that the PR fixed my use case. Thank you!! |
Hi @nicholasrq @makseq Do we know when will be PR from @hlomzik (#1559) be included on master? |
@lluissalord Hey, I think it's already in the master branch of LS. |
I tested on the new version 1.3.0post1 and it did not work so I supposed it was not in master. |
@lluissalord Sorry, I was in a hurry. Could you check it again from the master? |
@makseq It is working as expected on master. Thank you! 😄 |
Currently having the same problem with v1.4, any idea when this fix is expected to be released? |
@JulesBelveze Are you on master branch from LS github repository? |
Nop, I'm on v1.4.0. I was just wondering if there's a patch release expected soon |
Yes, we are going to release 1.4.1 and include this patch too. |
In the attached pictures you can see the proper text for "Ecuador" as an entity, and the annotation tool showing it incorrectly.
My first guess is that it is related to the emoji being used, but if so, it isn't all emoji:
As you can see in the third image, offsets continue correctly in that particular case after being exposed to some emoji.
The text was updated successfully, but these errors were encountered: