Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Span with surrounding quotation marks appears as if without ones #1033

Open
cjer opened this issue Aug 8, 2018 · 3 comments
Open

Span with surrounding quotation marks appears as if without ones #1033

cjer opened this issue Aug 8, 2018 · 3 comments
Labels
🐛Bug Something isn't working RTL
Milestone

Comments

@cjer
Copy link

cjer commented Aug 8, 2018

Span annotation with beginning or closing quotation mark (and I believe other punctuation marks as well) appears as if the punctuation is not included in the mark. This made annotators miss a lot of these small border issues.

Examples:

"הראל" will visually appear the same as הראל, but as you can see they are annotated differently, as recognized by the curation interface.
image
image
image
image

Same goes for חטיבת "הראל" and חטיבת "הראל

image

@reckart reckart added RTL 🐛Bug Something isn't working labels Aug 8, 2018
@reckart
Copy link
Member

reckart commented Aug 8, 2018

Sounds like an edge-case. I assume the quotes are not unicode RTL characters, right? Can you verify whether the quotes are detected as separate tokens? So "הראל" should consist of three tokens. You could check that e.g. by exporting the data as TSV and seeing whether ", הראל and " all appear on separate lines. If they are, then we have an unknown bug. If they are not and this is a mixed RTL-LTR token, then it sounds like a known bug (#283).

@reckart
Copy link
Member

reckart commented Aug 8, 2018

Actually, re-reading #283 it sounds like the same thing.

@cjer
Copy link
Author

cjer commented Aug 8, 2018

Yes these are regular ASCII quotation marks. And yes, they are recognized as separate tokens.

@reckart reckart added this to the Bug backlog milestone Sep 24, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🐛Bug Something isn't working RTL
Projects
None yet
Development

No branches or pull requests

2 participants