Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unicode right-to-left character encoding issue #344

Open
jacobthill opened this issue Feb 7, 2023 · 0 comments
Open

Unicode right-to-left character encoding issue #344

jacobthill opened this issue Feb 7, 2023 · 0 comments
Labels
bug Something isn't working

Comments

@jacobthill
Copy link
Contributor

Some Arabic strings have an unencoded character in them: e.g.

\u200f is a non-printing right-to-left unicode character. It is in the original data but is not visible when rendered. We didn't have an issue with this before but with the new airflow process, this character isn't getting encoded. The issue is likely in intake. We need this character to render properly or we need to remove it. I'm not sure the implications of removing it but here is a way to do that:

https://stackoverflow.com/questions/46897952/remove-right-to-left-character-u200f-in-python-hebrew

@jacobthill jacobthill added the bug Something isn't working label Feb 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: No status
Development

No branches or pull requests

1 participant