Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for emojis in text translation #77

Closed
fleschutz opened this issue Apr 8, 2021 · 1 comment
Closed

Support for emojis in text translation #77

fleschutz opened this issue Apr 8, 2021 · 1 comment

Comments

@fleschutz
Copy link

Given is the English text: "Well done 馃憤"

The text itself gets translated perfectly in any language. However, depending on the target language the emoji is translated to "" or "?" or "Benachrichtigung" (in German).

Would it be possible to detect the emoji and leave that character as it is?
Hint: in Unicode 13.0 there are 4 character ranges allocated for emojis: U+1F300 (127744) to U+1FAD6 (129750), 126980 to 127569, 169 to 174 and 8205 to 12953

@PJ-Finlay
Copy link
Collaborator

This has already been fixed in the code and should work for any newly trained models going forward.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants