-
Notifications
You must be signed in to change notification settings - Fork 282
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Choices do not work with ❌ emoji #898
Comments
The tokenizer behavior of splitting on some emoji is described here: https://github.com/Microsoft/botbuilder-dotnet/issues/706 |
Yes this is correct the ❌ U+274C is being recognized as a breaking character (like punctuation) and the 💚 U+1F49A isn't. Emoji's are a little more complex than the current code assumes, simply working on code-points is not enough, the full emoji will often be multiple code-points. The length of the emoji is variable. In some cases like the flag of Wales up to 7 code-points (and in this case 14 characters.) Refer to https://unicode.org/emoji/charts/full-emoji-list.html So this amounts to quite a heavy redesign of this code. Moving to the next milestone. However the Tokenizer is replaceable. The behavior we are talking about here is just the behavior of the default tokenizer. If you have an urgent need for slightly different tokenizing code you might consider plugging in a custom tokenizer. |
pushing to 4.7 due to resource constraints in 4.6. |
@johnataylor - any update on this? |
We are not planning enhanced emoji support in R9 moving to R10. |
closing and tracking this with the issue in borframework-sdk |
Versions
"botbuilder": "^4.3.4",
v10.14.0
OSX
Describe the bug
Chioce prompt doesn't work with the ❌ emoji, and for example 💚 does work, in the same list of options.
To Reproduce
Expected behavior
Should work with every emoji
Additional context
Debugging a bit I've found that the
defaultTokenizer
doesn;t recognize the emoji and thinks its a breaking char?[bug]
The text was updated successfully, but these errors were encountered: