-
Notifications
You must be signed in to change notification settings - Fork 6k
Custom unicode handling for Android backspace via JNI to ICU #17960
Custom unicode handling for Android backspace via JNI to ICU #17960
Conversation
c33eef7
to
0aed344
Compare
0aed344
to
0929649
Compare
ac6e8e8
to
5718c22
Compare
Can you file an issue to fully explain the problem? This would help with both understanding this and for future archeology to figure out what was going on. We may want to consider using the Emoji handling that is already included in the ICU library that the C++ engine imports. This would ensure that we are using a standardized solution to this problem and that we don't have repeated logic in Java and C++. We should explore the tradeoffs of a direct custom java solution like yours vs a JNI into C++ to use ICU solution. |
I started an emulator with Android 5.1 and wasn't able to delete characters with TextKeyListener. I tried using icu for java but requires Android API level 21. I saw the android source code is still handling some new Emojis alongside icu library before updating to the latest version. How can I use C++ engine modules inside this java code? |
I am able to delete complex emojis like 👨👩👦 using the existing implementation with I'll defer to Gary on the C++ and performance questions. Thanks for opening this by the way! It may unblock my PR for Dpad navigation through emoji characters (#17420) if we can figure that out. |
I can reproduce the issue using an emulator running an API 21 (Lollipop) image. In that environment the I think the Could the The JNI-based alternative would involve adding a new JNI exported function to the engine that provides an API like |
I don't trust Android APIs for this specific problem as we encountered problems and if we are about to use something to avoid implementing, I prefer using JNI solution. BreakIterator iterates over the graphemes while we don't always want to delete a complete grapheme. |
0602667
to
e3dd4d4
Compare
So I'm using JNI and icu module from the native engine.
And I don't know why. Can you help me with this, please? |
You should look at I am not familiar enough with JNI to be able to tell you what is immediately wrong just from looking, but in any case, to stay consistent with the rest of the engine, you should move this into FlutterJNI.java anyways. There should be plenty of example methods already implemented in there to work off of. |
4078a0b
to
349f694
Compare
I tried |
Hmm, which tests are you running exactly to encounter that? |
InputConnectionAdaptor last two tests. |
I posted a version of this that finds the character boundary by using the Can you try it and see if it works for your use case? |
123bb09
to
228d308
Compare
7694a55
to
a2ce047
Compare
public static final int LINE_FEED = 0x0A; | ||
public static final int CARRIAGE_RETURN = 0x0D; | ||
public static final int COMBINING_ENCLOSING_KEYCAP = 0x20E3; | ||
public static final int CANCEL_TAG = 0xE007F; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the Unicode spec refers to this as TAG_END
, should probably stick to same naming scheme
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Jk, it looks like they do call this CANCEL_TAG
in Android, this is fine, but I still think TAG_END may be a more appropriate name. Up to you if you want to change it. Just depends on if you want to follow Android or Unicode's lead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's better to follow the Android guideline because we are using it as the reference.
} | ||
|
||
// Emoji Tag Sequence | ||
if (codePoint == CANCEL_TAG) { // tag_base |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did you mean // tag_end
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
was a mistake. fixed.
codePoint = Character.codePointBefore(text, lastOffset); | ||
lastOffset -= Character.charCount(codePoint); | ||
} | ||
if (lastOffset == 0) { // tag_end not found. Just delete the end. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did you mean tag_base
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed.
} | ||
return offset - deleteCharCount; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you have unrolled the state machine, can you add some notation here indicating that the following few if statements are meant to act like a fall through and do not always return like the if statements above?
The separation will make the code much more parse-able.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
private static final String SAMPLE_TEXT = | ||
"Lorem ipsum dolor sit amet," + "\nconsectetur adipiscing elit."; | ||
|
||
private static final String SAMPLE_EMOJI_TEXT = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should add some additional edge cases, including handling of invalid sequences. Invalid sequences may be generated programmatically, and we should make sure we can handle/recover from these cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Implementation roughly LGTM, will take another look during final pass. |
3af0e4e
to
23f37cb
Compare
I handled invalid sequences and unknown cases in the last changes |
23f37cb
to
9bf671b
Compare
cab91a4
to
d853f9f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did some manual testing with this, and seems to be working well. LGTM, I'll see if @jason-simmons has any final thoughts.
The android backspace key is currently handled by TextKeyListener from native Android API in my older pull request #17393.
I figured out that this function has problems on lower Android API levels and I also think flutter needs a method to handle this functionality by itself.
So started scraping emoji characters from http://www.unicode.org/reports/tr51/#emoji_data by a python script and handled emojis based on the Unicode's documentation.
The utility I wrote is supporting all characters in Emoji 13.0
I also handled some states that were not handled in the android source code.