Custom unicode handling for Android backspace via JNI to ICU #17960

DeMonkeyCoder · 2020-04-26T12:25:25Z

The android backspace key is currently handled by TextKeyListener from native Android API in my older pull request #17393.
I figured out that this function has problems on lower Android API levels and I also think flutter needs a method to handle this functionality by itself.
So started scraping emoji characters from http://www.unicode.org/reports/tr51/#emoji_data by a python script and handled emojis based on the Unicode's documentation.
The utility I wrote is supporting all characters in Emoji 13.0

I also handled some states that were not handled in the android source code.

GaryQian · 2020-04-28T00:21:19Z

Can you file an issue to fully explain the problem? This would help with both understanding this and for future archeology to figure out what was going on.

We may want to consider using the Emoji handling that is already included in the ICU library that the C++ engine imports. This would ensure that we are using a standardized solution to this problem and that we don't have repeated logic in Java and C++. We should explore the tradeoffs of a direct custom java solution like yours vs a JNI into C++ to use ICU solution.

DeMonkeyCoder · 2020-04-28T06:06:53Z

I started an emulator with Android 5.1 and wasn't able to delete characters with TextKeyListener.

I tried using icu for java but requires Android API level 21.
http://site.icu-project.org/download/66#TOC-ICU4J-Download

I saw the android source code is still handling some new Emojis alongside icu library before updating to the latest version.
https://android.googlesource.com/platform/frameworks/base/+/master/core/java/android/text/Emoji.java#75

How can I use C++ engine modules inside this java code?
Also, How can I compare the performance between the two solutions? (using JNI - my Java code)
@GaryQian

justinmc · 2020-04-28T23:59:16Z

I am able to delete complex emojis like 👨‍👩‍👦 using the existing implementation with TextKeyListener.getInstance().onKeyDown. I'm using an Android Pixel 3 emulator with API 28.
Can you post specific steps to reproduce? Maybe open an issue in the framework repository like Gary said, or find an existing issue that covers it.

I'll defer to Gary on the C++ and performance questions.

Thanks for opening this by the way! It may unblock my PR for Dpad navigation through emoji characters (#17420) if we can figure that out.

jason-simmons · 2020-04-29T00:49:17Z

I can reproduce the issue using an emulator running an API 21 (Lollipop) image. In that environment the TextKeyListener.onKeyDown based implementation does not remove any text when the delete key is pressed.

I think the getOffsetBefore approach is a good improvement, but I'd like to avoid reimplementing a grapheme cluster breaker in the Android embedding.

Could the InputConnectionAdaptor call Android's android.text.TextUtils.getOffsetBefore on older platforms and use the android.icu.text.BreakIterator on platforms that offer it (API 24 and later)?
TextUtils.getOffsetBefore will not handle the latest emoji, but it should provide accuracy equivalent to native text fields on the Android versions where it would be used.

The JNI-based alternative would involve adding a new JNI exported function to the engine that provides an API like getOffsetBefore. The native implementation would use the grapheme breaker in the ICU C++ library that is already linked into the Flutter engine. I'd prefer to reuse that if there is no sufficiently accurate breaker that is already available to Java.

DeMonkeyCoder · 2020-04-29T05:11:56Z

I don't trust Android APIs for this specific problem as we encountered problems and if we are about to use something to avoid implementing, I prefer using JNI solution.
Working on it.

BreakIterator iterates over the graphemes while we don't always want to delete a complete grapheme.
Consider سً which is a single grapheme in Persian or Arabic locale (with zero width joiner I think) but after a backspace, should be converted to ‌س and shouldn't be completely deleted.
I'm working on bringing Emoji handlers from the native icu module.

DeMonkeyCoder · 2020-04-29T11:00:07Z

So I'm using JNI and icu module from the native engine.
I tested on emulators and code worked fine but in flutter tests I got:

java.lang.UnsatisfiedLinkError: io.flutter.plugin.editing.EmojiUtils.nativeEmojiUtilsIsRegionalIndicator(I)Z
	at io.flutter.plugin.editing.EmojiUtils.nativeEmojiUtilsIsRegionalIndicator(Native Method)

And I don't know why. Can you help me with this, please?

GaryQian · 2020-04-29T12:24:30Z

You should look at FlutterJNI.java. We currently implement all JNI methods in there, an explanation is given in the docs at the top.

I am not familiar enough with JNI to be able to tell you what is immediately wrong just from looking, but in any case, to stay consistent with the rest of the engine, you should move this into FlutterJNI.java anyways. There should be plenty of example methods already implemented in there to work off of.

DeMonkeyCoder · 2020-04-29T14:51:41Z

I tried FlutterJNI.java instance. Still the same error.
But worked normal on emulators.

GaryQian · 2020-04-29T19:09:20Z

Hmm, which tests are you running exactly to encounter that?

DeMonkeyCoder · 2020-04-29T19:46:10Z

InputConnectionAdaptor last two tests.
It is also failing with the same error in Cirrus CI Linux Android Debug Engine.

jason-simmons · 2020-04-29T22:03:22Z

I posted a version of this that finds the character boundary by using the icu::BreakIterator through JNI:
#18041

Can you try it and see if it works for your use case?

GaryQian · 2020-05-01T09:36:55Z

shell/platform/android/io/flutter/plugin/editing/FlutterTextUtils.java

+  public static final int LINE_FEED = 0x0A;
+  public static final int CARRIAGE_RETURN = 0x0D;
+  public static final int COMBINING_ENCLOSING_KEYCAP = 0x20E3;
+  public static final int CANCEL_TAG = 0xE007F;


I believe the Unicode spec refers to this as TAG_END, should probably stick to same naming scheme

http://unicode.org/reports/tr51/#def_emoji_tag_sequence

Jk, it looks like they do call this CANCEL_TAG in Android, this is fine, but I still think TAG_END may be a more appropriate name. Up to you if you want to change it. Just depends on if you want to follow Android or Unicode's lead.

I think it's better to follow the Android guideline because we are using it as the reference.

GaryQian · 2020-05-01T09:40:28Z

shell/platform/android/io/flutter/plugin/editing/FlutterTextUtils.java

+    }
+
+    // Emoji Tag Sequence
+    if (codePoint == CANCEL_TAG) { // tag_base


did you mean // tag_end?

was a mistake. fixed.

GaryQian · 2020-05-01T09:42:10Z

shell/platform/android/io/flutter/plugin/editing/FlutterTextUtils.java

+        codePoint = Character.codePointBefore(text, lastOffset);
+        lastOffset -= Character.charCount(codePoint);
+      }
+      if (lastOffset == 0) { // tag_end not found. Just delete the end.


did you mean tag_base?

GaryQian · 2020-05-01T09:58:35Z

shell/platform/android/io/flutter/plugin/editing/FlutterTextUtils.java

+      }
+      return offset - deleteCharCount;
+    }
+


Since you have unrolled the state machine, can you add some notation here indicating that the following few if statements are meant to act like a fall through and do not always return like the if statements above?

The separation will make the code much more parse-able.

GaryQian · 2020-05-01T10:52:25Z

shell/platform/android/test/io/flutter/plugin/editing/InputConnectionAdaptorTest.java

  private static final String SAMPLE_TEXT =
      "Lorem ipsum dolor sit amet," + "\nconsectetur adipiscing elit.";

+  private static final String SAMPLE_EMOJI_TEXT =


You should add some additional edge cases, including handling of invalid sequences. Invalid sequences may be generated programmatically, and we should make sure we can handle/recover from these cases.

GaryQian · 2020-05-01T11:09:53Z

Implementation roughly LGTM, will take another look during final pass.

DeMonkeyCoder · 2020-05-01T13:54:11Z

I handled invalid sequences and unknown cases in the last changes

GaryQian

Did some manual testing with this, and seems to be working well. LGTM, I'll see if @jason-simmons has any final thoughts.

…lutter/engine#17960)

googlebot added the cla: yes label Apr 26, 2020

auto-assign bot requested a review from GaryQian April 26, 2020 12:27

DeMonkeyCoder force-pushed the android-backspace branch from c33eef7 to 0aed344 Compare April 26, 2020 12:38

DeMonkeyCoder mentioned this pull request Apr 26, 2020

Make DPAD movement consider grapheme clusters #17420

Merged

DeMonkeyCoder force-pushed the android-backspace branch from 0aed344 to 0929649 Compare April 26, 2020 12:45

DeMonkeyCoder changed the title ~~Handle android backspace by handling last unicode characters~~ Handle android backspace by handling the last Unicode characters Apr 26, 2020

DeMonkeyCoder force-pushed the android-backspace branch 5 times, most recently from ac6e8e8 to 5718c22 Compare April 26, 2020 14:10

DeMonkeyCoder force-pushed the android-backspace branch from 0602667 to e3dd4d4 Compare April 29, 2020 10:52

DeMonkeyCoder force-pushed the android-backspace branch 3 times, most recently from 4078a0b to 349f694 Compare April 29, 2020 14:51

DeMonkeyCoder mentioned this pull request Apr 30, 2020

Handle backspace on Android by using the engine's ICU grapheme breaker #18041

Closed

DeMonkeyCoder force-pushed the android-backspace branch 3 times, most recently from 123bb09 to 228d308 Compare April 30, 2020 06:18

DeMonkeyCoder force-pushed the android-backspace branch from 7694a55 to a2ce047 Compare May 1, 2020 06:03

GaryQian reviewed May 1, 2020

View reviewed changes

DeMonkeyCoder force-pushed the android-backspace branch 2 times, most recently from 3af0e4e to 23f37cb Compare May 1, 2020 13:53

DeMonkeyCoder requested a review from GaryQian May 1, 2020 13:53

InputConnectionAdaptor cleanup and test

9bf671b

DeMonkeyCoder force-pushed the android-backspace branch from 23f37cb to 9bf671b Compare May 1, 2020 13:55

GaryQian mentioned this pull request May 1, 2020

RTL indicator back to start after clicking space flutter/flutter#22627

Closed

DeMonkeyCoder mentioned this pull request May 2, 2020

Backspace/delete not working at word boundary after mixing RTL/LTR text flutter/flutter#56149

Closed

Merge branch 'master' into android-backspace

d853f9f

DeMonkeyCoder force-pushed the android-backspace branch from cab91a4 to d853f9f Compare May 2, 2020 06:50

GaryQian assigned askeksa-google May 5, 2020

GaryQian approved these changes May 5, 2020

View reviewed changes

jason-simmons approved these changes May 6, 2020

View reviewed changes

askeksa-google assigned GaryQian and jason-simmons and unassigned askeksa-google May 6, 2020

GaryQian added the waiting for tree to go green This PR is approved and tested, but waiting for the tree to be green to land. label May 6, 2020

GaryQian changed the title ~~Handle android backspace by handling the last Unicode characters~~ Custom unicode handling for Android backspace via JNI to ICU May 6, 2020

fluttergithubbot merged commit 9cdb5a9 into flutter:master May 6, 2020

engine-flutter-autoroll added a commit to engine-flutter-autoroll/flutter that referenced this pull request May 6, 2020

9cdb5a9 Custom unicode handling for Android backspace via JNI to ICU (f…

71f8108

…lutter/engine#17960)

engine-flutter-autoroll mentioned this pull request May 6, 2020

Roll engine d6aa099de7c7..419afe38c970 (3 commits) flutter/flutter#56498

Closed

engine-flutter-autoroll added a commit to engine-flutter-autoroll/flutter that referenced this pull request May 7, 2020

9cdb5a9 Custom unicode handling for Android backspace via JNI to ICU (f…

a3701fa

…lutter/engine#17960)

engine-flutter-autoroll mentioned this pull request May 7, 2020

Roll engine d6aa099de7c7..3953c3ccd15a (4 commits) flutter/flutter#56506

Merged

Custom unicode handling for Android backspace via JNI to ICU #17960

Custom unicode handling for Android backspace via JNI to ICU #17960

Uh oh!

Conversation

DeMonkeyCoder commented Apr 26, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

GaryQian commented Apr 28, 2020

Uh oh!

DeMonkeyCoder commented Apr 28, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

justinmc commented Apr 28, 2020

Uh oh!

jason-simmons commented Apr 29, 2020

Uh oh!

DeMonkeyCoder commented Apr 29, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DeMonkeyCoder commented Apr 29, 2020

Uh oh!

GaryQian commented Apr 29, 2020

Uh oh!

DeMonkeyCoder commented Apr 29, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

GaryQian commented Apr 29, 2020

Uh oh!

DeMonkeyCoder commented Apr 29, 2020

Uh oh!

jason-simmons commented Apr 29, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

GaryQian May 1, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

DeMonkeyCoder May 1, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

DeMonkeyCoder May 1, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

GaryQian May 1, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

GaryQian commented May 1, 2020

Uh oh!

DeMonkeyCoder commented May 1, 2020

Uh oh!

GaryQian left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

DeMonkeyCoder commented Apr 26, 2020 •

edited

Loading

DeMonkeyCoder commented Apr 28, 2020 •

edited

Loading

DeMonkeyCoder commented Apr 29, 2020 •

edited

Loading

DeMonkeyCoder commented Apr 29, 2020 •

edited

Loading

GaryQian May 1, 2020 •

edited

Loading

DeMonkeyCoder May 1, 2020 •

edited

Loading

DeMonkeyCoder May 1, 2020 •

edited

Loading

GaryQian May 1, 2020 •

edited

Loading