Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] IME support #5777

Open
JoelEinbinder opened this issue Mar 10, 2021 · 25 comments
Open

[Feature] IME support #5777

JoelEinbinder opened this issue Mar 10, 2021 · 25 comments

Comments

@JoelEinbinder
Copy link
Contributor

Right now our IME support is limited. We insert text, but do not allow for testing complicated scenarios with composition events.

The Web api: https://developer.mozilla.org/en-US/docs/Web/API/CompositionEvent
The Mac api: https://developer.apple.com/documentation/appkit/nstextinputclient?language=objc
The Chromium mojo api: https://source.chromium.org/chromium/chromium/src/+/master:third_party/blink/public/mojom/input/input_handler.mojom

This request comes from @trueadm on twitter: https://twitter.com/trueadm/status/1369332999242792967

My initial api proposal:

await page.keyboard.startComposition(start, end); // optional start and end
await page.keyboard.updateComposition('foo');
await page.keyboard.updateComposition('bar');
await page.keyboard.press('A'); // what should this do? Send keyboard events without generating text?
await page.keyboard.endComposition('baz'); // optional ending text

The only time I've seen composition events used in the wild is in CodeMirror to draw a synthetic composition highlight. It would be helpful to have some better use cases here to make sure that the api handles them without growing too large.

  • Does the composition state need to be inspected, or just controlled? That is, do we need keyboard.compositionRange()?
  • Do we instead need a high level composition api. Like keyboard.typeWithComposition("è") types AltGr+` followed by e?
  • Do we need be able to trigger composition via keys. Right now our keyboard api largely ignores dead keys that would start composition.
@JoelEinbinder JoelEinbinder self-assigned this Mar 10, 2021
@trueadm
Copy link

trueadm commented Mar 10, 2021

Thanks for taking the time to write this feature up @JoelEinbinder. Testing the complexities around IME is really important to us at Facebook, both for our internal projects and also our open-source projects.

Your initial proposal looks good, however, I'd maybe opt for this not to be on page.keyboard as composition can come from sources that aren't keyboard related (software keyboards and accessibility tools). Maybe page.composition is a better fit. Also I believe we'd want to ensure that selection changes are reflected in composition updates, as typing in languages like Hiragana can cause selection to move around during updates.

For keyboard presses, or keydown/keyup, we'd still want to trigger those events, but they shouldn't affect the output of the DOM or of selection. It's common to have ArrowDown/ArrowRight key presses occur during composition, but these shouldn't cause their default behaviors to occur.

Does the composition state need to be inspected, or just controlled? That is, do we need keyboard.compositionRange()?

I believe we just want to control it. I think we can do some investigation work around this after we have a MVP and see from that how accurately we can model test cases around user workflows.

Do we instead need a high level composition api. Like keyboard.typeWithComposition("è") types AltGr+` followed by e?

I think we should be careful here. As this behavior can change depending on keyboard layout if I'm not mistaken, which makes tests harder to properly emulate. However, I might be wrong, maybe this is something we should consider.

Do we need be able to trigger composition via keys. Right now our keyboard api largely ignores dead keys that would start composition.

Dead keys aren't always fired, so I don't think we need to worry too much about them. For example, on a Mac, holding down e brings up the IME to add an accent over the e but this doesn't fire a dead key AFAIK. Also the combinations needed to invoke composition can be complex depending on the OS and the language pack in use.

@JoelEinbinder
Copy link
Contributor Author

Also I believe we'd want to ensure that selection changes are reflected in composition updates, as typing in languages like Hiragana can cause selection to move around during updates.

The cursor should change on updateComposition based on the length of the text there. I'll do some testing with CJK to see if they end up separately triggering selection updates.

Do we instead need a high level composition api. Like keyboard.typeWithComposition("è") types AltGr+` followed by e?

I think we should be careful here. As this behavior can change depending on keyboard layout if I'm not mistaken, which makes tests harder to properly emulate. However, I might be wrong, maybe this is something we should consider.

Its definitely layout specific. We currently have the US keyboard layout mapped out. Adding dead keys and other layouts is not too hard, but it sounds like you don't need this.

For example, on a Mac, holding down e brings up the IME to add an accent over the e but this doesn't fire a dead key AFAIK.

It doesn't. It also doesn't fire composition start/end, but rather does a text replacement for the character before the cursor. The web page sees the initial keydown/input, a bunch of repeated key downs without text as the e key is held, a keydown for the digit 1 to select the variation, and then an input event with the new text.
Is the complexity here important? Right now we would just fire the final input event with è. This is similar to what you'd get on an iOS keyboard.

Can you give an example of where a component breaks with IME? You have to be doing something a little bit strange for composition events to be relevant, so I want to make sure I understand what we are testing for.

@trueadm
Copy link

trueadm commented Mar 10, 2021

It doesn't. It also doesn't fire composition start/end, but rather does a text replacement for the character before the cursor. The web page sees the initial keydown/input, a bunch of repeated key downs without text as the e key is held, a keydown for the digit 1 to select the variation, and then an input event with the new text.
Is the complexity here important? Right now we would just fire the final input event with è. This is similar to what you'd get on an iOS keyboard.

Can you give an example of where a component breaks with IME? You have to be doing something a little bit strange for composition events to be relevant, so I want to make sure I understand what we are testing for.

If I hold down e the IME appears, if I press RightArrow, it enters composition. This is an important bug, because this is broken in DraftJS right now (see FB.com or Twitter). We've fixed it in a new editor we're working on, but bugs like this are why we need good IME regression tests.

Another interesting thing is that when you use a software keyboard with FF, it uses composition for insertions. On Safari and Chrome it does not, it just does text replacement. These edge cases are where the bulk of bugs we see at FB stem from.

Another issue we found was around Japanese/Korean IME input. For years the last character of a word entered via IME on Draft meant the last character went missing! Don't get me started on Android IME issues too. :P

I'd be happy to setup a video call at some point if you wanted to talk about this and dive into more specifics.

@JoelEinbinder
Copy link
Contributor Author

Well if you are building a rich text editor or a code editor, that's a problem I can wrap my head around.

I think the lower level api makes a lot of sense, because anyone building an editor will have enough expertise to send the exact events they need rather than relying on Playwright to make an educated guess based on browser/platform/language.

No promises on time frame, but I'll make a prototype in webkit and see what the api looks like.

@trueadm
Copy link

trueadm commented Mar 10, 2021

@JoelEinbinder That's awesome. Thank you again for this. Another interesting idea might be if there was a way to record the native user invoked events in the form of Playwright API commands. We can obviously do a hacky form of this today by adding capture listeners to a page and recording all the events, and convert them into Playwright tests, but we want to distinguish between events that the user makes vs that from a library or framework.

@JoelEinbinder
Copy link
Contributor Author

Check out Playwright codegen.

I'd make it work with IME after adding this api.

@trueadm
Copy link

trueadm commented Mar 10, 2021

@JoelEinbinder It would also be good if we had a way of selecting a range of text with Playwright. Something like:

await page.composition.setSelection(
  'div.editor',
  anchorPath,
  anchorOffset,
  focusPath,
  focusOffset,
);

Where the paths are the DOM path from the given element selector (i.e. [0, 1, 0]). I realize you can kind of do this with `page.evaluate` but it's good to get a guarantee that selection was applied before continuing (as it's async). Otherwise things listening to `selectionchange` may not correctly get triggered.

This is useful when emulating how software keyboards change selection and insert text (without triggering key events).

@steveluscher
Copy link

Here are some visuals for the folks following along.

Long-press diacritic composition on Mac OS

Hold down the ‘e’ key then press TAB, Left/RightArrow, or 1–7 to select a diacritic.

Screen.Recording.2021-03-10.at.12.13.48.PM.mov

Long-press diacritic composition on Twitter in Chrome

Same as above. Broken, as you can see – you get an unintended ‘e’.

Screen.Recording.2021-03-10.at.12.15.09.PM.mov

Hiragana keyboard input on Mac OS

Use the Japanese – Romaji keyboard then type ‘watashi’ followed by TAB several times.

Screen.Recording.2021-03-10.at.11.41.09.AM.mov

Hiragana keyboard input on Facebook

Same as above, in the Facebook post composer. Broken – pressing TAB commits the composition instead of moving you to the next selection in the candidate window.

Screen.Recording.2021-03-10.at.11.39.48.AM.mov

@trueadm
Copy link

trueadm commented Mar 23, 2021

@JoelEinbinder I've been use Playwright internally for our new text editor and the experience has been highly positive. I was wondering if you had any updates, or if there was any way we could help contribute towards this feature, even if it's to get it into an experimental form?

@JoelEinbinder
Copy link
Contributor Author

JoelEinbinder commented Mar 23, 2021 via email

@trueadm
Copy link

trueadm commented Mar 23, 2021

@JoelEinbinder This is an awesome update! Do you have a working branch, so I can see how you've done this? I'm really interested in getting a better understanding of what happens under the hood.

In terms of replicating what Draft does in a unit test might be tricky, as the project is largely unmaintained.

However, the cause is likely to do with the fact that Draft doesn't apply the same selection that occurs during compositionstart during compositionend where it gets the event.data and inserts the text. That's how we've tackled this problem with our new editor we're working on at least – we save the selection and re-apply it before inserting the text, so it replaces the existing content. So in terms of a test, it would likely be as trivial to create once I know the API calls for how you've implemented this feature.

@trueadm
Copy link

trueadm commented Apr 6, 2021

@JoelEinbinder Quickly catching up. Do you have any updates? Is there anything you need from my side of things? Can I help anywhere too?

@JoelEinbinder
Copy link
Contributor Author

Chromium patch in review: https://chromium-review.googlesource.com/c/chromium/src/+/2764592

@aslushnikov aslushnikov added v1.13 and removed v1.12 labels May 20, 2021
@mxschmitt mxschmitt added this to the v1.13.0 milestone Jun 14, 2021
@mxschmitt mxschmitt removed the v1.13 label Jun 14, 2021
@prontiol
Copy link

prontiol commented Jun 15, 2021

Yay! v1.13.0 sounds promising!
@JoelEinbinder Any updates? Is there a branch we can test it out before the release?

@trueadm
Copy link

trueadm commented Oct 14, 2021

@mxschmitt @JoelEinbinder are there any updates regarding this feature? Is there anything blocked that we can help with from our side?

@luin
Copy link
Contributor

luin commented Jun 20, 2023

Hi there 👋 I'm currently working on the next version of Quill, and IME support is a crucial component as many bugs have arisen on the Quill repo due to IME. Would like to see if there is anything I can do to help to move this forward.

@ruifigueira
Copy link
Contributor

ruifigueira commented Jul 26, 2023

I was just looking into this ticket (I've been working on a related one, #24249).

CDP currently has an Input.imeSetComposition which seems to work, at least partially:

test('ime', async ({ page }) => {
  await page.goto('https://w3c.github.io/uievents/tools/key-event-viewer.html');
  const client = await page.context().newCDPSession(page);
  await page.locator('#input').focus();
  await client.send('Input.imeSetComposition', {
    selectionStart: -1,
    selectionEnd: -1,
    text: '😂'
  });
  // to cancel
  await client.send('Input.imeSetComposition', {
    selectionStart: -1,
    selectionEnd: -1,
    text: ''
  });
})

image

Cancel the composition is supported, but I'm trying to figure out how to commit it. In their docs, they state:

Use imeCommitComposition to commit the final text.

I'm assuming that imeCommitComposition is (or should be) a CDP method, but I can't find it, or any reference in chromium source code other than that comment.

@ruifigueira
Copy link
Contributor

ruifigueira commented Jul 26, 2023

I look a little bit into chromium code, and it seems that the trick to commit is to ensure that this call is made:

https://github.com/chromium/chromium/blob/f32e479557fb27f8e9dc993bddd22c7a426f5f5e/content/browser/devtools/protocol/input_handler.cc#L1181-L1184

And that's what Input.insertText CDP method actually does.

So, I tried to use it and it apparently is triggering very realistic events:

test('ime', async ({ page }) => {
  await page.goto('https://w3c.github.io/uievents/tools/key-event-viewer.html');
  const client = await page.context().newCDPSession(page);
  await page.locator('#input').focus();
  await client.send('Input.imeSetComposition', {
    selectionStart: -1,
    selectionEnd: -1,
    text: '😂😂',
  });
  await client.send('Input.imeSetComposition', {
    selectionStart: 1,
    selectionEnd: 2,
    text: '😭',
  });
  // to commit
  await client.send('Input.insertText', {
    text: '😂😭',
  });
});

image

@luin
Copy link
Contributor

luin commented Sep 19, 2023

@ruifigueira Thanks for the info! The CDP approach works good and we've already used it to cover some regressions 👍. Hope it would work on other browsers, especially on Safari as most IME bugs that we encountered happened on Safari 😄

@ruifigueira
Copy link
Contributor

@luin, if you really, really need to have compositionupdate events with webkit browser, this hack will work:

import { test, expect } from '@playwright/test';

test('test', async ({ page, playwright, browserName }) => {
  test.skip(browserName !== 'webkit');
  await page.goto('https://w3c.github.io/uievents/tools/key-event-viewer.html');

  // hack to get WKSession
  const session = (await (playwright as any)._toImpl(page))._delegate._session;
  
  // triggers a compositionstart and compositionupdate. Available parameters:
  // https://github.com/microsoft/playwright/blob/175ae09fefb971236e984a20667cd04a52aa29d0/packages/playwright-core/src/server/webkit/protocol.d.ts#L7151-L7157
  await session.send('Page.setComposition', { text: 'example', selectionStart: 0, selectionLength: 0 });
  
  await expect(page.locator('#output span').filter({ hasText: 'compositionstart' })).toHaveCount(1);
  await expect(page.locator('#output span').filter({ hasText: 'compositionupdate' })).toHaveCount(1);
  await expect(page.locator('#output span').filter({ hasText: 'compositionend' })).toHaveCount(0);
  await expect(page.locator("#input")).toHaveValue('example');
  
  await page.pause();
  
  // commits (triggers compositionend)
  await page.keyboard.insertText("example commited");
  await expect(page.locator('#output span').filter({ hasText: 'compositionend' })).toHaveCount(1);
  
  await page.pause();
});

You'll need to launch tests with npx playwright test --project webkit. It won't run with playwright extension on VSCode because it reuses browser context.

@luin
Copy link
Contributor

luin commented Nov 15, 2023

@ruifigueira Thanks for the insight! This approach works well and I've already used it in Quill repo. It successfully covers all the IME issues we've encountered.

It'd be awesome to see something similar get implemented officially

@ruifigueira
Copy link
Contributor

ruifigueira commented Nov 15, 2023

I think that IME in chrome and webkit could be easily implemented. The major problem is with firefox, which requires changes in juggler. There's a PR #9921 with some work on that but it's incomplete

@gurschitz
Copy link

@luin, if you really, really need to have compositionupdate events with webkit browser, this hack will work:

import { test, expect } from '@playwright/test';

test('test', async ({ page, playwright, browserName }) => {
  test.skip(browserName !== 'webkit');
  await page.goto('https://w3c.github.io/uievents/tools/key-event-viewer.html');

  // hack to get WKSession
  const session = (await (playwright as any)._toImpl(page))._delegate._session;
  
  // triggers a compositionstart and compositionupdate. Available parameters:
  // https://github.com/microsoft/playwright/blob/175ae09fefb971236e984a20667cd04a52aa29d0/packages/playwright-core/src/server/webkit/protocol.d.ts#L7151-L7157
  await session.send('Page.setComposition', { text: 'example', selectionStart: 0, selectionLength: 0 });
  
  await expect(page.locator('#output span').filter({ hasText: 'compositionstart' })).toHaveCount(1);
  await expect(page.locator('#output span').filter({ hasText: 'compositionupdate' })).toHaveCount(1);
  await expect(page.locator('#output span').filter({ hasText: 'compositionend' })).toHaveCount(0);
  await expect(page.locator("#input")).toHaveValue('example');
  
  await page.pause();
  
  // commits (triggers compositionend)
  await page.keyboard.insertText("example commited");
  await expect(page.locator('#output span').filter({ hasText: 'compositionend' })).toHaveCount(1);
  
  await page.pause();
});

You'll need to launch tests with npx playwright test --project webkit. It won't run with playwright extension on VSCode because it reuses browser context.

Unfortunately, this solution is not working anymore in 1.40+ because Page.setComposition got removed.

@ruifigueira
Copy link
Contributor

You're right.

I tried with pw:protocol debug level and the message is properly sent but it's webkit that responds with the error:

  pw:protocol SEND ► {"id":93,"method":"Target.sendMessageToTarget","params":{"message":"{\"id\":92,\"method\":\"Page.setComposition\",\"params\":{\"text\":\"example\",\"selectionStart\":0,\"selectionLength\":0}}","targetId":"page-7"},"pageProxyId":"6"} +4ms
  pw:protocol ◀ RECV {"method":"Target.dispatchMessageFromTarget","params":{"targetId":"page-7","message":"{\"error\":{\"code\":-32601,\"message\":\"'Page.setComposition' was not found\",\"data\":[{\"code\":-32601,\"message\":\"'Page.setComposition' was not found\"}]},\"id\":92}"},"browserContextId":"8000000000000002","pageProxyId":"6"} +0ms

I think webkit patch was updated to remove that, but the patch in the project doesn't reflect those changes because it still contains that instruction:

https://github.com/microsoft/playwright/blob/v1.40.1/browser_patches/webkit/patches/bootstrap.diff#L1199-L1209

@ivailop7
Copy link
Contributor

Is there any update on potential support for IME input in Playwright?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests