Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Korean IME does not work as expected #4226

Closed
rkttu opened this issue Jan 15, 2020 · 24 comments · Fixed by #4796
Closed

Korean IME does not work as expected #4226

rkttu opened this issue Jan 15, 2020 · 24 comments · Fixed by #4796
Assignees
Labels
Area-Input Related to input processing (key presses, mouse, etc.) Area-TerminalControl Issues pertaining to the terminal control (input, selection, keybindings, mouse interaction, etc.) Help Wanted We encourage anyone to jump in on these. Issue-Bug It either shouldn't be doing this or needs an investigation. Priority-1 A description (P1) Product-Terminal The new Windows Terminal. Resolution-Fix-Committed Fix is checked in, but it might be 3-4 weeks until a release.
Milestone

Comments

@rkttu
Copy link

rkttu commented Jan 15, 2020

Environment

Windows build number: Windows 10 2004 19041.21
Windows Terminal version (if applicable): 0.8.10091.0

Any other software?

Steps to reproduce

  • Run Windows PowerShell tab
  • Typing text $x = '한글'

image

Expected behavior

  • The resulted text should be $x = '한글'

Actual behavior

  • But the resulted text was $x = '한그ㄱ글'
@ghost ghost added Needs-Triage It's a new issue that the core contributor team needs to triage at the next triage meeting Needs-Tag-Fix Doesn't match tag requirements labels Jan 15, 2020
@yjh0502
Copy link

yjh0502 commented Jan 15, 2020

The bug breaks my daily workflow :(

@zadjii-msft
Copy link
Member

Thanks for the bug report! @rkttu /@yjh0502 Does this repro in a legacy console (pwsh.exe) window? Or does this only happen in the Windows Terminal?

@zadjii-msft zadjii-msft added Area-Input Related to input processing (key presses, mouse, etc.) Area-TerminalControl Issues pertaining to the terminal control (input, selection, keybindings, mouse interaction, etc.) Needs-Author-Feedback The original author of the issue/PR needs to come back and respond to something Product-Terminal The new Windows Terminal. Priority-1 A description (P1) Issue-Bug It either shouldn't be doing this or needs an investigation. labels Jan 15, 2020
@ghost ghost removed the Needs-Tag-Fix Doesn't match tag requirements label Jan 15, 2020
@rkttu
Copy link
Author

rkttu commented Jan 15, 2020

@zadjii-msft Currently it only happens in WT, but legacy console's behavior also weird. It does not display character compositions. For example, to make a character '한', user can typing the key 'ㅎ', 'ㅏ', and 'ㄴ'. Usually, this sequence displayed like 'ㅎ' -> '하' -> '한', but currently it displays just a complete character directly (ex. '' -> '한').

FYI, I'm using Dubeolsik (두벌식) Korean IME.

@ghost ghost added Needs-Attention The core contributors need to come back around and look at this ASAP. and removed Needs-Author-Feedback The original author of the issue/PR needs to come back and respond to something labels Jan 15, 2020
@cupsos
Copy link

cupsos commented Jan 16, 2020

@zadjii-msft Same issue here. Terminal with WSL(Debian), pwsh, powershell, cmd not work expectly. //'한그ㄱ글'

Legacy WSL(Debian), pwsh, powershell, cmd can type properly, //'한글'
but not showing character compositions

@DHowett-MSFT DHowett-MSFT removed Needs-Attention The core contributors need to come back around and look at this ASAP. Needs-Triage It's a new issue that the core contributor team needs to triage at the next triage meeting labels Jan 16, 2020
@DHowett-MSFT DHowett-MSFT added this to the Terminal v1.0 milestone Jan 16, 2020
@DHowett-MSFT DHowett-MSFT added the Help Wanted We encourage anyone to jump in on these. label Jan 16, 2020
@zadjii-msft
Copy link
Member

This is another great repro gif from @JuhoKang in #4311

Windows build number: Microsoft Windows [Version 10.0.18362.592]
Windows Terminal version (if applicable): Windows Terminal (Preview) Version: 0.8.10091.0

ex) Type 안녕하세요 with the keyboard

koreannotnormal

Expected behavior

show 안녕하세요 on the terminal

Actual behavior

shows 안녀ㄴ녕하ㅎ하셍세세요요

@ioklo
Copy link

ioklo commented Jan 29, 2020

FYI, expected behavior during composition.
ime

@hatsunearu
Copy link

microsoft/vscode#89853 Possibly similar issue happening in VS Code

@rkttu
Copy link
Author

rkttu commented Feb 2, 2020

@zadjii-msft Is there any progress on this issue? It seems like quite a lot of people encountered this problem.

@zadjii-msft
Copy link
Member

@rkttu Nope, when there is progress, someone will make sure to chime in on this thread. It's been triaged as a P1 bug for 1.0, so we won't be shipping 1.0 without a fix for this, so stay tuned. If anyone is particularly passionate about this bug, we'd be happy to review a PR. Until someone's been assigned to this bug, you can be sure you won't be stepping on our toes ☺️

@hatsunearu
Copy link

Possible workaround (not tested): try removing Droid Sans Fallback from the fonts list of the app if it's there.

microsoft/vscode#89853 (comment)

@simnalamburt
Copy link

Note that this problem did not occur in version 0.7.3451.0.

@simnalamburt
Copy link

Possible workaround (not tested): try removing Droid Sans Fallback from the fonts list of the app if it's there.

microsoft/vscode#89853 (comment)

@hatsunearu Unfortunately microsoft/vscode#89853 is caused by wrong glyph rendering and that workaround dosn't work here.

@simnalamburt
Copy link

simnalamburt commented Feb 26, 2020

@guswns0528 discovered that this bug is first caused by dfa7b4a1. dfa7b4a1 itself is right commit so we can't simply revert it.

Looks like this issue is caused by strange behavior of Core text APIs. Originally, the Composition complete event should only be fired when the letter composition is completed. Like this:

  • If you are typing , composition event should be fired twice.
    , and

But in here it's fired before that. Like this:

  • If you type in here, composition event is fired three times.
    , , and

It might be a bug of Core text APIs but core text API is just a wrapper of Text Services Framework, which is super stable framework inheritted from Windows XP era. So I need to investigate further.

I'm currently debugging this issue and I'll comment on here if I find something. Please feel free to share any information if you find something. Thanks

@DHowett-MSFT
Copy link
Contributor

@simnalamburt thanks! Note that @leonMSFT is currently working in this area and has a couple pull requests out for this; perhaps you two can coordinate?

@simnalamburt
Copy link

Cool! Currently I am suspecting wrong parameter of NotifyTextChanged function as the cause of the bug. But in fact, since I first saw the Core text API today, I still don't know what's wrong. Any help or information is always welcomed!

@leonMSFT
Copy link
Contributor

leonMSFT commented Feb 26, 2020

@simnalamburt thanks a lot for the investigation! I tried to see if I can repro the two vs three composition completed events issue, but I only see two composition completed events. Maybe you could provide a screenshot of the debugging logs you used to find this? Update: I just found this out, but you're likely seeing multiple composition completed events after the first character is finished because when we call NotifyTextChanged to reset the text server's buffer, it'll fire another composition completed event. However, since we have the if-statement to check if there's anything in _inputBuffer, we don't do anything on our side.

I was also taking a look at this bug earlier, and perhaps my findings could explain why you're seeing the bug you're seeing. Here's the behavior I'm observing and why I think we're messing up Korean IME input:

So, going through your example keysequence, pressing , would result in three TextUpdated events being received, with the _inputBuffer and the _textBlock having the character .

Now the user presses , and what will happen is the following:

  1. TextUpdated is received with the character .
  2. CompositionCompleted is received, signaling that is the finished composition.
  3. As part of our CompositionCompletedHandler, we send the contents of _inputBuffer (which is 한) to the terminal and reset the _inputBuffer and _textBlock. Then we also notify the text server that they should also make their "buffer" empty as well. (This is the NotifyTextChanged call that you mentioned).

What should happen now is that we should receive a TextUpdated event with the text as 한ㄱ.
However, we're actually not getting any TextUpdated events after our CompositionCompletedHandler is finished because we're telling the text server to reset their text buffer. Since their buffer is empty, it won't tell us that we should update our text to be anything.

This is why you'll run into the weird issue where you'll be pressing , which works fine, and you'll see on the screen, but once you make another input, like , nothing happens. The keypress triggers the CompositionCompleted event for the previous character 한, which tells the text server to clear their buffer. So, you will need to press again to make ㄱ show up on the screen.

So, the core of the problem is that we need to send the IME input to the terminal when we believe composition is finished, and we naturally also need to clear our buffer whenever we send some input to the terminal. We also need to keep the text server's buffer and our _inputBuffer in sync, so whenever we clear our buffer, we tell the text server to clear theirs as well.

As a small test, I've tried commenting out the code where we're telling the text server to reset their text buffer, and lo and behold, text comes out as you would expect, without having to double-press any characters. The only problem here is that if we don't reset our text buffer, every CompositionCompleted event will cause us to send the whole _inputBuffer (which included literally everything you've ever typed while in IME mode) to the terminal, resulting in lots of duplicate input.

I'm currently trying to think of a way around this, but I'm giving you a summary of my findings so maybe you can also repro and investigate further to see if I've missed something! 😄

@simnalamburt
Copy link

simnalamburt commented Feb 28, 2020

@leonMSFT Wow thanks for your detailed explanation! Now I understand what was going on in my development environment.

Currently I’m trying to leave some unfinished characters in text buffer instead of totally clearing it in CompositionCompletedHandler.

Please share any information or updates and let me help anything I can! Actually there are lots of people waiting for this issue to be resolved since there are not much options that Korean developer can choose in Windows. Any share will be helpful and the whole Korean developer community will be grateful to you! 😄

@leonMSFT
Copy link
Contributor

@simnalamburt Yup, not clearing the whole text buffer, but leaving unfinished characters in is key! Luckily, I think I'm close to getting the fix for this out! 🎉I'm specifically testing out trying to type out this sequence: 안녕하세요, which was provided earlier in this thread.

koreanoutput

It seems to work as expected, but before I have a PR out for this fix, I'll need to make sure I haven't messed up any other IME input modes, so hang tight! 😃

One thing I would like your help on is letting me know of other sample character sequences that might possibly break the way I'm handling Korean IME! I don't know Korean at all, so having the sequence laid out in english characters like it was above with "dkssudgktpdy" (which comes out as 안녕하세요) really helped!

@simnalamburt
Copy link

simnalamburt commented Feb 29, 2020 via email

@simnalamburt
Copy link

I tried to reproduce the scenario that you described, but I'm having trouble. I typed and I got 2 textUpdate event instead of 0 after typing "한".

# Typed ㅎ
_compositionStartedHandler() called.
_textUpdatingHandler() called.
Text:  ㅎ
Range: [0, 0]

# Typed ㅏ
_textUpdatingHandler() called.
Text:  하
Range: [0, 1]

# Typed ㄴ
_textUpdatingHandler() called.
Text:  한
Range: [0, 1]

# Typed ㄱ
_compositionCompletedHandler() called.
_SendAndClearText() called.
inputBuffer was L"한" and cleared.
_compositionStartedHandler() called.
_textUpdatingHandler() called.
Text:  ㄱ
Range: [0, 0]
_textUpdatingHandler() called.
Text:  ㄱ
Range: [0, 0]
_compositionCompletedHandler() called.
_SendAndClearText() called.
inputBuffer was L"ㄱㄱ" and cleared.

Is there anything that I misunderstood, or did something changed with 31c9d19#diff-7708ccd4133d008adca4935827f7ddb7?

simnalamburt@acf74bc8ad947 this is a patch that I used for tracing.

@leonMSFT
Copy link
Contributor

leonMSFT commented Mar 2, 2020

That's really weird! I pulled your branch from your fork (and the branch called patch-4226) and tried to do the same thing you were doing and this is what I'm getting:

testingdebug1

After pressing , as you can see, _textUpdatingHandler, _compositionCompletedHandler and _SendAndClearText are called in sequence, and another _compositionCompletedHandler is called afterwards. I'm not seeing the two extra _textUpdatingHandler calls that you're seeing though. 😢

@simnalamburt
Copy link

That's strange. My issue (2 text update event) is being reproduced consistently on two computers, mine and PC of @guswns0528. I wonder what the difference is.

My Windows specifications:

  • Windows Terminal version: simnalamburt@acf74bc8ad947
  • Windows Edition: Windows 10 Pro
  • Windows Version: 1909
  • Windows OS build: 18363.657
  • Processor type: 64-bit operating system, x64-based processor
  • Windows display language: English (United States)
  • Default app language, Default input language: 한국어

@ghost ghost added the In-PR This issue has a related PR label Mar 4, 2020
@leonMSFT
Copy link
Contributor

leonMSFT commented Mar 4, 2020

From the details you've provided, I don't really see a difference 😢. However! I finally have a PR out, so feel free to take a look and play around with it!

@simnalamburt
Copy link

Me and @guswns0528 tried #4796 and it worked perfectly! Still don't know why the 2 TextUpdate events were fired but looks like your patch fixed it anyway. 👍

A video typing '감사합니다!' in new microsoft terminal

@ghost ghost closed this as completed in #4796 Mar 4, 2020
@ghost ghost added Resolution-Fix-Committed Fix is checked in, but it might be 3-4 weeks until a release. and removed In-PR This issue has a related PR labels Mar 4, 2020
ghost pushed a commit that referenced this issue Mar 4, 2020
## Summary of the Pull Request
Korean IME was not working correctly due to way we were clearing the input buffer inside of `TSFInputControl`. We wanted to clear our input buffer and tell TSF to clear its input buffer as well when we receive a `CompositionCompleted` event. This works fine in some IME languages such as Chinese and Japanese. However, Korean IME composes characters differently in such a way where we can't tell TSF to clear their buffer during a `CompositionCompleted` event because it would clear the character that triggered the `CompositionCompleted` event in the first place.

The solution in this PR is to keep our `_inputBuffer` intact until the user presses <kbd>Enter</kbd> or <kbd>Esc</kbd>, in which case we clear our buffer and the TSF buffer. I've chosen these two keys because it seems to make sense to clear the buffer after text is sent to the terminal with <kbd>Enter</kbd>, and <kbd>Esc</kbd> usually means to cancel a current composition anyway.

This means we need to keep track of our last known "Composition Start Point", which is represented by `_activeTextStart`. Whenever we complete a composition, we'll send the portion of the input buffer between `_activeTextStart` and the end of the input buffer to the terminal. Then, we'll update `_activeTextStart` to be the end of the input buffer so that the next time we send text to the terminal, we'll only send the portion of our buffer that's "active".

## PR Checklist
* [x] Closes #4226
* [x] CLA signed
* [x] Tests added/passed

## Validation Steps Performed
Manual testing with Chinese, Japanese, and Korean IME.
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-Input Related to input processing (key presses, mouse, etc.) Area-TerminalControl Issues pertaining to the terminal control (input, selection, keybindings, mouse interaction, etc.) Help Wanted We encourage anyone to jump in on these. Issue-Bug It either shouldn't be doing this or needs an investigation. Priority-1 A description (P1) Product-Terminal The new Windows Terminal. Resolution-Fix-Committed Fix is checked in, but it might be 3-4 weeks until a release.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants