-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Korean IME does not work as expected #4226
Comments
The bug breaks my daily workflow :( |
@zadjii-msft Currently it only happens in WT, but legacy console's behavior also weird. It does not display character compositions. For example, to make a character '한', user can typing the key 'ㅎ', 'ㅏ', and 'ㄴ'. Usually, this sequence displayed like 'ㅎ' -> '하' -> '한', but currently it displays just a complete character directly (ex. '' -> '한'). FYI, I'm using Dubeolsik (두벌식) Korean IME. |
@zadjii-msft Same issue here. Terminal with WSL(Debian), pwsh, powershell, cmd not work expectly. //'한그ㄱ글' Legacy WSL(Debian), pwsh, powershell, cmd can type properly, //'한글' |
This is another great repro gif from @JuhoKang in #4311
|
microsoft/vscode#89853 Possibly similar issue happening in VS Code |
@zadjii-msft Is there any progress on this issue? It seems like quite a lot of people encountered this problem. |
@rkttu Nope, when there is progress, someone will make sure to chime in on this thread. It's been triaged as a P1 bug for 1.0, so we won't be shipping 1.0 without a fix for this, so stay tuned. If anyone is particularly passionate about this bug, we'd be happy to review a PR. Until someone's been assigned to this bug, you can be sure you won't be stepping on our toes |
Possible workaround (not tested): try removing Droid Sans Fallback from the fonts list of the app if it's there. |
Note that this problem did not occur in version 0.7.3451.0. |
@hatsunearu Unfortunately microsoft/vscode#89853 is caused by wrong glyph rendering and that workaround dosn't work here. |
@guswns0528 discovered that this bug is first caused by dfa7b4a1. dfa7b4a1 itself is right commit so we can't simply revert it. Looks like this issue is caused by strange behavior of Core text APIs. Originally, the Composition complete event should only be fired when the letter composition is completed. Like this:
But in here it's fired before that. Like this:
It might be a bug of Core text APIs but core text API is just a wrapper of Text Services Framework, which is super stable framework inheritted from Windows XP era. So I need to investigate further. I'm currently debugging this issue and I'll comment on here if I find something. Please feel free to share any information if you find something. Thanks |
@simnalamburt thanks! Note that @leonMSFT is currently working in this area and has a couple pull requests out for this; perhaps you two can coordinate? |
Cool! Currently I am suspecting wrong parameter of NotifyTextChanged function as the cause of the bug. But in fact, since I first saw the Core text API today, I still don't know what's wrong. Any help or information is always welcomed! |
@simnalamburt thanks a lot for the investigation!
So, going through your example keysequence, pressing ㅎㅏㄴ, would result in three Now the user presses ㄱ, and what will happen is the following:
What should happen now is that we should receive a This is why you'll run into the weird issue where you'll be pressing ㅎㅏㄴ, which works fine, and you'll see So, the core of the problem is that we need to send the IME input to the terminal when we believe composition is finished, and we naturally also need to clear our buffer whenever we send some input to the terminal. We also need to keep the text server's buffer and our As a small test, I've tried commenting out the code where we're telling the text server to reset their text buffer, and lo and behold, text comes out as you would expect, without having to double-press any characters. The only problem here is that if we don't reset our text buffer, every CompositionCompleted event will cause us to send the whole I'm currently trying to think of a way around this, but I'm giving you a summary of my findings so maybe you can also repro and investigate further to see if I've missed something! 😄 |
@leonMSFT Wow thanks for your detailed explanation! Now I understand what was going on in my development environment. Currently I’m trying to leave some unfinished characters in text buffer instead of totally clearing it in CompositionCompletedHandler. Please share any information or updates and let me help anything I can! Actually there are lots of people waiting for this issue to be resolved since there are not much options that Korean developer can choose in Windows. Any share will be helpful and the whole Korean developer community will be grateful to you! 😄 |
@simnalamburt Yup, not clearing the whole text buffer, but leaving unfinished characters in is key! Luckily, I think I'm close to getting the fix for this out! 🎉I'm specifically testing out trying to type out this sequence: 안녕하세요, which was provided earlier in this thread. It seems to work as expected, but before I have a PR out for this fix, I'll need to make sure I haven't messed up any other IME input modes, so hang tight! 😃 One thing I would like your help on is letting me know of other sample character sequences that might possibly break the way I'm handling Korean IME! I don't know Korean at all, so having the sequence laid out in english characters like it was above with "dkssudgktpdy" (which comes out as 안녕하세요) really helped! |
Wow it was very fast! I lost my chance to become hero lol
Actually there are not many corner cases in Korean IME. And your sample
video (안녕하세요) showed that it perfectly handles one of very famous corner
cases in Korean IME called “도깨비불 현상”.
If 안녕하세요 works perfectly I expect the other cases to work fine, but I’ll
share you few more samples just in case.
gksrnrdj whgdk (한국어 좋아)
To test whether aborting composition with space works fine
gksrnrdj<kbd>Enter</kbd>whgdk (`한국어\n좋아`)
To test whether aborting composition with enter works fine
Actually there are bunch of things to test further like
- Test if alternative Korean IME like 세벌식 works fine
- Test if swiching IME in the middle of composition works fine (it’s very
common for Korean people)
- etc
But these cases might be complicated to ask you to test so just share your
patch or make the draft PR. I have bunch of Korean and Japanese friend
developers interested in this issue and they will battle test it for you!
2020년 2월 29일 (토) 09:27, Leon Liang <notifications@github.com>님이 작성:
|
I tried to reproduce the scenario that you described, but I'm having trouble. I typed ㅎㅏㄴㄱ and I got 2 textUpdate event instead of 0 after typing "한".
Is there anything that I misunderstood, or did something changed with 31c9d19#diff-7708ccd4133d008adca4935827f7ddb7? simnalamburt@acf74bc8ad947 this is a patch that I used for tracing. |
That's really weird! I pulled your branch from your fork (and the branch called patch-4226) and tried to do the same thing you were doing and this is what I'm getting: After pressing ㄱ, as you can see, |
That's strange. My issue (2 text update event) is being reproduced consistently on two computers, mine and PC of @guswns0528. I wonder what the difference is. My Windows specifications:
|
From the details you've provided, I don't really see a difference 😢. However! I finally have a PR out, so feel free to take a look and play around with it! |
Me and @guswns0528 tried #4796 and it worked perfectly! Still don't know why the 2 TextUpdate events were fired but looks like your patch fixed it anyway. 👍 |
## Summary of the Pull Request Korean IME was not working correctly due to way we were clearing the input buffer inside of `TSFInputControl`. We wanted to clear our input buffer and tell TSF to clear its input buffer as well when we receive a `CompositionCompleted` event. This works fine in some IME languages such as Chinese and Japanese. However, Korean IME composes characters differently in such a way where we can't tell TSF to clear their buffer during a `CompositionCompleted` event because it would clear the character that triggered the `CompositionCompleted` event in the first place. The solution in this PR is to keep our `_inputBuffer` intact until the user presses <kbd>Enter</kbd> or <kbd>Esc</kbd>, in which case we clear our buffer and the TSF buffer. I've chosen these two keys because it seems to make sense to clear the buffer after text is sent to the terminal with <kbd>Enter</kbd>, and <kbd>Esc</kbd> usually means to cancel a current composition anyway. This means we need to keep track of our last known "Composition Start Point", which is represented by `_activeTextStart`. Whenever we complete a composition, we'll send the portion of the input buffer between `_activeTextStart` and the end of the input buffer to the terminal. Then, we'll update `_activeTextStart` to be the end of the input buffer so that the next time we send text to the terminal, we'll only send the portion of our buffer that's "active". ## PR Checklist * [x] Closes #4226 * [x] CLA signed * [x] Tests added/passed ## Validation Steps Performed Manual testing with Chinese, Japanese, and Korean IME.
Environment
Steps to reproduce
$x = '한글'
Expected behavior
$x = '한글'
Actual behavior
$x = '한그ㄱ글'
The text was updated successfully, but these errors were encountered: