win32 encoding: fix more inconsistencies #2235

kreudom · 2022-07-10T03:29:56Z

While looking into #2196 I was able to find some more inconsistencies in the win32 encoding. I discovered the following differences:

CTRL + ALT + key combinations should be sent with the same character code that would be generated by AltGr + key, i.e. either \x00 or a character depending on the current layout. I fixed the first case for now, which should fix win32 input mode: not processing Ctrl + Alt keyboard combinations properly #2196 in most cases.
AltGr + key should be sent with character code \x00 like above if that combination has no associated character. Right now wezterm falls back on the character code of key.
SHIFT + ascii_letter sends the correct character, but misses the SHIFT flag. This is caused by the normalize_shift function. I don't know if this is intentional behaviour for non-windows systems, so I didn't want to change that for now.

Raw test data here

While we could try to implement all these different cases, I did some testing and realized the ToUnicode function would actually return the correct character in all these cases if the control keys weren't manually removed from the key set before, as it currently happens here:

wezterm/window/src/os/windows/window.rs

Lines 2143 to 2153 in d5060ec

    
           // If control is pressed, clear that out and remember it in our 
        
           // own set of modifiers. 
        
           // We used to also remove shift from this set, but it impacts 
        
           // handling of eg: ctrl+shift+' (which is equivalent to ctrl+" in a US English 
        
           // layout. 
        
           // The shift normalization is now handled by the normalize_shift() method. 
        
           if modifiers.contains(Modifiers::CTRL) { 
        
               keys[VK_CONTROL as usize] = 0; 
        
               keys[VK_LCONTROL as usize] = 0; 
        
               keys[VK_RCONTROL as usize] = 0; 
        
           }

However, this would change the KeyEvents wezterm generates, possibly breaking existing configurations and keyboard encoding for allow_win32_input_mode = false.

One possible idea would be to add a new optional field to the KeyEvent structure which is only used when using the win32 encoding. What's your opinion on all this?

wez · 2022-07-10T14:45:21Z

I'm not opposed to adding something like a win32_uni_char field alongside the scan_code field:

wezterm/wezterm-input-types/src/lib.rs

Lines 1076 to 1078 in d7fa541

    
           /// The *other* OS and hardware dependent key code for the key 
        
           #[cfg(windows)] 
        
           pub scan_code: u32,

to hold what we get from ToUnicode in this case.

However! ToUnicode has internal global state that makes it tricky to call multiple times with varying parameters: it can destroy or be surprisingly affected by dead key states, so we'd need to be careful to verify that things are working correctly! I think robust results might be to:

Save a pristine copy of the keys we get from the GetKeyboardState call
Later in that function, when we understand the dead key state, call ToUnicode with the pristine state to get the data we need. That may require following up with additional call(s) to replay/correct the dead key state afterwards. You can see some of the gymnastics that happens to probe the layout in the KeyboardLayoutInfo code.
We might be able to avoid the tricky parts: if they are only tricky while we are in the middle of dead key processing, then those key presses probably wouldn't make it to the terminal, so perhaps it is valid to only make this extra ToUnicode call in the non-dead-key code path?

kreudom · 2022-07-10T23:21:11Z

The last commits are my shot at implementing this. Some notes:

I only call ToUnicode twice in the same case when it would have been called once anyway. This should ensure ToUnicode stays in the correct state.
I don't call ToUnicode when handling dead keys, I assume the probing process already identified the correct characters, since ToUnicode is called with all modifiers intact when probing.
There is some kind of race condition happening when two KeyEvents are emitted by an invalid dead key combination. Both keys sometimes arrived in the terminal swapped around.

I didn't look far into it, but I assume this happened because the first of the two keys was encoded with the default encoding and sent with pane.key_down while the second key was encoded with the win32 encoding and sent using pane.writer.write_all.

My changes now let both keys be encoded with win32 encoding, this seems to have fixed the problem. But I thought I'd let you know in case it happens again.

All in all, these changes make the win32 encoding a lot more accurate. There's still a few differences, mainly regarding dead key handling, but I'm not sure fixing those is worth the effort.

This field is populated by calling ToUnicode with an unmodified keystate. It will be used to fix edge cases in the win32 keyboard encoding in an upcoming commit.

This allows us to remove a lot of manual edge case handling and fixes multiple encoding bugs.

Shift normalization removes the shift modifier from an event. This modifier still needs to be encoded when using win32 keyboard encoding. Luckily we can fall back on using the individual left and right shift modifiers, since the win32 keyboard encoding doesn't distinguish those.

This lets both keys be encoded with the win32 keyboard encoding if an invalid dead key combination is pressed. Before this change, the first key was always sent with normal terminal encoding. This change also works around a race condition that would frequently occur when using win32 encoding where the first key of an invalid combination would appear after the second. Furthermore, pressing any two dead keys will now immediately emit both of them. This is the same behaviour as in conhost.

This pulls in almost all of the original PR in #2235. I skipped a dead key case that I recall being tricky: I didn't want to break the non win32-input mode version of that. I'd be happy to have that case re-evaluated in a smaller PR where we can focus on its details. Co-authored-by: Dominik Kreutzer <kreudom@gmail.com>

wez · 2023-04-16T21:08:59Z

Thanks for your patience with this; I manually merged this, except for one dead key case that I recall as being important for the non-win32 input mode. We can follow up on that in a separate PR if that still causes problems!

kreudom marked this pull request as ready for review July 10, 2022 23:21

kreudom added 4 commits July 18, 2022 02:04

win32: add win32_uni_char field to KeyEvent

46b6dd3

This field is populated by calling ToUnicode with an unmodified keystate. It will be used to fix edge cases in the win32 keyboard encoding in an upcoming commit.

win32: use win32_uni_char field for win32 keyboard encoding

f3544dd

This allows us to remove a lot of manual edge case handling and fixes multiple encoding bugs.

kreudom force-pushed the main branch from 8cbe349 to b723b5b Compare July 18, 2022 00:28

wez added Windows Issue applies to Microsoft Windows keyboard Keyboard mapping/handling labels Nov 19, 2022

wez mentioned this pull request Mar 30, 2023

Wezterm preventing CTRL + / from entering vim #3180

Open

wez closed this Apr 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

win32 encoding: fix more inconsistencies #2235

win32 encoding: fix more inconsistencies #2235

kreudom commented Jul 10, 2022

wez commented Jul 10, 2022

kreudom commented Jul 10, 2022

wez commented Apr 16, 2023

	// If control is pressed, clear that out and remember it in our
	// own set of modifiers.
	// We used to also remove shift from this set, but it impacts
	// handling of eg: ctrl+shift+' (which is equivalent to ctrl+" in a US English
	// layout.
	// The shift normalization is now handled by the normalize_shift() method.
	if modifiers.contains(Modifiers::CTRL) {
	keys[VK_CONTROL as usize] = 0;
	keys[VK_LCONTROL as usize] = 0;
	keys[VK_RCONTROL as usize] = 0;
	}

win32 encoding: fix more inconsistencies #2235

win32 encoding: fix more inconsistencies #2235

Conversation

kreudom commented Jul 10, 2022

wez commented Jul 10, 2022

kreudom commented Jul 10, 2022

wez commented Apr 16, 2023