Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Win32 input mode breaks WSL .exe interop #16343

Closed
lhecker opened this issue Nov 20, 2023 · 13 comments
Closed

Win32 input mode breaks WSL .exe interop #16343

lhecker opened this issue Nov 20, 2023 · 13 comments
Assignees
Labels
Issue-Bug It either shouldn't be doing this or needs an investigation. Needs-Tag-Fix Doesn't match tag requirements Priority-0 Bugs that we consider release-blocking/recall-class (P0) Product-Conpty For console issues specifically related to conpty zInbox-Bug Ignore me!

Comments

@lhecker
Copy link
Member

lhecker commented Nov 20, 2023

Windows Terminal version

1.19.2682.0

Windows build number

No response

Steps to reproduce

Expected Behavior

Input is preserved as it is typed.

Actual Behavior

There are two independent issues at play here:

  • ConPTY deadlocks on startup
    ConhostInternalGetSet::ReturnResponse translates a VT response into a sequence of INPUT_RECORDs and calls InputBuffer::Write. That one calls TerminalInput::HandleKey which translates KEY_EVENTs type into Win32 input sequences. In other words, VT responses are being double-encoded in conhost.
    At some point when ConPTY starts, VtIo::StartIfNeeded() is called which calls RequestCursor() under WSL (= "\x1b[6n" request) and waits for a response. ConPTY now deadlocks because it won't double-decode the response.
  • Console apps receive still-encoded events as input
    Even with that issue fixed, there's still a change in behavior, as the Win32 console application still receives undecoded Win32 input sequences as a series of INPUT_RECORDs. I.e. typing "a" yields 2 dozen INPUT_RECORDs.

Note to self:

@lhecker lhecker added Product-Conpty For console issues specifically related to conpty Issue-Bug It either shouldn't be doing this or needs an investigation. Needs-Triage It's a new issue that the core contributor team needs to triage at the next triage meeting Priority-0 Bugs that we consider release-blocking/recall-class (P0) zInbox-Bug Ignore me! labels Nov 20, 2023
@lhecker
Copy link
Member Author

lhecker commented Nov 20, 2023

@j4james FYI this regression was caused by #15476, but I haven't really understood yet why it happens or how to fix it. I'll try to figure it out later, but if you immediately know what the issue is (and it's not a problem for you) let me know, and I'll fix it ASAP. 🙂

@zadjii-msft zadjii-msft added this to the Terminal v1.20 milestone Nov 20, 2023
@zadjii-msft
Copy link
Member

@craigloewen-msft as an FYI. We may have broken windows interop for WSL for insiders. We're looking into it.

@zadjii-msft zadjii-msft removed the Needs-Triage It's a new issue that the core contributor team needs to triage at the next triage meeting label Nov 20, 2023
@microsoft-github-policy-service microsoft-github-policy-service bot added the Needs-Tag-Fix Doesn't match tag requirements label Nov 20, 2023
@lhecker
Copy link
Member Author

lhecker commented Nov 20, 2023

Probably also causes #16307

@j4james
Copy link
Collaborator

j4james commented Nov 20, 2023

From what @DHowett said in #15476 (comment), my understanding was that the conpty DLL was tightly coupled to the openconsole version. So I'm wondering whether it's possible that you're getting a version mismatch by replacing the system conhost, if you aren't also using the corresponding conpty DLL?

If it's not something like that, I have no idea what the issue could be. But I haven't had a chance to look at the code in detail (and probably won't have much time again until the weekend).

@j4james
Copy link
Collaborator

j4james commented Nov 20, 2023

I have an idea what the problem might be, but I don't really know how WSL works behind the scenes so my understanding may be incorrect.

  1. I'm assuming when you launch a Windows executable from within WSL it acts like a third-party conpty client, and creates a conpty connection.
  2. I'm assuming it would do this without the PSEUDOCONSOLE_WIN32_INPUT_MODE flag, but since that flag was removed in PR 15475, we now act as if it's always set.
  3. Without that flag set, conhost would not previously have sent DECSET sequences to enable win32 input mode and focus event mode, but now it does that regardless.
  4. It was assumed that change would be harmless, because a client that didn't understand those modes would ignore them, but I suspect WSL is actually passing them up to the parent terminal.
  5. This would result in the parent terminal enabling win32 input mode, and from then on any keystrokes would be sent to WSL in a format that it doesn't understand (possibly also affecting VT query responses).

If the above assumptions are correct, we're probably going to need to revert most if not all of PR #15476 in order to fix this, and come up with another solution for #15461.

@lhecker
Copy link
Member Author

lhecker commented Nov 21, 2023

The way I see it, the fundamental problem is that the same Mode::Win32 flag is being used for both "input" into and "output" from the InputBuffer and indiscriminately of the purpose of the read.

By default it should only take affect when INPUT_RECORDs are being read for communication between a terminal and ConPTY but since we only have a single InputBuffer::Read() and a single InputBuffer::Write() function it taints even all of the public APIs that don't get those win32-input-sequences. This is especially weird for internal APIs like the aforementioned ConhostInternalGetSet::ReturnResponse which is already is text-based and hence the double-encoding. So, if I don't misunderstand the purpose of the Win32 DECSET sequence I think it's actually fine that it's being passed up through WSL. It just shouldn't be immediately visible to console clients.

Do we support console clients requesting win32-input-sequences? If so, then that doesn't change the above IMO, because a client requesting it shouldn't change the encoding for any other attached client.

Fixing the double-encoding is trivial with just a couple lines changed, but fixing the Win32 sequences being visible to console clients is difficult since it requires us to delay encoding to the time of Read() and make it dependent on the requester. I've started writing a modernized InputBuffer but it's fairly messy given our current architecture and the vast compatibility requirements. 🤦

I suspect we'll end up having to revert most of #15476 after all... 😢

@DHowett
Copy link
Member

DHowett commented Nov 21, 2023

Do we support console clients requesting win32-input-sequences?

I don't see why we shouldn't! It's not different from enabling mouse mode with DECSET 1003.

If so, then that doesn't change the above IMO, because a client requesting it shouldn't change the encoding for any other attached client.

As above, it's not different from enabling mouse mode with DECSET 1003. Unfortunately, we can't know who's requesting it and on whose behalf it is being requested. Even if we track the client application, how do we know that ssh.exe isn't passing input to two different processes--one of which supports win32 input, and one of which does not?

@lhecker
Copy link
Member Author

lhecker commented Nov 21, 2023

Even if we track the client application, how do we know that ssh.exe isn't passing input to two different processes--one of which supports win32 input, and one of which does not?

In my vision, win32-input-sequences would be requested automatically by ConPTY at all times, just like it works right now. It's just that console clients (= in case of SSH on the server side) should not see those in form of 20 INPUT_RECORDs per actual input character if they didn't ask for it. That part is easy to do as far as I can see.

@lhecker
Copy link
Member Author

lhecker commented Nov 21, 2023

See... I'm actually an idiot. The reason it ACTUALLY happens is because of this:

// Fortunately, for VT input, each keystroke comes in as an individual
// write operation. So, if at the end of processing a string for the
// InputEngine, we find that we're not in the Ground state, that implies
// that we've processed some input, but not dispatched it yet. This

confused meme

Unfortunately, that statement is entirely wrong. The most basic example for this is that our input buffer is a basic

char buffer[256];

and if your input pipe got more characters than that, the comment's assumption will be broken, because the read will be split up into multiple chunks. Second of all, however, WSL doesn't use anonymous pipes and so they can split up inputs into arbitrary chunks down to individual bytes. That's why we see broken win32 input sequences randomly in WSL.

@DHowett
Copy link
Member

DHowett commented Nov 21, 2023

That part is easy to do as far as I can see.

Yyeehehhhhh.. eh. Eaugh.

When an application requests ENABLE_VIRTUAL_TERMINAL_INPUT under ConPTY, we pass the input received through ConPTY directly through to the application. It allows the terminal and the client application to support input modalities that conhost doesn't know about.

@j4james
Copy link
Collaborator

j4james commented Nov 26, 2023

I did a little hack experiment with fixing the ConhostInternalGetSet::ReturnResponse implementation, so it sends the response directly to InputBuffer::_HandleTerminalInputCallback (instead of creating input events, which get converted into VT sequences, which then converted back into input events).

This fixes the problem with VT query responses not working correctly in conhost when win32-input mode is set (I'm almost sure we already had an issue for that, but I couldn't find it). It also fixes the conpty deadlock on startup that you're getting when running an exe from within WSL.

I can't reproduce the second problem though. Running your ReadConsoleInputW test app, I only see two events per key press, i.e. the up and down events. The keypresses are not showing up as a bunch of undecoded win32-input sequences.

That said, you still have a problem when you exit the exe and return to WSL, because the terminal is still in win32-input mode at that point. Since your typical WSL shell is not going to be expecting that, you get a lot of garbage output at the prompt when you type.

I can work around that by adding a win32-input reset after the exe call, e.g. testinput.exe; printf "\e[?9001l", but that's obviously not a practical solution for users. Someone needs to be responsible for restoring that mode, and I'm not sure where that should happen.

lhecker added a commit that referenced this issue Nov 30, 2023
Since all VT parameters are treated to be at least 1 (and 1 if they're
absent or 0), `modifierParam > 0` was always true. This meant that
`ENHANCED_KEY` was always being set. It's unclear why `ENHANCED_KEY`
was used there, but it's likely not needed in general.

Closes #16266

## Validation Steps Performed
* Can't test this unless we fix the win32 input mode issue #16343
DHowett pushed a commit that referenced this issue Dec 4, 2023
Since all VT parameters are treated to be at least 1 (and 1 if they're
absent or 0), `modifierParam > 0` was always true. This meant that
`ENHANCED_KEY` was always being set. It's unclear why `ENHANCED_KEY`
was used there, but it's likely not needed in general.

Closes #16266

## Validation Steps Performed
* Can't test this unless we fix the win32 input mode issue #16343 ❌

(cherry picked from commit be9fc20)
Service-Card-Id: 91159301
Service-Version: 1.19
DHowett pushed a commit that referenced this issue Dec 4, 2023
This changeset avoids re-encoding output from `AdaptDispatch`
via the win32-input-mode mechanism when VT input is enabled.
That is, an `AdaptDispatch` output like `\x1b[C` would otherwise
result in dozens of characters of input.

Related to #16343

## Validation Steps Performed
* Replace conhost with this
* Launch a Win32 application inside WSL
* ASCII keyboard inputs are represented as single `INPUT_RECORD`s ✅
DHowett pushed a commit that referenced this issue Dec 4, 2023
This changeset avoids re-encoding output from `AdaptDispatch`
via the win32-input-mode mechanism when VT input is enabled.
That is, an `AdaptDispatch` output like `\x1b[C` would otherwise
result in dozens of characters of input.

Related to #16343

## Validation Steps Performed
* Replace conhost with this
* Launch a Win32 application inside WSL
* ASCII keyboard inputs are represented as single `INPUT_RECORD`s ✅

(cherry picked from commit 0da37a1)
Service-Card-Id: 91246942
Service-Version: 1.19
DHowett pushed a commit that referenced this issue Dec 5, 2023
When ConPTY exits it should attempt to restore the state as it was
before it started. This is particularly important for the win32
input mode sequences, as Linux shells don't know what to do with it.

Related to #16343

## Validation Steps Performed
* Replace conhost with this
* Launch a Win32 application inside WSL
* Exit that application
* Shell prompt doesn't get filled with win32 input mode sequences ✅
DHowett pushed a commit that referenced this issue Dec 5, 2023
When ConPTY exits it should attempt to restore the state as it was
before it started. This is particularly important for the win32
input mode sequences, as Linux shells don't know what to do with it.

Related to #16343

## Validation Steps Performed
* Replace conhost with this
* Launch a Win32 application inside WSL
* Exit that application
* Shell prompt doesn't get filled with win32 input mode sequences ✅

(cherry picked from commit 70e51ae)
Service-Card-Id: 91246943
Service-Version: 1.19
DHowett pushed a commit that referenced this issue Dec 5, 2023
This is my proposal to avoid aborting ConPTY input parsing because a
read accidentally got split up into more than one chunk. This happens a
lot with WSL for me, as I often get (for instance) a
`\x1b[67;46;99;0;32;` input followed immediately by a `1_` input. The
current logic would cause both of these to be flushed out to the client
application.

This PR fixes the issue by only flushing either a standalone escape
character or a escape+character combination. It basically limits the
previous code to just `VTStates::Ground` and `VTStates::Escape`.

I'm not using the `_state` member, because `VTStates::OscParam` makes no
distinction between `\x1b]` and `\x1b]1234` and I only want to flush the
former. I felt like checking the contents of `run` directly is easier to
understand.

Related to #16343

## Validation Steps Performed
* win32-input-mode sequences are now properly buffered ✅
* Standalone alt-key combinations are still being flushed ✅
DHowett pushed a commit that referenced this issue Dec 6, 2023
This is my proposal to avoid aborting ConPTY input parsing because a
read accidentally got split up into more than one chunk. This happens a
lot with WSL for me, as I often get (for instance) a
`\x1b[67;46;99;0;32;` input followed immediately by a `1_` input. The
current logic would cause both of these to be flushed out to the client
application.

This PR fixes the issue by only flushing either a standalone escape
character or a escape+character combination. It basically limits the
previous code to just `VTStates::Ground` and `VTStates::Escape`.

I'm not using the `_state` member, because `VTStates::OscParam` makes no
distinction between `\x1b]` and `\x1b]1234` and I only want to flush the
former. I felt like checking the contents of `run` directly is easier to
understand.

Related to #16343

## Validation Steps Performed
* win32-input-mode sequences are now properly buffered ✅
* Standalone alt-key combinations are still being flushed ✅

(cherry picked from commit 5f5ef10)
Service-Card-Id: 91270261
Service-Version: 1.19
DHowett pushed a commit that referenced this issue Dec 15, 2023
Even with the previous fixes we still randomly encounter win32-
input-mode sequences that are broken up in exactly such a way that
e.g. lone escape keys are encounters. Those for instance clear the
current prompt. The remaining parts of the sequence are then visible.

This changeset fixes the issue by skipping the entire force-to-ground
code whenever we saw at least 1 win32-input-mode sequence.

Related to #16343

## Validation Steps Performed
* Host a ConPTY inside ConPTY (= double the trouble) with cmd.exe
* Paste random amounts of text
* In the old code spurious `[..._` strings are seen
* In the new code they're consistently gone ✅
DHowett pushed a commit that referenced this issue Dec 15, 2023
Even with the previous fixes we still randomly encounter win32-
input-mode sequences that are broken up in exactly such a way that
e.g. lone escape keys are encounters. Those for instance clear the
current prompt. The remaining parts of the sequence are then visible.

This changeset fixes the issue by skipping the entire force-to-ground
code whenever we saw at least 1 win32-input-mode sequence.

Related to #16343

## Validation Steps Performed
* Host a ConPTY inside ConPTY (= double the trouble) with cmd.exe
* Paste random amounts of text
* In the old code spurious `[..._` strings are seen
* In the new code they're consistently gone ✅

(cherry picked from commit bc18348)
Service-Card-Id: 91337332
Service-Version: 1.19
@zadjii-msft
Copy link
Member

We're pretty sure this was fixed, just not actually marked as closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Issue-Bug It either shouldn't be doing this or needs an investigation. Needs-Tag-Fix Doesn't match tag requirements Priority-0 Bugs that we consider release-blocking/recall-class (P0) Product-Conpty For console issues specifically related to conpty zInbox-Bug Ignore me!
Projects
None yet
Development

No branches or pull requests

4 participants