Skip to content

.exe Interop not working on commands specified in boot command option in wsl.conf #11121

@Kytech

Description

@Kytech

Version

Microsoft Windows 11 [Version 10.0.22000.493]

WSL Version

  • WSL 2
  • WSL 1 (issue not applicable to WSL 1)

Kernel Version

5.10.60.1

Distro Version

Debian 11.2

Other Software

Occurs when any .exe app is involved in this scenario. My specific scenario involves:

socat version 1.7.4.1 (WSL side)
npiperelay version 0.1.0 (windows .exe)
OpenSSH_8.4p1 Debian-5, OpenSSL 1.1.1k 25 Mar 2021 (WSL side)

Repro Steps

Add the following to a wsl config file to /etc/wsl.conf

[boot]
command = something that calls a .exe file and communicates over stdio

# Example from my use case setup, though it probably needs umask=000 for testing purposes to eliminate any factors of
# permissions for the actual socket file itself.
command =  socat "UNIX-LISTEN:/run/user/ssh-agent-pipe,fork,umask=077" EXEC:"/mnt/c/Users/<WindowsUserprofile>/bin/npiperelay.exe -ep -ei -s //./pipe/openssh-ssh-agent",nofork

Restart WSL (8-second rule or wsl --shutdown first). Attempt to run a program in WSL that expects the communication from the interop to work. In my example here, this is:

SSH_AUTH_SOCK=/run/user/ssh-agent-pipe ssh-add -L

Expected Behavior

In general, the communication works without error. In the case above with socat, npiperelay and ssh, SSH keys added to the Windows-side SSH agent are listed.

Actual Behavior

An error is produced indicating that the communication has failed. In the socat and ssh example:

$ SSH_AUTH_SOCK=/run/user/ssh-agent-pipe ssh-add -L
error fetching identities: communication with agent failed

Additionally, in the original issue (linked below at the end of this issue description), others experiencing the same issue with other programs started this way indicated that network interop was also not working. The program would listen on a port, but traffic from the Windows side directed to localhost of the bound port was not properly forwarded to the WSL process.

Diagnostic Logs

The script works as intended when executed directly in a regular WSL terminal window when backgrounding the socat process with the bash job control features.

Through some trial and error, it appears that a large part of the issue is that the WSL_INTEROP env variable is not set for the command specified in wsl.conf. Manually setting this env var for the process in the wsl.conf boot command to point to a socket in /run/WSL that is also specified in an active WSL terminal's WSL_INTEROP env variable results in everything working as expected. However, if the terminal that originally had the WSL_INTEROP env var specified exists, the socket appears to become closed as the process in [boot] command no longer communicates properly, even though the run file still exists. Attempting to use the socket named 1_interop does not appear to work either, even though it seems to be always present, resulting in the same connection error in the ssh, socat, and npiperelay example.

This simply seems like a case of interop not being setup for applications that run through the command section of the wsl.conf [boot] section. Ideally, a socket should be created and made available through the WSL_INTEROP env var for such commands, with the interop connection staying alive as long as the process for command is alive. This interop socket should not close or become inoperable during the lifetime of the process created with command, even if it is long-running, and should only be removed/closed once the process in command exits.


Re-filing #8056 due to issue being automatically closed without any follow-up from maintainers. See previous issue for comments from others who encountered this issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions