-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Description
Version
Microsoft Windows 11 [Version 10.0.22000.493]
WSL Version
- WSL 2
- WSL 1
Kernel Version
5.10.60.1
Distro Version
Debian 11.2
Other Software
Occurs when any .exe app is involved in this scenario. My specific scenario involves:
socat version 1.7.4.1 (WSL side)
npiperelay version 0.1.0 (windows .exe)
OpenSSH_8.4p1 Debian-5, OpenSSL 1.1.1k 25 Mar 2021 (WSL side)
Repro Steps
Add the following to a wsl config file to /etc/wsl.conf
[boot]
command = something that calls a .exe file and communicates over stdio
# Example from my use case setup, though it probably needs umask=000 for testing purposes to eliminate any factors of
# permissions for the actual socket file itself.
command = socat "UNIX-LISTEN:/run/user/ssh-agent-pipe,fork,umask=077" EXEC:"/mnt/c/Users/<WindowsUserprofile>/bin/npiperelay.exe -ep -ei -s //./pipe/openssh-ssh-agent",nofork
Restart WSL (8-second rule or wsl --shutdown first). Attempt to run a program in WSL that expects the communication from the interop to work. In my example here, this is:
SSH_AUTH_SOCK=/run/user/ssh-agent-pipe ssh-add -L
Expected Behavior
In general, the communication works without error. In the case above with socat, npiperelay and ssh, SSH keys added to the Windows-side SSH agent are listed.
Actual Behavior
An error is produced indicating that the communication has failed. In the socat and ssh example:
$ SSH_AUTH_SOCK=/run/user/ssh-agent-pipe ssh-add -L
error fetching identities: communication with agent failed
Diagnostic Logs
The script works as intended when executed directly in a regular WSL terminal window when backgrounding the socat process with the bash job control features.
Through some trial and error, it appears that a large part of the issue is that the WSL_INTEROP env variable is not set for the command specified in wsl.conf. Manually setting this env var for the process in the wsl.conf boot command to point to a socket in /run/WSL that is also specified in an active WSL terminal's WSL_INTEROP env variable results in everything working as expected. However, if the terminal that originally had the WSL_INTEROP env var specified exists, the socket appears to become closed as the process in [boot] command no longer communicates properly, even though the run file still exists. Attempting to use the socket named 1_interop does not appear to work either, even though it seems to be always present, resulting in the same connection error in the ssh, socat, and npiperelay example.
This simply seems like a case of interop not being setup for applications that run through the command section of the wsl.conf [boot] section. Ideally, a socket should be created and made available through the WSL_INTEROP env var for such commands, with the interop connection staying alive as long as the process for command is alive. This interop socket should not close or become inoperable during the lifetime of the process created with command, even if it is long-running, and should only be removed/closed once the process in command exits.