New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SSH connect failures on Mitogen 0.2.9 on WSL Ubuntu 18.04 #681
Comments
Hi, We are experiencing the exact same issue when running a playbook in WSL with Ubuntu over multiple hosts. There are no issues when running a playbook with a single host or when running with Edit: Running with I would gladly help out with additional troubleshooting but I need some pointers on where to start. Environment: |
Same thing ( |
Same here, single connection works fine (--limit single host), else I get the same error. Using WSL1 Debian Buster |
Could someone try latest |
I'm still seeing failures on
|
Does anyone know if there's a way to get a WSL machine to test with? We use Azure Devops to test but afaik there's no WSL env we can enable |
You can probably run the azure devops agent inside a WSL instance and use that as the agent pool in your devops pipeline. |
Also reproducible for me most of the time, it seems much more prone to doing it on "copy" tasks for some reason. I'm surprised because it was all working fine a while ago, so I suspect WSL has updated or something. If I can help with any debug details let me know and I will try. |
We'd need a WSL instance for that right? 🤔 is there an OSS-supported test env (like Travis, Circle, Azure devops, etc) that offer WSL instances? |
I wonder if WSL added a timeout on connection or something? 🤔 The error of |
I'm still on WSL1 and definitely seeing the problem. Sadly, I don't know of any test envs that provide WSL instances to test. |
Could it be due to an ssh timeout error maybe? I found https://www.reddit.com/r/bashonubuntuonwindows/comments/bj617c/how_to_keep_wsl_shell_open_when_ssh_session/ . Wild shot in the dark but if it used to work with the same code and now doesn't then maybe WSL changed their default ssh session connection time? |
I'll dig through the linked post and do some experimenting but an initial look through it doesn't seem to apply, as there is no delay at all between the success and failures. One - and only one - random machine always succeeds and the others immediately fail. It feels more like when it is trying to open a bunch of SSH connections in parallel but only one is being allowed, the rest are immediately rejected by the underlying subsystems (networking maybe?). It's important to note that for me, at least, I'm not sure it ever worked properly. I don't think I tried connecting to an inventory with multiple hosts on WSL before encountering this problem. |
Ok. I'm not too sure why the underlying subsystems would be rejecting the other connections 😞 maybe @dw knows? He fixed WSL stuff last time: 22bab87 and 56943d3 . I do see other ssh-related WSL issues have been filed in the past: microsoft/WSL#3503, not sure if relevant though. |
Just as an additional point, I am seeing the failures and I am only targeting a single host. I agree it seems like a very quick failure. |
Anyone tried WSL2 yet with this? |
Just to chime in with a possible workaround, I was able to work around this by disabling the Windows Defender firewall. I'm not sure why that solves it. All prior steps in the playbook execute successfully. I can also confirm the LAN IP the playbook was run against is accessible with both the firewall on and off. The task in the playbook is:
And the backtrace from the failed execution of the task is:
My platform is WSL1 with Ubuntu 18.04.3 LTS, on Windows 10 1904.985. |
I'm seeing consistent failures when trying to connect via SSH when multiple hosts are specified in the inventory:
One host connects, all of the host connections other fail. If there are more than two hosts in the inventory, all but one fail with the same errors. Repeated runs show that the host that fails appears to be random.
Environment:
Mitogen 0.2.9
Windows 10 Pro, V. 1809, OS build 17763.914
WSL Ubuntu 18.04.3 LTS
ansible 2.7.11 config file = /home/gchaix/repos/xxx/ansible/ansible.cfg configured module search path = [u'/home/gchaix/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules'] ansible python module location = /home/gchaix/.local/lib/python2.7/site-packages/ansible executable location = /home/gchaix/.local/bin/ansible python version = 2.7.15+ (default, Oct 7 2019, 17:39:04) [GCC 7.4.0]
Host target OS is generally CentOS 7.x but this also appears to be happening with other distros (Ubuntu, etc.)
No patches on Ansible or Mitogen. I tried running it with Mitogen current master, same behavior. This feels like it might be related to #319 but I'm not familiar enough with the internals of WSL to really say for certain. Interestingly, running Ansible with
-vvv
seems to bypass the issue, as all host connections succeed, whereas running with just--verbose
produces failure and the output above.The text was updated successfully, but these errors were encountered: