Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WSL freezing on startup with networkingMode=mirrored after installing 2.1.0.0 even after downgrade #11005

Closed
1 of 2 tasks
DavidZidar opened this issue Jan 9, 2024 · 19 comments
Labels

Comments

@DavidZidar
Copy link

Windows Version

Microsoft Windows [Version 10.0.22631.3007]

WSL Version

2.0.9.0 - 2.1.0.0

Are you using WSL 1 or WSL 2?

  • WSL 2
  • WSL 1

Kernel Version

5.15.137.3-1

Distro Version

Debian 12

Other Software

No response

Repro Steps

Set the following in .wslconfig

[experimental]
networkingMode=mirrored

Then update to version 2.1.0.0 using wsl --update --pre-release

Expected Behavior

WSL should work.

Actual Behavior

WSL freezes seemingly forever during startup of a distribution, I can't even list distributions using wsl --list when it's frozen.

I have uninstalled the updates and see the same problem with 2.0.9.0 and 2.0.14.0 even though it worked fine before installing 2.1.0.0.

Diagnostic Logs

No response

Copy link

github-actions bot commented Jan 9, 2024

Hi I'm an AI powered bot that finds similar issues based off the issue title.

Please view the issues below to see if they solve your problem, and if the issue describes your problem please consider closing this one and thumbs upping the other issue to help us prioritize it. Thank you!

Open similar issues:

Closed similar issues:

Note: You can give me feedback by thumbs upping or thumbs downing this comment.

@wan84
Copy link

wan84 commented Jan 12, 2024

vim .wslconfig
networking=NAT

@keith-horton
Copy link
Member

@DavidZidar
Copy link
Author

@keith-horton I did my best, the script runs wsl.exe several times which freezes each time so I had to forcefully terminate the WSL service after a while each time it happened to allow the script to continue. So I don't know if these logs are useful but I uploaded them for you to Azure Blob Storage so I can delete them later.
https://blob.steamcore.se/github/WslNetworkingLogs-2024-01-17_23-49-23.zip

@keith-horton
Copy link
Member

Thanks! There are 2 different issues happening.

  1. vSwitch is in a bad state and is failing to create new endpoints to mirror networks into the WSL container.
  2. eventually our WSL service crashes.

Could you follow the instructions here https://github.com/Microsoft/WSL/blob/master/CONTRIBUTING.md#11-reporting-a-wsl-process-crash to capture the memory dmp of WSL crashing?

Thanks!

@DavidZidar
Copy link
Author

@keith-horton It doesn't crash, it becomes unresponsive. I collected this dump shortly after it seems to halt, I hope it helps.
https://blob.steamcore.se/github/WslLogs-2024-01-18_22-44-58.zip

@keith-horton
Copy link
Member

Hi there. in the previous traces submitted we could clearly see at some stage our service no longer responding to calls - and eventually crash and completely restart.

@DavidZidar
Copy link
Author

I killed the process tree using Process Explorer to get the script moving along, isn't that what you were seeing?

@DavidZidar
Copy link
Author

I just tried v2.1.1 and while it didn't solve the issue I noticed that mirrored mode worked once at first after a fresh boot but if I shut down WSL using wsl --shutdown and then try to open a new session it is back to being frozen again.

@keith-horton
Copy link
Member

Thanks for sticking with this David.

Can you run the script https://github.com/microsoft/WSL/blob/master/diagnostics/collect-networking-logs.ps1, then stop it once you have a repro?

Just so it's clear: when you say it's frozen, does that mean:

  • the WSL Linux shell goes away?
  • or it stays but doesn't respond to keyboard input?
  • or it responds to keyboard input but has no networking?
  • does "wsl --version" or "wsl --status" complete, or fail with an error?

(this time please don't kill any processes)

if running "wsl" commands hangs or fails, can you get a process dmp from Task Manager? Right-click on wslservice.exe and click "Create memory dump file".

Thanks!

@DavidZidar
Copy link
Author

DavidZidar commented Jan 31, 2024

@keith-horton I'm not sure what you mean by stop it, that script doesn't work without killing the processes because it can't even collect the before data.

I'm talking about this line here:

& wsl.exe -e $networkingBashScript 2>&1 > $folder/linux_network_configuration_before.log

I left it running for 10 minutes and the log file linux_network_configuration_before.log has 0 size.

the WSL Linux shell goes away?
Windows Terminal seems to time out waiting for a shell if I try to open a new tab.
If I run wsl.exe in an existing tab it stays frozen forever, no prompt appears.

or it stays but doesn't respond to keyboard input?
or it responds to keyboard input but has no networking?
No, it doesn't even start.

does "wsl --version" or "wsl --status" complete, or fail with an error?
wsl --version works

WSL version: 2.1.1.0
Kernel version: 5.15.146.1-2
WSLg version: 1.0.60
MSRDC version: 1.2.5105
Direct3D version: 1.611.1-81528511
DXCore version: 10.0.25131.1002-220531-1700.rs-onecore-base2-hyp
Windows version: 10.0.22631.3085

But wsl --status never completes, no output appears, the prompt never returns (unless I press ctrl+c), nothing happens.

Here is a full dump file of wslservice.exe when it's frozen.
https://blob.steamcore.se/github/wslservice-dmp.zip

@keith-horton
Copy link
Member

Thanks... this makes sense now. Yeah, the script is stuck trying to make a call through wsl.exe, but since the service is hung, that is also blocked. We're trying to see what is going on within Linux as to why our init instance doesn't seem to run. Can you take a manual trace?

It uses the wpr profile from the same WSL/diagnostics.

wpr -start wsl_networking.wprp -filemode

wpr -stop .\wsl_trace.etl

Thanks David. We cannot repro this, and have not heard this from others.

Can you also share what distro you have installed, and any other software you have installed in the Linux partition?
Thanks!

@DavidZidar
Copy link
Author

Interesting, I didn't even consider my distribution being a problem since wsl --status isn't even working.

Your question made me try some stuff, I installed a new blank Ubuntu distro and that one seems to work fine. Then when I was listing my software that's installed I realized I have a network mount in /etc/fstab (ceph-fuse) that's mounted using x-systemd.automount, so I tried disabling that and now all of a sudden my distro starts but now I have no network device! Just lo and loopback0. I tried running ifup eth0 but it just tells me "unknown interface eth0".

So it seems that my WSL freezes because systemd locks up because I don't have a network device when running in mirrored more, are there some other steps I can take to debug this now that we have gotten this far?

Here's a trace when it froze
https://blob.steamcore.se/github/wsl_trace-frozen.zip

And here's a trace when it started but I have no eth0
https://blob.steamcore.se/github/wsl_trace-no-eth0.zip

@DavidZidar
Copy link
Author

I also ran the collect-networking-logs.ps1-script now that WSL starts with no eth0.

https://blob.steamcore.se/github/WslNetworkingLogs-2024-01-31_20-42-19.zip

@DavidZidar
Copy link
Author

Sorry for the spamming but I just double checked and the fresh Ubuntu install doesn't have working networking either, it has both an eth0 and an eth1 but neither has an IP-adress.

@DavidZidar
Copy link
Author

I just stumbled across issue #11263 so I tried removing the internal Hyper-V switch that I had configured but wasn't even using and suddenly mirrored seems to be working again.

@keith-horton
Copy link
Member

@DavidZidar , thanks for the follow-up - glad it's working now!
Sorry - I don't know what was going so badly deep in vswitch

@DavidZidar
Copy link
Author

@keith-horton It's complicated stuff I can imagine, thanks for investigating anyway! Unless you would like me to keep this issue open I'll close it in favor of the other one.

@keith-horton
Copy link
Member

Sure, feel free to close this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants