Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow Linux VM boot [hv-sock proxy (vsudd) is not reachable] #1995

Closed
rn opened this issue May 2, 2018 · 3 comments
Closed

Slow Linux VM boot [hv-sock proxy (vsudd) is not reachable] #1995

rn opened this issue May 2, 2018 · 3 comments

Comments

@rn
Copy link
Contributor

rn commented May 2, 2018

There have been many reports with this error message:

Docker hv-sock proxy (vsudd) is not reachable

It indicates that the Hyper-V Linux VM did not start (completely). There are several causes for this, but there are a few common root causes.

One of them is that the Linux VM is booting very very slowly. There is a timeout, which Docker for Windows uses to detect if there is an issue with boot the VM, and the slow boot causes the timeout to be exceeded and above error message to be displayed.

Other possible root cause are:

This issue deals with the Slow Linux VM boot case.

How to diagnose "Slow Linux VM boot":

Look at the log file (Diagnose and Feedback in the whale systray icon, the click on log file). Then look for the start of the kernel boot. It typically starts with (note there may be several in the log file):

[14:56:59.464][Linux          ][Info   ] Trying to connect to vsudd...
[14:57:00.166][Moby           ][Info   ] Connected
[14:57:01.236][Moby           ][Info   ] [    0.000000] Linux version 4.9.60-linuxkit-aufs (root@4a42478ffb9a) (gcc version 6.3.0 (Alpine 6.3.0) ) #1 SMP Mon Nov 6 16:00:12 UTC 2017
[14:57:01.270][Moby           ][Info   ] [    0.000000] Command line: BOOT_IMAGE=/boot/kernel console=ttyS0 page_poison=1 ntp=gateway vsyscall=emulate panic=1 root=/dev/sr0 text
[14:57:01.299][Moby           ][Info   ] [    0.000000] x86/fpu: Legacy x87 FPU detected.
[14:57:01.318][Moby           ][Info   ] [    0.000000] x86/fpu: Using 'eager' FPU context switches.
[14:57:01.335][Moby           ][Info   ] [    0.000000] e820: BIOS-provided physical RAM map:
[14:57:01.354][Moby           ][Info   ] [    0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009ffff] usable
[14:57:01.370][Moby           ][Info   ] [    0.000000] BIOS-e820: [mem 0x00000000000c0000-0x00000000000fffff] reserved
[14:57:01.384][Moby           ][Info   ] [    0.000000] BIOS-e820: [mem 0x0000000000100000-0x000000007eee9fff] usable
[14:57:01.408][Moby           ][Info   ] [    0.000000] BIOS-e820: [mem 0x000000007eeea000-0x000000007eef1fff] ACPI data

The numbers to the right of the [Info ] column are the timestamps from the kernel boot ([ 0.000000] in this case). As the kernel boots these time stamps increase. On an idle system the first phase of the boot should take about 2-8 seconds depending on your hardware. The first phase is booting the kernel until it switches to userspace. In the logs file this looks like this:

[18:11:00.706][Moby           ][Info   ] [    4.449386] Write protecting the kernel read-only data: 14336k
[18:11:00.722][Moby           ][Info   ] [    4.467857] Freeing unused kernel memory: 2040K
[18:11:00.743][Moby           ][Info   ] [    4.485663] Freeing unused kernel memory: 1368K
[18:11:02.142][Moby           ][Info   ] Welcome to LinuxKit

The Welcome to LinuxKit message is the first userspace message and, on this system (a typical laptop), it took ~4.5s to get there.

From various diagnostics, we have seen it can take way more than 30 or even 60 seconds to get to that point, or in fact never reaching the first phases before the timeout kicks in.

So if you see the kernel timestamp going up, beyond 30 seconds, that is a good indication that you VM is booting slowly.

Here is an example:

[11:33:04.022][Moby           ][Info   ] [   37.242906] cdrom: Uniform CD-ROM driver Revision: 3.20
[11:33:04.347][Moby           ][Info   ] [   37.620167] sr 1:0:0:1: [sr1] scsi-1 drive
[11:33:04.661][Moby           ][Info   ] [   37.620825] sd 0:0:0:0: [sda] Write Protect is off
[11:33:05.320][Moby           ][Info   ] [   37.623339] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[11:33:05.503][Moby           ][Info   ] [   37.636945]  sda: sda1
[11:33:05.807][Moby           ][Info   ] [   37.662678] sd 0:0:0:0: [sda] Attached SCSI disk
[11:33:06.233][Moby           ][Info   ] [   39.397258] sd 0:0:0:0: Attached scsi generic sg0 type 0
[11:33:06.638][Moby           ][Info   ] [   39.823022] sr 0:0:0:1: Attached scsi generic sg1 type 5
[11:33:06.949][Moby           ][Info   ] [   40.220400] sr 1:0:0:1: Attached scsi generic sg2 type 5
[11:33:07.310][Moby           ][Info   ] [   40.549676] tun: Universal TUN/TAP device driver, 1.6

This is half way through the first phase of the kernel boot and it already has taken 40 seconds. It is very slow.

Is there a workaround?

The most common cause seems to be high CPU (or other resource) utilisation when the VM is booting up. For example, when I artificially push the CPU utilisation to 100% with a couple of endless loops, the first phase of the VM boot takes close to 80 seconds (while it takes only 4.5s on an idle system).

  • Check the CPU utilisation on your system using Task Manager.
  • Stop processes which consume a significant amount of CPU (e.g., don't run bitcoin miners in the background when starting Docker for Windows).

You can also try the latest Edge releases (18.05.0-ce-rc1-win63 (17439) or newer) where the timeout is more tolerating and based more on the output from the VM. Note, however, even with these changes, if your CPUs are fully loaded the boot is still likely to fail (it'll just take longer to fail), but the changes may help with medium CPU loads.

@rn
Copy link
Contributor Author

rn commented May 2, 2018

We collating all the diagnostics from users where we have seen the Slow Linux VM boot issue in other issues here. Could you please try and reduce the CPU load, especially when you start Docker for Windows. It looks like Hyper-V VMs are very sensitive to CPU load.

If your system is completely idle and you still see the issue, please also let us know, so we can diagnose it further.

@docker-robott
Copy link
Collaborator

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale comment.
Stale issues will be closed after an additional 30d of inactivity.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so.

Send feedback to Docker Community Slack channels #docker-for-mac or #docker-for-windows.
/lifecycle stale

@docker-robott
Copy link
Collaborator

Closed issues are locked after 30 days of inactivity.
This helps our team focus on active issues.

If you have found a problem that seems similar to this, please open a new issue.

Send feedback to Docker Community Slack channels #docker-for-mac or #docker-for-windows.
/lifecycle locked

@docker docker locked and limited conversation to collaborators Jun 28, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants