Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow image pulling when using VZ VM type #648

Closed
fungiboletus opened this issue Mar 10, 2023 · 21 comments · Fixed by #848
Closed

Slow image pulling when using VZ VM type #648

fungiboletus opened this issue Mar 10, 2023 · 21 comments · Fixed by #848
Milestone

Comments

@fungiboletus
Copy link

fungiboletus commented Mar 10, 2023

Description

Pulling images is very slow when I set the VM type to VZ.

On the docker hello world image, running it from a clean colima instances takes 3 seconds using the default parameters, and 17 seconds when --vm-type=vz is set. It looks like the time is spent initiating the connection because the download speed looks fine. Could be a slow DNS query, a slow TCP handshake, a slow TLS handshake, I don't know.

Version

Colima Version: 0.5.4
Lima Version: 0.15.0
Qemu Version: 7.2.0

Operating System

macOS M1 >= 13 (Ventura)

Output of colima status

INFO[0000] colima is running using macOS Virtualization.Framework 
INFO[0000] arch: aarch64                                
INFO[0000] runtime: docker                              
INFO[0000] mountType: virtiofs                          
INFO[0000] socket: unix:///Users/fungiboletus/.colima/default/docker.sock 

Reproduction Steps

  1. colima start --cpu 4 --memory 4
  2. time docker run -it --rm hello-world
  3. colima delete
  4. colima start --cpu 4 --memory 4 --vm-type=vz
  5. time docker run -it --rm hello-world
  6. colima delete

Expected behaviour

The duration of pulling and running the hello-world image should be similar.

Additional context

No response

@abiosoft
Copy link
Owner

Hi, thanks for reporting.
From your experience, does it happen always or only the first time and subsequent pulls are normal?

@fungiboletus
Copy link
Author

No problems 🙂

No it's always slow to pull all images (and layers). I also tried to debug inside a container but while it did feel slow, it didn't feel that slow to curl/ping google. I could do a bit more tests and get precise numbers if needed.

@PiotrKlimczak
Copy link

PiotrKlimczak commented Mar 11, 2023

Just installed Colima 1st time and noticed the same problem. My setup is identical to yours.
Every single repo interaction does nothing in 1st few seconds. Regardless if external or internal and regardless of operation- even login is slow.

@baschny
Copy link

baschny commented Mar 16, 2023

Same for me, just installed to try the performance with vmType: vz and mountType: virtiofs on an MacOS 13.2.1 (X86) and the docker pulls take ages to even start.

With brew:

==> Pouring lima--0.15.0.ventura.bottle.tar.gz
==> Pouring colima--0.5.4.ventura.bottle.tar.gz

@AndreasA
Copy link

AndreasA commented Apr 5, 2023

I have done some tests and I think it might be related to DNS and IPv6,
with default dns setting of [] in the yaml configuration, which from I gather can be forced on startup using:

colima start --dns "" --disk 200 --cpu 6 --memory 10 --ssh-agent --vm-type vz --mount-type virtiofs

when using ping:

time colima ssh -- ping  www.github.com -c 1

PING www.github.com (...): 56 data bytes
64 bytes from ....: seq=0 ttl=42 time=0.497 ms

--- www.github.com ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 0.497/0.497/0.497 ms
colima ssh -- ping www.github.com -c 1  0,12s user 0,09s system 17% cpu 1,203 total

requires over 1 second for just one ping command to finish executing (takes quite long to even start9.

if forcing ping with IPv4:

time colima ssh -- ping -4 www.github.com -c 1

PING www.github.com (....): 56 data bytes
64 bytes from ....: seq=0 ttl=42 time=0.256 ms

--- www.github.com ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 0.256/0.256/0.256 ms
colima ssh -- ping -4 www.github.com -c 1  0,11s user 0,08s system 100% cpu 0,193 total

it takes way about 0.2 seconds for everything to finish.

And if I do a docker pull:

docker image rm hello-world:latest

time docker pull hello-world:latest
latest: Pulling from library/hello-world
...: Pull complete
Digest: ....
Status: Downloaded newer image for hello-world:latest
docker.io/library/hello-world:latest
docker pull hello-world:latest  0,04s user 0,02s system 0% cpu 16,414 total

if I override the default DNS with google dns:

colima start --dns 8.8.8.8 --dns 8.8.4.4 --disk 200 --cpu 6 --memory 10 --ssh-agent --vm-type vz --mount-type virtiofs

ping without forcing IP v4:

time colima ssh -- ping www.github.com -c 1

PING www.github.com (...): 56 data bytes
64 bytes from ...: seq=0 ttl=42 time=0.260 ms

--- www.github.com ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 0.260/0.260/0.260 ms
colima ssh -- ping www.github.com -c 1  0,12s user 0,09s system 104% cpu 0,199 total

results in basically the same response as before upon forcing IP v4.

This also matches the docker pull results:

docker image rm hello-world:latest

time docker pull hello-world:latest
latest: Pulling from library/hello-world
...: Pull complete
Digest: ...
Status: Downloaded newer image for hello-world:latest
docker.io/library/hello-world:latest
docker pull hello-world:latest  0,04s user 0,02s system 2% cpu 2,318 total

Not sure if the issue is in colima directly or maybe lima? if the second, somebody should create an issue there, if no such issue exists.

However, specifying the correct DNS manually might not be that easy all the time as might change depending on the local/corporate network or VPN. Though I guess for the colima VM itself that might not matter - except if there is a private docker registry that is only reachable using the correct DNS server 😄

@AndreasA
Copy link

AndreasA commented Apr 6, 2023

OK. I have done some more testing and the best solution seems to supply --dns 192.168.5.3 on startup. This way it sets the correct default DNS and seems to force DNS requests to use IPv4 internally as the issue does not occur.

Also applying dns does not seem to change the file at all after initial instance creation but it still seems to fix the slow dns resolution upon sub-sequent starts.

So e.g.:

colima start --dns 192.168.5.3 --disk 200 --cpu 6 --memory 10 --ssh-agent --vm-type vz --mount-type virtiofs

@AndreasA
Copy link

Hi @abiosoft any idea regarding an official fix here that does not require to set the dns on start? or do you know if this might actually be a lima issue?

@abiosoft
Copy link
Owner

@AndreasA I am wondering why you need to set the DNS to 192.168.5.3 because it is actually the default DNS when --vm-type=vz.

Or this could possibly be an ipv6 issue and maybe try this workaround #686 (comment).

@AndreasA
Copy link

@abiosoft it is most likely an IPv6 issue. Specifying the DNS solves it. It doesn't matter what is supplied (probably as long as an IPv4 address is used), so supplying the default IP works as well.

However, my setup already uses link-local for network settings and the issue still occurs.

For some reason specifying the DNS manually solves it, even if it is set to the default DNS. Not 100% sure why setting it to e.g. the default works as well, but according to ping tests it seems IPv6 related.

@fungiboletus
Copy link
Author

Maybe we could have IPv6 disabled by default, as is tradition.

@AndreasA
Copy link

AndreasA commented Apr 12, 2023

@abiosoft I noticed that if --dns 192.168.5.3 is provided names like host.docker.internal will not resolve. So disabling IPv6 for now as @fungiboletus for now seems to safest option. Disabling it manually already e.g. using limactl is not possible, is it?

otherwise great work with colima so far. it works way better (especially regarding performance) than docker desktop 😄 of course there are still seem issues like this one (for now), but mostly it works quite well.

@AndreasA
Copy link

AndreasA commented Apr 26, 2023

Sadly I just noticed that the --dns 192.168.5.3 is a poor workaround as various things do not work correctly:

so it is probably better to keep the slower pull times for now.

@terev
Copy link
Contributor

terev commented May 7, 2023

Seems related to lima-vm/lima#1333 .

@kconley-sq
Copy link

For some reason specifying the DNS manually solves it, even if it is set to the default DNS. Not 100% sure why setting it to e.g. the default works as well, but according to ping tests it seems IPv6 related.

One thing I noticed is that providing any DNS resolver via --dns when starting a Colima instance disables the underlying Lima VM's host resolver:

l.HostResolver.Enabled = len(conf.Network.DNSResolvers) == 0

I read others are reporting that the use of Lima's host resolver with vz VMs may explain issues like lima-vm/lima#1333.

@caire-bear
Copy link

Thanks for the writeup @AndreasA ! I was also able to repro the problem and fix following your steps.

Just adding a +1 here. I'm def not a docker expert, but could tell something was off for the last month+, enough to finally start googling. This is a pretty huge usability impact, especially if using docker compose and using large images. We run a kafka cluster locally and those images are large. I was seeing a lot of connection timeouts talking to registries and downloading images seemed slower than it should have been.

With the 8.8.8.8 workaround above, the speed of getting everything up and running from a cold cache is night and day different. Feels like it should.

Should this workaround be documented somewhere in the README/docs until the upstream fix has been figured out?

@aaronlehmann
Copy link

It sounds like this is fixed now on lima's master branch: lima-vm/lima#1333 (comment)

However, I did notice that upgrading lima to this commit causes DNS to stop working in VMs that were already created on an old version. It sounds like a possible fix for this is under investigation.

@aaronlehmann
Copy link

The DNS issue appears to be fixed in lima's master branch, but I ran into a problem where upgrading lima breaks DNS for an existing colima VM: lima-vm/lima#1783

The lima maintainers seem to think this is a colima-specific issue, but I'm a bit out of my depth trying to figure out the details. @abiosoft, would you be able to take a look at this?

@aaronlehmann
Copy link

(@abiosoft: Also let me know if it's better to file a separate issue for the lima upgrade problem mentioned above, as it seems it may not be connected...)

@abiosoft
Copy link
Owner

abiosoft commented Oct 5, 2023

@aaronlehmann I would recommend opening a separate issue for that.

Thanks.

@aaronlehmann
Copy link

Sure thing! I've opened #827.

@AndreasA
Copy link

AndreasA commented Nov 2, 2023

since lima 0.18.x update this seems to work fine (also another person confirmed it seems to work on their M2 too, where there were a lot of issues prior). does anyone else still have issues here, if not maybe the issue could be closed with a corresponding information?

@abiosoft abiosoft added this to the v0.6.0 milestone Nov 12, 2023
jesse-c added a commit to SeldonIO/MLServer that referenced this issue May 30, 2024
* build: Lock GitHub runners' OS

This was motivated by our macOS jobs failing [2] because
colima is missing. It looks like this is because the
latest versions of the macOS runner no longer have
colima installed by default [1].

colima is now explicitly installed.

[1] actions/runner-images#6216
[2] `/Users/runner/work/_temp/f19ffbff-27a9-4fc7-80b6-97791d2de141.sh: line 9: colima: command not found`

* build: Lock Colima

* build: Move macOS Docker installation to script

* build: Move macOS libomp activation to script

* build: Use latest Colima

The > 0.6.0 releases actually fix the issue we have linked [1][2][3].

[1] abiosoft/colima#577
[2] https://github.com/jesse-c/MLServer/blob/c3acd60995a72141027eff506e4fd330fe824179/hack/install-docker-macos.sh#L18-L20
[3] > Switch to new user-v2 network. Fixes abiosoft/colima#648, abiosoft/colima#603, abiosoft/colima#577, abiosoft/colima#779, abiosoft/colima#137, abiosoft/colima#740.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants