-
-
Notifications
You must be signed in to change notification settings - Fork 356
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Colima hangs a short while after starting #552
Comments
Thanks for reporting this. I have experienced same with the VZ vm type and still troubleshooting. If you do not need the faster filesystem access, QEMU is more stable at the moment. |
@blame-git I have been trying to reproduce this but couldn't find a way |
@balajiv113 I am not sure what fixed it precisely but I'm suspecting its lima-vm/lima#1261. I am still monitoring it but it seems to have stabilised now and no longer freezing. |
I'm still experiencing this issue from time to time, but I haven't found a way to systematically reproduce it |
I’m still able to reproduce, but it is less frequent now. What seems to do it for me, is using a container that downloads files from the internet at high speed, and writing them out to volumes (mounted via virtiofs) |
Oops, didn’t meant to close |
Is this steps makes colima hang ?? If yes, can you share some command to redo this steps?? |
Bumping, I'm have the exact same issue. |
Also having this issue and it drives me nuts :( |
If its consistently reproducible please do share the steps. Having hard time in reproducing this. It is happening but not always. |
@balajiv113, there's no weird steps actually. Just start colima, start couple of containers, it hangs after 5 min. I have mongodb,nats,mysql,redis (from bitnami) containers only, within a custom network. |
If you could share the docker-compose for this setup it would be great. |
Here you go
|
Thanks for the compose file. Will try the same with M1 |
@balajiv113 it only happens if VM type is |
@abiosoft - Yes, trying with vz only |
This is also happening to me with a mongodb setup. After working a while with the containers, it hangs up. Mostly happens after been working with it for a while. Error is persistent after that. A delete and start from scratch the vm solves the issue until some days later it happens again. edit: I'm using qemu on a macbook pro intel setup |
Having the same issue with my M1 MBP, I have to run colima stop -f and start again frequently :( |
The network stack for vz was updated in lima-vm/lima#1383 (targeted for v0.16). If interested in trying out do try with latest lima master and with this new network stack |
I get this issue when I switch from MacOS
|
So the lima VM is started successfully, I can shell into it from
|
Oh I figured it out, to use containerd I uninstalled The fix was to remove the link and install |
I can reproduce this issue on MacOS by running a Syncting instance of the following docker-compose. Colima hangs after a few minutes to a few hours. It might be correlated with the download speed. Last time it hung after downloading 19.5 GiB at ~20.4 MiB/s colima version: 0.5.4
|
Note: This comment is preliminary. I hope to narrow things down and provide a set up where I can reproduce the issue reliably. This looks like gremlin territory, so no promises. Not sure if this is related to the thread above, but I'm getting some hangs similar to what's described in this thread. The difference though is that I'm not using VZ. I'm using the default setup, with QEMU. It's an Intel Mac, running Ventura 13.4. This worked fine until today, when I added an extra container to my setup. The hangs appear to happen at the end of a "composer install" or an "npm install". The first time, the hang cured itself after a few minutes and the containers were accessible again. The second time, in the same session, it didn't recover.
I did a Restarting colima seems to get things working again, so I don't know how long it will take to reproduce the issue again. Colima had been running for about a week without a restart, but through computer sleeps, if that's relevant. Some filesystem synchronisation issues seem to be at play. Npm complains about permissions on a file in the cache. This cache is on a Docker volume. Composer gave an error when I ran it just now on a local mount point. Where these two have in common is that they all operat on lots of small files very quickly. If this is related, I guess the filesystem could get stuck, bringing everything down with it. Context: My setup is basically based on this repository: https://gitlab.com/nucleware/docker-dev . Please excuse the lack of documentation. I made that to ease my multi-project PHP development setup, and it still had many pitfalls and annoyances. My volumes are mounted using the "local" driver, even though that repo defaults to nfs on a Mac. Update 1: Restarting colima didn't actually make everything work again. I could run a shell in my containers, but I couldn't connect to my traefik container with my browser. I had to reboot my Mac to be able to connect. The filesystem problems are still there. |
I am getting this frequently too. At first I thought it was due to problems with my corporate VPN interfering with lima, but maybe it isn't. Unfortunately I can't go back to using qemu easily since I need the x86_64 emulation for some containers that haven't been compiled for arm. |
I'm experiencing a similar freeze using I'm running an Apache Spark workload, very IO-bound, so this might have to do with the filesystem after all. The issue happens every time. |
I'm seeing this on an M1 Mac, in a dev-environment container that's trying to do a |
Experiencing a similar hang, on Intel Mac. All I'm doing is starting an ubuntu:22.04 image, installing some build dependencies, and trying to build binutils. i.e.
Not sure it'll help, but it starts hanging at this point:
EDIT: Actually, this may be quite important to the hang, but the build happens in a mounted directory. So the clone happens on the native file system. |
@Ptival can we have a minimal example of a Dockerfile based on this that is guaranteed to "hang"? |
@lucaspar Not sure whether you saw my edit before asking, but I believe the crux of the problem lies in the build happening in a mounted, native directory, rather than a directory of the image. I just tried running the build in a directory inside the image, and it seems to go further. I don't think you can mount a native directory as part of a Dockerfile, so I think the problem cannot be reproduced that way. My steps are, more or less, the following:
|
Same issue for me, I start my build in a container and after a couple of minutes Colima hangs. I have Intel MacBook Pro with macOS 13.6 , using qemu with sshfs.
This happens all the time, please let me know if I can provide more info |
I am having the same issue. MacOs Ventura 13.6 Intel I start two docker containers -> Mongo and Redis I then start three different applications in debug mode and then after a few seconds Colima is unresponsive and I get the following error when I execute docker ps error during connect: Get "http://%2FUsers%2Fnkokkoris%2F.colima%2Fdefault%2Fdocker.sock/v1.24/containers/json": EOF After a few minutes docker becomes responsive but as long as I have the application opening connections to Redis this keeps happening. If I only use the Mongo image there is no problem. |
FYI I switched to vz with virtiofs, since then I didn't see the issue anymore and the build is faster. |
@mdenna-synaptics Do you mind sharing your |
delete current colima session and settingscolima delete configure colima to use virtiofscolima start --vm-type vz --mount-type virtiofs |
I'm a bit out of my comfort zone here :), but this issue is bugging me as well. Your comment (@ryancurrah) seems to apply to me as well. I have no idea how exactly that VM should behave to be honest, but I was just poking around. One of the things that's quite peculiar I think is the fact that in the process tree it has this hanging process:
I found the
However:
Basically everything that I try to do with Not sure if any of this is the cause or the effect, but figured I'd write it down and see if it helps in figuring out this issue. |
Description
Shortly after starting a colima instance it seem to hang, and I’m unable to ssh into the session
From
~/.lima/colima/ha.stderr.log
I see:Version
Colima Version:
Lima Version:
limactl version 0.14.2
Qemu Version:
qemu-img version 7.2.0
macOS Version:
13.1 22C65
Operating System
Output of
colima status
vm-type:
vz
mount type:
virtiofs
Reproduction Steps
Expected behaviour
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: