-
Notifications
You must be signed in to change notification settings - Fork 784
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
buildah pull sometimes hangs forever (v1.23.0) #3662
Comments
Using nix, I pulled this version of buildah:
and I couldn't reproduce the issue after trying ~10 times. After switching back to 1.23.0 (using nix as well for consistency), I reproduced at the first try. For reference, that was with:
|
WDYT @vrothberg ? |
@giuseppe PTAL |
podman/buildah fallback to pulling the image without using the "partial pull" feature when it is not supported (in fact the message above is just a debug log), so the hang is probably happening later on. Could it be registry related and the registry block two quick requests for the same image? Do you have an image that we can use to reproduce the issue? |
Thanks for the feedback. I'll see if I can provide a reproducer. |
I could reproduce with this image: https://hub.docker.com/r/deubeuliou/buildah-issue-3662 but only when it's on an AWS ECR registry (I used a private registry provided by my employer). Besides, I tried pulling the nginx image from AWS gallery (https://gallery.ecr.aws/nginx/nginx) and I couldn't reproduce. Here's how the image was built:
I then pushed the 3 tags to my AWS ECR registry, removed the local images, downloaded the 1st tag and then repeatedly attempted to |
I still think the issue is caused by the remote registry throttling your requests. The debug log above doesn't have any effect (in fact, it doesn't even make an additional request when the annotation is not present). Could you try debugging the network connection with wireshark? |
I can try that. Are you suggesting that the bandwidth will be limited but non-zero? |
yes, or hangs for a while |
Using I can confirm, however, that the hang is not always associated with the "blob type not supported for partial retrieval" log; I'll remove it from the title. |
I suppose I can try and bisect the issue and/or I can try to reproduce with a public registry on the same AWS account where I'm currently having the issue. If that works, I could send you the URL privately. I'll probably do that after the holiday season. |
Good news: I have bissected it down to 980d352. The bad news is, this commit is:
I'll try and bisect that as well. |
Ok... you're not going to believe this... I checked out the last good commit and then checked out the various components that were upgraded by the "bad" commit. I have no idea why it only happens under the specific circumstances I'm experiencing it, though. Using the |
@mtrmac PTAL |
That’s almost certainly vbauerster/mpb#100 . (To confirm, it would help to capture a full Go backtrace.) It seems that even the newest Buildah v1.23.1 still depends on that version; it was fixed in #3526 but there hasn’t been a Buildah release since. |
@mtrmac : indeed, I don't reproduce with this commit. Thanks. I'm closing this; until there's a new release and it hits my distro, I'll recompile it locally. |
To fix this I had to update buildah, and also delete and prune the buildah images stored on disk with |
Description
buildah pull
sometimes hangs forever. Running it with--log-level debug
shows this error and hangs immediately thereafter:I'm pulling images that are constructed in an iterative fashion (each image is constructed from the previous one) and the reproducibility seems to vary depending on the layer. For instance, I have an image that only adds an environment variable (the filesystem diff is empty) and that one seems to reproduce the issue more than the others.
Besides, the images are constructed with buildah but then pushed to the local docker daemon which, in turn, pushes them to an AWS ECR registry.
Steps to reproduce the issue:
Just run
buildah pull <some image on AWS ECR>
. It sometimes work; in that case delete the image and try again.Output of
rpm -q buildah
orapt list buildah
:Output of
buildah version
:Output of
cat /etc/*release
:Output of
uname -a
:Output of
cat /etc/containers/storage.conf
:The text was updated successfully, but these errors were encountered: