New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ci: Prune dangling images on RESTART_CI_DOCKER_BEFORE_RUN #27777
ci: Prune dangling images on RESTART_CI_DOCKER_BEFORE_RUN #27777
Conversation
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update. |
This should fix the out-of-space error, see https://cirrus-ci.com/task/6118746146734080?logs=ci#L4903 |
Looks like it is working, but not claiming a lot of space: https://cirrus-ci.com/task/6656469207089152?logs=ci#L408
|
Not sure what is going on. Seems there are unreachable zombie containers in the overlay2 directory. Let's try |
What amount do you expect? |
I already wiped all servers, but I was expecting that about 50GB of zombie stuff is cleared:
|
Maybe add:
? |
I listed all images, volumes, containers, networks, and the log file(s) and the list was empty, apart from 3 small image layers. |
(CI is green, but it will probably take 6 months to find out if podman is running out of storage space as well) |
Could explicit logging of disk space usage help in the meanwhile? |
This is already done, see https://cirrus-ci.com/task/4965521926389760?logs=ci#L356 |
Right. I mean, no need to wait for 6 month. If numbers are the same after a week then this PR works as intended, no? |
Ok, I'll check back a week after merge. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK fa12307
systemctl restart docker | ||
podman container kill --all # Similar to "systemctl restart docker" | ||
echo "Prune all dangling images" | ||
docker image prune --force |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be prudent to use the --filter
option here to only prune images created by the CI? This may be especially helpful to people running the CI locally/unsandboxed. We'd just need to add a --label "bitcoin-ci"
argument to docker build
.
docker image prune --force | |
docker image prune --force --filter label=bitcoin-ci |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do dangling image (layer)s have a label? Anyway, this isn't for local runs, only for the persistent runners, guarded via RESTART_CI_DOCKER_BEFORE_RUN
. Happy to rename RESTART_CI_DOCKER_BEFORE_RUN
to DANGER_RESTART_CI_DOCKER_BEFORE_RUN
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do dangling image (layer)s have a label?
Yeah they do, just tested it to confirm.
guarded via RESTART_CI_DOCKER_BEFORE_RUN
Fair enough, looks like this is unlikely to be hit locally. I don't think renaming to DANGER_RESTART_CI_DOCKER_BEFORE_RUN
is an improvement, so happy to keep as is. Just thought adding a label and only applying batch operations to our own images is a good practice, e.g. if in the future we remove the RESTART_CI_DOCKER_BEFORE_RUN
condition, given that it's almost no overhead. But up to you, it's no big deal and I don't have a strong opinion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, could make sense in a follow-up to apply the label to both places and then unguard the image prune
from DANGER_RESTART_CI_DOCKER_BEFORE_RUN
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in #27793
@@ -42,7 +42,9 @@ if [ -z "$DANGER_RUN_CI_ON_HOST" ]; then | |||
|
|||
if [ -n "${RESTART_CI_DOCKER_BEFORE_RUN}" ] ; then | |||
echo "Restart docker before run to stop and clear all containers started with --rm" | |||
systemctl restart docker | |||
podman container kill --all # Similar to "systemctl restart docker" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this works? Even though the CLIs have a compatible syntax, I think they operate on their own set of resources. I'm pretty sure podman ps
right before this line won't show anything, so I think no containers are killed after this line? At least, that's what I get when I try locally spinning up a container with docker run
and then try to show/kill it with podman ps
or podman container kill --all
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I already switched all machines to podman-docker
, because it is not possible to install docker.io
and podman-docker
at the same time. So docker run
is actually just an alias for podman run
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I can't comment on that. Since there is no mention of podman
anywhere else in the codebase/documentation, I think this may make maintaining the CI more opaque, but perhaps that's an okay tradeoff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I don't expect anyone to use RESTART_CI_DOCKER_BEFORE_RUN
beside the persistent workers, so it is an undocumented setting. Anyone using it is expected to infer the documentation from the two echo calls and two lines of code.
systemctl restart docker | ||
podman container kill --all # Similar to "systemctl restart docker" | ||
echo "Prune all dangling images" | ||
docker image prune --force |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do dangling image (layer)s have a label?
Yeah they do, just tested it to confirm.
guarded via RESTART_CI_DOCKER_BEFORE_RUN
Fair enough, looks like this is unlikely to be hit locally. I don't think renaming to DANGER_RESTART_CI_DOCKER_BEFORE_RUN
is an improvement, so happy to keep as is. Just thought adding a label and only applying batch operations to our own images is a good practice, e.g. if in the future we remove the RESTART_CI_DOCKER_BEFORE_RUN
condition, given that it's almost no overhead. But up to you, it's no big deal and I don't have a strong opinion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -42,7 +42,9 @@ if [ -z "$DANGER_RUN_CI_ON_HOST" ]; then | |||
|
|||
if [ -n "${RESTART_CI_DOCKER_BEFORE_RUN}" ] ; then | |||
echo "Restart docker before run to stop and clear all containers started with --rm" | |||
systemctl restart docker | |||
podman container kill --all # Similar to "systemctl restart docker" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I can't comment on that. Since there is no mention of podman
anywhere else in the codebase/documentation, I think this may make maintaining the CI more opaque, but perhaps that's an okay tradeoff.
…EFORE_RUN fa12307 ci: Use podman for persistent workers (MarcoFalke) fa9c65a ci: Prune dangling images on RESTART_CI_DOCKER_BEFORE_RUN (MarcoFalke) Pull request description: This should prevent the persistent workers from running out of disk space. Containers are already removed, but not images. This is required since CI images are built and cached. ACKs for top commit: hebasto: ACK fa12307 Tree-SHA512: 07c4faec57d659d1762e4e6d776c882ee48d4bac6ce6d438d56d9ab13277be3e39d6aa38816165a5a3e0938ac5d47674ee2921b6e115a4bb54e3e4910b34c4b6
@@ -42,7 +42,9 @@ if [ -z "$DANGER_RUN_CI_ON_HOST" ]; then | |||
|
|||
if [ -n "${RESTART_CI_DOCKER_BEFORE_RUN}" ] ; then | |||
echo "Restart docker before run to stop and clear all containers started with --rm" | |||
systemctl restart docker | |||
podman container kill --all # Similar to "systemctl restart docker" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this doesn't work some of the time:
- https://cirrus-ci.com/task/5798762795237376?logs=ci#L267
- https://cirrus-ci.com/task/4640353173635072?logs=ci#L266
++ echo 'Restart docker before run to stop and clear all containers started with --rm'
Restart docker before run to stop and clear all containers started with --rm
++ podman container kill --all
++ echo 'Prune all dangling images'
Prune all dangling images
++ docker image prune --force
Emulate Docker CLI using podman. Create /etc/containers/nodocker to quiet msg.
+++ docker run --rm --interactive --detach --tty --mount type=bind,src=/tmp/cirrus-build,dst=/ro_base,readonly --mount type=volume,src=ci_native_qt5_ccache,dst=/tmp/ccache_dir --mount type=volume,src=ci_native_qt5_depends,dst=/tmp/cirrus-build/depends --mount type=volume,src=ci_native_qt5_previous_releases,dst=/tmp/cirrus-build/releases/x86_64-pc-linux-gnu -w /tmp/cirrus-build --env-file /tmp/env --name ci_native_qt5 ci_native_qt5
Emulate Docker CLI using podman. Create /etc/containers/nodocker to quiet msg.
time="2023-06-05T06:45:57+02:00" level=warning msg="The input device is not a TTY. The --tty and --interactive flags might not work properly"
Error: error creating container storage: the container name "ci_native_qt5" is already in use by "1b176e38eff6bdc928f2e213260cb9c8abb9efaa5cd6b0cccd09033a1b707601". You have to remove that container to be able to reuse that name.: that name is already in use
++ CI_CONTAINER_ID=
Exit status: 125
faaa627 ci: Use podman stop over podman kill (MarcoFalke) Pull request description: This should avoid a race where the kill is not done when spinning up the new container. podman stop waits 10 seconds by default. Fixes bitcoin/bitcoin#27777 (comment) ACKs for top commit: dimitaracev: ACK [faaa627](bitcoin/bitcoin@faaa627) Tree-SHA512: d46a32429629dcfa711a2d0abe79100f5593d72f8e5eded7b505b0f270e28abfeba0a96bec43e9951a76a96bb23fe2c7092433e5e0c66510e3e3b6c3cb58f4db
faaa627 ci: Use podman stop over podman kill (MarcoFalke) Pull request description: This should avoid a race where the kill is not done when spinning up the new container. podman stop waits 10 seconds by default. Fixes bitcoin#27777 (comment) ACKs for top commit: dimitaracev: ACK [faaa627](bitcoin@faaa627) Tree-SHA512: d46a32429629dcfa711a2d0abe79100f5593d72f8e5eded7b505b0f270e28abfeba0a96bec43e9951a76a96bb23fe2c7092433e5e0c66510e3e3b6c3cb58f4db
Github-Pull: bitcoin#27777 Rebased-From: fa9c65a
Github-Pull: bitcoin#27777 Rebased-From: fa12307
6233049 ci: Switch to `amd64` container in "ARM" task (Hennadii Stepanov) d845a3e test: add coverage for `/deploymentinfo` passing a blockhash (brunoerg) 72ead86 rest: bugfix, fix crash error when calling `/deploymentinfo` (brunoerg) 6f7a0ae ci: Use podman stop over podman kill (MarcoFalke) de56daa ci: Use podman for persistent workers (MarcoFalke) 71f626e ci: Prune dangling images on RESTART_CI_DOCKER_BEFORE_RUN (MarcoFalke) Pull request description: Currently backports: * #27777 * #27844 * #27853 * #27886 Effectively also backports: #27562. ACKs for top commit: stickies-v: ACK 6233049 Tree-SHA512: d0f3d5c4cd0cf9792f3270078b49ad6661e28e7a883f91e5981e8cfe95d01c6e415762ee2e3a5fea17d198299989a8b5412961c2cc788cef975b2ee45d84550c
Github-Pull: bitcoin#27777 Rebased-From: fa9c65a
Github-Pull: bitcoin#27777 Rebased-From: fa12307
Backported for 24.x in #28535. |
9077f21 depends: fix unusable memory_resource in macos qt build (fanquake) dccacf0 build, macos: Fix `qt` package build with new Xcode 15 linker (Hennadii Stepanov) 4359649 ci: Switch to `amd64` container in "ARM" task (Hennadii Stepanov) 805f98b ci: Nuke Android APK task, Use credits for tsan (MarcoFalke) cb5512d test: ensure old fee_estimate.dat not read on restart and flushed (ismaelsadeeq) 01f8ee4 tx fees, policy: read stale fee estimates with a regtest-only option (ismaelsadeeq) 1c98029 tx fees, policy: do not read estimates of old fee_estimates.dat (ismaelsadeeq) 77979e0 tx fees, policy: periodically flush fee estimates to fee_estimates.dat (ismaelsadeeq) 67b6d99 Do not use std::vector = {} to release memory (Pieter Wuille) defdc15 ci: Use podman stop over podman kill (MarcoFalke) 7f1357d ci: Use podman for persistent workers (MarcoFalke) 0db69a3 ci: Prune dangling images on RESTART_CI_DOCKER_BEFORE_RUN (MarcoFalke) Pull request description: Backports to the 24.x branch. Currently: * #27622 * #27777 * #27834 * #27844 * #27886 * #28452 * #28543 * #28571 ACKs for top commit: stickies-v: ACK 9077f21 Tree-SHA512: abaafc9a048b67b494993134fd332457ea52695ec007b963c283f962ec40c3b6b3a7e98407481be55d3271a595088a0281cc84b79dad4f24d260381ea0153076
…ng images selectively e44c574 ci: always prune all dangling bitcoin-ci-test images (stickies-v) ce16997 ci: add label to docker images (stickies-v) Pull request description: Follow-up from bitcoin/bitcoin#27777 (comment). Labeling the docker images produced by the CI allows us/the user to apply batch operations to all images (including dangling ones) produced by the ci without affecting other, non-bitcoin-ci images. With labeling, we can safely always prune dangling bitcoin-ci-test images without checking for `RESTART_CI_DOCKER_BEFORE_RUN`, which we enable on our persistent runners. ACKs for top commit: fanquake: utACK e44c574 Tree-SHA512: 1009fb1be78fbc80b5341ba92eac2991e77d050e1ab6048d1d9a65af73413a6be7afc1e1c764eb3f347f363af31245b93fdb38f6ac016d775aad4a0f36e4c98f
…s selectively e44c574 ci: always prune all dangling bitcoin-ci-test images (stickies-v) ce16997 ci: add label to docker images (stickies-v) Pull request description: Follow-up from bitcoin#27777 (comment). Labeling the docker images produced by the CI allows us/the user to apply batch operations to all images (including dangling ones) produced by the ci without affecting other, non-bitcoin-ci images. With labeling, we can safely always prune dangling bitcoin-ci-test images without checking for `RESTART_CI_DOCKER_BEFORE_RUN`, which we enable on our persistent runners. ACKs for top commit: fanquake: utACK e44c574 Tree-SHA512: 1009fb1be78fbc80b5341ba92eac2991e77d050e1ab6048d1d9a65af73413a6be7afc1e1c764eb3f347f363af31245b93fdb38f6ac016d775aad4a0f36e4c98f
This should prevent the persistent workers from running out of disk space. Containers are already removed, but not images. This is required since CI images are built and cached.