Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

auto update containers in systemd units #5480

Merged
merged 2 commits into from Mar 18, 2020

Conversation

vrothberg
Copy link
Member

@vrothberg vrothberg commented Mar 12, 2020

Add support to auto-update containers running in systemd units as
generated with podman generate systemd --new.

podman restart --auto-update looks up containers with a specified
"io.containers.autoupdate" label (i.e., the auto-update policy).

If the label is present and set to "image", Podman reaches out to the
corresponding registry to check if the image has been updated. We
consider an image to be updated if the digest in the local storage is
different than the one of the remote image. If an image must be
updated, Podman pulls it down and restarts the container. Note that the
restarting sequence relies on systemd.

At container-creation time, Podman looks up the "PODMAN_SYSTEMD_UNIT"
environment variables and stores it verbatim in the container's label.
This variable is now set by all systemd units generated by
podman-generate-systemd and is set to %n (i.e., the name of systemd
unit starting the container). This data is then being used in the
auto-update sequence to instruct systemd (via DBUS) to restart the unit
and hence to restart the container.

Note that this implementation of auto-updates relies on systemd and
requires a fully-qualified image reference to be used to create the
container. This enforcement is necessary to know which image to
actually check and pull. If we used an image ID, we would not know
which image to check/pull anymore.

Fixes: #3575
Signed-off-by: Valentin Rothberg rothberg@redhat.com

@openshift-ci-robot openshift-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/XL labels Mar 12, 2020
@openshift-ci-robot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: vrothberg

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 12, 2020
@vrothberg
Copy link
Member Author

Note that it's still work in progress. I still need to perform some more tests and add at least some e2e tests to catch error cases.

cmd/podman/restart.go Outdated Show resolved Hide resolved
libpod/container.go Show resolved Hide resolved
libpod/options.go Outdated Show resolved Hide resolved
pkg/adapter/autoupdate.go Outdated Show resolved Hide resolved
pkg/adapter/autoupdate.go Show resolved Hide resolved
pkg/systemd/generate/systemdgen.go Outdated Show resolved Hide resolved
pkg/adapter/autoupdate.go Outdated Show resolved Hide resolved
@saschagrunert
Copy link
Member

Add support to auto-update containers running in systemd units as
generated with podman generate systemd --new.

Nice work! 👍 I just have some thoughts without having to look at the code.

podman restart --auto-update looks up containers with a specified
"io.containers.autoupdate" label (i.e., the auto-update policy).

I would expect that this happens automatically, do you think it would make sense to have something like a systemd timer to really auto-update the image without manual intervention?

If the label is present and set to "image", Podman reaches out to the
corresponding registry to check if the image has been updated. We
consider an image to be updated if the digest in the local storage is
different than the one of the remote image. If an image must be
updated, Podman pulls it down and restarts the container. Note that the
restarting sequence relies on systemd.

Would it be possible to catch update issues? For example we stop the old container, start the new one and if the new one dies we recover back to the old version. Maybe with an additional label that the update failed?

Note that this implementation of auto-updates relies on systemd and
requires a fully-qualified image reference to be used to create the
container. This enforcement is necessary to know which image to
actually check and pull. If we used an image ID, we would not know
which image to check/pull anymore.

I'm not sure if I get this correct, but we could just resolve the image during the initial pull, right? So if i just specify alpine, the image is available locally then we still could resolve it with the remote resource to obtain the full reference… (would be something like docker.io/library/alpine:latest in that case).

cmd/podman/restart.go Outdated Show resolved Hide resolved
cmd/podman/shared/create.go Outdated Show resolved Hide resolved
pkg/autoupdate/autoupdate.go Outdated Show resolved Hide resolved
@vrothberg
Copy link
Member Author

podman restart --auto-update looks up containers with a specified
"io.containers.autoupdate" label (i.e., the auto-update policy).

I would expect that this happens automatically, do you think it would make sense to have something like a systemd timer to really auto-update the image without manual intervention?

Absolutely. That's actually the end goal to have a oneshot timer running. I'll add a .service file for that once the PR is in a better shape.

If the label is present and set to "image", Podman reaches out to the
corresponding registry to check if the image has been updated. We
consider an image to be updated if the digest in the local storage is
different than the one of the remote image. If an image must be
updated, Podman pulls it down and restarts the container. Note that the
restarting sequence relies on systemd.

Would it be possible to catch update issues? For example we stop the old container, start the new one and if the new one dies we recover back to the old version. Maybe with an additional label that the update failed?

I plan to add roll backs in the future but not with this PR. It would imply retagging the previous image ID to RawImageName and restart the service. Keeping the door for roll backs open also impacted the current design (see #5480 (comment)).

Note that this implementation of auto-updates relies on systemd and
requires a fully-qualified image reference to be used to create the
container. This enforcement is necessary to know which image to
actually check and pull. If we used an image ID, we would not know
which image to check/pull anymore.

I'm not sure if I get this correct, but we could just resolve the image during the initial pull, right? So if i just specify alpine, the image is available locally then we still could resolve it with the remote resource to obtain the full reference… (would be something like docker.io/library/alpine:latest in that case).

I want to avoid the problems of unqualified-search-registries and all its ambiguities.

(would be something like docker.io/library/alpine:latest in that case).

That really depends on the registries.conf, which is why I want to enforce fully-qualified references.

@vrothberg vrothberg force-pushed the auto-updates branch 3 times, most recently from 1815919 to c002336 Compare March 13, 2020 16:03
@vrothberg
Copy link
Member Author

  • Refined the backend
  • All logic is now in pkg/autoupdate
  • Added podman auto-update command and a man page
  • Added a podman-auto-update.{service,timer} (untested)
  • Addressed all review comments

I am satisfied with the code now. Still need to perform some more manual tests and add some e2e tests.

@edsantiago, I would love to have your eyes and hands on this PR.

@vrothberg
Copy link
Member Author

vrothberg commented Mar 13, 2020

Note to self: create a follow-up PR to allow podman-inspect to return the new RawImageName field.

@vrothberg vrothberg force-pushed the auto-updates branch 3 times, most recently from 953bce7 to 51fabb7 Compare March 16, 2020 13:29
@vrothberg vrothberg changed the title [WIP] auto update containers in systemd units auto update containers in systemd units Mar 16, 2020
@openshift-ci-robot openshift-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 16, 2020
@vrothberg
Copy link
Member Author

Ready from my side. I will add systemd tests in the near future but want to spit that part out as I also want to actually run the generated unit files.

@vrothberg vrothberg force-pushed the auto-updates branch 2 times, most recently from a6caf3b to 18881a7 Compare March 16, 2020 14:45
Move the dbus-connection code from libpod's healthcheck to pkg/systemd
to allow for sharing the logic.  Needed for the auto-updates work.

Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
@vrothberg vrothberg force-pushed the auto-updates branch 2 times, most recently from d6fc3ac to b7ab7e3 Compare March 17, 2020 11:40
@vrothberg
Copy link
Member Author

@edsantiago
Copy link
Collaborator

I'm out today, but when playing with this yesterday (with a local registry) I was stumped by the lack of --tls-verify=false in auto-update. Is that worth considering? (Or perhaps you've already added it, I haven't checked this morning)

@vrothberg
Copy link
Member Author

I'm out today, but when playing with this yesterday (with a local registry) I was stumped by the lack of --tls-verify=false in auto-update. Is that worth considering? (Or perhaps you've already added it, I haven't checked this morning)

I have considered it but I want users to specify everything in the registries.conf. In case of --tls-verify=false, the registry has to be marked as insecure. Using --tls-verify for run, create, pull etc. is fine as we will pull only one image but in case of auto-update we might pull N images from M registries/repositories.

But that's just my two cents. I am curious what others think about it.

@rhatdan
Copy link
Member

rhatdan commented Mar 17, 2020

I agree I think this should be handled in registries.conf.

libpod/container_api.go Outdated Show resolved Hide resolved
Copy link
Member

@TomSweeneyRedHat TomSweeneyRedHat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, LGTM once the placement gets settled between you and @mheon.
I definitely smell a blog out of this once it lands.

@vrothberg vrothberg added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 17, 2020
@vrothberg
Copy link
Member Author

Added do-not-merge label to prevent this from getting in before cutting 1.8.1 - merging this PR would require a minor version bump (i.e., 1.9.0)

Add support to auto-update containers running in systemd units as
generated with `podman generate systemd --new`.

`podman auto-update` looks up containers with a specified
"io.containers.autoupdate" label (i.e., the auto-update policy).

If the label is present and set to "image", Podman reaches out to the
corresponding registry to check if the image has been updated.  We
consider an image to be updated if the digest in the local storage is
different than the one of the remote image.  If an image must be
updated, Podman pulls it down and restarts the container.  Note that the
restarting sequence relies on systemd.

At container-creation time, Podman looks up the "PODMAN_SYSTEMD_UNIT"
environment variables and stores it verbatim in the container's label.
This variable is now set by all systemd units generated by
`podman-generate-systemd` and is set to `%n` (i.e., the name of systemd
unit starting the container).  This data is then being used in the
auto-update sequence to instruct systemd (via DBUS) to restart the unit
and hence to restart the container.

Note that this implementation of auto-updates relies on systemd and
requires a fully-qualified image reference to be used to create the
container.  This enforcement is necessary to know which image to
actually check and pull.  If we used an image ID, we would not know
which image to check/pull anymore.

Fixes: containers#3575
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
# Load the new systemd unit and start it
$ mv ./container-bc219740a210455fa27deacc96d50a9e20516492f1417507c13ce1533dbdcd9d.service ~/.config/systemd/user
$ systemctl --user daemon-reload
$ systemctl --user start container-bc219740a210455fa27deacc96d50a9e20516492f1417507c13ce1533dbdcd9d.service
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't actually work as described, but it's unrelated to your PR.

First: on my test setup (f31, root), I have to restorecon /etc/systemd/system/xxx.service, otherwise systemctl says "no such unit".

Second, if the initial source container is started with --name foo, systemctl start will constantly fail until you podman rm -f the original source container:

Mar 18 08:22:40 ci-vm-10-0-137-242.hosted.upshift.rdu2.redhat.com podman[68087]: Error: error creating container storage: the container name "mytest" is already in use by "5d6d296390e453b59a9e109a8de78369949f1de0d2fedcb3c6c0568d4e0ceafd". You have to remove that container to be able to reuse that name.: that name is already in use

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First: on my test setup (f31, root), I have to restorecon /etc/systemd/system/xxx.service, otherwise systemctl says "no such unit".

Shouldn't it go to /usr/lib/systemd/system/ for system services? That's where I place them and selinux behaved.

Second, if the initial source container is started with --name foo, systemctl start will constantly fail until you podman rm -f the original source container.

That's a super helpful observation! I think, we should not restart but stop-start the services, so that everything gets properly cleaned up. I'll investigate.

Thanks a lot for testing and the feedback, @edsantiago !

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't it go to /usr/lib/systemd/system/ for system services?

Probably not: that's for installed packages; the recommendation is to use /etc/systemd/system for local configuration. Regardless, if I (root) am sitting in my home directory when I run podman generate systemd --files, then mv that file elsewhere, SELinux will have problems because the file will be labeled admin_home_t instead of systemd_unit_file_t.

This brings up a gripe I've long had, though: why does podman write the unit file to pwd? That causes pain for users who then have to figure out the correct destination, mv it, restorecon, etc. Is it too late to fix podman so it actually figures out the correct root/rootless path and writes the file to the proper place?

we should not restart but stop-start the services

As I recall, systemctl stop podman-xx.service did not help (perhaps because of the --cidfile option). Only podman rm -f (or podman stop, podman rm) helped.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Second, if the initial source container is started with --name foo, systemctl start will constantly fail until you podman rm -f the original source container.

Misread the sentence. Yes, the initial container will remain. We want to add --replace which will remove existing containers in case of a name collision.

@vrothberg
Copy link
Member Author

@mheon, I think we can merge it. We marked it as experimental (same as v2) which doesn't require a minor version bump. Once that's in (and my todo-list is smaller) I will tackle the remaining bits (tests, better docs for systemd in general and --replace).

@mheon
Copy link
Member

mheon commented Mar 18, 2020

Alright, SGTM

@edsantiago
Copy link
Collaborator

Aside from the file path and name issues, and the lack of a fallback mechanism (which I assume will be addressed some day), LGTM.

@mheon
Copy link
Member

mheon commented Mar 18, 2020

Code LGTM

@mheon
Copy link
Member

mheon commented Mar 18, 2020

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Mar 18, 2020
@rhatdan
Copy link
Member

rhatdan commented Mar 18, 2020

@edsantiago @vrothberg Units should go under /etc not in /usr/lib. /usr/lib is for the distro, /etc/ is for the admin.

Also the reason you have an SELinux issue is the difference between mv and cp. If you cp to the destination directory, it will be labeled correctly. If you mv then it will maintain the label that the file was created in. If you created the unit file in your homedir, and mv it to /etc/systemd/system, then it will look to SELinux as if it s user homedir content.

Back in 2012 Dan Walsh wrote about this.
https://danwalsh.livejournal.com/56534.html

mv -Z /src/ /dest

Will do the thing you expected

@edsantiago
Copy link
Collaborator

Thanks @rhatdan . Is it fair to say that a regular user should not have to worry about those things? I have opened #5538 with the intention that podman take care of that automatically.

@rhatdan
Copy link
Member

rhatdan commented Mar 18, 2020

/hold cancel

@openshift-ci-robot openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 18, 2020
@openshift-merge-robot openshift-merge-robot merged commit 45e7cbf into containers:master Mar 18, 2020
@vrothberg vrothberg deleted the auto-updates branch March 19, 2020 10:13
@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 25, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 25, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Auto update containers with podman
8 participants