Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restart container runtimes when certificates store changes #2076

Merged

Conversation

arnaldo2792
Copy link
Contributor

@arnaldo2792 arnaldo2792 commented Apr 15, 2022

Issue number:
Fixes #2021

Description of changes:

ca-certificates: restart container runtimes when certificates store changes

The containerd and docker daemons cache the certificates store, so any
updates to the store will be ignored by the daemons. With this, the
daemons will be restarted whenever the certificates store is updated.

Testing done:
In aws-ecs-1, aws-k8s-1.21:

  • I tried to pull down an image from a local registry, which uses custom CA certificates
  • I validated the pull failed before the CA certificates store was updated
  • I validated the pull succeeded after the CA certificates store was updated

ECS:

# Pull image from local registry, with pki.local-registry.trusted=false, which failed as expected
bash-5.0# docker pull <ip>.us-west-2.compute.amazonaws.com:5000/alpine
Using default tag: latest
Error response from daemon: Get "https://<ip>.us-west-2.compute.amazonaws.com:5000/v2/": x509: certificate signed by unknown authority

# Update certificates store to trust custom CA certificate
bash-5.0# apiclient set pki.local-registry.trusted=true
# Successfully pull down image
bash-5.0# docker pull <ip>.us-west-2.compute.amazonaws.com:5000/alpine
Using default tag: latest
latest: Pulling from alpine
df9b9388f04a: Pull complete
Digest: sha256:a777c9c66ba177ccfea23f2a216ff6721e78a662cd17019488c417135299cd89
Status: Downloaded newer image for <ip>.us-west-2.compute.amazonaws.com:5000/alpine:latest
<ip>.us-west-2.compute.amazonaws.com:5000/alpine:latest

k8s:
I deployed a pod that used an image in my local registry, since k8s variants don't have docker. After I updated the certificates store (like in the ECS test), the image was successfully pulled:

  Normal   Pulling    43s (x3 over 79s)  kubelet            Pulling image "<ip>.us-west-2.compute.amazonaws.com:5000/alpine:latest"
  Warning  Failed     43s (x3 over 79s)  kubelet            Failed to pull image "<ip>.us-west-2.compute.amazonaws.com:5000/alpine:latest": rpc error: code = Unknown desc = failed to pull and unpack image "<ip>.us-west-2.compute.amazonaws.com:5000/alpine:latest": failed to resolve reference "<ip>.us-west-2.compute.amazonaws.com:5000/alpine:latest": failed to do request: Head "https://<ip>.us-west-2.compute.amazonaws.com:5000/v2/alpine/manifests/latest": x509: certificate signed by unknown authority
  Warning  Failed     43s (x3 over 79s)  kubelet            Error: ErrImagePull
  Normal   BackOff    30s (x3 over 78s)  kubelet            Back-off pulling image "<ip>.us-west-2.compute.amazonaws.com:5000/alpine:latest"
  Warning  Failed     30s (x3 over 78s)  kubelet            Error: ImagePullBackOff
  Normal   Pulling    20s                kubelet            Pulling image "<ip>.us-west-2.compute.amazonaws.com:5000/alpine:latest"
  Normal   Pulled     20s                kubelet            Successfully pulled image "<ip>.us-west-2.compute.amazonaws.com:5000/alpine:latest" in 73.145474ms
  Normal   Created    20s                kubelet            Created container image
  Normal   Started    20s                kubelet            Started container image

  • aws-k8s-1.22 upgrade/downgrade
  • aws-k8s-1.19 upgrade/downgrade
  • aws-k8s-1.22-nvidia upgrade/downgrade
  • aws-ecs-1 upgrade/downgrade
  • aws-ecs-1-nvidia upgrade/downgrade
  • vmware-k8s-1.22 upgrade/downgrade

Terms of contribution:

By submitting this pull request, I agree that this contribution is dual-licensed under the terms of both the Apache License, version 2.0, and the MIT license.

Copy link
Contributor

@zmrow zmrow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎧

@bcressey
Copy link
Contributor

It'd be worth rebasing this to exercise more of the functionality after the pem crate update in #2049.

@arnaldo2792
Copy link
Contributor Author

(Rebase to pull down migrations directory for next release)

@arnaldo2792
Copy link
Contributor Author

(Rebase to pull down latest changes in build.rs files)

@arnaldo2792
Copy link
Contributor Author

Forced push includes migration and configuration files for newly added variants

@arnaldo2792
Copy link
Contributor Author

(Rebased to pull down fixes in aws-ecs-1-nvidia)

@arnaldo2792
Copy link
Contributor Author

(Fix since I was using runtime, and the new buildsys library emits variant_runtime)


fn main() {
let variant = Variant::from_env().unwrap();
variant.emit_cfgs();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

sources/models/shared-defaults/containerd-pki.toml Outdated Show resolved Hide resolved
@arnaldo2792 arnaldo2792 requested a review from bcressey June 4, 2022 01:16
Comment on lines 9 to 10
/// We updated the 'affected-services' list metadata for 'settings.network.hostname' to include the
/// hosts "service" on upgrade, and to remove it on downgrade.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is not accurate.

@arnaldo2792
Copy link
Contributor Author

(Forced push to fix comment in migration)

…hanges

The containerd and docker daemons cache the certificates store, so any
updates to the store will be ignored by the daemons. With this, the
daemons will be restarted whenever the certificates store is updated.

Signed-off-by: Arnaldo Garcia Rincon <agarrcia@amazon.com>
@arnaldo2792
Copy link
Contributor Author

(Forced push to fix merge conflicts with Release.toml and sources/Cargo.toml)

@arnaldo2792 arnaldo2792 merged commit e013df9 into bottlerocket-os:develop Jun 6, 2022
@arnaldo2792 arnaldo2792 deleted the fix-ca-certificates branch June 7, 2022 16:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Container clients don't use the updated certificates store
5 participants