New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix the Node bootstrap problem #68
Fix the Node bootstrap problem #68
Conversation
Skipping CI for Draft Pull Request. |
fa2c9ec
to
a1d6440
Compare
a1d6440
to
8555564
Compare
8555564
to
fb3985c
Compare
/test pull-gardener-extension-registry-cache-e2e-kind |
0c7af68
to
2abb4f1
Compare
2abb4f1
to
3258a26
Compare
Seems like a networking error. The test is passing locally. /test pull-gardener-extension-registry-cache-e2e-kind |
/assign |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor findings, otherwise looks good
pkg/component/registryconfigurationcleaner/registry_configuration_cleaner.go
Outdated
Show resolved
Hide resolved
pkg/component/registryconfigurationcleaner/registry_configuration_cleaner_test.go
Outdated
Show resolved
Hide resolved
pkg/webhook/operatingsystemconfig/scripts/configure-containerd-registries.sh
Outdated
Show resolved
Hide resolved
85602c2
to
5f8ddc3
Compare
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ialidzhikov The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
LGTM label has been added. Git tree hash: 6e9b98d14c61b217d9fe1fdce2aee2e03ece8869
|
How to categorize this PR?
/kind bug
What this PR does / why we need it:
Currently, when a Shoot requests caches for all upstreams that are used by Shoot system components, things regress in several aspects. In OSS Gardener these upstreams would be
quay.io
for calico images,registry.k8s.io
for kube-proxy and others,eu.gcr.io
for the rest). Example extension configuration for that case:On creation of such Shoot, image pull of Shoot system components is abnormally high.
quay.io/calico/cni
,quay.io/calico/calico-node
andregistry.k8s.io/kube-proxy
images are pulled in times like 2m, 3m or even 5m. Usually, it takes up to 10s to pull these images from the upstream.The reason for this behaviour is kind of the design of the extension. We configure a containerd mirror/registry using the Service IP of registry cache Service in the Shoot cluster. In order to have the Service's cluster IP reachable from a newly joining Node, the networking has to be set up and kube-proxy and calico components to be running. Otherwise, containerd cannot reach the Service's cluster IP and falls back to the upstream registry.
The reason for the abnormal high image pull times are that initially (until kube-proxy starts running) image pull requests from containerd to the Service's cluster IP time out after 30s. After this timeout, the fallback to the upstream (for example eu.gcr.io or quay.io) happens. We see that containerd does many requests - HEAD request to resolve the manifest by tag, GET request for the manifest by SHA, GET requests for blobs. Each of these requests times out in 30s and the the fall back to the upstream host happens. At the end image pull succeeds but it succeeds in minutes. Yesterday I was doing experiments with the
docker.io/library/alpine:3.13.2
andeu.gcr.io/gardener-project/gardener/ops-toolbelt:0.18.0
in a setup where the Service IP of containerd is misconfigured (Service IP deleted and containerd refers to a no longer existing cluster IP):We see that after kube-proxy stars running, requests to the Service IP of the registry-cache Service no longer time out in 30s (
dial tcp 10.4.196.64:5000: i/o timeout
) but are rejected right away (dial tcp 10.4.225.49:5000: connect: connection refused
).We believe that this is because kube-proxy starts running and configures iptable rules on the Node for the Service IPs. At that time the registry cache Pods are not yet running, hence the Service does not have any available Endpoints. Most probably in that case kube-proxy configures a "black hole" for such a Service because it does not have any available Endpoints.
In total, with the current design of the extension:
To address these issues, the extension is redesigned in the following. During the OSC mutation, we no longer append the
hosts.toml
file with the registry config. Instead we append an new systemd unit. The systemd unit waits for each cache to be available (its Service IP starts returningHTTP 200
) and only after this it creates the correspondinghosts.toml
file. For the uninstall case - a DaemonSet is deployed and it cleans up the hosts.toml files and the unit (if needed).The drawback of this approach is that the registry-cache extension won't be able to cache Shoot system components if a cache for the corresponding upstream is requested. However, with the new design, we no longer regress the image pull of the Shoot system components.
Which issue(s) this PR fixes:
Part of #3
Special notes for your reviewer:
N/A
Release note: