-
Notifications
You must be signed in to change notification settings - Fork 451
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[node-agent] Restrict Node
watches via label/field selector to prevent watching all nodes
#9672
Conversation
1987a9e
to
b1ca5eb
Compare
/assign |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice PR, thanks 🚀
b1ca5eb
to
c7ea476
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
LGTM label has been added. Git tree hash: 0362278f7a0f456397faf8fd6c81bfd5ef4bfebc
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: oliver-goetz The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
…/resourcemanager/controller/node/criticalcomponents` We will introduce another controller for nodes, so let's prepare the folder structure
No longer needed/used
The health check controller in `node-agent` needs the full `Node` object anyways: https://github.com/gardener/gardener/blob/2cdfaa5545f3fc07e93997d6bb52fafdcc57d4ef/pkg/nodeagent/controller/healthcheck/reconciler.go#L45-L46 Hence, we can always work with the full object. Currently/without this, `node-agent` starts two watches for `Node`s: one with the full object, another one with metadata-only.
When the node name is known, we can use a field selector for its name. Otherwise, we use a label selector for the hostname.
c7ea476
to
79d2ade
Compare
/lgtm |
LGTM label has been added. Git tree hash: d1fb9e1fb3dea815fb373db25e2c60115f834590
|
How to categorize this PR?
/area cost scalability
/kind enhancement
What this PR does / why we need it:
Today, each
gardener-node-agent
watches allNode
s which is quite costly in terms of network I/O. Actually, each of them has two watches forNode
s: one with the full object, and one with metadata-only.This is because of #8885. In this PR, we added a linear mapping approach for reconciliation delay (to prevent too many kubelet restarts at the same time, e.g. when a shoot changes its patch Kubernetes version). Back then, we were very keen on using metadata-only for this which made the approach acceptable.
However, #8786 was developed in parallel, and in this PR, the full
Node
object was required:gardener/pkg/nodeagent/controller/healthcheck/reconciler.go
Lines 45 to 46 in 2cdfaa5
This invalidated the approach of listing all nodes in #8885 since it's simply too costly.
With this PR, we introduce a new controller in
gardener-resource-manager
which still follows the linear mapping approach for computing the reconciliation delays for nodes. It adds its computation result to the node annotations with keynode-agent.gardener.cloud/reconciliation-delay
.gardener-node-agent
can now simply read this value and use it for its reconciliation delay. This allows us to restrict its WATCH to only only theNode
it is responsible for (with the help of label/field selectors).In addition, I removed the metadata-only usage of
Node
objects ingardener-node-agent
to prevent having two watches forNode
s as explained above. Since at least one controller requires the full object, we can always work with the full object only.Which issue(s) this PR fixes:
Follow-up of #8885 and #8786
Related to #8023
Special notes for your reviewer:
/cc @oliver-goetz
Release note: