-
Notifications
You must be signed in to change notification settings - Fork 2.1k
feat: Introduce object limits #2626
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: mrueg The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
This issue is currently awaiting triage. If kube-state-metrics contributors determine this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
9f17eee
to
915ef4e
Compare
adee22a
to
dcffe5f
Compare
e1c99cf
to
b7f6f2b
Compare
pkg/watch/watch.go
Outdated
} | ||
i.metrics.ListRequestsTotal.WithLabelValues("success", i.resource).Inc() | ||
|
||
if i.limit != 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'd want to validate against negative values, also s/int/uint preferably. I'm reviewing this on my back from work so apologies if I missed something obvious here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, to be honest, I have no idea why the kubernetes API made limits a signed int. https://pkg.go.dev/k8s.io/apimachinery/pkg/apis/meta/v1#ListOptions
00b2c6b
to
93b4b9f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have an e2e test-case as well that covers --object-limit
? Also one nit, otherwise LGTM.
@@ -202,5 +204,9 @@ func (o *Options) Validate() error { | |||
return fmt.Errorf("value for --auto-gomemlimit-ratio=%f must be greater than 0 and less than or equal to 1", o.AutoGoMemlimitRatio) | |||
} | |||
|
|||
if o.ObjectLimit < 0 { | |||
return fmt.Errorf("value for --object-limit=%d must be equal or greater than 0", o.ObjectLimit) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want this to be greater than zero? I see in a couple of instances in watch.go
that we check for this being greater than zero? Or maybe we want the conditions to reflect this value (0) as well, and as such, accommodate this in watch.go
as well?
The latter would make more sense as in we keep the same expectations as cache.ListerWatcher
allows upstream.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It throws an error because we assume that a negative object limit is invalid. If it's zero, we don't want to throw an error because this is the default and means that it's not set.
3613b0d
to
fcd7a8b
Compare
00bacd5
to
9ec4b78
Compare
This change allows user-controlled limits on how many objects KSM will list from the API. This is helpful to prevent resource exhaustion on KSM, in case the API creates too many resources. The object limit it set globally and applied per resource watched.
@rexagod I added another e2e test, please take a look if you have some spare cycles. |
What this PR does / why we need it:
This change allows user-controlled limits on how many objects KSM will list from the API. This is helpful to prevent resource exhaustion on KSM, in case the API creates too many resources.
The object limit it set globally and applied per resource watched.
This is currently a WIP as I'm not sure if it will work as expected and it needs further testing.
How does this change affect the cardinality of KSM: (increases, decreases or does not change cardinality)
Introduces a new metric on ksm telemetry, which is static over KSM config lifetime:
kube_state_metrics_list_limit
This will help with alerting when a threshold is reached.
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #2622