-
Notifications
You must be signed in to change notification settings - Fork 38.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Increase api latency threshold for cluster-scoped list calls #52732
Increase api latency threshold for cluster-scoped list calls #52732
Conversation
/test all |
/retest |
/lgtm |
/retest |
@liggitt @smarterclayton - Can one of you PTAL? |
/test pull-kubernetes-e2e-gce-gpu |
test/e2e/framework/metrics_util.go
Outdated
// as list response sizes are bigger in general for big clusters. We also use a higher threshold | ||
// for list calls with cluster scope (all namespaces). | ||
apiListCallLatencyThreshold time.Duration = 5 * time.Second | ||
apiClusterScopeListCallThreshold time.Duration = 10 * time.Second |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you literally mean cluster scoped (i.e., node objects) or cross-namespace lists? If the latter, please choose a less confusing name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant the former (which includes both non-namespaced and all-namespaced calls) - changed the comment to make it clearer.
test/e2e/framework/metrics_util.go
Outdated
if !isListCall || | ||
!isBigCluster || | ||
(!isClusterScopedCall && latency > apiListCallLatencyThreshold) || | ||
(latency > apiClusterScopeListCallThreshold) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a really confusing logic statement. I suggest one if/switch to set the threshold, and then this if should read if latency > threshold
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed it. LG?
6738e88
to
f373645
Compare
Yay, more precise metrics! :) |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: gmarek, shyamjvs, spiffxp Associated issue requirement bypassed by: spiffxp The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
/test all [submit-queue is verifying that this PR is safe to merge] |
/test pull-kubernetes-kubemark-e2e-gce-big |
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here.. |
cc @jpbetz |
Recent change from @smarterclayton (#52237) added scope to apiserver metrics. As a result, our current threshold for list calls is no longer sufficient for all-namespace calls which are now being measured separately from namespaced lists. For e.g (from our last 5k run):
cc @kubernetes/sig-scalability-misc @kubernetes/sig-api-machinery-misc @wojtek-t