Admission controllers can cause unnecessary significant load on apiserver #22422

wojtek-t · 2016-03-03T09:35:17Z

Recently we observed high latencies on POST pods on our scalability tests.
For more context see: #22340

It appeared to be throttling related, e.g.:

W0302 21:38:28.190204       5 request.go:627] Throttling request took 1.281100127s, request: GET:http://127.0.0.1:8080/api/v1/namespaces/e2e-tests-load-6cqci/limitranges
I0302 21:38:28.191235       5 handlers.go:152] GET /api/v1/namespaces/e2e-tests-load-6cqci/limitranges: (853.53Âµs) 200 [[kube-apiserver/v1.2.0 (linux/amd64) kubernetes/2cf9157] 127.0.0.1:58515]
I0302 21:38:28.193396       5 trace.go:57] Trace "Create /api/v1/namespaces/e2e-tests-load-6cqci/pods" (started 2016-03-02 21:38:26.908795629 +0000 UTC):
[20.412Âµs] [20.412Âµs] About to convert to expected version
[239.558Âµs] [219.146Âµs] Conversion done
[1.282815499s] [1.282575941s] About to store object in database
[1.284377959s] [1.56246ms] Object stored in database
[1.284389884s] [11.925Âµs] Self-link added
[1.284548334s] [158.45Âµs] END

As an example look at resource quota admission controller.
In #20446 there was some fallback added to use lru cache in case of on objects.
However, not what happens if we are sending multiple POST pods at the same time to apiserver.
What happens in Admit() if there are no results in Indexer, we list ResourceQuotas.
But if there are multiple calls at the same time, there can be multiple LISTs before the cache is updated:
https://github.com/kubernetes/kubernetes/blob/master/plugin/pkg/admission/resourcequota/admission.go#L131
But this can cause throttling of these requests.

What we should do is: if there is List() in flight, we should wait until it is finished and lookupCache is updated.

@derekwaynecarr

The text was updated successfully, but these errors were encountered:

derekwaynecarr · 2016-03-03T15:10:04Z

@wojtek-t - thanks for the analysis. @deads2k - fyi, this is an interesting side-effect of the LRU cache pattern, not sure if we have adopted it in other places as well, but good to keep in mind.

derekwaynecarr · 2016-03-03T15:17:04Z

To clarify, it was adopted in LimitRange and ResourceQuota admission plug-ins in separate PRs.

wojtek-t · 2016-03-03T15:26:51Z

@derekwaynecarr - it took me some time, but was useful experience. Thanks for assigning yourself.

fejta-bot · 2017-12-25T13:51:38Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle stale

fejta-bot · 2018-01-24T15:40:26Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle rotten
/remove-lifecycle stale

fejta-bot · 2018-02-23T15:46:49Z

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

aimuz · 2022-09-23T07:24:29Z

/reopen

k8s-ci-robot · 2022-09-23T07:24:33Z

@aimuz: You can't reopen an issue/PR unless you authored it or you are a collaborator.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

relates to kubernetes#22422 and kubernetes#123806 Signed-off-by: Flavian Missi <fmissi@redhat.com>

relates to kubernetes#22422 and kubernetes#123806

wojtek-t added sig/scalability Categorizes an issue or PR as relevant to SIG Scalability. team/control-plane labels Mar 3, 2016

wojtek-t added this to the next-candidate milestone Mar 3, 2016

wojtek-t mentioned this issue Mar 3, 2016

e2e flake: density and load are failing with too high latency of api calls #22340

Closed

derekwaynecarr self-assigned this Mar 3, 2016

smarterclayton mentioned this issue May 5, 2016

Moving StorageFactory building logic to genericapiserver #24787

Merged

wojtek-t removed the team/control-plane (deprecated - do not use) label May 30, 2017

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 25, 2017

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jan 24, 2018

k8s-ci-robot closed this as completed Feb 23, 2018

This was referenced Dec 14, 2020

If there are multiple operations at the same time and cache has just expired, pacoxu/kubernetes#905

Open

If there are multiple operations at the same time and cache has just expired, pacoxu/kubernetes#1193

Open

k8s-ci-robot mentioned this issue Sep 23, 2022

Fixed: 22422 use singleflight to alleviate simultaneous calls to #112696

Merged

DrAuYueng mentioned this issue Jan 9, 2023

Reduce simultaneous calls to quota queries #114919

Closed

flavianmissi added a commit to flavianmissi/kubernetes that referenced this issue Apr 3, 2024

resourcequota: use singleflight.Group to reduce apiserver load

dfa7da9

relates to kubernetes#22422 and kubernetes#123806 Signed-off-by: Flavian Missi <fmissi@redhat.com>

flavianmissi mentioned this issue Apr 3, 2024

resourcequota: use singleflight.Group to reduce apiserver load #124163

Merged

flavianmissi added a commit to flavianmissi/kubernetes that referenced this issue Apr 10, 2024

resourcequota: use singleflight.Group to reduce apiserver load

e13ff5e

relates to kubernetes#22422 and kubernetes#123806

jingczhang pushed a commit to nokia/kubernetes that referenced this issue May 7, 2024

resourcequota: use singleflight.Group to reduce apiserver load

32390f2

relates to kubernetes#22422 and kubernetes#123806

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Admission controllers can cause unnecessary significant load on apiserver #22422

Admission controllers can cause unnecessary significant load on apiserver #22422

wojtek-t commented Mar 3, 2016

derekwaynecarr commented Mar 3, 2016

derekwaynecarr commented Mar 3, 2016

wojtek-t commented Mar 3, 2016

fejta-bot commented Dec 25, 2017

fejta-bot commented Jan 24, 2018

fejta-bot commented Feb 23, 2018

aimuz commented Sep 23, 2022

k8s-ci-robot commented Sep 23, 2022

Admission controllers can cause unnecessary significant load on apiserver #22422

Admission controllers can cause unnecessary significant load on apiserver #22422

Comments

wojtek-t commented Mar 3, 2016

derekwaynecarr commented Mar 3, 2016

derekwaynecarr commented Mar 3, 2016

wojtek-t commented Mar 3, 2016

fejta-bot commented Dec 25, 2017

fejta-bot commented Jan 24, 2018

fejta-bot commented Feb 23, 2018

aimuz commented Sep 23, 2022

k8s-ci-robot commented Sep 23, 2022