Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

flowschema for etcd operator traffic #462

Merged
merged 1 commit into from Oct 5, 2020

Conversation

tkashem
Copy link
Contributor

@tkashem tkashem commented Oct 5, 2020

Traffic from control plane operators are important (kas-o, oas-o, auth operator, etcd operator), they have a dedicated concurrency pool now.

See openshift/cluster-kube-apiserver-operator#966 for more details.

@hexfusion
Copy link
Contributor

/hold
@tkashem can you walk us through this. I understand that we have a dedicated concurrency pool.

questions:

  • does this pool have a limit?
  • if yes what happens if we exceed it?
  • if yes how can the operator understand the limit thresholds and how close we are to it.
  • if yes did we have a limit before?
  • if yes what happened if we exceeded it?

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 5, 2020
@hexfusion
Copy link
Contributor

maybe a link to docs would help :)

@tkashem
Copy link
Contributor Author

tkashem commented Oct 5, 2020

https://kubernetes.io/blog/2020/04/06/kubernetes-1-18-feature-api-priority-and-fairness-alpha/

does this pool have a limit?

yes, all control plane operators (kas-o, etcd-o, oauth-o, oas-o) share the same concurrency pool, right now its share is 10. By default (max-mutating=1000 and max-readonly=3000) it translates to about 175 concurrent requests.

if yes what happens if we exceed it?

the apiserver will reject the request with a 429.

if yes how can the operator understand the limit thresholds and how close we are to it.

There are metrics in place to see if there are requests in queue for this particular request flow (all requests originating from etcd operator)

if yes did we have a limit before?

The previous limit was not as fine grained as this. Previously we have max inflight requests allowed (readonly=3000 and mutating=1000)

if yes did we have a limit before?

same, the apiserver would reject it with a 429. The difference now is that if a bad actor (a different user) floods the apiserver with requests it won't have any affect on the etc operator traffic.

@tkashem
Copy link
Contributor Author

tkashem commented Oct 5, 2020

@hexfusion this shows how many requests per second etcd operator is generating. :)

image

@hexfusion
Copy link
Contributor

@hexfusion this shows how many requests per second etcd operator is generating. :)

image

ok so plenty of room to expand! thanks for details

@hexfusion
Copy link
Contributor

/hold cancel

cc @retroflexer @ironcladlou

@openshift-ci-robot openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 5, 2020
@deads2k
Copy link
Contributor

deads2k commented Oct 5, 2020

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Oct 5, 2020
@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: deads2k, tkashem

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 5, 2020
@openshift-ci-robot
Copy link

@tkashem: The following test failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
ci/prow/e2e-disruptive 272fc26 link /test e2e-disruptive

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@tkashem
Copy link
Contributor Author

tkashem commented Oct 5, 2020

/retest

@openshift-merge-robot openshift-merge-robot merged commit 3945354 into openshift:master Oct 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants