Configurable throughput for clients to the API server. #447

RaunakShah · 2020-06-19T00:18:52Z

Adding configurable throughput for clients to the API server.
Use a separate client for leader election go routine.

Testing
Kubernetes version: 1.17.4
CSI driver: Dummy driver that always returns success for CreateVolume() and DeleteVolume() callbacks with randomized VolumeID's.

Test: Use 64-threads to create and delete PVCs in a loop for 5 minutes. Tracking the number of operations successfully executed within this time helps calculate the throughput.
Logs - https://gist.github.com/RaunakShah/ac18639695ff968df84021c1ba12df95

QPS = 5.0, Burst = 10 (Default)
a. Number of successful Create and Delete operations - 728
b. Ops/sec - 2.435
I have not pasted the entire log file here but there were 1780 instances of "Throttling ..." within these 5 minutes.

root@k8-master-440:/var/log/containers# cat !$ | grep Throttl | wc -l
cat vsphere-csi-controller-69cbc685fb-8gn9p_kube-system_csi-provisioner-bc80ce4ae96c1b0025f73b2166da2028fc59be55c9aa5b1ea12d848df0fa3fa6.log | grep Throttl | wc -l
1780

QPS = 10.0, Burst = 20
a. Number of successful Create and Delete operations - 1404
b. Ops/sec - 4.69
QPS = 25.0, Burst = 50
a. Number of successful Create and Delete operations - 1666
b. Ops/sec - 5.57
QPS = 50.0, Burst = 100
a. Number of successful Create and Delete operations - 1703
b. Ops/sec - 5.696
Zero instances of "Throttling ..." within these 5 minutes.

root@k8-master-440:/var/log/containers# cat !$ | grep Throt | wc -l
cat vsphere-csi-controller-7bcbdbc485-9g9vp_kube-system_csi-provisioner-16fff97dcb70cf23f09004656e5b32e574d59be39d4f1ec5ebd5f3a36b25f29a.log | grep Throt | wc -l
0

What type of PR is this?
/kind bug

What this PR does / why we need it:
This change adds two new flags "kube-api-qps" and "kube-api-burst" to configure the QPS and Burst of clients to the API server. These values default to 5 which is not very efficient at larger scales.
The default QPS and Burst for clients is changed from 5 to 50.
This change also uses a separate client for leader election go routine.

Which issue(s) this PR fixes:

Fixes #416

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

Use a separate client for leader election go routine and add kube-api-qps and kube-api-burst configurable parameters for the provisioner's kubernetes client.

k8s-ci-robot · 2020-06-19T00:19:01Z

Hi @RaunakShah. Thanks for your PR.

I'm waiting for a kubernetes-csi member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

RaunakShah · 2020-06-19T00:20:31Z

/assign @xing-yang @jsafrane

xing-yang · 2020-06-19T01:28:47Z

/ok-to-test

cmd/csi-provisioner/csi-provisioner.go

xing-yang · 2020-06-19T01:35:53Z

cmd/csi-provisioner/csi-provisioner.go

@@ -74,6 +74,9 @@ var (

 	defaultFSType = flag.String("default-fstype", "", "The default filesystem type of the volume to provision when fstype is unspecified in the StorageClass. If the default is not set and fstype is unset in the StorageClass, then no fstype will be set")

+	kubeAPIQPS   = flag.Float32("kube-api-qps", 50, "QPS to use while talking with the kubernetes apiserver. Defaults to 50.0.")
+	kubeAPIBurst = flag.Int("kube-api-burst", 50, "Burst to use while talking with the kubernetes apiserver. Defaults to 50.")
+


We probably want to have different sets of QPS/Burst config parameters for the leader election client and the client for create/delete operations.

I thought about that too, but it seemed a little superfluous to add two more flags just for a leader election client. There's also a snapshot client that uses the same config, so that would need its own QPS and Burst too then..

I think it's fine for the snapshot client to use the same config as the client that handles create/delete operations.
Let's see if @msau42 has any suggestions here.

msau42 · 2020-06-19T21:29:32Z

cmd/csi-provisioner/csi-provisioner.go

@@ -124,10 +127,15 @@ func main() {
 	if err != nil {
 		klog.Fatalf("Failed to create config: %v", err)
 	}
+
+	config.QPS = *kubeAPIQPS
+	config.Burst = *kubeAPIBurst


Can you explain how Burst interacts with QPS?

Also are you able to get a profile about what operations are coming from the provisioner? I want to make sure we're not hiding inefficiencies or bugs that could be fixed, such as using informers or not rate limiting/backing off. It may be worth talking to sig-scalability or sig-api-machinery about reasonable qps values would be because there are tradeoffs if we give everything a high qps.

NewForConfig() uses a "token bucket approach". QPS is the guaranteed number of queries that are processed per second. Burst is the size of the bucket/queue that is filled at the rate of QPS. So requests to the API server will only be throttled after there are "Burst" number of items in the bucket/queue.

There are some Get() operations on the API server that could potentially be replaced with informers, but Create()/Update()/Delete() are still limited by the throughput of these clients. I've updated the testing section with more information but using QPS/Burst of 50/100 increases ops/sec of CreateVolume and DeleteVolume from ~2.5 to ~5.5.

Can we potentially tag someone from sig-scalability or sig-api-machinery here for their input?

I'm surprised we need a qps of 50 to achieve 5.5 ops per second. Does that imply we're making 10 api calls per operation?

cc @wojtek-t if you have any thoughts on reasonable client qps/burst values

Forgive my ignorance here, but is csi-provisioner a O(1) instance in the cluster or there can be a lot of them (e.g. one per node or sth like that).
If there is O(1), setting to 50 sounds fine, if there are a lot of them, that sounds like too much too me.

BTW - I second Michelle`s question - if you need 50qps to achieve 5.5ops that doesn't sound very promissing...

@RaunakShah thanks for putting together the benchmarks! The results are concerning to me, that increasing qps 10x only achieves 2x in ops throughput. Could you also try increasing qps 2x, and 5x to get some more datapoints?

(Also separately, I think your benchmarking work is very valuable, and we should consider adding it into our CI and also try to get CSI tested in sig-scalability's test framework as well)

Yup i'll play around a little and add some more results shortly!
I'm not sure if these results are entirely accurate because i've done some hacking with the CSI driver as well, but a reasonable baseline i would think

@msau42 i've updated the PR description with the test results. Please take a look!

I also did some further analysis. Interestingly, there were "Throttling ... " messages for four operations:

Posting events:

{"log":"I0625 17:10:17.542869 1 request.go:565] Throttling request took 221.204186ms, request: POST:https://10.96.0.1:443/api/v1/namespaces/default/events\n","stream":"stderr","time":"2020-06-25T17:10:17.54314291Z"}

Posting PVs:

{"log":"I0625 17:10:17.741553 1 request.go:565] Throttling request took 402.941089ms, request: POST:https://10.96.0.1:443/api/v1/persistentvolumes\n","stream":"stderr","time":"2020-06-25T17:10:17.741860753Z"}

Deleting PVs:

{"log":"I0625 17:10:21.741317 1 request.go:565] Throttling request took 3.330344146s, request: DELETE:https://10.96.0.1:443/api/v1/persistentvolumes/pvc-90b8a0f3-6914-4ae3-9744-74fbc6e0f3c4\n","stream":"stderr","time":"2020-06-25T17:10:21.741666156Z"}

Get PVs:

{"log":"I0625 17:10:20.541396 1 request.go:565] Throttling request took 2.367093054s, request: GET:https://10.96.0.1:443/api/v1/persistentvolumes/pvc-c6fca517-345a-42b5-bde1-ea707f16e084\n","stream":"stderr","time":"2020-06-25T17:10:20.541660324Z"}

(1), (2) and (3) above probably can't be replaced but external-provisioner lib does a Get() on the API server for every provisioning and delete operation (ref: https://github.com/kubernetes-sigs/sig-storage-lib-external-provisioner/blob/master/controller/controller.go#L1290 and https://github.com/kubernetes-sigs/sig-storage-lib-external-provisioner/blob/master/controller/controller.go#L1428). The same clientset is used to initialize ProvisionController - https://github.com/kubernetes-csi/external-provisioner/blob/master/cmd/csi-provisioner/csi-provisioner.go#L246

I did another test by adding a pvLister in sig-storage-lib-external-provisioner instead of querying API server and got a ~2x increase in throughput for default qps and burst! I still see "Throttling ... " messages upto a QPS of 25 but not for any Get() operations

QPS = 5.0, Burst = 10
a. Number of successful Create and Delete operations - 1425
b. Ops/sec - 4.76

root@k8-master-440:/var/log/containers/with-lib-change# cat qps5.log | wc -l 2450

QPS = 10.0, Burst = 20
a. Number of successful Create and Delete operations - 1586
b. Ops/sec - 5.30

root@k8-master-440:/var/log/containers/with-lib-change# cat qps10.log | wc -l 3137

QPS = 25.0, Burst = 50
a. Number of successful Create and Delete operations - 1694
b. Ops/sec - 5.66

root@k8-master-440:/var/log/containers/with-lib-change# cat qps25.log | wc -l 0

Since this change doesn't address sig-storage-lib-external-provisioner, i kept this out of the PR description, but i think its worth addressing in subsequent changes..

That's really great! Assuming we can fix lib-external-provisioner, it doesn't look like increasing the default qps/burst values will significantly improve throughput. What do you think of this plan?

Make improvements to sig-storage-lib-external-provisioner to use informers

Separate out clientset used for leader election

Make qps/burst configurable but use client default values as the default

Future: tuning other parameters like backoff ratio/interval

Future: Add benchmarking integration test using mock driver to our CI

Future: Add e2e scalability tests using mock driver to sig-scalability test framework

Yup, sounds good to me. I've updated this PR accordingly. Should i go ahead and raise an issue/PR/both on lib-external-provisioner?

Yup that sounds good!

Defaults for QPS and Burst remain unchanged. This change also creates a new client for leader election.

msau42 · 2020-06-26T16:13:57Z

/lgtm
/approve

Can you also extend the release note to say that the leader election client is separate? Go ahead and "/hold cancel" when updated thanks!
/hold

k8s-ci-robot · 2020-06-26T16:14:09Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: msau42, RaunakShah

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [msau42]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

RaunakShah · 2020-06-26T17:16:53Z

/hold cancel

chrishenzie · 2020-07-22T16:54:04Z

Hi @RaunakShah , do you mind sharing the benchmark setup / tool you used to collect these metrics? I'm trying to gain more context on this issue and am curious how contributors collect these metrics.

…go_modules/k8s.io/csi-translation-lib-0.27.3 Bump k8s.io/csi-translation-lib from 0.27.2 to 0.27.3

k8s-ci-robot requested review from davidz627 and msau42 June 19, 2020 00:19

k8s-ci-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Jun 19, 2020

k8s-ci-robot assigned jsafrane and xing-yang Jun 19, 2020

k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jun 19, 2020

xing-yang reviewed Jun 19, 2020

View reviewed changes

RaunakShah force-pushed the configurable_throughput branch from 798dc20 to e2dae61 Compare June 19, 2020 16:03

msau42 reviewed Jun 19, 2020

View reviewed changes

RaunakShah force-pushed the configurable_throughput branch 2 times, most recently from d101568 to 7778d54 Compare June 26, 2020 03:12

Configurable throughput for clients to the API server.

550c67a

Defaults for QPS and Burst remain unchanged. This change also creates a new client for leader election.

RaunakShah force-pushed the configurable_throughput branch from 7778d54 to 550c67a Compare June 26, 2020 03:13

k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 26, 2020

k8s-ci-robot assigned msau42 Jun 26, 2020

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 26, 2020

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 26, 2020

k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 26, 2020

k8s-ci-robot merged commit 6bc88c9 into kubernetes-csi:master Jun 26, 2020

This was referenced Jun 30, 2020

Replace k8s client with listers kubernetes-sigs/sig-storage-lib-external-provisioner#82

Closed

Replace k8s client get requests with volume cache kubernetes-sigs/sig-storage-lib-external-provisioner#83

Merged

chrishenzie mentioned this pull request Jul 23, 2020

after concurrent create 100 PVCs external-provisioner crash #322

Closed

RaunakShah mentioned this pull request Jul 25, 2020

Update README with qps and burst argument description #457

Merged

xing-yang mentioned this pull request Nov 29, 2021

Make the QPS and Burst of kube client config to be configurable kubernetes-csi/external-snapshotter#621

Merged

kbsonlong pushed a commit to kbsonlong/external-provisioner that referenced this pull request Dec 29, 2023

Merge pull request kubernetes-csi#447 from kubernetes-csi/dependabot/…

6fc6453

…go_modules/k8s.io/csi-translation-lib-0.27.3 Bump k8s.io/csi-translation-lib from 0.27.2 to 0.27.3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Configurable throughput for clients to the API server. #447

Configurable throughput for clients to the API server. #447

RaunakShah commented Jun 19, 2020 •

edited by msau42

Loading

k8s-ci-robot commented Jun 19, 2020

RaunakShah commented Jun 19, 2020

xing-yang commented Jun 19, 2020

xing-yang Jun 19, 2020

RaunakShah Jun 19, 2020

xing-yang Jun 19, 2020

msau42 Jun 19, 2020

RaunakShah Jun 23, 2020

msau42 Jun 24, 2020

wojtek-t Jun 24, 2020

msau42 Jun 24, 2020

RaunakShah Jun 24, 2020

RaunakShah Jun 25, 2020 •

edited

Loading

msau42 Jun 25, 2020

RaunakShah Jun 26, 2020

msau42 Jun 26, 2020

msau42 commented Jun 26, 2020

k8s-ci-robot commented Jun 26, 2020

RaunakShah commented Jun 26, 2020

chrishenzie commented Jul 22, 2020

Configurable throughput for clients to the API server. #447

Configurable throughput for clients to the API server. #447

Conversation

RaunakShah commented Jun 19, 2020 • edited by msau42 Loading

k8s-ci-robot commented Jun 19, 2020

RaunakShah commented Jun 19, 2020

xing-yang commented Jun 19, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

RaunakShah Jun 25, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

msau42 commented Jun 26, 2020

k8s-ci-robot commented Jun 26, 2020

RaunakShah commented Jun 26, 2020

chrishenzie commented Jul 22, 2020

RaunakShah commented Jun 19, 2020 •

edited by msau42

Loading

RaunakShah Jun 25, 2020 •

edited

Loading