Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optimize watch-cache getlist #116327

Merged
merged 4 commits into from Apr 11, 2023

Conversation

sxllwx
Copy link
Member

@sxllwx sxllwx commented Mar 7, 2023

What type of PR is this?

What this PR does / why we need it:

Faster watch-cache get-list.

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

In order to facilitate verification, I split it into two commits:

  • Add benchmarks
  • Optimize GetList

Here are my own benchmark results. FIY...

# before
$ go test -bench=BenchmarkCacher_GetList  -count 5 . -run=none --benchmem
goos: linux
goarch: amd64
pkg: k8s.io/apiserver/pkg/storage/cacher
cpu: AMD EPYC 7K62 48-Core Processor
BenchmarkCacher_GetList-16    	14	 137743757 ns/op	129016510 B/op	   67975 allocs/op
BenchmarkCacher_GetList-16    	15	 121673849 ns/op	151072125 B/op	   66776 allocs/op
BenchmarkCacher_GetList-16    	15	  88535782 ns/op	151072186 B/op	   66774 allocs/op
BenchmarkCacher_GetList-16    	18	  83706540 ns/op	126228029 B/op	   63987 allocs/op
BenchmarkCacher_GetList-16    	19	 124967934 ns/op	149816737 B/op	   63252 allocs/op
PASS

# after

$ go test -bench=BenchmarkCacher_GetList  . -run=none --benchmem --count=5
goos: linux
goarch: amd64
pkg: k8s.io/apiserver/pkg/storage/cacher
cpu: AMD EPYC 7K62 48-Core Processor
BenchmarkCacher_GetList-16    	      44	  24302640 ns/op	 2528573 B/op	    5749 allocs/op
BenchmarkCacher_GetList-16    	      45	  23997310 ns/op	 2490390 B/op	    5623 allocs/op
BenchmarkCacher_GetList-16    	      46	  23521017 ns/op	 2453680 B/op	    5501 allocs/op
BenchmarkCacher_GetList-16    	      48	  23682351 ns/op	 2384704 B/op	    5273 allocs/op
BenchmarkCacher_GetList-16    	      49	  23772900 ns/op	 2352476 B/op	    5166 allocs/op
PASS

# compare
benchcmp before.text after.text                                                                      
benchmark                      old ns/op     new ns/op     delta
BenchmarkCacher_GetList-16     137743757     24302640      -82.36%
BenchmarkCacher_GetList-16     121673849     23997310      -80.28%
BenchmarkCacher_GetList-16     88535782      23521017      -73.43%
BenchmarkCacher_GetList-16     83706540      23682351      -71.71%
BenchmarkCacher_GetList-16     124967934     23772900      -80.98%

benchmark                      old allocs     new allocs     delta
BenchmarkCacher_GetList-16     67975          5749           -91.54%
BenchmarkCacher_GetList-16     66776          5623           -91.58%
BenchmarkCacher_GetList-16     66774          5501           -91.76%
BenchmarkCacher_GetList-16     63987          5273           -91.76%
BenchmarkCacher_GetList-16     63252          5166           -91.83%

benchmark                      old bytes     new bytes     delta
BenchmarkCacher_GetList-16     129016510     2528573       -98.04%
BenchmarkCacher_GetList-16     151072125     2490390       -98.35%
BenchmarkCacher_GetList-16     151072186     2453680       -98.38%
BenchmarkCacher_GetList-16     126228029     2384704       -98.11%
BenchmarkCacher_GetList-16     149816737     2352476       -98.43%

Does this PR introduce a user-facing change?

Kube-apiserver: Improved memory use when performing GetList on the cache.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot
Copy link
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-kind Indicates a PR lacks a `kind/foo` label and requires one. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. area/apiserver sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Mar 7, 2023
@k8s-ci-robot k8s-ci-robot requested review from deads2k and enj March 7, 2023 13:27
@sxllwx sxllwx changed the title [apiserver] Optimize watch-cache getlist Optimize watch-cache getlist Mar 7, 2023
@sxllwx sxllwx changed the title Optimize watch-cache getlist optimize watch-cache getlist Mar 7, 2023
@sxllwx sxllwx force-pushed the optimize/watch-cache-getlist branch from 40dcba9 to ca8a72f Compare March 7, 2023 13:44
@sxllwx sxllwx marked this pull request as ready for review March 7, 2023 13:48
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 7, 2023
@liggitt
Copy link
Member

liggitt commented Mar 7, 2023

cc @wojtek-t

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 7, 2023
pred := storage.SelectionPredicate{
Label: labels.Everything(),
Field: fields.Everything(),
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This benchmark assumes that we actually return all objects.

The other extremely important usecase is where a bunch of objects are actually filtered out (e.g. Kubelet listing its own pods).
Can you add a second benchmark that will simulate this one (i.e. only say 50 out of those 50,000 pods are being returned)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your prompt reply is much appreciated 😄! Use-cases have been added and I provided the latest benchmark results at #116327 (comment). PTAL

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, PTAL @wojtek-t

@wojtek-t wojtek-t self-assigned this Mar 7, 2023
@fedebongio
Copy link
Contributor

/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Mar 7, 2023
@sxllwx sxllwx force-pushed the optimize/watch-cache-getlist branch from ca8a72f to 3073b0c Compare March 8, 2023 03:33
@k8s-ci-robot k8s-ci-robot removed the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Mar 8, 2023
@k8s-ci-robot k8s-ci-robot added kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. do-not-merge/needs-kind Indicates a PR lacks a `kind/foo` label and requires one. labels Mar 29, 2023
@sxllwx
Copy link
Member Author

sxllwx commented Apr 3, 2023

/ping @lavalamp

- add comment to explain why we need to apply for a slice of runtime.Object instead of making a slice of ListObject.Items directly.
@sxllwx
Copy link
Member Author

sxllwx commented Apr 4, 2023

/retest

@lavalamp
Copy link
Member

lavalamp commented Apr 4, 2023

/lgtm
/approve

Thank you! (this should merge when we open up for 1.28)

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 4, 2023
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: e7fd8ed2f9067a86cc64347e394289cd6ade3bc1

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: lavalamp, sxllwx

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 4, 2023
@sxllwx
Copy link
Member Author

sxllwx commented Apr 5, 2023

/lgtm
/approve

Thank you! (this should merge when we open up for 1.28)

Ok,Thank you for your help, it has been a pleasure to work with you.

@wojtek-t
Copy link
Member

wojtek-t commented Apr 5, 2023

Sorry for delay - this LGTM too.

/lgtm
/label tide/merge-method-squash

@k8s-ci-robot k8s-ci-robot added the tide/merge-method-merge Denotes a PR that should use a standard merge by tide when it merges. label Apr 5, 2023
@wojtek-t wojtek-t added tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. and removed tide/merge-method-merge Denotes a PR that should use a standard merge by tide when it merges. labels Apr 5, 2023
@k8s-ci-robot k8s-ci-robot merged commit 75f17eb into kubernetes:master Apr 11, 2023
12 checks passed
@k8s-ci-robot k8s-ci-robot added this to the v1.28 milestone Apr 11, 2023
charles-chenzz pushed a commit to charles-chenzz/kubernetes that referenced this pull request Apr 12, 2023
* ftr(watch-cache): add benchmarks

* ftr(kube-apiserver): faster watch-cache getlist

* refine: testcase name

* - refine var name make it easier to convey meaning
- add comment to explain why we need to apply for a slice of runtime.Object instead of making a slice of ListObject.Items directly.
@sxllwx
Copy link
Member Author

sxllwx commented Jun 1, 2023

I want to add a release-note here. hope it's not too late

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-none Denotes a PR that doesn't merit a release note. labels Jun 1, 2023
rayowang pushed a commit to rayowang/kubernetes that referenced this pull request Feb 9, 2024
* ftr(watch-cache): add benchmarks

* ftr(kube-apiserver): faster watch-cache getlist

* refine: testcase name

* - refine var name make it easier to convey meaning
- add comment to explain why we need to apply for a slice of runtime.Object instead of making a slice of ListObject.Items directly.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/apiserver cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants