Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use github.com/klauspost/compress/gzip in kubernetes #104071

Closed
mborsz opened this issue Aug 2, 2021 · 18 comments
Closed

Use github.com/klauspost/compress/gzip in kubernetes #104071

mborsz opened this issue Aug 2, 2021 · 18 comments
Assignees
Labels
area/code-organization Issues or PRs related to kubernetes code organization kind/feature Categorizes issue or PR as related to a new feature. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@mborsz
Copy link
Member

mborsz commented Aug 2, 2021

What would you like to be added:

This is an optimization proposal: In some cases (when a lot of large LIST calls happens at roughly the same time) it looks like a significant part of CPU usage is generated by gzip compressing.

There is an alternative implementation of gzip library which seems to be less CPU-consuming.

I think we should investigate if migration from golang's compress/gzip library to klauspost/compress/gzip library is possible and do that.

My POC with promising preliminary results here: #99300

Personally I don't have capacity to continue that PR so creating this issue so that some other person can optimize this (looks like a good first issue).

@mborsz mborsz added the kind/feature Categorizes issue or PR as related to a new feature. label Aug 2, 2021
@k8s-ci-robot k8s-ci-robot added needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Aug 2, 2021
@mritunjaysharma394
Copy link

/assign

@neolit123
Copy link
Member

/sig api-machinery

@k8s-ci-robot k8s-ci-robot added sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Aug 2, 2021
@caesarxuchao
Copy link
Member

/cc @wojtek-t
/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Aug 3, 2021
@neolit123
Copy link
Member

neolit123 commented Aug 4, 2021

I think we should investigate if migration from golang's compress/gzip library to klauspost/compress/gzip library is possible and do that.

switching from stdlib to a third party library comes with a cost:

  • looks like the library is maintained by a single person, is that a concern? my guess is not given how many small vendors we have in k/k.
  • has anyone done evaluation of the algorithm implementation that we want to use? security?

once this is in k/k taking it out becomes difficult, one reason would be the perf regression if the library is actually faster.

/area code-organization
/sig architecture

@k8s-ci-robot k8s-ci-robot added the area/code-organization Issues or PRs related to kubernetes code organization label Aug 4, 2021
@k8s-ci-robot
Copy link
Contributor

@neolit123: The label(s) sig/secuirty cannot be applied, because the repository doesn't have them.

In response to this:

I think we should investigate if migration from golang's compress/gzip library to klauspost/compress/gzip library is possible and do that.

switching from stdlib to a third party library comes with a cost:

  • looks like the library is maintained by a single person, is that a concern? my guess is not given how many small vendors we have in k/k.
  • has anyone done evaluation of the algorithm implementation that we want to use? security?

once this is in k/k taking it out becomes difficult, one reason would be the perf regression if the library is actually faster.

/area code-organization
/sig architecture secuirty

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. label Aug 4, 2021
@mritunjaysharma394
Copy link

I think we should investigate if migration from golang's compress/gzip library to klauspost/compress/gzip library is possible and do that.

switching from stdlib to a third party library comes with a cost:

  • looks like the library is maintained by a single person, is that a concern? my guess is not given how many small vendors we have in k/k.

I personally feel that this shouldn't be a problem as this library is being actively maintained by the maintainer and in fact, it's latest 1.13.3 version was released 3 days ago only (https://github.com/klauspost/compress) but of course, we would love to have views of other folks in the community.

  • has anyone done evaluation of the algorithm implementation that we want to use? security?

I think we will have to reach out to other community members to discuss security aspect but I tried to do an evaluation of the algorithm implementation that makes this library faster. Here are some inferences and I would rather like to quote them from the blog of the maintainer of this library

  • Minimum matches are 4 bytes, this leads to fewer searches and better compression.
  • Stronger hash (iSCSI CRC32) for matches on x64 with SSE 4.2 support. This leads to fewer hash collisions.
  • Literal byte matching using SSE 4.2 for faster long-match comparisons.
  • Bulk hashing on matches.
  • Much faster dictionary indexing with NewWriterDict()/Reset().
  • CRC32 optimized for 10x speedup on SSE 4.2. Available separately.
  • Make Bit Coder faster by assuming we are on a 64 bit CPU.
  • Remove some branches by splitting the main deflate loop.

Another couple of interesting points to consider are :

  • that the real speedup depends a lot on data
  • there's another library from the same mainter called pgzip which is not meant as a general-purpose gzip, but rather when we have big amounts of data. Since it processes blocks in parallel it has a speed advantage over the other compressors.

I am also attaching the links to the three blogs by the maintainer himself which helped me out a lot. The blogs have spreadsheets of performance tests results as well which can help us reach a better understanding.

Thanks a lot, looking forward!

once this is in k/k taking it out becomes difficult, one reason would be the perf regression if the library is actually faster.

/area code-organization
/sig architecture

@neolit123
Copy link
Member

neolit123 commented Aug 5, 2021

that is good analysis, @mritunjaysharma394 . thank you for that.

of course, i'm not questioning whether the project is poorly maintained or insecure.
but one example is that we are currently trying to decouple k/k from the project https://github.com/PuerkitoBio/purell which is a well written (by a single person) library for URL normalization, but it has become stale and unmaintained.

the judgement is in the hands of the owning SIGs here.

@liggitt
Copy link
Member

liggitt commented Aug 6, 2021

my first thought is that if there is a demonstrable bottleneck due to the stdlib gzip implementation, I'd like to see if that could be reported/improved there first. This is a large and difficult to review dependency to add... I'd like to avoid it if possible.

@mritunjaysharma394
Copy link

mritunjaysharma394 commented Aug 7, 2021

thanks a lot for your feedback @neolit123 and @liggitt.

my first thought is that if there is a demonstrable bottleneck due to the stdlib gzip implementation,

I am not sure if it's a bottleneck but the results in optimizing the CPU usage seems to have shown good results in this direction in not only the PR #99300 but also in the benchmark tests that I ran on my system also showed some good results.

For an example, I would like to show the results of two such benchmark results:

  • For stdlib gzip, level=4
goarch: amd64
pkg: k8s.io/apiserver/pkg/endpoints/handlers/responsewriters
cpu: Intel(R) Core(TM) i7-1065G7 CPU @ 1.30GHz
BenchmarkSerializeObject1000PodsPB-8                  63          17689684 ns/op
--- BENCH: BenchmarkSerializeObject1000PodsPB-8
    writers_test.go:457: Payload size: 2633726, expected output size: 214575, ratio: 0.08
    writers_test.go:457: Payload size: 2633726, expected output size: 214576, ratio: 0.08
    writers_test.go:457: Payload size: 2633726, expected output size: 214565, ratio: 0.08
BenchmarkSerializeObject10000PodsPB-8                  6         169622770 ns/op
--- BENCH: BenchmarkSerializeObject10000PodsPB-8
    writers_test.go:457: Payload size: 26337133, expected output size: 2130033, ratio: 0.08
    writers_test.go:457: Payload size: 26337133, expected output size: 2130034, ratio: 0.08
BenchmarkSerializeObject100000PodsPB-8                 1        1808848476 ns/op
--- BENCH: BenchmarkSerializeObject100000PodsPB-8
    writers_test.go:457: Payload size: 263371335, expected output size: 21276773, ratio: 0.08
BenchmarkSerializeObject1000PodsJSON-8                48          26911183 ns/op
--- BENCH: BenchmarkSerializeObject1000PodsJSON-8
    writers_test.go:457: Payload size: 4118115, expected output size: 256439, ratio: 0.06
    writers_test.go:457: Payload size: 4118115, expected output size: 256473, ratio: 0.06
BenchmarkSerializeObject10000PodsJSON-8                5         213174191 ns/op
--- BENCH: BenchmarkSerializeObject10000PodsJSON-8
    writers_test.go:457: Payload size: 41180777, expected output size: 2548333, ratio: 0.06
    writers_test.go:457: Payload size: 41180777, expected output size: 2548365, ratio: 0.06
    writers_test.go:457: Payload size: 41180777, expected output size: 2548706, ratio: 0.06
    writers_test.go:457: Payload size: 41180777, expected output size: 2548454, ratio: 0.06
BenchmarkSerializeObject100000PodsJSON-8               1        2125218034 ns/op
--- BENCH: BenchmarkSerializeObject100000PodsJSON-8
    writers_test.go:457: Payload size: 411808348, expected output size: 25466194, ratio: 0.06
PASS
ok      k8s.io/apiserver/pkg/endpoints/handlers/responsewriters 25.309s
  • For github.com/klauspost/compress, level=4
goos: linux
goarch: amd64
pkg: k8s.io/apiserver/pkg/endpoints/handlers/responsewriters
cpu: Intel(R) Core(TM) i7-1065G7 CPU @ 1.30GHz
BenchmarkSerializeObject1000PodsPB-8                 160           7328892 ns/op
--- BENCH: BenchmarkSerializeObject1000PodsPB-8
    writers_test.go:458: Payload size: 2633726, expected output size: 220879, ratio: 0.08
    writers_test.go:458: Payload size: 2633726, expected output size: 221082, ratio: 0.08
    writers_test.go:458: Payload size: 2633726, expected output size: 221055, ratio: 0.08
BenchmarkSerializeObject10000PodsPB-8                 16          69526779 ns/op
--- BENCH: BenchmarkSerializeObject10000PodsPB-8
    writers_test.go:458: Payload size: 26337133, expected output size: 2193218, ratio: 0.08
    writers_test.go:458: Payload size: 26337133, expected output size: 2191140, ratio: 0.08
BenchmarkSerializeObject100000PodsPB-8                 2         672789538 ns/op
--- BENCH: BenchmarkSerializeObject100000PodsPB-8
    writers_test.go:458: Payload size: 263371335, expected output size: 21891137, ratio: 0.08
    writers_test.go:458: Payload size: 263371335, expected output size: 21907392, ratio: 0.08
BenchmarkSerializeObject1000PodsJSON-8               100          11422098 ns/op
--- BENCH: BenchmarkSerializeObject1000PodsJSON-8
    writers_test.go:458: Payload size: 4118115, expected output size: 273601, ratio: 0.07
    writers_test.go:458: Payload size: 4118115, expected output size: 273499, ratio: 0.07
BenchmarkSerializeObject10000PodsJSON-8               10         111397635 ns/op
--- BENCH: BenchmarkSerializeObject10000PodsJSON-8
    writers_test.go:458: Payload size: 41180777, expected output size: 2718504, ratio: 0.07
    writers_test.go:458: Payload size: 41180777, expected output size: 2719112, ratio: 0.07
BenchmarkSerializeObject100000PodsJSON-8               1        1075152066 ns/op
--- BENCH: BenchmarkSerializeObject100000PodsJSON-8
    writers_test.go:458: Payload size: 411808348, expected output size: 27155758, ratio: 0.07
PASS
ok      k8s.io/apiserver/pkg/endpoints/handlers/responsewriters 17.863s

While I am learning to understand benchmark test better, at the higher level and especially looking at 25.309s --> 17.863s in the final test result line, it seems to be a good option.

I'd like to see if that could be reported/improved there first.

How can we do this and what should be our approach? Like we should try to lower the level to 1? I tested it out for 1 with stdlib gzip and here are the benchmark results for level 1 of stdlib gzip:

goos: linux
goarch: amd64
pkg: k8s.io/apiserver/pkg/endpoints/handlers/responsewriters
cpu: Intel(R) Core(TM) i7-1065G7 CPU @ 1.30GHz
BenchmarkSerializeObject1000PodsPB-8                 118           9953215 ns/op
--- BENCH: BenchmarkSerializeObject1000PodsPB-8
    writers_test.go:457: Payload size: 2633726, expected output size: 268882, ratio: 0.10
    writers_test.go:457: Payload size: 2633726, expected output size: 267803, ratio: 0.10
    writers_test.go:457: Payload size: 2633726, expected output size: 267717, ratio: 0.10
BenchmarkSerializeObject10000PodsPB-8                 12          97536616 ns/op
--- BENCH: BenchmarkSerializeObject10000PodsPB-8
    writers_test.go:457: Payload size: 26337133, expected output size: 2662035, ratio: 0.10
    writers_test.go:457: Payload size: 26337133, expected output size: 2662535, ratio: 0.10
BenchmarkSerializeObject100000PodsPB-8                 2         957113640 ns/op
--- BENCH: BenchmarkSerializeObject100000PodsPB-8
    writers_test.go:457: Payload size: 263371335, expected output size: 26630714, ratio: 0.10
    writers_test.go:457: Payload size: 263371335, expected output size: 26639922, ratio: 0.10
BenchmarkSerializeObject1000PodsJSON-8                84          13318834 ns/op
--- BENCH: BenchmarkSerializeObject1000PodsJSON-8
    writers_test.go:457: Payload size: 4118115, expected output size: 391552, ratio: 0.10
    writers_test.go:457: Payload size: 4118115, expected output size: 390221, ratio: 0.09
BenchmarkSerializeObject10000PodsJSON-8                8         130358136 ns/op
--- BENCH: BenchmarkSerializeObject10000PodsJSON-8
    writers_test.go:457: Payload size: 41180777, expected output size: 3911634, ratio: 0.09
    writers_test.go:457: Payload size: 41180777, expected output size: 3906551, ratio: 0.09
BenchmarkSerializeObject100000PodsJSON-8               1        1294101085 ns/op
--- BENCH: BenchmarkSerializeObject100000PodsJSON-8
    writers_test.go:457: Payload size: 411808348, expected output size: 39017018, ratio: 0.09
PASS
ok      k8s.io/apiserver/pkg/endpoints/handlers/responsewriters 20.716s

This is a large and difficult to review dependency to add... I'd like to avoid it if possible.

I think I have been able to come up with files which use stdlib gzip in k/k, they are:

  • kubernetes/cluster/images/conformance/go-runner/tar.go
  • kubernetes/cluster/images/conformance/go-runner/tar_test.go
  • kubernetes/staging/src/k8s.io/apimachinery/pkg/util/proxy/transport.go
  • kubernetes/staging/src/k8s.io/apimachinery/pkg/util/proxy/transport_test.go
  • kubernetes/staging/src/k8s.io/apimachinery/pkg/util/proxy/upgradeaware_test.go
  • kubernetes/staging/src/k8s.io/apiserver/pkg/endpoints/apiserver_test.go (benchmark)
  • kubernetes/staging/src/k8s.io/apiserver/pkg/endpoints/handlers/responsewriters/writers.go
  • kubernetes/staging/src/k8s.io/apiserver/pkg/endpoints/handlers/responsewriters/writers_test.go
  • kubernetes/pkg/kubelet/logs/container_log_manager.go

If we decide to change the dependency in all of them then it surely can become a complex task, however, if it is just the responsewriters/writers.go or any other file that uses gzip.NewWriter(), then it can become may be a little doable? I am not very sure in this and I think the cause that @mborsz said and I quote has to be considered:

In some cases (when a lot of large LIST calls happens at roughly the same time) it looks like a significant part of CPU usage is generated by gzip compressing.

Thanks a lot and looking forward for the feedback 😄

@liggitt
Copy link
Member

liggitt commented Aug 9, 2021

I'd like to see if that could be reported/improved there first.

How can we do this and what should be our approach?

My naive approach would be to profile the stdlib impl and this proposed replacement and see if there are corresponding steps in the stdlib impl that perform poorly, to see if there are possible individual improvements that could be made.

If we decide to change the dependency in all of them then it surely can become a complex task, however, if it is just the responsewriters/writers.go or any other file that uses gzip.NewWriter(), then it can become may be a little doable?

I was referring to the review of the dependency itself, not the call sites.

@mritunjaysharma394
Copy link

I'd like to see if that could be reported/improved there first.

How can we do this and what should be our approach?

My naive approach would be to profile the stdlib impl and this proposed replacement and see if there are corresponding steps in the stdlib impl that perform poorly, to see if there are possible individual improvements that could be made.

Thanks a lot @liggitt! This leaves me with another question, do we need to modify the individual components in the staging/src/k8s.io/apiserver/pkg/endpoints/handlers/responsewriters/writers.go or modify stdlib impl (maybe a fork of it, which I think we very likely won't do)?

Thanks!

@liggitt
Copy link
Member

liggitt commented Aug 11, 2021

Thanks a lot @liggitt! This leaves me with another question, do we need to modify the individual components in the staging/src/k8s.io/apiserver/pkg/endpoints/handlers/responsewriters/writers.go or modify stdlib impl

I was suggesting profiling and proposing changes to the stdlib implementation at https://github.com/golang/go/blob/master/src/compress/gzip, then picking them up here once they are released in a new golang version.

@mritunjaysharma394
Copy link

mritunjaysharma394 commented Aug 11, 2021

Thanks a lot @liggitt! This leaves me with another question, do we need to modify the individual components in the staging/src/k8s.io/apiserver/pkg/endpoints/handlers/responsewriters/writers.go or modify stdlib impl

I was suggesting profiling and proposing changes to the stdlib implementation at https://github.com/golang/go/blob/master/src/compress/gzip, then picking them up here once they are released in a new golang version.

Thank you so much @liggitt, I have read the gzip code of stdlib and the drop-in replacement of gzip, and I agree with your suggestion, I will try to propose the corresponding changes for the new golang version to make it more optimal for us and as well as in general.

Till then, would love to know your opinion on this issue? Should we close it and create an issue in golang’s repo to suggest our changes? Also, then I guess we will wlso have to close this PR #104118 I was working on? Thank you so much for all the help!

@liggitt
Copy link
Member

liggitt commented Aug 17, 2021

Should we close it and create an issue in golang’s repo to suggest our changes?

Yeah, if there's a performance issue in the stdlib gzip impl with a clear fix/alternative, taking the issue/proposed change upstream to https://github.com/golang/go would make more sense to me, rather than a kubernetes-specific issue/change, or changing kubernetes to use a non-stdlib gzip lib

@mritunjaysharma394
Copy link

Thank you so much @liggitt, I will try to take up this issue there then 😊

thanks for all the help!

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 15, 2021
@enj
Copy link
Member

enj commented Dec 6, 2021

Per #104071 (comment) I believe this can be closed as this is something that should be fixed in Go.

/close

@k8s-ci-robot
Copy link
Contributor

@enj: Closing this issue.

In response to this:

Per #104071 (comment) I believe this can be closed as this is something that should be fixed in Go.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/code-organization Issues or PRs related to kubernetes code organization kind/feature Categorizes issue or PR as related to a new feature. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants