New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use github.com/klauspost/compress/gzip in kubernetes #104071
Comments
/assign |
/sig api-machinery |
/cc @wojtek-t |
switching from stdlib to a third party library comes with a cost:
once this is in k/k taking it out becomes difficult, one reason would be the perf regression if the library is actually faster. /area code-organization |
@neolit123: The label(s) In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
I personally feel that this shouldn't be a problem as this library is being actively maintained by the maintainer and in fact, it's latest 1.13.3 version was released 3 days ago only (https://github.com/klauspost/compress) but of course, we would love to have views of other folks in the community.
I think we will have to reach out to other community members to discuss security aspect but I tried to do an evaluation of the algorithm implementation that makes this library faster. Here are some inferences and I would rather like to quote them from the blog of the maintainer of this library
Another couple of interesting points to consider are :
I am also attaching the links to the three blogs by the maintainer himself which helped me out a lot. The blogs have spreadsheets of performance tests results as well which can help us reach a better understanding.
Thanks a lot, looking forward!
|
that is good analysis, @mritunjaysharma394 . thank you for that. of course, i'm not questioning whether the project is poorly maintained or insecure. the judgement is in the hands of the owning SIGs here. |
my first thought is that if there is a demonstrable bottleneck due to the stdlib gzip implementation, I'd like to see if that could be reported/improved there first. This is a large and difficult to review dependency to add... I'd like to avoid it if possible. |
thanks a lot for your feedback @neolit123 and @liggitt.
I am not sure if it's a bottleneck but the results in optimizing the CPU usage seems to have shown good results in this direction in not only the PR #99300 but also in the benchmark tests that I ran on my system also showed some good results. For an example, I would like to show the results of two such benchmark results:
While I am learning to understand benchmark test better, at the higher level and especially looking at 25.309s --> 17.863s in the final test result line, it seems to be a good option.
How can we do this and what should be our approach? Like we should try to lower the level to 1? I tested it out for 1 with stdlib gzip and here are the benchmark results for level 1 of stdlib gzip:
I think I have been able to come up with files which use stdlib gzip in k/k, they are:
If we decide to change the dependency in all of them then it surely can become a complex task, however, if it is just the responsewriters/writers.go or any other file that uses
Thanks a lot and looking forward for the feedback 😄 |
My naive approach would be to profile the stdlib impl and this proposed replacement and see if there are corresponding steps in the stdlib impl that perform poorly, to see if there are possible individual improvements that could be made.
I was referring to the review of the dependency itself, not the call sites. |
Thanks a lot @liggitt! This leaves me with another question, do we need to modify the individual components in the staging/src/k8s.io/apiserver/pkg/endpoints/handlers/responsewriters/writers.go or modify stdlib impl (maybe a fork of it, which I think we very likely won't do)? Thanks! |
I was suggesting profiling and proposing changes to the stdlib implementation at https://github.com/golang/go/blob/master/src/compress/gzip, then picking them up here once they are released in a new golang version. |
Thank you so much @liggitt, I have read the gzip code of stdlib and the drop-in replacement of gzip, and I agree with your suggestion, I will try to propose the corresponding changes for the new golang version to make it more optimal for us and as well as in general. Till then, would love to know your opinion on this issue? Should we close it and create an issue in golang’s repo to suggest our changes? Also, then I guess we will wlso have to close this PR #104118 I was working on? Thank you so much for all the help! |
Yeah, if there's a performance issue in the stdlib gzip impl with a clear fix/alternative, taking the issue/proposed change upstream to https://github.com/golang/go would make more sense to me, rather than a kubernetes-specific issue/change, or changing kubernetes to use a non-stdlib gzip lib |
Thank you so much @liggitt, I will try to take up this issue there then 😊 thanks for all the help! |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
Per #104071 (comment) I believe this can be closed as this is something that should be fixed in Go. /close |
@enj: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
What would you like to be added:
This is an optimization proposal: In some cases (when a lot of large LIST calls happens at roughly the same time) it looks like a significant part of CPU usage is generated by gzip compressing.
There is an alternative implementation of gzip library which seems to be less CPU-consuming.
I think we should investigate if migration from golang's compress/gzip library to klauspost/compress/gzip library is possible and do that.
My POC with promising preliminary results here: #99300
Personally I don't have capacity to continue that PR so creating this issue so that some other person can optimize this (looks like a good first issue).
The text was updated successfully, but these errors were encountered: