-
Notifications
You must be signed in to change notification settings - Fork 38.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce memory allocations in kube proxy #46033
Reduce memory allocations in kube proxy #46033
Conversation
3dd0222
to
524c13a
Compare
We can probably optimize much more, but I would like to wait with that for all other changes to be merged first to see where we will be (in particular periodic runner) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice. Sort of sad how easy it is to write slice-abusive code.
Just a few nits. Will approve, you can fixup and self-lgtm
pkg/proxy/iptables/proxier.go
Outdated
@@ -417,6 +425,11 @@ func NewProxier(ipt utiliptables.Interface, | |||
recorder: recorder, | |||
healthChecker: healthChecker, | |||
healthzServer: healthzServer, | |||
iptablesLines: bytes.NewBuffer(nil), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: This isn't really parsed into lines yet - call it iptablesData
or iptablesRaw
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
filterRules := bytes.NewBuffer(nil) | ||
natChains := bytes.NewBuffer(nil) | ||
natRules := bytes.NewBuffer(nil) | ||
// Reset all buffers used later. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment that this is to avoid re-allocations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
@@ -228,36 +228,27 @@ func saveChain(chain *fakeChain, data *bytes.Buffer) { | |||
} | |||
|
|||
func (f *fakeIPTables) Save(tableName utiliptables.Table) ([]byte, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we still need plain Save() any more?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are still using it in few places in kubelet. Though I'm happy to send a PR migrating it to SaveInto (should be pretty simple). But I would prefer to do it in a separate one. Will send one on Monday.
pkg/util/iptables/iptables.go
Outdated
args := []string{"-t", string(table)} | ||
glog.V(4).Infof("running iptables-save %v", args) | ||
cmd := runner.exec.Command(cmdIPTablesSave, args...) | ||
cmd.SetStdout(buffer) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a comment on this vs CombinedOutput() ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: thockin, wojtek-t
Needs approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
524c13a
to
919f931
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Tim - comments applied.
@@ -228,36 +228,27 @@ func saveChain(chain *fakeChain, data *bytes.Buffer) { | |||
} | |||
|
|||
func (f *fakeIPTables) Save(tableName utiliptables.Table) ([]byte, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are still using it in few places in kubelet. Though I'm happy to send a PR migrating it to SaveInto (should be pretty simple). But I would prefer to do it in a separate one. Will send one on Monday.
pkg/proxy/iptables/proxier.go
Outdated
@@ -417,6 +425,11 @@ func NewProxier(ipt utiliptables.Interface, | |||
recorder: recorder, | |||
healthChecker: healthChecker, | |||
healthzServer: healthzServer, | |||
iptablesLines: bytes.NewBuffer(nil), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
filterRules := bytes.NewBuffer(nil) | ||
natChains := bytes.NewBuffer(nil) | ||
natRules := bytes.NewBuffer(nil) | ||
// Reset all buffers used later. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
pkg/util/iptables/iptables.go
Outdated
args := []string{"-t", string(table)} | ||
glog.V(4).Infof("running iptables-save %v", args) | ||
cmd := runner.exec.Command(cmdIPTablesSave, args...) | ||
cmd.SetStdout(buffer) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Applying label based on Tim`s comment above. |
919f931
to
f53440e
Compare
Just fixed compile errors that appeared (didn't rename all occurences of variable). |
f53440e
to
a3da8d7
Compare
Sorry - fixed gofmt. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just nits. Looks good overall.
pkg/proxy/iptables/proxier.go
Outdated
|
||
glog.V(3).Infof("Restoring iptables rules: %s", lines) | ||
err = proxier.iptables.RestoreAll(lines, utiliptables.NoFlushTables, utiliptables.RestoreCounters) | ||
if glog.V(4) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove if condition here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry - initially it was different in my code.
So it sometimes makes sense to have such iffs, if the arguments of glog are expensive to compute. So imagine that you have the following line:
glog.V(4).Infof("xxxL: %v", callFunction(args))
Then callFunction(args) will be called no matter what the verbosity is and whether the log will be printed or not. In case of expensive computations, it's simply waste of time.
This is no longer the case here, so will send a PR to remove it (or will integrate it to one of other PRs I already have).
pkg/proxy/iptables/proxier.go
Outdated
@@ -1497,8 +1497,8 @@ func (proxier *Proxier) syncProxyRules(reason syncReason) { | |||
proxier.iptablesLines.Write(proxier.natChains.Bytes()) | |||
proxier.iptablesLines.Write(proxier.natRules.Bytes()) | |||
|
|||
if glog.V(4) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why? I feel like it has to be 4 at least. Otherwise, there is no enough actionable information to debug e2e tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is extremely expensive (in my tests in just 100-node cluster, this line is responsible for ~10% of whole cpu usage and ~20% of all memory allocations and GC).
I think that we can just bump the verbosity in tests in case of problems.
@wojtek-t: The following test(s) failed:
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
Automatic merge from submit-queue |
|
||
glog.V(3).Infof("Restoring iptables rules: %s", lines) | ||
err = proxier.iptables.RestoreAll(lines, utiliptables.NoFlushTables, utiliptables.RestoreCounters) | ||
if glog.V(5) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
weird constructs like this warrant a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case it's not longer needed - I will send a PR removing it on Monday.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#46201 is already out for review
Automatic merge from submit-queue (batch tested with PRs 45534, 37212, 46613, 46350) Speed up and reduce number of memory allocations in kube-proxy This is a second (and last PR) in this series - this solves all very-low-hanging fruits. This PR: - reduces cpu usage by ~25% - reduces memory allocations by ~3x (together with #46033 by 10-12x) Without this PR: ``` (pprof) top 8.59GB of 8.79GB total (97.75%) Dropped 238 nodes (cum <= 0.04GB) Showing top 10 nodes out of 64 (cum >= 0.11GB) flat flat% sum% cum cum% 3.66GB 41.60% 41.60% 8.72GB 99.17% k8s.io/kubernetes/pkg/proxy/iptables.(*Proxier).syncProxyRules 3.07GB 34.96% 76.56% 3.07GB 34.96% runtime.rawstringtmp 0.62GB 7.09% 83.65% 0.62GB 7.09% runtime.hashGrow 0.34GB 3.82% 87.46% 0.34GB 3.82% runtime.stringtoslicebyte 0.29GB 3.24% 90.71% 0.58GB 6.61% encoding/base32.(*Encoding).EncodeToString 0.22GB 2.47% 93.18% 0.22GB 2.47% strings.genSplit 0.18GB 2.04% 95.22% 0.18GB 2.04% runtime.convT2E 0.11GB 1.22% 96.44% 0.73GB 8.36% runtime.mapassign 0.10GB 1.08% 97.52% 0.10GB 1.08% syscall.ByteSliceFromString 0.02GB 0.23% 97.75% 0.11GB 1.25% syscall.SlicePtrFromStrings ``` with this PR: ``` (pprof) top 2.98GB of 3.08GB total (96.78%) Dropped 246 nodes (cum <= 0.02GB) Showing top 10 nodes out of 70 (cum >= 0.10GB) flat flat% sum% cum cum% 1.99GB 64.60% 64.60% 1.99GB 64.60% runtime.rawstringtmp 0.58GB 18.95% 83.55% 0.58GB 18.95% runtime.hashGrow 0.10GB 3.40% 86.95% 0.69GB 22.47% runtime.mapassign 0.09GB 2.86% 89.80% 0.09GB 2.86% syscall.ByteSliceFromString 0.08GB 2.63% 92.44% 0.08GB 2.63% runtime.convT2E 0.03GB 1.13% 93.56% 0.03GB 1.13% syscall.Environ 0.03GB 0.99% 94.56% 0.03GB 0.99% bytes.makeSlice 0.03GB 0.97% 95.52% 0.03GB 1.06% os.Stat 0.02GB 0.65% 96.18% 3.01GB 97.79% k8s.io/kubernetes/pkg/proxy/iptables.(*Proxier).syncProxyRules 0.02GB 0.6% 96.78% 0.10GB 3.35% syscall.SlicePtrFromStrings ```
Memory allocation (and Go GarbageCollection) seems to be one of the most expensive operations in kube-proxy (I've seen profiles where it was more than 50%).
The commits are mostly independent from each other and all of them are mostly about reusing already allocated memory.
This PR is reducing memory allocation by ~5x (results below from 100-node load test):
before:
after: