New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fqdn: use map to dedup to reduce memory usage of dns gc job #25142
Conversation
cc @christarazi mind triggering test to verify that the logic is sound? |
/test Job 'Cilium-PR-K8s-1.26-kernel-4.19' failed: Click to show.Test Name
Failure Output
Jenkins URL: https://jenkins.cilium.io/job/Cilium-PR-K8s-1.26-kernel-4.19/169/ If it is a flake and a GitHub issue doesn't already exist to track it, comment Then please upload the Jenkins artifacts to that issue. |
@odinuge thanks for the PR! this seems like a pretty straight-forward improvement. |
Hi! Yes, I'll rebase and squash tomorrow morning. Thanks |
31a7e38
to
ce0d16c
Compare
On some host we have a lot of DNS lookups, and this job allocates a lot of memory and burn a lot of cpu. We often see loglines like these; "FQDN garbage collector work deleted 16211 name entries" This will now dedup using a map directly instead of deduping via merging slices. Peek from a pprof, where this comes on top (running cilium v1.11.13); Type: alloc_space Showing nodes accounting for 24333619.24MB, 100% of 24333619.24MB total ----------------------------------------------------------+------------- flat flat% sum% cum cum% calls calls% + context ----------------------------------------------------------+------------- 4686095.18MB 95.24% | github.com/cilium/cilium/daemon/cmd.(*Daemon).bootstrapFQDN.func1 /go/src/github.com/cilium/cilium/daemon/cmd/fqdn.go:236 230562.07MB 4.69% | github.com/cilium/cilium/daemon/cmd.(*Daemon).bootstrapFQDN.func1 /go/src/github.com/cilium/cilium/daemon/cmd/fqdn.go:246 2609.33MB 0.053% | github.com/cilium/cilium/pkg/fqdn.(*DNSCache).cleanupExpiredEntries /go/src/github.com/cilium/cilium/pkg/fqdn/cache.go:256 997.05MB 0.02% | github.com/cilium/cilium/daemon/cmd.(*Daemon).bootstrapFQDN.func1 /go/src/github.com/cilium/cilium/daemon/cmd/fqdn.go:250 175.82MB 0.0036% | github.com/cilium/cilium/pkg/fqdn.(*DNSCache).GC /go/src/github.com/cilium/cilium/pkg/fqdn/cache.go:339 4920439.45MB 20.22% 20.22% 4920439.45MB 20.22% | github.com/cilium/cilium/pkg/fqdn.KeepUniqueNames /go/src/github.com/cilium/cilium/pkg/fqdn/helpers.go:127 Signed-off-by: Odin Ugedal <ougedal@palantir.com> Signed-off-by: Odin Ugedal <odin@uged.al>
ce0d16c
to
fde0b53
Compare
Should be good to go now thorn3r |
/test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, thank you!
On some host we have a lot of DNS lookups, and this job allocates a lot of memory and burn a lot of cpu. We often see loglines like these; "FQDN garbage collector work deleted 16211 name entries"
This will now dedup using a map directly instead of deduping via merging slices. It doesn't seem like the code care about the order, so I think this should be fine. It would be nice to benchmark this, but its a bit difficult in the way the job is set up. Happy to iterate on that after we can test this.
Peek from a pprof, where this comes on top (running cilium v1.11.13);
Please ensure your pull request adheres to the following guidelines:
description and a
Fixes: #XXX
line if the commit addresses a particularGitHub issue.
Fixes: <commit-id>
tag, thenplease add the commit author[s] as reviewer[s] to this issue.
Fixes: #issue-number