-
Notifications
You must be signed in to change notification settings - Fork 695
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CO-RE pragma add lots of memory overhead #1063
Comments
Have you collected a profile using |
they look identical from go allocations perspective
(pprof) top10
Showing nodes accounting for 1024.70kB, 100% of 1024.70kB total
Showing top 10 nodes out of 13
flat flat% sum% cum cum%
512.50kB 50.01% 50.01% 512.50kB 50.01% runtime.allocm
512.20kB 49.99% 100% 512.20kB 49.99% runtime.malg
0 0% 100% 512.50kB 50.01% runtime.mstart
0 0% 100% 512.50kB 50.01% runtime.mstart0
0 0% 100% 512.50kB 50.01% runtime.mstart1
0 0% 100% 512.50kB 50.01% runtime.newm
0 0% 100% 512.20kB 49.99% runtime.newproc.func1
0 0% 100% 512.20kB 49.99% runtime.newproc1
0 0% 100% 512.50kB 50.01% runtime.resetspinning
0 0% 100% 512.50kB 50.01% runtime.schedule
(pprof) top10
Showing nodes accounting for 1024.70kB, 100% of 1024.70kB total
Showing top 10 nodes out of 13
flat flat% sum% cum cum%
512.50kB 50.01% 50.01% 512.50kB 50.01% runtime.allocm
512.20kB 49.99% 100% 512.20kB 49.99% runtime.malg
0 0% 100% 512.50kB 50.01% runtime.mstart
0 0% 100% 512.50kB 50.01% runtime.mstart0
0 0% 100% 512.50kB 50.01% runtime.mstart1
0 0% 100% 512.50kB 50.01% runtime.newm
0 0% 100% 512.20kB 49.99% runtime.newproc.func1
0 0% 100% 512.20kB 49.99% runtime.newproc1
0 0% 100% 512.50kB 50.01% runtime.resetspinning
0 0% 100% 512.50kB 50.01% runtime.schedule |
cc @lmb |
I extracted the
The memory increase you're seeing is triggered by CO-RE, but the CO-RE code itself doesn't allocate a lot. To do CO-RE relocations we need to parse the vmlinux BTF. Parsing that is expensive and the parsed representation uses a bunch of resident memory. If only ever load BPF at the start of your program you can use https://pkg.go.dev/github.com/cilium/ebpf/btf#FlushKernelSpec to release the memory used by the cached kernel BTF. |
at what point just after the load I can try that in the xdp example in my WIP pr and rerun ? |
After |
so I applied the following changes to xdp test program diff --git a/examples/xdp/main.go b/examples/xdp/main.go
index a669a86..67fc76d 100644
--- a/examples/xdp/main.go
+++ b/examples/xdp/main.go
@@ -17,6 +17,7 @@ import (
"time"
"github.com/cilium/ebpf"
+ "github.com/cilium/ebpf/btf"
"github.com/cilium/ebpf/link"
)
@@ -41,7 +42,7 @@ func main() {
log.Fatalf("loading objects: %s", err)
}
defer objs.Close()
-
+ btf.FlushKernelSpec()
// Attach the program.
l, err := link.AttachXDP(link.XDPOptions{
Program: objs.XdpProgFunc, however memory seems to be getting worse than before ? Note: repro changes are here https://github.com/msherif1234/ebpf/tree/test_fix_core if u would like to see the same |
What you are seeing is an interaction with the Go GC. Flushing the caches doesn't automatically trigger GC. In the XDP example, no more GCs are triggered while in a "normal" application you will be allocating things, so sooner or later the memory should be returned. If you don't allocate you need to run package main
import (
"runtime/debug"
"testing"
"time"
"github.com/cilium/ebpf/linux"
)
func TestLoad(t *testing.T) {
_, _ = linux.Types()
linux.FlushCaches()
runtime.GC()
time.Sleep(time.Minute)
} Execute that using
The last line is where the memory held by BTF is released. https://www.ardanlabs.com/blog/2019/05/garbage-collection-in-go-part2-gctraces.html has a good explanation for what the debug output means. Still, this doesn't release memory to the OS immediately! Seems like the GC has a heuristic for when to return memory. You can short circuit this by replacing My recommendation: if your application doesn't allocate at all after attaching the BPF, add a runtime.GC call after |
my application dynamically allocate memories because I am using |
BPF memory and Go memory are completely distinct. |
* Optimize ebpf agent map memory usage - switch to use pointer to metric instead of metric - manuall trigger GC after flow eviction complete Signed-off-by: msherif1234 <mmahmoud@redhat.com> * Fix memory and cpu scale issue work around in #133 following up on cilium/ebpf#1063 it seems we have a way to fix resources issues Signed-off-by: msherif1234 <mmahmoud@redhat.com> (cherry picked from commit b9c9a03) --------- Signed-off-by: msherif1234 <mmahmoud@redhat.com>
Describe the bug
When eBPF kernel module use kernel generated header
vmlinux.h
I was noticed a huge memory and CPU spike.To Reproduce
Using cilium xdp example to repro this issue
check memory usage
disable CO-RE pragma using this diff
then rebuild, run and check the memory will notice the memory consumption spike
issue is seen with libbpfv1.0.1 and libbpfv1.2
I hit the same with TC hook as well so I think its general issue impacting all ebpf hooks
Expected behavior
CO-RE macros shouldn't add this huge overhead to the application's resources
The text was updated successfully, but these errors were encountered: