Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CO-RE pragma add lots of memory overhead #1063

Closed
msherif1234 opened this issue Jun 14, 2023 · 10 comments
Closed

CO-RE pragma add lots of memory overhead #1063

msherif1234 opened this issue Jun 14, 2023 · 10 comments
Assignees
Labels
bug Something isn't working

Comments

@msherif1234
Copy link

Describe the bug
When eBPF kernel module use kernel generated header vmlinux.h I was noticed a huge memory and CPU spike.

To Reproduce
Using cilium xdp example to repro this issue

cd examples
make -C ../
go run -exec sudo ./xdp/ <interface-name>
  • check memory usage
    image

  • disable CO-RE pragma using this diff

diff --git a/examples/xdp/xdp.c b/examples/xdp/xdp.c
index 7e6c3c5..7b38acf 100644
--- a/examples/xdp/xdp.c
+++ b/examples/xdp/xdp.c
@@ -1,6 +1,6 @@
 //go:build ignore
 
-//#define BPF_NO_PRESERVE_ACCESS_INDEX
+#define BPF_NO_PRESERVE_ACCESS_INDEX
 #include "vmlinux.h"
 typedef __u8 u8;
 typedef __u16 u16;
  • then rebuild, run and check the memory will notice the memory consumption spike
    image

  • issue is seen with libbpfv1.0.1 and libbpfv1.2

  • I hit the same with TC hook as well so I think its general issue impacting all ebpf hooks

Expected behavior
CO-RE macros shouldn't add this huge overhead to the application's resources

@brycekahle
Copy link
Contributor

brycekahle commented Jun 14, 2023

Have you collected a profile using pprof to see if there is a change in Go allocations?

@msherif1234
Copy link
Author

msherif1234 commented Jun 14, 2023

Have you collected a profile using pprof to see if there is a change in Go allocations?

they look identical from go allocations perspective

  • run with CO-RE
(pprof) top10
Showing nodes accounting for 1024.70kB, 100% of 1024.70kB total
Showing top 10 nodes out of 13
      flat  flat%   sum%        cum   cum%
  512.50kB 50.01% 50.01%   512.50kB 50.01%  runtime.allocm
  512.20kB 49.99%   100%   512.20kB 49.99%  runtime.malg
         0     0%   100%   512.50kB 50.01%  runtime.mstart
         0     0%   100%   512.50kB 50.01%  runtime.mstart0
         0     0%   100%   512.50kB 50.01%  runtime.mstart1
         0     0%   100%   512.50kB 50.01%  runtime.newm
         0     0%   100%   512.20kB 49.99%  runtime.newproc.func1
         0     0%   100%   512.20kB 49.99%  runtime.newproc1
         0     0%   100%   512.50kB 50.01%  runtime.resetspinning
         0     0%   100%   512.50kB 50.01%  runtime.schedule
  • run with CO-RE disabled
(pprof) top10
Showing nodes accounting for 1024.70kB, 100% of 1024.70kB total
Showing top 10 nodes out of 13
      flat  flat%   sum%        cum   cum%
  512.50kB 50.01% 50.01%   512.50kB 50.01%  runtime.allocm
  512.20kB 49.99%   100%   512.20kB 49.99%  runtime.malg
         0     0%   100%   512.50kB 50.01%  runtime.mstart
         0     0%   100%   512.50kB 50.01%  runtime.mstart0
         0     0%   100%   512.50kB 50.01%  runtime.mstart1
         0     0%   100%   512.50kB 50.01%  runtime.newm
         0     0%   100%   512.20kB 49.99%  runtime.newproc.func1
         0     0%   100%   512.20kB 49.99%  runtime.newproc1
         0     0%   100%   512.50kB 50.01%  runtime.resetspinning
         0     0%   100%   512.50kB 50.01%  runtime.schedule

@msherif1234
Copy link
Author

cc @lmb

@lmb lmb self-assigned this Jun 15, 2023
@lmb
Copy link
Collaborator

lmb commented Jun 15, 2023

I extracted the loadBpf call into a test and collected a memory profile.

(pprof) top     
Showing nodes accounting for 45474.43kB, 100% of 45474.43kB total
Showing top 10 nodes out of 30
      flat  flat%   sum%        cum   cum%
14768.18kB 32.48% 32.48% 14768.18kB 32.48%  github.com/cilium/ebpf/btf.indexTypes
13825.40kB 30.40% 62.88% 15948.08kB 35.07%  github.com/cilium/ebpf/btf.inflateRawTypes
 6329.13kB 13.92% 76.80% 10937.92kB 24.05%  github.com/cilium/ebpf/btf.readTypes
 4096.79kB  9.01% 85.81%  4608.80kB 10.13%  encoding/binary.Read
 2283.86kB  5.02% 90.83%  3307.88kB  7.27%  github.com/cilium/ebpf/btf.readStringTable
 2122.68kB  4.67% 95.50%  2122.68kB  4.67%  github.com/cilium/ebpf/btf.inflateRawTypes.func3

The memory increase you're seeing is triggered by CO-RE, but the CO-RE code itself doesn't allocate a lot. To do CO-RE relocations we need to parse the vmlinux BTF. Parsing that is expensive and the parsed representation uses a bunch of resident memory.

If only ever load BPF at the start of your program you can use https://pkg.go.dev/github.com/cilium/ebpf/btf#FlushKernelSpec to release the memory used by the cached kernel BTF.

@msherif1234
Copy link
Author

msherif1234 commented Jun 15, 2023

at what point just after the load I can try that in the xdp example in my WIP pr and rerun ?

@lmb
Copy link
Collaborator

lmb commented Jun 15, 2023

After loadBpf() returns.

@msherif1234
Copy link
Author

msherif1234 commented Jun 15, 2023

so I applied the following changes to xdp test program

diff --git a/examples/xdp/main.go b/examples/xdp/main.go
index a669a86..67fc76d 100644
--- a/examples/xdp/main.go
+++ b/examples/xdp/main.go
@@ -17,6 +17,7 @@ import (
        "time"
 
        "github.com/cilium/ebpf"
+       "github.com/cilium/ebpf/btf"
        "github.com/cilium/ebpf/link"
 )
 
@@ -41,7 +42,7 @@ func main() {
                log.Fatalf("loading objects: %s", err)
        }
        defer objs.Close()
-
+       btf.FlushKernelSpec()
        // Attach the program.
        l, err := link.AttachXDP(link.XDPOptions{
                Program:   objs.XdpProgFunc,

however memory seems to be getting worse than before ?
image

Note: repro changes are here https://github.com/msherif1234/ebpf/tree/test_fix_core if u would like to see the same

@lmb
Copy link
Collaborator

lmb commented Jun 19, 2023

What you are seeing is an interaction with the Go GC. Flushing the caches doesn't automatically trigger GC. In the XDP example, no more GCs are triggered while in a "normal" application you will be allocating things, so sooner or later the memory should be returned. If you don't allocate you need to run runtime.GC() manually after flushing the caches:

package main

import (
	"runtime/debug"
	"testing"
	"time"

	"github.com/cilium/ebpf/linux"
)

func TestLoad(t *testing.T) {
	_, _ = linux.Types()

	linux.FlushCaches()

	runtime.GC()

	time.Sleep(time.Minute)
}

Execute that using GODEBUG=gctrace=1 and you'll see something like:

$ go test -c ./examples/xdp/ && GODEBUG=gctrace=1 ./xdp.test 
gc 1 @0.002s 2%: 0.013+1.1+0.009 ms clock, 0.20+0.049/1.1/0+0.15 ms cpu, 3->4->3 MB, 4 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 2 @0.007s 3%: 0.010+1.5+0.009 ms clock, 0.17+0.060/2.9/0+0.14 ms cpu, 9->10->9 MB, 9 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 3 @0.034s 1%: 0.013+2.5+0.008 ms clock, 0.21+0.33/5.4/0.32+0.13 ms cpu, 19->20->15 MB, 20 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 4 @0.059s 2%: 0.050+11+0.004 ms clock, 0.80+0.44/12/0.005+0.073 ms cpu, 33->33->24 MB, 33 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 5 @0.102s 2%: 0.019+6.2+0.011 ms clock, 0.31+0.051/20/34+0.18 ms cpu, 46->48->47 MB, 49 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 6 @0.138s 2%: 0.019+0.24+0.002 ms clock, 0.31+0/0.16/0+0.042 ms cpu, 65->65->0 MB, 94 MB goal, 0 MB stacks, 0 MB globals, 16 P (forced)
PASS

The last line is where the memory held by BTF is released. https://www.ardanlabs.com/blog/2019/05/garbage-collection-in-go-part2-gctraces.html has a good explanation for what the debug output means.

Still, this doesn't release memory to the OS immediately! Seems like the GC has a heuristic for when to return memory. You can short circuit this by replacing runtime.GC() with debug.FreeOSMemory(). After doing this I see RES return to normal-ish levels.

My recommendation: if your application doesn't allocate at all after attaching the BPF, add a runtime.GC call after FlushCaches. Otherwise let the GC do its thing. I'd be wary of adding debug.FreeOSMemory to a production app.

@msherif1234
Copy link
Author

my application dynamically allocate memories because I am using BPF_F_NO_PREALLOC so I don't think I can use runtime.GC() correct ?

@lmb
Copy link
Collaborator

lmb commented Jun 19, 2023

BPF memory and Go memory are completely distinct. runtime.GC has no effect on BPF memory.

@lmb lmb closed this as completed Jun 19, 2023
openshift-merge-robot pushed a commit to netobserv/netobserv-ebpf-agent that referenced this issue Jun 23, 2023
* Optimize ebpf agent map memory usage
- switch to use pointer to metric instead of metric
- manuall trigger GC after flow eviction complete

Signed-off-by: msherif1234 <mmahmoud@redhat.com>

* Fix memory and cpu scale issue work around in #133

following up on cilium/ebpf#1063
it seems we have a way to fix resources issues

Signed-off-by: msherif1234 <mmahmoud@redhat.com>
(cherry picked from commit b9c9a03)

---------

Signed-off-by: msherif1234 <mmahmoud@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants