Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve PID monitoring (step 2) #13530

Merged
merged 26 commits into from Aug 23, 2022
Merged

Conversation

thiagoftsm
Copy link
Contributor

@thiagoftsm thiagoftsm commented Aug 18, 2022

Summary

Fixes #12343

This PR is reducing CPU and memory usage for eBPF.plugin.
It is also disabling integration by default integration for legacy kernels, and some kernels that are LTS, but they do not have trampoline.
This PR is also bringing co-re codes for more threads.

Test Plan
  1. Compile this branch.
  2. Start netdata using the default configuration for ebpf.plugin and verify that is your kernel was compiled with CONFIG_DEBUG_INFO_BTF (zgrep BTF /proc/config.gz) the integration with apps and cgroup will be enabled by default, on the other hand, you should not have them enabled.
  3. Verify that you do not have libbpf errors (grep libbpf /var/log/netdata/error.log) for threads different of socket. For socket this is expected after kernel 5.19 to be released. This will be fixed in next PR.
  4. Now stop netdata and enabling all threads, you should have charts normally.
  5. Veriify that memory used by ebpf.programs were reduced, they should have 10992 instead the traditional 32768 by default when you run bpftool map show.
Additional Information

This PR was tested on:

Linux Distribution kernel version Threads bpftool
Slackware Current 5.19.2 slackware_5_19_threads.txt slackware_5_19_bpftool.txt
Arch Linux 5.19.2 arch_5_19_threads.txt arch_5_19_bpftool.txt
Ubuntu 22.04 5.15.0-33-generic ubuntu_5_15_threads.txt ubuntu_5_15_bpftool.txt
Alma 9 5.14.0-70.17.1.el9_0.x86_64 alma_5_14_threads.txt --
Debian 11 5.10.0-16-amd64 debian_5_10_threads.txt debian_5_10_bpftool.txt
Alma 8.6 4.18.0-372.19.1.el8_6 alma_4_18_threads.txt alma8_4_18_bpftool.txt
Ubuntu 18.04 4.15.0-180 ubuntu_4_15_threads.txt ubuntu_4_15_bpftool.txt
For users: How does this change affect me? Describe the PR affects users: - Which area of Netdata is affected by the change? ebpf.plugin - Can they see the change or is it an under the hood? If they can see it, where? The plugin will use less resources from host. - How is the user impacted by the change? A better performance from plugin; - What are there any benefits of the change? plugin improvement.

@thiagoftsm thiagoftsm marked this pull request as draft August 18, 2022 02:09
@github-actions github-actions bot added area/collectors Everything related to data collection area/docs collectors/ebpf labels Aug 18, 2022
@github-actions github-actions bot added the area/packaging Packaging and operating systems support label Aug 18, 2022
@thiagoftsm thiagoftsm marked this pull request as ready for review August 21, 2022 18:57
@thiagoftsm thiagoftsm requested review from underhood and removed request for vkalintiris August 21, 2022 18:57
collectors/ebpf.plugin/README.md Outdated Show resolved Hide resolved
collectors/ebpf.plugin/README.md Outdated Show resolved Hide resolved
collectors/ebpf.plugin/ebpf.c Outdated Show resolved Hide resolved
collectors/ebpf.plugin/README.md Outdated Show resolved Hide resolved
@vlvkobal
Copy link
Contributor

==30926== Thread 3 EBPF SOCKET:
==30926== Conditional jump or move depends on uninitialised value(s)
==30926==    at 0x4847D18: strlen (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==30926==    by 0x17C14E: fixup_verifier_log (in /home/vlad/netdata/ebpf.plugin)
==30926==    by 0x17BB65: bpf_object_load_prog (in /home/vlad/netdata/ebpf.plugin)
==30926==    by 0x17C5C1: bpf_object__load_progs (in /home/vlad/netdata/ebpf.plugin)
==30926==    by 0x17DEF1: bpf_object_load (in /home/vlad/netdata/ebpf.plugin)
==30926==    by 0x17E1CB: bpf_object__load (in /home/vlad/netdata/ebpf.plugin)
==30926==    by 0x189AE3: bpf_object__load_skeleton (in /home/vlad/netdata/ebpf.plugin)
==30926==    by 0x12B019: socket_bpf__load (socket.skel.h:127)
==30926==    by 0x12B019: ebpf_socket_load_and_attach (ebpf_socket.c:404)
==30926==    by 0x12B13B: ebpf_socket_load_bpf (ebpf_socket.c:3890)
==30926==    by 0x12F2C7: ebpf_socket_thread (ebpf_socket.c:3944)
==30926==    by 0x15860F: thread_start (threads.c:185)
==30926==    by 0x4E0D78C: start_thread (pthread_create.c:442)
==30926==
==30926== Conditional jump or move depends on uninitialised value(s)
==30926==    at 0x17BC9A: find_prev_line (in /home/vlad/netdata/ebpf.plugin)
==30926==    by 0x17C188: fixup_verifier_log (in /home/vlad/netdata/ebpf.plugin)
==30926==    by 0x17BB65: bpf_object_load_prog (in /home/vlad/netdata/ebpf.plugin)
==30926==    by 0x17C5C1: bpf_object__load_progs (in /home/vlad/netdata/ebpf.plugin)
==30926==    by 0x17DEF1: bpf_object_load (in /home/vlad/netdata/ebpf.plugin)
==30926==    by 0x17E1CB: bpf_object__load (in /home/vlad/netdata/ebpf.plugin)
==30926==    by 0x189AE3: bpf_object__load_skeleton (in /home/vlad/netdata/ebpf.plugin)
==30926==    by 0x12B019: socket_bpf__load (socket.skel.h:127)
==30926==    by 0x12B019: ebpf_socket_load_and_attach (ebpf_socket.c:404)
==30926==    by 0x12B13B: ebpf_socket_load_bpf (ebpf_socket.c:3890)
==30926==    by 0x12F2C7: ebpf_socket_thread (ebpf_socket.c:3944)
==30926==    by 0x15860F: thread_start (threads.c:185)
==30926==    by 0x4E0D78C: start_thread (pthread_create.c:442)
==30926==
==30926== Conditional jump or move depends on uninitialised value(s)
==30926==    at 0x484846C: strncmp (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==30926==    by 0x17C1B2: fixup_verifier_log (in /home/vlad/netdata/ebpf.plugin)
==30926==    by 0x17BB65: bpf_object_load_prog (in /home/vlad/netdata/ebpf.plugin)
==30926==    by 0x17C5C1: bpf_object__load_progs (in /home/vlad/netdata/ebpf.plugin)
==30926==    by 0x17DEF1: bpf_object_load (in /home/vlad/netdata/ebpf.plugin)
==30926==    by 0x17E1CB: bpf_object__load (in /home/vlad/netdata/ebpf.plugin)
==30926==    by 0x189AE3: bpf_object__load_skeleton (in /home/vlad/netdata/ebpf.plugin)
==30926==    by 0x12B019: socket_bpf__load (socket.skel.h:127)
==30926==    by 0x12B019: ebpf_socket_load_and_attach (ebpf_socket.c:404)
==30926==    by 0x12B13B: ebpf_socket_load_bpf (ebpf_socket.c:3890)
==30926==    by 0x12F2C7: ebpf_socket_thread (ebpf_socket.c:3944)
==30926==    by 0x15860F: thread_start (threads.c:185)
==30926==    by 0x4E0D78C: start_thread (pthread_create.c:442)
==30926==
==30926== Conditional jump or move depends on uninitialised value(s)
==30926==    at 0x484847F: strncmp (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==30926==    by 0x17C1B2: fixup_verifier_log (in /home/vlad/netdata/ebpf.plugin)
==30926==    by 0x17BB65: bpf_object_load_prog (in /home/vlad/netdata/ebpf.plugin)
==30926==    by 0x17C5C1: bpf_object__load_progs (in /home/vlad/netdata/ebpf.plugin)
==30926==    by 0x17DEF1: bpf_object_load (in /home/vlad/netdata/ebpf.plugin)
==30926==    by 0x17E1CB: bpf_object__load (in /home/vlad/netdata/ebpf.plugin)
==30926==    by 0x189AE3: bpf_object__load_skeleton (in /home/vlad/netdata/ebpf.plugin)
==30926==    by 0x12B019: socket_bpf__load (socket.skel.h:127)
==30926==    by 0x12B019: ebpf_socket_load_and_attach (ebpf_socket.c:404)
==30926==    by 0x12B13B: ebpf_socket_load_bpf (ebpf_socket.c:3890)
==30926==    by 0x12F2C7: ebpf_socket_thread (ebpf_socket.c:3944)
==30926==    by 0x15860F: thread_start (threads.c:185)
==30926==    by 0x4E0D78C: start_thread (pthread_create.c:442)
==30926==
==30926== Conditional jump or move depends on uninitialised value(s)
==30926==    at 0x4848484: strncmp (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==30926==    by 0x17C1B2: fixup_verifier_log (in /home/vlad/netdata/ebpf.plugin)
==30926==    by 0x17BB65: bpf_object_load_prog (in /home/vlad/netdata/ebpf.plugin)
==30926==    by 0x17C5C1: bpf_object__load_progs (in /home/vlad/netdata/ebpf.plugin)
==30926==    by 0x17DEF1: bpf_object_load (in /home/vlad/netdata/ebpf.plugin)
==30926==    by 0x17E1CB: bpf_object__load (in /home/vlad/netdata/ebpf.plugin)
==30926==    by 0x189AE3: bpf_object__load_skeleton (in /home/vlad/netdata/ebpf.plugin)
==30926==    by 0x12B019: socket_bpf__load (socket.skel.h:127)
==30926==    by 0x12B019: ebpf_socket_load_and_attach (ebpf_socket.c:404)
==30926==    by 0x12B13B: ebpf_socket_load_bpf (ebpf_socket.c:3890)
==30926==    by 0x12F2C7: ebpf_socket_thread (ebpf_socket.c:3944)
==30926==    by 0x15860F: thread_start (threads.c:185)
==30926==    by 0x4E0D78C: start_thread (pthread_create.c:442)
==30926==
==30926== Conditional jump or move depends on uninitialised value(s)
==30926==    at 0x484846C: strncmp (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==30926==    by 0x17C229: fixup_verifier_log (in /home/vlad/netdata/ebpf.plugin)
==30926==    by 0x17BB65: bpf_object_load_prog (in /home/vlad/netdata/ebpf.plugin)
==30926==    by 0x17C5C1: bpf_object__load_progs (in /home/vlad/netdata/ebpf.plugin)
==30926==    by 0x17DEF1: bpf_object_load (in /home/vlad/netdata/ebpf.plugin)
==30926==    by 0x17E1CB: bpf_object__load (in /home/vlad/netdata/ebpf.plugin)
==30926==    by 0x189AE3: bpf_object__load_skeleton (in /home/vlad/netdata/ebpf.plugin)
==30926==    by 0x12B019: socket_bpf__load (socket.skel.h:127)
==30926==    by 0x12B019: ebpf_socket_load_and_attach (ebpf_socket.c:404)
==30926==    by 0x12B13B: ebpf_socket_load_bpf (ebpf_socket.c:3890)
==30926==    by 0x12F2C7: ebpf_socket_thread (ebpf_socket.c:3944)
==30926==    by 0x15860F: thread_start (threads.c:185)
==30926==    by 0x4E0D78C: start_thread (pthread_create.c:442)
==30926==
==30926== Conditional jump or move depends on uninitialised value(s)
==30926==    at 0x484847F: strncmp (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==30926==    by 0x17C229: fixup_verifier_log (in /home/vlad/netdata/ebpf.plugin)
==30926==    by 0x17BB65: bpf_object_load_prog (in /home/vlad/netdata/ebpf.plugin)
==30926==    by 0x17C5C1: bpf_object__load_progs (in /home/vlad/netdata/ebpf.plugin)
==30926==    by 0x17DEF1: bpf_object_load (in /home/vlad/netdata/ebpf.plugin)
==30926==    by 0x17E1CB: bpf_object__load (in /home/vlad/netdata/ebpf.plugin)
==30926==    by 0x189AE3: bpf_object__load_skeleton (in /home/vlad/netdata/ebpf.plugin)
==30926==    by 0x12B019: socket_bpf__load (socket.skel.h:127)
==30926==    by 0x12B019: ebpf_socket_load_and_attach (ebpf_socket.c:404)
==30926==    by 0x12B13B: ebpf_socket_load_bpf (ebpf_socket.c:3890)
==30926==    by 0x12F2C7: ebpf_socket_thread (ebpf_socket.c:3944)
==30926==    by 0x15860F: thread_start (threads.c:185)
==30926==    by 0x4E0D78C: start_thread (pthread_create.c:442)
==30926==
==30926== Conditional jump or move depends on uninitialised value(s)
==30926==    at 0x4848484: strncmp (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==30926==    by 0x17C229: fixup_verifier_log (in /home/vlad/netdata/ebpf.plugin)
==30926==    by 0x17BB65: bpf_object_load_prog (in /home/vlad/netdata/ebpf.plugin)
==30926==    by 0x17C5C1: bpf_object__load_progs (in /home/vlad/netdata/ebpf.plugin)
==30926==    by 0x17DEF1: bpf_object_load (in /home/vlad/netdata/ebpf.plugin)
==30926==    by 0x17E1CB: bpf_object__load (in /home/vlad/netdata/ebpf.plugin)
==30926==    by 0x189AE3: bpf_object__load_skeleton (in /home/vlad/netdata/ebpf.plugin)
==30926==    by 0x12B019: socket_bpf__load (socket.skel.h:127)
==30926==    by 0x12B019: ebpf_socket_load_and_attach (ebpf_socket.c:404)
==30926==    by 0x12B13B: ebpf_socket_load_bpf (ebpf_socket.c:3890)
==30926==    by 0x12F2C7: ebpf_socket_thread (ebpf_socket.c:3944)
==30926==    by 0x15860F: thread_start (threads.c:185)
==30926==    by 0x4E0D78C: start_thread (pthread_create.c:442)
==30926==
--30926-- WARNING: unhandled eBPF command 28
--30926-- WARNING: unhandled eBPF command 28
==30926== Conditional jump or move depends on uninitialised value(s)
==30926==    at 0x4847D18: strlen (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==30926==    by 0x4DE25C7: __vfprintf_internal (vfprintf-process-arg.c:397)
==30926==    by 0x4DE2C74: buffered_vfprintf (vfprintf-internal.c:1748)
==30926==    by 0x16B030: __base_pr (in /home/vlad/netdata/ebpf.plugin)
==30926==    by 0x16B11C: libbpf_print (in /home/vlad/netdata/ebpf.plugin)
==30926==    by 0x17BC17: bpf_object_load_prog (in /home/vlad/netdata/ebpf.plugin)
==30926==    by 0x17C5C1: bpf_object__load_progs (in /home/vlad/netdata/ebpf.plugin)
==30926==    by 0x17DEF1: bpf_object_load (in /home/vlad/netdata/ebpf.plugin)
==30926==    by 0x17E1CB: bpf_object__load (in /home/vlad/netdata/ebpf.plugin)
==30926==    by 0x189AE3: bpf_object__load_skeleton (in /home/vlad/netdata/ebpf.plugin)
==30926==    by 0x12B019: socket_bpf__load (socket.skel.h:127)
==30926==    by 0x12B019: ebpf_socket_load_and_attach (ebpf_socket.c:404)
==30926==    by 0x12B13B: ebpf_socket_load_bpf (ebpf_socket.c:3890)
==30926==
==30926== Syscall param write(buf) points to uninitialised byte(s)
==30926==    at 0x4E7DECF: __libc_write (write.c:26)
==30926==    by 0x4E7DECF: write (write.c:24)
==30926==    by 0x4E052CC: _IO_file_write@@GLIBC_2.2.5 (fileops.c:1180)
==30926==    by 0x4E0466F: new_do_write (fileops.c:448)
==30926==    by 0x4E05A00: _IO_new_file_xsputn (fileops.c:1254)
==30926==    by 0x4E05A00: _IO_file_xsputn@@GLIBC_2.2.5 (fileops.c:1196)
==30926==    by 0x4DE2CE2: buffered_vfprintf (vfprintf-internal.c:1766)
==30926==    by 0x16B030: __base_pr (in /home/vlad/netdata/ebpf.plugin)
==30926==    by 0x16B11C: libbpf_print (in /home/vlad/netdata/ebpf.plugin)
==30926==    by 0x17BC17: bpf_object_load_prog (in /home/vlad/netdata/ebpf.plugin)
==30926==    by 0x17C5C1: bpf_object__load_progs (in /home/vlad/netdata/ebpf.plugin)
==30926==    by 0x17DEF1: bpf_object_load (in /home/vlad/netdata/ebpf.plugin)
==30926==    by 0x17E1CB: bpf_object__load (in /home/vlad/netdata/ebpf.plugin)
==30926==    by 0x189AE3: bpf_object__load_skeleton (in /home/vlad/netdata/ebpf.plugin)
==30926==  Address 0x8cb1855 is on thread 3's stack
==30926==  in frame #4, created by buffered_vfprintf (vfprintf-internal.c:1716)

@thiagoftsm
Copy link
Contributor Author

@vlvkobal about these errors, they have relationship with internal libbpf that is linked statically with the threads and the co-re code generated by the compiler. For we properly fix this we will either need to change the compiler output in ebpf-co-re or create a patch in libbpf.

A new libbpf was released today, a library that is bringing huge changes, I will investigate this better since libbpf code until we use use it here. I prefer to address this issue in another PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/collectors Everything related to data collection area/docs area/packaging Packaging and operating systems support collectors/ebpf
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feat]: change the default eBPF load mode to CO-RE
3 participants