Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory CO-RE #12684

Merged
merged 22 commits into from Apr 26, 2022
Merged

Memory CO-RE #12684

merged 22 commits into from Apr 26, 2022

Conversation

thiagoftsm
Copy link
Contributor

@thiagoftsm thiagoftsm commented Apr 14, 2022

Summary

This PR is adding CO-RE algorithms to all threads related to memory actions.
We are also doing small changes in directory cache and cachestat to speed up the thread loading.
When I was developing this PR, the logic for commits was (exceptions can happen):

Commits
1. Move load from main thread to a helper
2. Add global variables used to test `btf` symbols.
3. Add CO-RE code
4. Change config file for thread.
Test Plan
  1. Clone this Branch
  2. Change your /etc/netdata/ebpf.d.conf to:
[ebpf programs]
    cachestat = yes
    dcstat = yes
    swap = yes
  1. Compile this branch
  2. Start netdata and wait few seconds.
  3. Run the following script:
Script:
#!/bin/bash
curl -k -o load_ebpf.txt http://localhost:19999/api/v1/data?chart=netdata.ebpf_load_methods
curl -k -o global_cachestat.txt http://localhost:19999/api/v1/data?chart=mem.cachestat_ratio
curl -k -o apps_cachestat_dirties.txt http://localhost:19999/api/v1/data?chart=apps.cachestat_dirties
curl -k -o apps_dc_reference.txt http://localhost:19999/api/v1/data?chart=apps.dc_reference
curl -k -o apps_swap_read_call.txt http://localhost:19999/api/v1/data?chart=apps.swap_read_call
curl -k -o services_cachestat.txt http://localhost:19999/api/v1/data?chart=services.cachestat_ratio

grep eBPF /var/log/netdata/* > log.txt

Additional Information

On Slackware I am not adding any cgroup chart, because Slackware does not have systemd ( 🥳 ).
You can get all reports using this link.

This PR was already tested on:

Linux Distribution kernel version Load Chart LOG Global Cachestat APPS cachestat Services Cachestat Apps SWAP Apps Directory Cache
Slackware Current 5.17.3 slackware_5_17_3_load_ebpf.txt slackware_5_17_3_log.txt slackware_5_17_3_global_cachestat.txt slackware_5_17_3_apps_cachestat_dirties.txt slackware_5_17_3_apps_swap_read_call.txt slackware_5_17_3_apps_dc_reference.txt
Arch Linux 5.17.1-arch1 arch_5_17_1_load_ebpf.txt arch_5_17_1_log.txt arch_5_17_1_global_cachestat.txt arch_5_17_1_apps_cachestat_dirties.txt arch_5_17_1_services_cachestat.txt arch_5_17_1_apps_swap_read_call.txt arch_5_17_1_apps_dc_reference.txt
Ubuntu 21.04 5.11.0-49-generic ubuntu_5_11_load_ebpf.txt ubuntu_5_11_log.txt ubuntu_5_11_global_cachestat.txt ubuntu_5_11_apps_cachestat_dirties.txt ubuntu_5_11_services_cachestat.txt ubuntu_5_11_apps_swap_read_call.txt ubuntu_5_11_apps_dc_reference.txt
Manjaro 21.1 5.10.109-1 manjaro_5_10_1_load_ebpf.txt manjaro_5_10_1_log.txt manjaro_5_10_1_global_cachestat.txt manjaro_5_10_1_apps_cachestat_dirties.txt manjaro_5_10_1_services_cachestat.txt manjaro_5_10_1_apps_swap_read_call.txt manjaro_5_10_1_apps_dc_reference.txt
Alma 8.5 4.18.0-348.20.1.el8_5.x86_64 alma_4_18_load_ebpf.txt alma_4_18_log.txt alma_4_18_global_cachestat.txt alma_4_18_apps_cachestat_dirties.txt alma_4_18_services_cachestat.txt alma_4_18_apps_swap_read_call.txt alma_4_18_apps_dc_reference.txt
Slackware 15.0 4.14.266 slackware_4_14_266_load_ebpf.txt slackware_4_14_266_log.txt slackware_4_14_266_global_cachestat.txt slackware_4_14_266_apps_cachestat_dirties.txt slackware_4_14_266_apps_swap_read_call.txt slackware_4_14_266_apps_dc_reference.txt
For users: How does this change affect me? Describe the PR affects users: - Which area of Netdata is affected by the change? Dashboard, and performance. - Can they see the change or is it an under the hood? If they can see it, where? This is not visible for users, because we are changing the way we collect data. - How is the user impacted by the change? Performance. - What are there any benefits of the change? Charts used to monitor SWAP, Cachestat, and Directory cache will affect less the host, because users will use `trampolines` instead `kprobes`

@thiagoftsm thiagoftsm marked this pull request as draft April 14, 2022 01:44
@github-actions github-actions bot added area/collectors Everything related to data collection collectors/ebpf labels Apr 14, 2022
@github-actions github-actions bot added the area/packaging Packaging and operating systems support label Apr 18, 2022
@thiagoftsm thiagoftsm marked this pull request as ready for review April 19, 2022 03:04
@vlvkobal
Copy link
Contributor

collectors/ebpf.plugin/ebpf_dcstat.c: In function 'ebpf_dc_load_and_attach':
collectors/ebpf.plugin/ebpf_dcstat.c:191:29: warning: unused variable 'mt' [-Wunused-variable]
  191 |     netdata_ebpf_targets_t *mt = em->targets;
      |   

@@ -1002,8 +1296,9 @@ void *ebpf_cachestat_thread(void *ptr)
cachestat_counter_dimension_name, cachestat_counter_dimension_name,
algorithms, NETDATA_CACHESTAT_END);

pthread_mutex_lock(&lock);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, do you plan to use libuv threading and synchronization primitives in the eBPF plugin?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a change that we cannot discard, mainly considering that Microsoft has working to improve eBPF support. Considering that we are right now working to finish the CO-RE integration, this could be the next work with plugin (ping @cpipilas ).

@thiagoftsm
Copy link
Contributor Author

collectors/ebpf.plugin/ebpf_dcstat.c:191:29: warning: unused variable 'mt' [-Wunused-variable]
191 | netdata_ebpf_targets_t *mt = em->targets;

It was fixed with last commit, thanks to report!

@thiagoftsm thiagoftsm merged commit 2244973 into netdata:master Apr 26, 2022
@thiagoftsm thiagoftsm deleted the memory_co_re branch April 26, 2022 11:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/collectors Everything related to data collection area/packaging Packaging and operating systems support collectors/ebpf
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants