Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: incorrect documentation about ebpf plugin configuration #16986

Open
nodiscc opened this issue Feb 10, 2024 · 6 comments · May be fixed by #17073
Open

[Bug]: incorrect documentation about ebpf plugin configuration #16986

nodiscc opened this issue Feb 10, 2024 · 6 comments · May be fixed by #17073

Comments

@nodiscc
Copy link
Contributor

nodiscc commented Feb 10, 2024

Bug description

https://learn.netdata.cloud/docs/data-collection/ebpf/ebpf-socket indicates that one must run sudo ./edit-config ebpf.d/network.conf to create the ebpf plugin configuration. This creates /etc/netdata/ebpf.d/network.conf.

However after restarting netdata, no ebpf.plugin charts are shown. Netdata logs the following message:

ebpf.plugin[649749]: Does not have a configuration file inside `/etc/netdata/ebpf.d.conf. It will try to load stock file.

sudo mv /etc/netdata/ebpf.d/network.conf /etc/netdata/ebpf.d.conf and restarting netdata fixes the problem.

I believe either the documentation is wrong, or the configuration is not being loaded from /etc/netdata/ebpf.d/network.conf when it should.

Expected behavior

Following https://learn.netdata.cloud/docs/data-collection/ebpf/ebpf-socket should yield a working ebpf.plugin configuration/display ebpf charts.

Steps to reproduce

  1. Follow https://learn.netdata.cloud/docs/data-collection/ebpf/ebpf-socket and create/edit /etc/netdata/ebpf.d/network.conf according to the documentation
  2. Restart netdata, wait for a while
  3. No EBPF charts are displayed, error message about failing to find the plugin config file in netdata logs.
    ...

Installation method

manual setup of official DEB/RPM packages

System info

Linux my.example.org 6.1.0-17-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.69-1 (2023-12-30) x86_64 GNU/Linux
/etc/os-release:PRETTY_NAME="Debian GNU/Linux 12 (bookworm)"
/etc/os-release:NAME="Debian GNU/Linux"
/etc/os-release:VERSION_ID="12"
/etc/os-release:VERSION="12 (bookworm)"
/etc/os-release:VERSION_CODENAME=bookworm
/etc/os-release:ID=debian

Netdata build info

Packaging:
    Netdata Version ____________________________________________ : v1.44.2
    Installation Type __________________________________________ : binpkg-deb
    Package Architecture _______________________________________ : x86_64
    Package Distro _____________________________________________ :  
    Configure Options __________________________________________ :  '--build=x86_64-linux-gnu' '--includedir=${prefix}/include' '--mandir=${prefix}/share/man' '--infodir=${prefix}/share/info' '--disable-option-checking' '--disable-silent-rules' '--libdir=${prefix}/lib/x86_64-linux-gnu' '--libexecdir=${prefix}/lib/x86_64-linux-gnu' '--disable-maintainer-mode' '--prefix=/usr' '--sysconfdir=/etc' '--localstatedir=/var' '--libdir=/usr/lib' '--libexecdir=/usr/libexec' '--with-user=netdata' '--with-math' '--with-zlib' '--with-webdir=/var/lib/netdata/www' '--disable-dependency-tracking' 'build_alias=x86_64-linux-gnu' 'CFLAGS=-g -O2 -ffile-prefix-map=/usr/src/netdata=. -fstack-protector-strong -Wformat -Werror=format-security' 'LDFLAGS=-Wl,-z,relro' 'CPPFLAGS=-Wdate-time -D_FORTIFY_SOURCE=2' 'CXXFLAGS=-g -O2 -ffile-prefix-map=/usr/src/netdata=. -fstack-protector-strong -Wformat -Werror=format-security'
Default Directories:
    User Configurations ________________________________________ : /etc/netdata
    Stock Configurations _______________________________________ : /usr/lib/netdata/conf.d
    Ephemeral Databases (metrics data, metadata) _______________ : /var/cache/netdata
    Permanent Databases ________________________________________ : /var/lib/netdata
    Plugins ____________________________________________________ : /usr/libexec/netdata/plugins.d
    Static Web Files ___________________________________________ : /var/lib/netdata/www
    Log Files __________________________________________________ : /var/log/netdata
    Lock Files _________________________________________________ : /var/lib/netdata/lock
    Home _______________________________________________________ : /var/lib/netdata
Operating System:
    Kernel _____________________________________________________ : Linux
    Kernel Version _____________________________________________ : 6.1.0-17-amd64
    Operating System ___________________________________________ : Debian GNU/Linux
    Operating System ID ________________________________________ : debian
    Operating System ID Like ___________________________________ : unknown
    Operating System Version ___________________________________ : 12 (bookworm)
    Operating System Version ID ________________________________ : none
    Detection __________________________________________________ : /etc/os-release
Hardware:
    CPU Cores __________________________________________________ : 4
    CPU Frequency ______________________________________________ : 3194000000
    RAM Bytes __________________________________________________ : 3931627520
    Disk Capacity ______________________________________________ : 375809638400
    CPU Architecture ___________________________________________ : x86_64
    Virtualization Technology __________________________________ : kvm
    Virtualization Detection ___________________________________ : systemd-detect-virt
Container:
    Container __________________________________________________ : none
    Container Detection ________________________________________ : systemd-detect-virt
    Container Orchestrator _____________________________________ : none
    Container Operating System _________________________________ : none
    Container Operating System ID ______________________________ : none
    Container Operating System ID Like _________________________ : none
    Container Operating System Version _________________________ : none
    Container Operating System Version ID ______________________ : none
    Container Operating System Detection _______________________ : none
Features:
    Built For __________________________________________________ : Linux
    Netdata Cloud ______________________________________________ : YES
    Health (trigger alerts and send notifications) _____________ : YES
    Streaming (stream metrics to parent Netdata servers) _______ : YES
    Back-filling (of higher database tiers) ____________________ : YES
    Replication (fill the gaps of parent Netdata servers) ______ : YES
    Streaming and Replication Compression ______________________ : YES (zstd lz4 gzip)
    Contexts (index all active and archived metrics) ___________ : YES
    Tiering (multiple dbs with different metrics resolution) ___ : YES (5)
    Machine Learning ___________________________________________ : YES
Database Engines:
    dbengine ___________________________________________________ : YES
    alloc ______________________________________________________ : YES
    ram ________________________________________________________ : YES
    map ________________________________________________________ : YES
    save _______________________________________________________ : YES
    none _______________________________________________________ : YES
Connectivity Capabilities:
    ACLK (Agent-Cloud Link: MQTT over WebSockets over TLS) _____ : YES
    static (Netdata internal web server) _______________________ : YES
    h2o (web server) ___________________________________________ : YES
    WebRTC (experimental) ______________________________________ : NO
    Native HTTPS (TLS Support) _________________________________ : YES
    TLS Host Verification ______________________________________ : YES
Libraries:
    LZ4 (extremely fast lossless compression algorithm) ________ : YES
    ZSTD (fast, lossless compression algorithm) ________________ : YES
    zlib (lossless data-compression library) ___________________ : YES
    Judy (high-performance dynamic arrays and hashtables) ______ : YES (bundled)
    dlib (robust machine learning toolkit) _____________________ : YES (bundled)
    protobuf (platform-neutral data serialization protocol) ____ : YES (system)
    OpenSSL (cryptography) _____________________________________ : YES
    libdatachannel (stand-alone WebRTC data channels) __________ : NO
    JSON-C (lightweight JSON manipulation) _____________________ : YES
    libcap (Linux capabilities system operations) ______________ : NO
    libcrypto (cryptographic functions) ________________________ : YES
    libm (mathematical functions) ______________________________ : YES
    jemalloc ___________________________________________________ : NO
    TCMalloc ___________________________________________________ : NO
Plugins:
    apps (monitor processes) ___________________________________ : YES
    cgroups (monitor containers and VMs) _______________________ : YES
    cgroup-network (associate interfaces to CGROUPS) ___________ : YES
    proc (monitor Linux systems) _______________________________ : YES
    tc (monitor Linux network QoS) _____________________________ : YES
    diskspace (monitor Linux mount points) _____________________ : YES
    freebsd (monitor FreeBSD systems) __________________________ : NO
    macos (monitor MacOS systems) ______________________________ : NO
    statsd (collect custom application metrics) ________________ : YES
    timex (check system clock synchronization) _________________ : YES
    idlejitter (check system latency and jitter) _______________ : YES
    bash (support shell data collection jobs - charts.d) _______ : YES
    debugfs (kernel debugging metrics) _________________________ : YES
    cups (monitor printers and print jobs) _____________________ : YES
    ebpf (monitor system calls) ________________________________ : YES
    freeipmi (monitor enterprise server H/W) ___________________ : YES
    nfacct (gather netfilter accounting) _______________________ : YES
    perf (collect kernel performance events) ___________________ : YES
    slabinfo (monitor kernel object caching) ___________________ : YES
    Xen ________________________________________________________ : NO
    Xen VBD Error Tracking _____________________________________ : NO
    Logs Management ____________________________________________ : YES
Exporters:
    AWS Kinesis ________________________________________________ : NO
    GCP PubSub _________________________________________________ : NO
    MongoDB ____________________________________________________ : YES
    Prometheus (OpenMetrics) Exporter __________________________ : YES
    Prometheus Remote Write ____________________________________ : YES
    Graphite ___________________________________________________ : YES
    Graphite HTTP / HTTPS ______________________________________ : YES
    JSON _______________________________________________________ : YES
    JSON HTTP / HTTPS __________________________________________ : YES
    OpenTSDB ___________________________________________________ : YES
    OpenTSDB HTTP / HTTPS ______________________________________ : YES
    All Metrics API ____________________________________________ : YES
    Shell (use metrics in shell scripts) _______________________ : YES
Debug/Developer Features:
    Trace All Netdata Allocations (with charts) ________________ : NO
    Developer Mode (more runtime checks, slower) _______________ : NO

Additional info

No response

@nodiscc nodiscc added bug needs triage Issues which need to be manually labelled labels Feb 10, 2024
@ilyam8 ilyam8 added collectors/ebpf and removed needs triage Issues which need to be manually labelled labels Feb 10, 2024
@nodiscc
Copy link
Contributor Author

nodiscc commented Feb 21, 2024

I would add that even with this configuration in /etc/netdata/ebpf.d.conf:

[global]
    ebpf load mode = entry
    apps = yes
    cgroups = yes
    update every = 5
    pid table size = 32768
    btf path = /sys/kernel/btf/
    maps per core = yes
    lifetime = 300

[network connections]
    enabled = yes
    resolve hostnames = no
    resolve service names = yes
    ports = *
    ips = *
    hostnames = *

Netdata does not show the metrics described at https://learn.netdata.cloud/docs/collecting-metrics/ebpf/ebpf-socket (I am interested in app.ebpf_sock_bytes_sent and app.ebpf_sock_bytes_received but am unable to find these on the agent dashboard, using search, manually looking through Applications or Networking related charts, serching through /api/v1/charts, etc)

I have made sure that /sys/kernel/btf is available on this kernel

$ uname -a
Linux my.example.org 6.1.0-18-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.76-1 (2024-02-01) x86_64 GNU/Linux

$ ls /sys/kernel/btf/
aesni_intel    crc32c_generic     drm_shmem_helper     intel_rapl_common  ledtrig_audio           nf_conntrack_netbios_ns  nft_chain_nat    sha256_ssse3           snd_pcm     usbcore                virtio_ring
ata_generic    crc32c_intel       efi_pstore           intel_rapl_msr     libata                  nf_conntrack_netlink     nft_compat       sha512_generic         snd_timer   virtio                 vmlinux
ata_piix       crc32_pclmul       ehci_hcd             ip6_tables         libchacha               nf_defrag_ipv4           overlay          sha512_ssse3           soundcore   virtio_balloon         wireguard
autofs4        crct10dif_common   ehci_pci             ip6t_REJECT        libchacha20poly1305     nf_defrag_ipv6           poly1305_x86_64  snd                    sr_mod      virtio_blk             x_tables
battery        crct10dif_pclmul   evdev                ip6t_rpfilter      libcrc32c               nf_log_syslog            psmouse          snd_hda_codec          tcp_diag    virtio_console         xt_conntrack
binfmt_misc    cryptd             ext4                 ip6_udp_tunnel     libcurve25519_generic   nf_nat                   qemu_fw_cfg      snd_hda_codec_generic  tls         virtio_dma_buf         xt_CT
button         crypto_simd        failover             ip_set             loop                    nfnetlink                scsi_common      snd_hda_core           tun         virtio_gpu             xt_LOG
cdrom          curve25519_x86_64  fuse                 ip_set_hash_ip     mbcache                 nfnetlink_acct           scsi_mod         snd_hda_intel          udp_diag    virtio_net             xt_MASQUERADE
chacha_x86_64  dm_mod             ghash_clmulni_intel  ip_tables          net_failover            nf_reject_ipv4           serio_raw        snd_hwdep              udp_tunnel  virtio_pci             xt_multiport
configfs       drm                i2c_piix4            ipt_REJECT         nf_conntrack            nf_reject_ipv6           sg               snd_intel_dspcfg       uhci_hcd    virtio_pci_legacy_dev  xt_set
crc16          drm_kms_helper     inet_diag            jbd2               nf_conntrack_broadcast  nf_tables                sha1_ssse3       snd_intel_sdw_acpi     usb_common  virtio_pci_modern_dev  xt_tcpudp

@thiagoftsm thiagoftsm linked a pull request Feb 27, 2024 that will close this issue
@thiagoftsm
Copy link
Contributor

Hello @nodiscc ,

Thank you for your feedback. We are now working with sockets again, and we are going to consider everything you wrote to improve.

Best regards!

@ilyam8
Copy link
Member

ilyam8 commented Feb 29, 2024

@thiagoftsm, excuse me, what exactly are you going to consider?

@nodiscc is saying that he followed https://learn.netdata.cloud/docs/collecting-metrics/ebpf/ebpf-socket and get

ebpf.plugin[649749]: Does not have a configuration file inside `/etc/netdata/ebpf.d.conf. It will try to load stock file.

He did some testing and found out that

sudo mv /etc/netdata/ebpf.d/network.conf /etc/netdata/ebpf.d.conf and restarting netdata fixes the problem.

So what are you going to consider? Is he wrong or our documentation is wrong/a bug in ebpf plugin?

@ilyam8
Copy link
Member

ilyam8 commented Feb 29, 2024

@nodiscc

ebpf.plugin[649749]: Does not have a configuration file inside `/etc/netdata/ebpf.d.conf. It will try to load stock file.

Expected because it is a different file. The ebpf plugin has modules and their configurations are in the ebpf.d directory. ebpf.d/network.conf is the confuguration file of the module. ebpf.d.conf is s different file.

@nodiscc
Copy link
Contributor Author

nodiscc commented Mar 1, 2024

Expected because it is a different file

Hi @ilyam8 sure, I understand that this message may be part of normal operation. The problem is that the charts are not shown when I create ebpf.d/network.conf. Some charts are shown when I created ebpf.d.conf, but not the ones I'm interested in.

Can you reproduce the problem on your side? As I said, just need to follow https://learn.netdata.cloud/docs/collecting-metrics/ebpf/ebpf-socket

@hugovalente-pm
Copy link
Contributor

@thiagoftsm could you help confirm if current documentation on https://learn.netdata.cloud/docs/collecting-metrics/ebpf/ebpf-socket is correct? are we missing something for user to be able to enable this specific module?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants