Skip to content

Sensor Probes

cyb3rjerry edited this page May 23, 2026 · 1 revision

Sensor Probes

The eBPF sensor library lives at internal/runner/sensor/. Probes are written in CO-RE BPF C (bpf/sensor.bpf.c), compiled with clang + bpf2go, and embedded into the runner binary via go:embed.

Lifecycle

  • Sensor.New(Options) — at runner startup. Loads BPF objects, attaches every probe ONCE, opens the ringbuf reader.
  • AddCgroup(opts) — per-job. Inserts the cgroup_id + run_id into the CGMAP and populates path_filter with the watched paths.
  • RemoveCgroup(cgroup_id) — per-job teardown. Removes the cgmap entry + the path_filter entries for this cgroup.
  • Events(ctx) — long-lived event channel. Reader goroutine decodes ringbuf records into typed Go events.
  • Close() — detaches all probes, releases BPF objects.

Probes stay attached across jobs. CGMAP + path_filter mutations are the only per-job state. Combined with the runner pre-creating /sys/fs/cgroup/.../fangs/<run_id>/ and registering ITS inode before docker-start, this closes the container-start vs sensor-attach race window — events fire from syscall #1 inside the container.

Probe inventory

Probe Section What it captures
handle_openat tracepoint/syscalls/sys_enter_openat File access; LPM-trie filtered to watched prefixes; @cred tag
handle_execve tracepoint/syscalls/sys_enter_execve Process exec; binary path, argv (8×64 B), 5-level ancestry
handle_connect tracepoint/syscalls/sys_enter_connect TCP+UDP connect; v4 and v6; port=0 dropped
handle_tcp_v4_connect kprobe/tcp_v4_connect TCP v4 connects via io_uring (bypasses sys_enter)
handle_tcp_v6_connect kprobe/tcp_v6_connect Same, v6
handle_sendto tracepoint/syscalls/sys_enter_sendto DNS via sendto+addr; TLS ClientHello detection
handle_sendmsg tracepoint/syscalls/sys_enter_sendmsg DNS via single-message sendmsg
handle_sendmmsg tracepoint/syscalls/sys_enter_sendmmsg DNS via curl/glibc-2.30+ batched A+AAAA
handle_write tracepoint/syscalls/sys_enter_write TLS ClientHello on TCP sockets (Node's BoringSSL path)
handle_ssl_ctrl uprobe:libssl/SSL_ctrl TLS SNI via OpenSSL SSL_set_tlsext_host_name

BPF maps

Map Type Purpose
cgmap HASH watched cgroup_id → run_id
path_filter LPM_TRIE watched-path allowlist + @cred tag per prefix
events RINGBUF 64 MB capture buffer
drops_counter PERCPU_ARRAY bumped when ringbuf reserve fails (overflow indicator)
cgmap_misses HASH diagnostic: cgroup_ids that hit lookup_cgroup but missed cgmap

Cgroup matching

The lookup_cgroup helper:

  1. bpf_get_current_cgroup_id() → check cgmap directly. Match → return.
  2. Walk bpf_get_current_ancestor_cgroup_id(level) from 1..8. Deepest match wins (LPM-style). Match → return.
  3. No match → record in cgmap_misses (diagnostic), return NULL.

The ancestor walk is what makes the pre-attach pattern work. Docker nests the container's leaf cgroup under our pre-created /fangs/<run_id> parent; the leaf cgroup_id never matches CGMAP directly, but the ancestor does.

Up to 8 ancestor levels covered. Docker hierarchies are typically ≤4 levels deep.

Event types

Every event begins with a 72-byte fangs_event_header:

struct fangs_event_header {
    __u64 ts_ns;        // bpf_ktime_get_ns
    __u64 cgroup_id;    // matched cgroup_id from lookup_cgroup
    __u8  run_id[16];   // copied from cgmap value
    __u32 pid;          // current task's tgid
    __u32 tid;          // current task's pid
    __u32 ppid;         // task->real_parent->tgid via BPF_CORE_READ
    __u32 uid;
    __u32 gid;
    char  comm[16];     // bpf_get_current_comm
    __u8  type;         // discriminator (1=file, 2=exec, 3=net, 4=dns, 5=tls)
    __u8  tags;         // EVENT_TAG_* bits
    __u8  _pad[2];
};

The tags field can carry:

  • EVENT_TAG_INTERESTING (bit 0) — generic high-signal marker
  • EVENT_TAG_CRED_ACCESS (bit 1) — set on file_access events whose path_filter entry was @cred-tagged

File access (openat)

struct openat_event {
    fangs_event_header h;
    __s32 dfd;
    __s32 flags;
    __u16 path_len;
    __u8  truncated;
    __u8  _pad;
    char  path[256];
};

Filter pipeline:

  1. lookup_cgroup — drop if not in any watched cgroup.
  2. bpf_probe_read_user_str(filename) into e->path.
  3. path_filter LPM trie lookup using e->path as key. No match → drop.
  4. If matching action == PATH_ACTION_KEEP_CRED_TAGGED, set EVENT_TAG_INTERESTING | EVENT_TAG_CRED_ACCESS on the tags byte.

The LPM trie's prefix-length-bits is set to PATH_LEN * 8 so the trie matches longest-prefix automatically. Operator's watched-path list populates the trie via AddCgroup.

Exec (execve)

struct exec_event {
    fangs_event_header h;
    __u8  argc;
    __u8  _pad[3];
    __u8  argv_lens[8];
    char  argv[8 * 64];
    char  binary_path[256];
    Ancestor ancestors[5];   // pid + ppid + comm[16] each
};

argv capture:

  • Reads the first 8 argv pointers via bpf_probe_read_user.
  • For each non-NULL pointer, reads up to 64 bytes of the string into the corresponding argv[i * 64] slot.
  • argv_lens[i] records the captured length per slot.
  • Tail args truncated (operator can grep for truncated=1).

Ancestry: walks task->real_parent 5 levels deep via BPF_CORE_READ. Each ancestor's pid, ppid, and comm get recorded. Loop is #pragma unroll-ed for verifier-friendliness on older kernels.

Network connect (connect + kprobes)

struct net_connect_event {
    fangs_event_header h;
    __u8  family;        // AF_INET=2, AF_INET6=10
    __u8  source;        // NET_SOURCE_SYSCALL=1, NET_SOURCE_KPROBE=2
    __u16 dest_port;     // host byte order
    __u32 sockfd;
    __u8  dest_addr[16]; // IPv4 uses lower 4 bytes
};

The kprobe arms (tcp_v4_connect, tcp_v6_connect) fire from the kernel function both syscall-path and io_uring-path connects pass through. The source byte distinguishes:

  • NET_SOURCE_SYSCALL (1) → sys_enter_connect tracepoint
  • NET_SOURCE_KPROBE (2) → kprobe on tcp_v{4,6}_connect

Userspace dedups: if a kprobe event arrives within 100ms of a matching (pid, family, ip, port) syscall event, the kprobe is dropped. io_uring connects (no preceding syscall) fire only the kprobe — they survive the dedup.

port=0 is filtered at the kernel — drops glibc getaddrinfo source-address-selection probes.

DNS query

struct dns_query_event {
    fangs_event_header h;
    __u8  family;
    __u8  _pad[1];
    __u16 dest_port;     // always 53 by filter
    __u16 query_len;
    __u8  _pad2[2];
    __u8  dest_addr[16];
    __u8  query[200];    // raw bytes; userspace parses question section
};

DNS capture across three syscalls because clients use different paths:

Caller Syscall
Classic sendto(fd, buf, len, 0, addr, addrlen) sys_enter_sendto
connect(fd, addr); send(fd, buf, len) (glibc default since ~2.30) sys_enter_sendto with NULL addr → walks task->files->fdt->fd[sockfd]->private_data->sk to read skc_dport and skc_daddr
sendmmsg(fd, mmsghdr_vec, vlen) (curl, glibc batched A+AAAA) sys_enter_sendmmsg — emits one event per vec entry up to 2
sendmsg(fd, msghdr) (rare custom resolvers) sys_enter_sendmsg

All share the dns_dest resolution path. The userspace DNS-question parser (internal/runner/sensor/parsing.go) walks the label-prefix- encoded name in the raw payload.

TLS SNI

struct tls_sni_event {
    fangs_event_header h;
    __u8  source;          // 1=libssl, 2=node-internal (future), 3=tcp_clienthello
    __u8  _pad[1];
    __u16 sni_len;
    __u16 raw_payload_len;
    __u8  _pad2[2];
    char  sni[256];        // populated when source=libssl
    __u8  raw_payload[512]; // populated when source=tcp_clienthello
};

Three capture mechanisms:

Source How Coverage
libssl (1) uprobe on SSL_ctrl in libssl.so when called with cmd=55 (SSL_CTRL_SET_TLSEXT_HOSTNAME) Catches every OpenSSL-based client (curl, Python requests, most Node TLS libs)
node-internal (2) future uprobe on Node-internal TLS — currently DEFERRED-REDUNDANT because mechanism 3 covers it
tcp_clienthello (3) sys_enter_write + sys_enter_sendto detect the TLS record-header signature \x16\x03\x01..\x01 and capture raw bytes; userspace parses the SNI extension Catches statically-linked Node (Alpine), Go binaries, anything bypassing libssl

Userspace dedup: 5-second window keyed on (pid, sni). When two mechanisms report the same SNI, the second one carries DuplicateOf="<first-source>" for visibility but is still emitted.

Drop counter

drops_counter is a per-CPU u64 array (single key=0) bumped whenever bpf_ringbuf_reserve() returns NULL — meaning the ringbuf was full and the event was dropped at probe time.

Userspace reads + sums on shutdown and reports as part of ScanResult. The orchestrator stores events_dropped per run + the Prometheus counter fangs_events_dropped_total accumulates lifetime drops.

Non-zero rate is the indicator that the ringbuf is undersized for the workload. Tuning options:

  1. Reduce sandbox concurrency on the runner.
  2. Sandbox commands that do less I/O.
  3. Bump the ringbuf size in bpf/sensor.bpf.c and rebuild (currently 64 MB; a doubling halves the overflow rate for most workloads).

Known gaps

Gap What's not observed
openat2 syscall Modern glibc may use it; we only hook openat. Most npm-install workloads still hit the legacy openat path because Node + npm don't use the new flags
io_uring file I/O Read/write via io_uring sqe queue bypasses every read/write tracepoint. Rare in current npm packages
mmap-with-prot-write + dirty pages Memory-mapped writes don't surface as syscalls; we'd miss file modification through that path
DNS-over-HTTPS to already-baselined hosts Cloudflare's 1.1.1.1 via cloudflare-dns.com is invisible because the DoH-tunneled query is encrypted inside the existing baselined SNI
Raw-socket netlink → uevent IPC Edge case; not observed

Each gap is a future probe addition. The delta-vs-baseline approach means a new probe retroactively catches new behaviors as soon as fingerprints from the new event type start landing in the runs table.

Kernel version targeting

Built against vmlinux.h generated from the BUILD host's BTF. Compiled with -target bpf -O2 -g. Runs CO-RE so it adapts to the runtime kernel's BTF without recompilation — provided structures it reads (task_struct, files_struct, struct sock, etc.) maintain the BPF_CORE relocations we use.

Minimum runtime kernel: 4.18 for tracepoint + BTF + ringbuf. 5.5+ recommended for the connected-UDP sock-fd walk path used in DNS capture.

Clone this wiki locally