Skip to content

[bpf-ci-bot] Flaky test: task_local_storage/sys_enter_exit #448

@kernel-patches-review-bot

Description

@kernel-patches-review-bot

Summary

The task_local_storage/sys_enter_exit BPF selftest is flaky. It fails
intermittently across multiple independent patch series and on the
baseline for-next_test branch, causing false CI failures in BPF CI
(kernel-patches/bpf).

Failure Pattern

When the test fails, the values are always the same:

test_sys_enter_exit:FAIL:enter_cnt unexpected enter_cnt: actual 2 != expected 3
test_sys_enter_exit:PASS:exit_cnt 0 nsec
test_sys_enter_exit:FAIL:mismatch_cnt unexpected mismatch_cnt: actual 3 != expected 0

That is: enter_cnt=2, exit_cnt=3, mismatch_cnt=3.

Affected CI Runs (recent examples)

  • for-next_test run 22314858282 (Feb 23): test_progs on x86_64 gcc-15
  • series/1056598=>bpf-next run 22315036624 (Feb 23): test_progs_no_alu32
  • series/1055783=>bpf-next run 22314940225 (Feb 23): test_progs_no_alu32
  • to-test run 22246444389 (Feb 21): test_progs_cpuv4 on x86_64 llvm-21

The test passes on other runs with the same code (e.g., run 22326871125),
confirming this is a flaky failure, not a deterministic regression.

Root Cause Analysis

The test in prog_tests/task_local_storage.c sets target_pid before
calling task_local_storage__attach():

skel->bss->target_pid = sys_gettid();    // set pid before attach

err = task_local_storage__attach(skel);   // attach BPF programs

sys_gettid();
sys_gettid();

/* 3x syscalls: 1x attach and 2x gettid */
ASSERT_EQ(skel->bss->enter_cnt, 3, "enter_cnt");

The test assumes that exactly 1 syscall during the attach() call will
be observed by the BPF programs. Specifically, it expects the
bpf(BPF_LINK_CREATE) syscall that attaches on_exit (the second
program) to be caught by the already-attached on_enter program.

The skeleton attaches programs in ELF order: on_enter (tp_btf/sys_enter)
first, then on_exit (tp_btf/sys_exit). When on_enter is attached via
bpf(BPF_LINK_CREATE), it registers with the sys_enter tracepoint.
The subsequent bpf(BPF_LINK_CREATE) for on_exit triggers sys_enter,
which should fire on_enter.

However, this relies on the tracepoint registration from the first
BPF_LINK_CREATE being immediately visible to the tracepoint dispatch
code during the second BPF_LINK_CREATE. The tracepoint callback array is
published via rcu_assign_pointer() and read via rcu_dereference_raw()
(essentially READ_ONCE). While this should work on the same CPU due to
program ordering, the user thread can be migrated between the two syscalls.
On a different CPU, the new tracepoint callback may not yet be visible if
the RCU publication hasn't fully propagated—the rcu_dereference_raw lacks
acquire semantics.

When the on_enter callback is not visible during the second
BPF_LINK_CREATE:

  • sys_enter fires but on_enter does NOT run → no enter counted
  • on_exit is registered during the syscall
  • sys_exit fires and on_exit DOES run → exit_cnt=1, creating a
    fresh task storage entry with value 0

This off-by-one between enter and exit counts cascades through all
subsequent syscalls, producing the observed enter_cnt=2, exit_cnt=3, mismatch_cnt=3 failure.

Proposed Fix

Move the target_pid assignment to after the attach() call. This
way, no syscalls during the attach phase match the BPF program's PID
filter (since target_pid is still 0), eliminating the dependency on
how many syscalls attach() generates and their tracepoint visibility:

pid_t pid;

pid = sys_gettid();              // get PID (programs not attached yet)
err = task_local_storage__attach(skel);  // attach (target_pid == 0, nothing counted)
skel->bss->target_pid = pid;    // set PID (memory write, no syscall)

sys_gettid();                    // counted: enter_cnt=1, exit_cnt=1
sys_gettid();                    // counted: enter_cnt=2, exit_cnt=2

ASSERT_EQ(skel->bss->enter_cnt, 2, "enter_cnt");
ASSERT_EQ(skel->bss->exit_cnt, 2, "exit_cnt");
ASSERT_EQ(skel->bss->mismatch_cnt, 0, "mismatch_cnt");

This makes the test deterministic: only the two explicit sys_gettid()
calls are counted, with no dependence on the internal syscall behavior
of task_local_storage__attach().

A patch implementing this fix is attached as
0001-selftests-bpf-fix-flaky-task_local_storage-sys_enter_exit-test.patch.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions