Skip to content

Refactor ruleset discovery and locking#526

Open
qdeslandes wants to merge 5 commits intofacebook:mainfrom
qdeslandes:locking
Open

Refactor ruleset discovery and locking#526
qdeslandes wants to merge 5 commits intofacebook:mainfrom
qdeslandes:locking

Conversation

@qdeslandes
Copy link
Copy Markdown
Contributor

bf_ctx_setup() currently initializes a global context with a file descriptor to the bpfilter pin directory, locks it (exclusive lock), and restore all the existing chains.

This behaviour is sub-optimal, as it prevents any concurrent bfcli call (or library call) to read the ruleset or modify it. This PR aims to improve this behaviour by:

  • Restoring only the necessary BPF objects: bf_ctx_setup() won't perform any discovery. Each bf_chain* call will restore the require chain, and bf_ruleset* calls will restore the ruleset when needed.
  • Improve locking granularity by allowing the ruleset and/or specific chains to be locked in read or write mode.

Full details are available in the commits and documentation.

@meta-cla meta-cla Bot added the cla signed label May 5, 2026
qdeslandes added 3 commits May 6, 2026 12:32
bf_ctx_setup() walked the pindir at startup and rehydrated every chain
into an in-memory bf_list. Persistence to bpffs already runs on every
mutating cgen operation, so the cache only added a fixed discovery cost on
each daemon start plus a second source of truth to keep in sync.

Drop the cache. bf_ctx_get_cgens() now discovers and allocates a fresh
list per call; bf_ctx_get_cgen() opens a single chain dir and
deserialises just that chain. Both forward errors via int return +
out-param so callers can distinguish "not present" from real I/O
failures, and the returned objects are heap-allocated with unambiguous
ownership (cleaned up via _free_bf_cgen_ / _free_bf_list_).

bf_ctx_set_cgen, bf_ctx_delete_cgen, and the cgens field on struct
bf_ctx are no longer needed. bf_ctx_flush() is reimplemented as
discover-then-unload.
@qdeslandes qdeslandes marked this pull request as ready for review May 6, 2026 14:15
@qdeslandes
Copy link
Copy Markdown
Contributor Author

@claude review

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 6, 2026

Claude review of PR #526 (1177dab)

Must fix

  • errno clobbered by unlinkatsrc/libbpfilter/core/lock.c:161unlinkat on line 159 can overwrite errno before bf_err_r(errno, ...) reads it, causing wrong error to be reported/returned
  • Misaligned multi-line commentsrc/libbpfilter/core/lock.c:195 — continuation asterisks use 8-space indent instead of aligning with opening /*
  • assert(link) removed from bf_link_unpinsrc/libbpfilter/cgen/prog/link.c:338 — precondition assert was removed but parameter and doc contract remain; NULL link won't be caught in debug builds

Suggestions

  • unlinkat when create=falsesrc/libbpfilter/core/lock.c:160 — could remove a pre-existing chain directory on openat/flock failure when the caller did not request creation
  • Self-include convention (dismissed) — "core/lock.h" is correct for private headers not in include/bpfilter/
  • Include bracket convention (dismissed) — quotes for bpfilter/ headers is the established project convention
  • bf_lock_cleanup removes pindir unconditionallysrc/libbpfilter/core/lock.c:114unlinkat with AT_REMOVEDIR is safe (only removes empty dirs) but intent could be clearer
  • CMakeLists .h and .c on same linesrc/libbpfilter/CMakeLists.txt:52 — inconsistent with rest of file where each entry is on its own line
  • *bft_teardown_ctx doesn't null statetests/harness/test.c:339 — documented postcondition says *state is NULL but implementation only nulls a local copy

Nits

  • Missing space after /// (dismissed) — all /// comments have proper spacing
  • Doc typo in bf_lock_defaultsrc/libbpfilter/core/lock.h:118 — references _clean_bf_lock (missing trailing underscore)
  • Unused includesrc/libbpfilter/core/lock.c:8<dirent.h> is not used by any symbol in this file
  • Typo in test.htests/harness/test.h:193 — "initalized" should be "initialized"

Workflow run

lock->chain_fd = openat(lock->pindir_fd, _name, O_DIRECTORY);
if (lock->chain_fd < 0) {
(void)unlinkat(lock->pindir_fd, _name, AT_REMOVEDIR);
return bf_err_r(
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: must-fix: After openat fails on line 157, errno holds the real error. The unlinkat call on line 159 can overwrite errno before it is read in bf_err_r(errno, ...) on line 160. Save errno immediately:

lock->chain_fd = openat(lock->pindir_fd, _name, O_DIRECTORY);
if (lock->chain_fd < 0) {
    int err = errno;
    (void)unlinkat(lock->pindir_fd, _name, AT_REMOVEDIR);
    return bf_err_r(-err, "...", _name);
}

Note: the same pattern applies to the flock-failure path below (line 170), but there the error comes from _bf_flock's return value r, not errno, so it's safe.

}

lock->chain_fd = openat(lock->pindir_fd, _name, O_DIRECTORY);
if (lock->chain_fd < 0) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: suggestion: When create=false, the directory was not created by this call. On openat failure (or flock failure at line 170), unlinkat unconditionally attempts to remove the pre-existing chain directory. If another process just created the directory but hasn't pinned objects yet (empty dir), this could remove it.

Guard both unlinkat calls with if (create):

if (create)
    (void)unlinkat(lock->pindir_fd, _name, AT_REMOVEDIR);

if (lock->chain_fd < 0)
return;

/* Only WRITE locks will be used to create or modify a chain, meaning
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: must-fix: Multi-line comment continuation asterisks are misaligned (8-space indent instead of aligned with the opening /*). Per doc/developers/style.rst: "Multi-line comment with aligned asterisks".

    /* Only WRITE locks will be used to create or modify a chain, meaning
     * if a READ or NONE lock was held on a chain directory, that directory
     * hasn't been modified during the lifetime of the lock. So there is no
     * need to attempt to remove it. */

Comment thread src/libbpfilter/core/lock.h Outdated
/// `BF_LOCK_READ` requests a shared lock (multiple readers allowed).
BF_LOCK_READ,

///`BF_LOCK_WRITE` requests an exclusive lock (single writer, no readers).
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: nit: Missing space between /// and the backtick:

    ///`BF_LOCK_WRITE` requests an exclusive lock ...

Should be:

    /// `BF_LOCK_WRITE` requests an exclusive lock ...

(Lines 76 and 80 have the space correctly.)

Comment thread src/libbpfilter/core/lock.h Outdated
/**
* @brief Assign sane defaults to a `bf_lock` object.
*
* This macros should always be used for a `bf_lock` object with the
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: nit: Two issues in this doc comment:

  1. "This macros should" -> "This macro should"
  2. References `_clean_bf_lock` (no trailing underscore) but the actual attribute macro on line 132 is _clean_bf_lock_ (with trailing underscore).

Comment thread src/libbpfilter/core/lock.c Outdated
* Copyright (c) Meta Platforms, Inc. and affiliates.
*/

#include "lock.h"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: suggestion: Self-include uses "lock.h" but the established convention is the path relative to src/libbpfilter/. Every other .c file follows this pattern (e.g., cgen/cgen.c includes "cgen/cgen.h", core/hashset.c includes "bpfilter/core/hashset.h"). Should be "core/lock.h".

Comment thread src/libbpfilter/core/lock.c Outdated
#include <sys/stat.h>
#include <unistd.h>

#include "bpfilter/ctx.h"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: suggestion: Public library headers are included with quotes ("bpfilter/ctx.h", "bpfilter/helper.h", etc.) but the dominant convention uses angle brackets for these. Compare with ctx.c, cgen/cgen.c, cgen/handle.c which all use <bpfilter/ctx.h>, <bpfilter/helper.h>, <bpfilter/io.h>, <bpfilter/logger.h>. These should use angle brackets.

${CMAKE_CURRENT_SOURCE_DIR}/io.c
${CMAKE_CURRENT_SOURCE_DIR}/core/hashset.c
${CMAKE_CURRENT_SOURCE_DIR}/core/list.c
${CMAKE_CURRENT_SOURCE_DIR}/core/lock.h ${CMAKE_CURRENT_SOURCE_DIR}/core/lock.c
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: nit: The .h and .c for core/lock are on the same line with wide whitespace padding. The rest of the file lists one path per line. Consider splitting onto separate lines (or listing only the .c, since private headers are not typically registered as sources here).


bf_lock_release_chain(lock);

(void)unlinkat(lock->bpffs_fd, "bpfilter", AT_REMOVEDIR);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: suggestion: bf_lock_cleanup unconditionally tries unlinkat(lock->bpffs_fd, "bpfilter", AT_REMOVEDIR) regardless of which lock mode was held on pindir. If a BF_LOCK_READ holder cleans up while another process concurrently holds a different pindir_fd (same inode), and the dir happens to be empty, the removal succeeds. The next bf_lock_init creates a new bpfilter directory -- the two processes now hold flocks on different inodes, breaking mutual exclusion.

Consider limiting this removal to BF_LOCK_WRITE holders (mirroring the per-chain logic in bf_lock_release_chain):

if (lock->pindir_lock == BF_LOCK_WRITE)
    (void)unlinkat(lock->bpffs_fd, "bpfilter", AT_REMOVEDIR);

Comment thread src/libbpfilter/cgen/prog/link.c Outdated
{
int r;

assert(link);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: nit: The (void)link; statement suppresses an unused-parameter warning, but the link parameter is never dereferenced in this function (only hardcoded names "bf_link" / "bf_link_extra" are used). Since this PR refactors the signature, this would be a good time to either use link->name for the unlink target (if applicable), or document why link exists in the signature despite being unused.

qdeslandes added 2 commits May 6, 2026 16:50
Add `bf_lock`, a primitive that bundles open file descriptors for
`$BPFFS`, `$BPFFS/bpfilter` and an optional chain directory under it,
and applies the locking matrix described in `core/lock.h` to keep
ruleset-wide and per-chain operations from stepping on each other.

The accompanying unit tests are wired up but disabled in
`tests/unit/CMakeLists.txt`; they will be enabled once the global
pindir `flock` taken in `bf_ctx_setup` is replaced by `bf_lock`,
since the two layers currently fight over the same file lock.
Replace raw dir_fd parameters with struct bf_lock so pinned-object
operations honor the pindir/chain locking matrix end-to-end.
bf_ctx_flush is folded into a per-chain helper that acquires a WRITE
lock around each unload, and ruleset-level operations now take per-chain
locks while iterating.
@qdeslandes
Copy link
Copy Markdown
Contributor Author

@claude review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant