Skip to content

lib: cgen: group same-key maps using bitmasks#520

Draft
pzmarzly wants to merge 4 commits intofacebook:mainfrom
pzmarzly:push-ytzrrosklspt
Draft

lib: cgen: group same-key maps using bitmasks#520
pzmarzly wants to merge 4 commits intofacebook:mainfrom
pzmarzly:push-ytzrrosklspt

Conversation

@pzmarzly
Copy link
Copy Markdown
Contributor

@pzmarzly pzmarzly commented Apr 21, 2026

Closes #381.

We're grouping all sets in a chain by key format. Then, sets of the same format will turn into a single BPF map, with bitmasks in its values that indicate which sets the element appears in. With the 64 maps per program limit, we should now be able to have 64 unique key formats per program, which is much less likely to hit than just 64 sets.

@pzmarzly pzmarzly requested a review from qdeslandes as a code owner April 21, 2026 15:00
@meta-cla meta-cla Bot added the cla signed label Apr 21, 2026
Comment on lines +66 to +92
/**
* @brief Emit bitmask check instructions after a map lookup.
*
* Set map values are bitmasks, sized to fit one bit per set in the
* group. After map_lookup_elem returns a non-NULL pointer in BPF_REG_0,
* this function emits instructions to:
* 1. Load the byte containing this set's bit from the pointer.
* 2. AND it with the set's bit.
* 3. Jump to the next rule if the bit is not set.
*/
static int _bf_emit_bitmask_check(struct bf_program *program, size_t bit_index)
{
assert(program);

/* r0 points to the map value. Load the byte containing our bit. */
EMIT(program, BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_0,
(int16_t)(bit_index / CHAR_BIT)));

/* AND with the set's bit within that byte. */
EMIT(program,
BPF_ALU64_IMM(BPF_AND, BPF_REG_0, 1U << (bit_index % CHAR_BIT)));

/* Jump to next rule if the result is 0 (bit not set). */
EMIT_FIXUP_JMP_NEXT_RULE(program, BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 0));

return 0;
}
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really know BPF bytecode, this function is vibecoded. Can you check if it looks right?

@pzmarzly
Copy link
Copy Markdown
Contributor Author

@claude review

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 21, 2026

Claude review of PR #520 (7f30d9d)

Must fix

  • assert after dereferencesrc/libbpfilter/cgen/program.c:693 — fixed in v2: assert is now before any dereference
  • missing NULL check on bf_vector_getsrc/libbpfilter/cgen/program.c:868 — no longer applicable: code was restructured
  • missing assert in cleanup functionsrc/libbpfilter/cgen/program.c:103_bf_set_group_free is missing assert(group) at the top, unlike every other cleanup function in the codebase
  • LPM trie read-modify-write correctnesssrc/libbpfilter/cgen/program.c:918 — for LPM trie maps, bf_bpf_map_lookup_elem does longest-prefix match, not exact match; the OR-based bitmask merge can propagate bits from shorter prefixes to longer ones, causing false positives

Suggestions

  • BPF_ALU64_IMM to BPF_ALU32_IMMsrc/libbpfilter/cgen/matcher/set.c:85 — fixed in v2: now uses BPF_ALU32_IMM
  • duplicated group computationsrc/libbpfilter/cgen/matcher/set.c:35 — fixed in v2: grouping now centralized in _bf_program_build_set_groups
  • consolidate parallel arrayssrc/libbpfilter/cgen/program.c:700 — no longer applicable: restructured to use bf_set_group
  • NULL set name in format stringssrc/libbpfilter/cgen/matcher/set.c:72set->name can be NULL for anonymous sets; passing it to %s is UB. Also in fixup.c:79 and program.c:430
  • total_elems overestimatesrc/libbpfilter/cgen/program.c:887total_elems over-counts when keys overlap across sets; safe but worth a comment
  • group->map assignment orderingsrc/libbpfilter/cgen/program.c:905 — non-owning alias stored before ownership transfer; fragile against future early returns
  • e2e isolation test missing exit checktests/e2e/parsing/set_group.sh:34 — ping exit code not verified

Nits

  • unnecessary calloc castsrc/libbpfilter/cgen/program.c:721 — no longer applicable
  • %lu for size_tsrc/libbpfilter/cgen/program.c:857 — pre-existing pattern throughout codebase; not a regression
  • unnecessary bf_list_ops_free castsrc/libbpfilter/cgen/program.c:157 — cast is redundant; other bf_list_default calls pass free callbacks without casting
  • set_size param namingsrc/libbpfilter/cgen/prog/map.h:74set_size could be misread as "number of sets"; consider n_elems

Workflow run

Comment thread src/libbpfilter/cgen/program.c
Comment thread src/libbpfilter/cgen/program.c Outdated
Comment thread src/libbpfilter/cgen/matcher/set.c
Comment thread src/libbpfilter/cgen/matcher/set.c Outdated
Comment thread src/libbpfilter/cgen/program.c Outdated
Comment thread src/libbpfilter/cgen/program.c Outdated
Comment thread src/libbpfilter/cgen/program.c Outdated
Comment thread src/libbpfilter/cgen/prog/map.h
Copy link
Copy Markdown
Contributor

@qdeslandes qdeslandes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this, I think's it's a good first implementation, but there are a few things that could be improved to simplify it.

When grouping sets, using a dedicated structure would be helpful:

bf_set_group {
    key;
    sets;
};

When looping over sets, you can ignore the empty ones, then for non-empty ones find if a set group exists already and add it there. You don't necessary need a "leader".

Fixups can hold a pointer to a set directly, it will be easier to resolve them, instead of relying on a set index. This is possible as the fixup (and bf_program) aren't serialized, they're only temporary objects (which was not the case before).

Comment thread src/libbpfilter/cgen/matcher/set.c Outdated
@pzmarzly pzmarzly force-pushed the push-ytzrrosklspt branch 3 times, most recently from f15cbd1 to a49eca0 Compare April 22, 2026 15:03
@pzmarzly pzmarzly force-pushed the push-ytzrrosklspt branch from a49eca0 to 7f30d9d Compare April 22, 2026 15:06
@pzmarzly
Copy link
Copy Markdown
Contributor Author

Fixups can hold a pointer to a set directly, it will be easier to resolve them, instead of relying on a set index. This is possible as the fixup (and bf_program) aren't serialized, they're only temporary objects (which was not the case before).

That's amazing. Trying this.

@claude review


#define _free_bf_set_group_ __attribute__((__cleanup__(_bf_set_group_free)))

static void _bf_set_group_free(struct bf_set_group **group)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: must-fix: Missing assert(group) at the top of _bf_set_group_free. Every other cleanup function in the codebase (bf_fixup_free, bf_set_free, bf_map_free, _bf_btf_free, etc.) validates its double-pointer argument with assert before dereferencing. Without it, a NULL group pointer causes a silent crash rather than a diagnostic assertion failure.

static void _bf_set_group_free(struct bf_set_group **group)
{
    assert(group);

    if (!*group)
        return;

if (r)
return bf_err_r(r, "failed to add set elements to the map");
memset(value, 0, value_size);
(void)bf_bpf_map_lookup_elem(new_map->fd, elem, value);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: must-fix: For LPM trie maps, bf_bpf_map_lookup_elem performs longest-prefix matching, not exact-key matching. When inserting an element from set B (e.g. 192.168.1.0/24), the lookup may return the value for a shorter prefix from set A (e.g. 192.168.0.0/16) that was already inserted. The subsequent OR would incorrectly propagate set A's bit into set B's entry.

This means a packet matching 192.168.1.0/24 would appear to belong to both set A and set B, producing false positives. Consider either:

  • Skipping the lookup for LPM trie groups and always writing only the current set's bit, or
  • Preventing LPM trie sets from being grouped together when their prefixes can overlap.

if (r)
return bf_err_r(r, "set '%s' not assigned to any group", set->name);

r = bf_stub_rule_check_protocol(program, meta);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: suggestion: set->name can be NULL for anonymous sets (documented in set.h:72). Passing NULL to %s in bf_err_r is undefined behavior. Use set->name ?: "<anonymous>" to match the convention in bf_set_dump (set.c:364). Same issue at line 143.

case BF_FIXUP_TYPE_SET_MAP_FD:
DUMP(prefix, "set_index: %lu", fixup->attr.set_index);
DUMP(prefix, "set: %s",
fixup->attr.set_ptr ? fixup->attr.set_ptr->name : "(null)");
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: suggestion: fixup->attr.set_ptr->name can be NULL for anonymous sets. The ternary guards against set_ptr being NULL but not name being NULL. Use fixup->attr.set_ptr->name ?: "<anonymous>" to match the convention in bf_set_dump (set.c:364).

return bf_err_r(
-ENOENT, "set map fixup: set '%s' not in any loaded group",
fixup->attr.set_ptr ? fixup->attr.set_ptr->name : "(null)");
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: suggestion: Same NULL name issue: fixup->attr.set_ptr->name can be NULL for anonymous sets. Use fixup->attr.set_ptr->name ?: "<anonymous>" in the non-NULL branch.

bf_list_node_get_data(bf_list_get_head(&group->sets));
size_t n_sets = bf_list_size(&group->sets);
size_t value_size = (n_sets + CHAR_BIT - 1) / CHAR_BIT;
size_t total_elems = 0;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: suggestion: total_elems sums element counts across all sets in the group, but duplicate keys across sets are merged in the map (via the lookup-then-update pattern). The BPF map's max_entries is set to total_elems, which over-provisions when keys overlap. This is safe (kernel accepts over-sized maps) but worth a brief comment noting it is an upper bound, since it is not immediately obvious.

values = malloc(nelems);
if (!values)
return bf_err_r(-errno, "failed to allocate map values");
group->map = new_map;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: suggestion: group->map = new_map stores a non-owning alias before bf_list_push on line 928 transfers ownership (NULLs new_map). If a future change inserts an early return between these lines, _free_bf_map_ would destroy the map while group->map still points to it. Consider moving the group->map assignment to after bf_list_push succeeds:

struct bf_map *map_ref = new_map;
r = bf_list_push(&new_prog->handle->sets, (void **)&new_map);
if (r)
    return r;
group->map = map_ref;

set a (ip4.saddr) in { 192.0.2.40 }
set b (ip4.saddr) in { ${HOST_IP_ADDR} }
rule (ip4.saddr) in a counter DROP"
ping -c 1 -W 1 ${NS_IP_ADDR}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: suggestion: The isolation test pings ${NS_IP_ADDR} but does not check the exit code. If ping fails for an unrelated reason, the test passes silently. Consider:

ping -c 1 -W 1 ${NS_IP_ADDR} || { echo "ERROR: ping should have succeeded"; exit 1; }

assert(program);

bf_list_clean(&program->set_groups);
program->set_groups =
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: nit: The explicit cast (bf_list_ops_free)_bf_set_group_free is unnecessary. The rest of the codebase passes free callbacks without casting (e.g., bf_list_default(bf_fixup_free, NULL) on line 116, bf_list_default(bf_set_free, bf_set_pack) in chain.c). Same at line 260.

* also as filename when pinning the map to the system. Can't be NULL or
* empty.
* @param set Set to create the map from. Can't be NULL.
* @param set_size Number of elements to reserve room for in the map.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: nit: The set_size parameter name could be misread as "number of sets" rather than "number of elements to reserve". Consider n_elems to match the naming convention in the underlying _bf_map_new call.

@pzmarzly pzmarzly marked this pull request as draft April 22, 2026 15:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Merge maps filtering on sets with identical keys

2 participants