lib: add bf_vector and use it in cgen#487
Conversation
Claude review of PR #487 (dd1fe11)Well-structured PR that introduces Must fix
Suggestions
Nits
|
2bbe579 to
419e434
Compare
419e434 to
e77f462
Compare
e77f462 to
305dd19
Compare
305dd19 to
f5a94cd
Compare
f5a94cd to
e9d06a6
Compare
e9d06a6 to
4cd032f
Compare
703e225 to
a8759b5
Compare
a8759b5 to
c828488
Compare
c828488 to
309f17f
Compare
309f17f to
fba6566
Compare
fba6566 to
c37916b
Compare
c37916b to
975b503
Compare
975b503 to
dd1fe11
Compare
| (elem) && (elem) < (vec)->data + ((vec)->size * (vec)->elem_size); \ | ||
| (elem) = (elem) + (vec)->elem_size) | ||
|
|
||
| /** |
There was a problem hiding this comment.
Claude: suggestion: bf_vector_foreach performs pointer arithmetic on void * ((elem) + (vec)->elem_size and (vec)->data + ...), which is a GCC extension not valid in standard C.
Since vector.h is a public header under include/bpfilter/core/, any downstream consumer using this macro with -pedantic -std=c17 will get a compile error. The pedantic integration test (tests/integration/pedantic_c.sh) includes all public headers with -pedantic-errors -std=c17 -Werror — the macro definition alone won't trigger an error (unexpanded macros aren't compiled), but any call site under those flags will fail.
For comparison, bf_list_foreach in list.h avoids this by iterating over typed pointers (bf_list_node *). Consider casting through char *:
| /** | |
| #define bf_vector_foreach(vec, elem) \ | |
| for (void *(elem) = (vec)->data; \ | |
| (elem) && (char *)(elem) < (char *)(vec)->data + ((vec)->size * (vec)->elem_size); \ | |
| (elem) = (void *)((char *)(elem) + (vec)->elem_size)) |
| _program->flavor = bf_hook_to_flavor(chain->hook); | ||
| _program->runtime.ops = bf_flavor_ops_get(_program->flavor); | ||
| _program->runtime.chain = chain; | ||
| _program->img = bf_vector_default(sizeof(struct bpf_insn)); |
There was a problem hiding this comment.
Claude: suggestion: The old _BF_PROGRAM_DEFAULT_IMG_SIZE was 64 (1 << 6); the new generic _BF_VECTOR_INIT_CAP is 8. Since BPF instruction images routinely contain hundreds of instructions, starting at capacity 8 means ~4 extra reallocations (8 -> 16 -> 32 -> 64 -> 128) per program generation compared to the old code.
Consider adding an initial reserve in bf_program_new after creating the vector:
_program->img = bf_vector_default(sizeof(struct bpf_insn));
bf_vector_reserve(&_program->img, 64);Alternatively, the caller-side reserve preserves the generic _BF_VECTOR_INIT_CAP for other use cases while optimizing the hot path.
There was a problem hiding this comment.
+1 with Claude, programs are at least 200 instructions, so we can reserve 512 elements.
| * @brief Iterate over every element of a `bf_vector`. | ||
| * | ||
| * `elem` is declared as a pointer to the element type and will point to each | ||
| * element in turn. Do not add or remove elements during iteration. |
There was a problem hiding this comment.
Do not add
This should not be an issue.
remove elements during iteration
Same here, as we don't shrink the memory buffer.
| * | ||
| * @param vec Initialised vector. Must be non-NULL. | ||
| * @param index Index of the element. | ||
| * @return Pointer to the element, or NULL if @p index is out of bounds. |
There was a problem hiding this comment.
@p index -> `index`
As a general rule, use backticks for parameters, variables, functions...
| void *bf_vector_get(const bf_vector *vec, size_t index); | ||
|
|
||
| /** | ||
| * @brief Overwrite the n-th element. |
There was a problem hiding this comment.
"Replace the n-th element" would be clearer.
| * @brief Take ownership of the backing buffer. | ||
| * | ||
| * Returns the raw data pointer and resets the vector so it will | ||
| * re-allocate on the next add. |
There was a problem hiding this comment.
"on the next bf_vector_add"
|
|
||
| assert(vec); | ||
|
|
||
| if (!vec->elem_size) |
There was a problem hiding this comment.
This seems off as we never check for vec->elem_size. Either we assume the user might modify vec->elem_size (because it has access to bf_vector structure), in which case we need to validate it in each function, or bf_vector should be moved into vector.c (I would suggest this).
| ctx->insn_idx); | ||
| } | ||
|
|
||
| size_t off = ctx->program->img.size - ctx->insn_idx - 1U; |
There was a problem hiding this comment.
This variable should be defined at the beginning of the scope.
| _program->flavor = bf_hook_to_flavor(chain->hook); | ||
| _program->runtime.ops = bf_flavor_ops_get(_program->flavor); | ||
| _program->runtime.chain = chain; | ||
| _program->img = bf_vector_default(sizeof(struct bpf_insn)); |
There was a problem hiding this comment.
+1 with Claude, programs are at least 200 instructions, so we can reserve 512 elements.
| int bf_program_emit(struct bf_program *program, struct bpf_insn insn) | ||
| { | ||
| int r; | ||
|
|
||
| assert(program); | ||
|
|
||
| if (program->img_size == program->img_cap) { | ||
| r = bf_program_grow_img(program); | ||
| if (r) | ||
| return r; | ||
| } | ||
|
|
||
| program->img[program->img_size++] = insn; | ||
|
|
||
| return 0; | ||
| return bf_vector_add(&program->img, &insn); | ||
| } |
There was a problem hiding this comment.
static inline in program.h.
| @@ -535,8 +519,8 @@ | |||
|
|
|||
| /* This call could fail and return an error, in which case it is not | |||
| * properly handled. However, this shouldn't be an issue as we previously | |||
| * test whether enough room is available in cgen.img, which is currently | |||
| * the only reason for EMIT() to fail. */ | |||
| * reserved enough room in program->img, which is currently the only | |||
| * reason for EMIT() to fail. */ | |||
| EMIT(program, insn); | |||
There was a problem hiding this comment.
The original code is weird in order to ensure the state of bf_program is always valid (i.e. we don't insert a fixup for an instruction that doesn't exist).
We don't need to maintain a valid bf_program state anymore, as it doesn't survive past the generation step. Hence, you can push the fixup to the list, then use bf_program_emit(), and return an error on failure. It will be properly taken care of by bpfilter.
| int bf_program_emit_fixup_elfstub(struct bf_program *program, | ||
| enum bf_elfstub_id id) | ||
| { | ||
| _free_bf_fixup_ struct bf_fixup *fixup = NULL; | ||
| int r; | ||
|
|
||
| assert(program); | ||
|
|
||
| if (program->img_size == program->img_cap) { | ||
| r = bf_program_grow_img(program); | ||
| if (r) | ||
| return r; | ||
| } | ||
| r = bf_vector_reserve(&program->img, program->img.size + 1); | ||
| if (r) | ||
| return r; | ||
|
|
||
| r = bf_fixup_new(&fixup, BF_FIXUP_ELFSTUB_CALL, program->img_size, NULL); | ||
| r = bf_fixup_new(&fixup, BF_FIXUP_ELFSTUB_CALL, program->img.size, NULL); | ||
| if (r) | ||
| return r; | ||
|
|
||
| fixup->attr.elfstub_id = id; | ||
|
|
||
| r = bf_list_add_tail(&program->fixups, fixup); | ||
| if (r) | ||
| return r; | ||
|
|
||
| TAKE_PTR(fixup); | ||
|
|
||
| EMIT(program, BPF_CALL_REL(0)); | ||
|
|
||
| return 0; | ||
| } |
There was a problem hiding this comment.
Same as bf_program_emit_fixup.
I initially planned to use bf_vector in bf_hashset, but I got convinced by @yaakov-stein that it's not a good idea. However, we could use bf_vector in other places, especially where we use bf_dynbuf or where we handroll vector-like behavior.
Commits:
lib: core: add bf_vector- implementation. Vector doubles in size when it hits the limits. Has a helper to remove elements as well, though shrinking is not implemented.cgen: use bf_vector for img- Instead of managingimg,img_size, andimg_capby hand, use bf_vector.