New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpf: Add user-space-publisher ringbuffer map type #3517
Conversation
Master branch: fb8d784 |
Master branch: fb8d784 |
7c9f41a
to
e41e5e9
Compare
Master branch: 75179e2 |
e41e5e9
to
00ccd6a
Compare
Master branch: 75179e2 |
00ccd6a
to
c102d67
Compare
Master branch: e918cd2 |
c102d67
to
5726fa4
Compare
Master branch: b979f00 |
5726fa4
to
571ca68
Compare
Master branch: b979f00 |
571ca68
to
8eb848b
Compare
Master branch: 92ec1cc |
8eb848b
to
ddbab7a
Compare
Master branch: 92ec1cc |
ddbab7a
to
6c83058
Compare
Master branch: f52c894 |
6c83058
to
bf7e623
Compare
Master branch: d24433c |
bf7e623
to
3ca41ef
Compare
Master branch: d24433c |
3ca41ef
to
4726ae5
Compare
Master branch: a8df1b0 |
4726ae5
to
caa9ffe
Compare
We want to support a ringbuf map type where samples are published from user-space, to be consumed by BPF programs. BPF currently supports a kernel -> user-space circular ringbuffer via the BPF_MAP_TYPE_RINGBUF map type. We'll need to define a new map type for user-space -> kernel, as none of the helpers exported for BPF_MAP_TYPE_RINGBUF will apply to a user-space producer ringbuffer, and we'll want to add one or more helper functions that would not apply for a kernel-producer ringbuffer. This patch therefore adds a new BPF_MAP_TYPE_USER_RINGBUF map type definition. The map type is useless in its current form, as there is no way to access or use it for anything until we one or more BPF helpers. A follow-on patch will therefore add a new helper function that allows BPF programs to run callbacks on samples that are published to the ringbuffer. Signed-off-by: David Vernet <void@manifault.com>
In a prior change, we added a new BPF_MAP_TYPE_USER_RINGBUF map type which will allow user-space applications to publish messages to a ringbuffer that is consumed by a BPF program in kernel-space. In order for this map-type to be useful, it will require a BPF helper function that BPF programs can invoke to drain samples from the ringbuffer, and invoke callbacks on those samples. This change adds that capability via a new BPF helper function: bpf_user_ringbuf_drain(struct bpf_map *map, void *callback_fn, void *ctx, u64 flags) BPF programs may invoke this function to run callback_fn() on a series of samples in the ringbuffer. callback_fn() has the following signature: long callback_fn(struct bpf_dynptr *dynptr, void *context); Samples are provided to the callback in the form of struct bpf_dynptr *'s, which the program can read using BPF helper functions for querying struct bpf_dynptr's. In order to support bpf_ringbuf_drain(), a new PTR_TO_DYNPTR register type is added to the verifier to reflect a dynptr that was allocated by a helper function and passed to a BPF program. Unlike PTR_TO_STACK dynptrs which are allocated on the stack by a BPF program, PTR_TO_DYNPTR dynptrs need not use reference tracking, as the BPF helper is trusted to properly free the dynptr before returning. The verifier currently only supports PTR_TO_DYNPTR registers that are also DYNPTR_TYPE_LOCAL. Note that while the corresponding user-space libbpf logic will be added in a subsequent patch, this patch does contain an implementation of the .map_poll() callback for BPF_MAP_TYPE_USER_RINGBUF maps. This .map_poll() callback guarantees that an epoll-waiting user-space producer will receive at least one event notification whenever at least one sample is drained in an invocation of bpf_user_ringbuf_drain(), provided that the function is not invoked with the BPF_RB_NO_WAKEUP flag. Sending an event notification for every sample is not an option, as it could cause the system to hang due to invoking irq_work_queue() in too-frequent succession. So as to try and optimize for the common case, however, bpf_user_ringbuf_drain() will also send an event notification whenever a sample being drained causes the ringbuffer to no longer be full. This heuristic may not help some user-space producers, as a producer can publish samples of varying size, and there may not be enough space in the ringbuffer after the first sample is drained which causes it to no longer be full. In this case, the producer may have to wait until bpf_ringbuf_drain() returns to receive an event notification. Signed-off-by: David Vernet <void@manifault.com>
Now that all of the logic is in place in the kernel to support user-space produced ringbuffers, we can add the user-space logic to libbpf. This patch therefore adds the following public symbols to libbpf: struct user_ring_buffer * user_ring_buffer__new(int map_fd, const struct user_ring_buffer_opts *opts); void *user_ring_buffer__reserve(struct user_ring_buffer *rb, __u32 size); void *user_ring_buffer__reserve_blocking(struct user_ring_buffer *rb, __u32 size, int timeout_ms); void user_ring_buffer__submit(struct user_ring_buffer *rb, void *sample); void user_ring_buffer__discard(struct user_ring_buffer *rb, void user_ring_buffer__free(struct user_ring_buffer *rb); A user-space producer must first create a struct user_ring_buffer * object with user_ring_buffer__new(), and can then reserve samples in the ringbuffer using one of the following two symbols: void *user_ring_buffer__reserve(struct user_ring_buffer *rb, __u32 size); void *user_ring_buffer__reserve_blocking(struct user_ring_buffer *rb, __u32 size, int timeout_ms); With user_ring_buffer__reserve(), a pointer to an @SiZe region of the ringbuffer will be returned if sufficient space is available in the buffer. user_ring_buffer__reserve_blocking() provides similar semantics, but will block for up to @timeout_ms in epoll_wait if there is insufficient space in the buffer. This function has the guarantee from the kernel that it will receive at least one event-notification per invocation to bpf_ringbuf_drain(), provided that at least one sample is drained, and the BPF program did not pass the BPF_RB_NO_WAKEUP flag to bpf_ringbuf_drain(). Once a sample is reserved, it must either be committed to the ringbuffer with user_ring_buffer__submit(), or discarded with user_ring_buffer__discard(). Signed-off-by: David Vernet <void@manifault.com>
Master branch: 092e677 |
This change includes selftests that validate the expected behavior and APIs of the new BPF_MAP_TYPE_USER_RINGBUF map type. Signed-off-by: David Vernet <void@manifault.com>
caa9ffe
to
412d818
Compare
At least one diff in series https://patchwork.kernel.org/project/netdevbpf/list/?series=668997 expired. Closing PR. |
Pull request for series with
subject: bpf: Add user-space-publisher ringbuffer map type
version: 3
url: https://patchwork.kernel.org/project/netdevbpf/list/?series=668997