io_uring: add buf-ring + multishot recv to IoUringImpl#44668
Open
aburan28 wants to merge 3 commits into
Open
Conversation
This is a no-behavior-change preparation step for multishot recv. The ``CompletionCb`` callback type now takes a ``uint32_t flags`` argument that carries the raw ``cqe->flags`` value from the kernel. For multishot completions a follow-up change will inspect: * ``IORING_CQE_F_BUFFER`` — a buffer was selected from a buf-ring; the buffer ID is encoded in the upper bits. * ``IORING_CQE_F_MORE`` — the SQE will produce further completions. The worker callback ignores ``flags`` for now. Injected completions are defined to always carry ``flags == 0``. All ``forEveryCompletion`` callers (worker, impl tests) updated. ``IoUringSocket::on*`` virtual methods are intentionally unchanged in this commit; only ``onRead`` will need flags, in the multishot recv change. Signed-off-by: Adam Buran <a.buran28@gmail.com> Signed-off-by: Adam Buran <aburan28@gmail.com>
Adds the kernel-managed buffer ring lifecycle and the ``recv`` multishot opcode to ``IoUringImpl``. This is the plumbing layer for switching the io_uring socket read path off the per-read ``readv`` allocation; the worker change comes in a follow-up PR. New ``IoUring`` virtuals: * ``setupBufRing(group_id, count, buf_size)`` — register a buffer ring with the kernel. The buffers live in a single contiguous allocation owned by ``IoUringImpl``. Validates that ``count`` is a non-zero power of two and rejects double-setup. Falls back to ``IoUringResult::Failed`` on kernels that lack ``IORING_REGISTER_PBUF_RING`` (< 5.19). * ``prepareRecvMultishot(fd, group_id, user_data)`` — submits a recv with ``IOSQE_BUFFER_SELECT`` so the kernel pulls a buffer from the ring. The same SQE may produce multiple completions, signalled by ``IORING_CQE_F_MORE`` in ``cqe->flags``. * ``getBufferForBid(group_id, bid)`` — look up the storage backing a particular kernel-selected buffer; the consumer reads up to ``cqe->res`` bytes and then recycles. * ``recycleBuffer(group_id, bid)`` — return a consumed buffer to the ring so the kernel can reuse it. For now only one buf-ring is supported per ``IoUring`` instance. Test: * ``SetupBufRingValidatesInputs`` — exercises the rejection paths (bad count, bad buf_size, double-setup). * ``MultishotRecvDeliversBuffersAndStaysArmed`` — end-to-end with a real socketpair and a real ring: arm a multishot recv, write twice, verify both completions deliver buffers, the bid is in range, the data matches, and the SQE stays armed (F_MORE set on the first completion). Skips when the kernel lacks buf-ring support. Signed-off-by: Adam Buran <a.buran28@gmail.com> Signed-off-by: Adam Buran <aburan28@gmail.com>
|
Hi @aburan28, welcome and thank you for your contribution. We will try to review your Pull Request as quickly as possible. In the meantime, please take a look at the contribution guidelines if you have not done so already. |
This was referenced Apr 27, 2026
Member
|
Let's mark this as a draft until the PR it depends on is merged. |
Signed-off-by: Adam Buran <aburan28@gmail.com>
Contributor
|
Please mark as draft if not ready to merge. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Commit Message:
io_uring: add buf-ring + multishot recv API to IoUringImpl
Plumbing layer for switching the io_uring socket read path off the per-read
readvallocation. Adds the kernel-managed buffer ring(
IORING_REGISTER_PBUF_RING) lifecycle andIORING_OP_RECVmultishot toIoUringImpl. The worker change that consumes this comes in a follow-up PR.New
IoUringvirtuals:setupBufRing(group_id, count, buf_size)— registers a buffer ring withthe kernel. Buffers live in a single contiguous allocation owned by
IoUringImpl. Validates thatcountis a non-zero power of two; rejectsdouble-setup. Returns
Failedon kernels < 5.19 (noIORING_REGISTER_PBUF_RING).prepareRecvMultishot(fd, group_id, user_data)— submits a recv withIOSQE_BUFFER_SELECT. The same SQE may produce multiple completions,signalled by
IORING_CQE_F_MOREincqe->flags.getBufferForBid(group_id, bid)— looks up the storage backing akernel-selected buffer.
recycleBuffer(group_id, bid)— returns a consumed buffer to the ring.Only one buf-ring per
IoUringinstance for now.Depends on #44667 (
CompletionCbflags-arg refactor — required soforEveryCompletioncan surfacecqe->flagsto the multishot consumer).Additional Description:
The full multishot read path touches the worker, the server-socket read state
machine, and the proto config. Landing the
IoUringImplAPI on its own givesreviewers a smaller, well-scoped surface to vet against the liburing/kernel
contract.
AI usage disclosure: Portions of the code and/or PR description were drafted
with the assistance of Claude (Anthropic). I reviewed and understand all
submitted code.
Risk Level: Low
(API-only addition. Nothing in this PR calls the new virtuals — the worker-
side caller arrives in #44669. No existing path changes.)
Testing:
SetupBufRingValidatesInputscovers rejection paths (badcount, bad
buf_size, double-setup).MultishotRecvDeliversBuffersAndStaysArmeduses a realIoUringImpl+ socketpair: arms a multishot recv, writes twice, verifiesboth completions deliver buffers, bids are in range, data matches, and
F_MOREstays set across recycles. Skips when the kernel lacks buf-ringsupport.
Docs Changes: N/A. New methods are documented inline in
envoy/common/io/io_uring.h.Release Notes: N/A (internal API additions; not yet exposed to extension
authors or operators — exposure happens in #44670).
Platform Specific Features:
io_uring is Linux-only. Buf-ring requires kernel 5.19+;
setupBufRingreturns
Failedon older kernels and callers are expected to fall back. Noplatform support change beyond the existing io_uring build gating.
Runtime guard: N/A. No call sites in this PR — the new API is dead code
until #44669 lands.