Conversation
There was a problem hiding this comment.
1 issue found across 9 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="src/virtio-net.c">
<violation number="1" location="src/virtio-net.c:372">
P2: `goto tx_publish` skips `dev->tx_wait_for_tap = false`, leaving stale back-pressure state if a previous iteration hit EAGAIN. Move the assignment above the label so it applies to all paths that publish USED.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
Address ten correctness and safety issues across the virtqueue chain walking, descriptor validation, and back-pressure paths. Chain walking. Both devices walked exactly one (virtio-net) or three (virtio-blk) descriptors per request, so a guest that split a packet or a >4 KiB block I/O across multiple descriptors got partial transfers and the next chain descriptor was reinterpreted as a new request head. Both device emulators now snapshot the entire chain via VRING_DESC_F_NEXT, sized to the VIRTQ_SIZE-bounded stack array, and route data through readv/writev (net) or per-segment diskimg I/O (blk). Descriptor direction. Neither device validated VRING_DESC_F_WRITE before touching guest memory, so a buggy or malicious guest could hand a "readable-only" status descriptor and have the device write into memory it never offered for writes. Every consumer site now checks direction matches the request type and rejects mismatches with VIRTIO_BLK_S_IOERR or an empty USED publication. NULL guards. virtq_get_avail can legitimately return NULL on a malformed chain; the prior code dereferenced the result on virtio-blk's data and status reads, which is a host segfault, not a queue stall. Every consumer site now NULL-checks. On a malformed chain the device stalls the queue rather than publishing USED with chain[0].id (the buffer ID lives on the chain's last descriptor in packed virtqueues, so publishing with a head id risks pointing the driver at an unrelated in-flight chain). Always-publish-USED. Every error path in virtio-blk/virtio-net used to return without flipping USED on the head, leaking descriptors and eventually drifting the device's view of the ring out of sync with the driver's. Successful walks now always publish USED — IOERR with len=1 on validation failure, the real status byte plus device-writable byte count on success. Stack-array bounds. Per-iteration desc_snap[VIRTQ_SIZE] / iov[VIRTQ_SIZE] arrays are sized to the host-advertised queue depth. virtio-pci now clamps guest writes to queue_size at VIRTQ_SIZE and the walkers cap at VIRTQ_SIZE, so a guest cannot make the walk overrun the stack arrays. Atomic event flag stores. virtq_handle_avail reads guest_event->flags with __ATOMIC_ACQUIRE; the worker threads paired this with plain stores, so the compiler was free to tear or reorder against the surrounding completion writes. A new virtq_set_guest_event_flags helper replaces every plain store with __atomic_store_n / __ATOMIC_RELEASE. virtio-blk SG arithmetic. data_size widened from uint16_t to uint32_t so >64 KiB segments don't silently truncate before reaching diskimg I/O. Per-segment writable_total now accumulates in uint64_t with __builtin_add_overflow, capping at UINT32_MAX-1 so the reported used.len plus the trailing status byte fits the packed-ring uint32_t. VIRTIO_BLK_F_FLUSH. Without the negotiated FLUSH feature the Linux virtio-blk driver runs in writeback-without-barriers mode, so guest fsync(2) returns success while data sits in the host page cache. The device now advertises VIRTIO_BLK_F_FLUSH, dispatches T_FLUSH to a new diskimg_flush() (fdatasync), and fsyncs the backing file at exit. TX back-pressure. The TX worker previously required tapfd POLLOUT in its poll predicate; with a level-triggered ioeventfd POLLIN that meant 100% CPU until the TAP drained. The poll set now drops the TAP predicate by default, drains the ioeventfd on a guest kick, and arms POLLOUT only after a transient writev() EAGAIN. On EAGAIN the chain's next_avail_idx and used_wrap_count are rolled back so the TAP-blocked chain is retried on POLLOUT rather than silently dropped.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Address ten correctness and safety issues across the virtqueue chain walking, descriptor validation, and back-pressure paths.
Chain walking. Both devices walked exactly one (virtio-net) or three (virtio-blk) descriptors per request, so a guest that split a packet or a >4 KiB block I/O across multiple descriptors got partial transfers and the next chain descriptor was reinterpreted as a new request head. Both device emulators now snapshot the entire chain via VRING_DESC_F_NEXT, sized to the VIRTQ_SIZE-bounded stack array, and route data through readv/writev (net) or per-segment diskimg I/O (blk).
Descriptor direction. Neither device validated VRING_DESC_F_WRITE before touching guest memory, so a buggy or malicious guest could hand a "readable-only" status descriptor and have the device write into memory it never offered for writes. Every consumer site now checks direction matches the request type and rejects mismatches with VIRTIO_BLK_S_IOERR or an empty USED publication.
NULL guards. virtq_get_avail can legitimately return NULL on a malformed chain; the prior code dereferenced the result on virtio-blk's data and status reads, which is a host segfault, not a queue stall. Every consumer site now NULL-checks. On a malformed chain the device stalls the queue rather than publishing USED with chain[0].id (the buffer ID lives on the chain's last descriptor in packed virtqueues, so publishing with a head id risks pointing the driver at an unrelated in-flight chain).
Always-publish-USED. Every error path in virtio-blk/virtio-net used to return without flipping USED on the head, leaking descriptors and eventually drifting the device's view of the ring out of sync with the driver's. Successful walks now always publish USED — IOERR with len=1 on validation failure, the real status byte plus device-writable byte count on success.
Stack-array bounds. Per-iteration desc_snap[VIRTQ_SIZE] / iov[VIRTQ_SIZE] arrays are sized to the host-advertised queue depth. virtio-pci now clamps guest writes to queue_size at VIRTQ_SIZE and the walkers cap at VIRTQ_SIZE, so a guest cannot make the walk overrun the stack arrays.
Atomic event flag stores. virtq_handle_avail reads guest_event->flags with __ATOMIC_ACQUIRE; the worker threads paired this with plain stores, so the compiler was free to tear or reorder against the surrounding completion writes. A new virtq_set_guest_event_flags helper replaces every plain store with __atomic_store_n / __ATOMIC_RELEASE.
virtio-blk SG arithmetic. data_size widened from uint16_t to uint32_t so >64 KiB segments don't silently truncate before reaching diskimg I/O. Per-segment writable_total now accumulates in uint64_t with __builtin_add_overflow, capping at UINT32_MAX-1 so the reported used.len plus the trailing status byte fits the packed-ring uint32_t.
VIRTIO_BLK_F_FLUSH. Without the negotiated FLUSH feature the Linux virtio-blk driver runs in writeback-without-barriers mode, so guest fsync(2) returns success while data sits in the host page cache. The device now advertises VIRTIO_BLK_F_FLUSH, dispatches T_FLUSH to a new diskimg_flush() (fdatasync), and fsyncs the backing file at exit.
TX back-pressure. The TX worker previously required tapfd POLLOUT in its poll predicate; with a level-triggered ioeventfd POLLIN that meant 100% CPU until the TAP drained. The poll set now drops the TAP predicate by default, drains the ioeventfd on a guest kick, and arms POLLOUT only after a transient writev() EAGAIN. On EAGAIN the chain's next_avail_idx and used_wrap_count are rolled back so the TAP-blocked chain is retried on POLLOUT rather than silently dropped.
Summary by cubic
Hardened
virtio-blkandvirtio-netby walking full packed-virtqueue descriptor chains, enforcing descriptor direction, using atomic event-flag stores, and improving TX back-pressure. This prevents partial I/O, ring desyncs, and busy spins, and adds flush support so guest fsync is durable.Bug Fixes
readv/writev(net) or per-segment disk I/O (blk).VRING_DESC_F_WRITEand header sizes; NULL-check chain elements; stall on malformed chains.used.lenviavirtq_publish_used.queue_sizeand chain length toVIRTQ_SIZE; bound stack arrays; widen blkdata_sizetouint32_tand cap 64-bit totals.virtq_set_guest_event_flagsand for USED flips to fix ordering.ioeventfd, armPOLLOUTonly afterEAGAIN, roll back indices to retry, and stop CPU spin; RX/TX handle the virtio header correctly.New Features
VIRTIO_BLK_F_FLUSH; handleT_FLUSHviafdatasyncand flush the backing file on device shutdown.Written for commit b461e1a. Summary will update on new commits. Review in cubic