ipc/mqueue: add fcntl(F_MQ_PEEK) for non-destructive message inspection#911
Open
vfsci-bot[bot] wants to merge 3 commits intovfs.base.cifrom
Open
ipc/mqueue: add fcntl(F_MQ_PEEK) for non-destructive message inspection#911vfsci-bot[bot] wants to merge 3 commits intovfs.base.cifrom
vfsci-bot[bot] wants to merge 3 commits intovfs.base.cifrom
Conversation
Add the user-visible interface for non-destructive POSIX message queue inspection via fcntl(2). POSIX message queues have no way to inspect queued messages without consuming them: mq_receive() always dequeues the message it returns. This makes it impossible for checkpoint/restore tools such as CRIU to save and replay message queue contents without destroying the queue state in the process. struct mq_peek_attr describes the request: the caller specifies an index into the queue in receive order (0 = next message that mq_receive() would return, i.e. highest priority, FIFO within same priority) and a buffer to receive the payload. On return, msg_prio is filled with the message priority and the return value is the number of bytes copied. F_MQ_PEEK = F_LINUX_SPECIFIC_BASE + 17 is the new fcntl command that accepts a pointer to struct mq_peek_attr. Link: checkpoint-restore/criu#2285 Signed-off-by: Shaurya Rane <ssrane_b23@ee.vjti.ac.in>
struct msg_msgseg and the DATALEN_MSG / DATALEN_SEG macros are currently private to ipc/msgutil.c. struct msg_msg (already in the public kernel header include/linux/msg.h) carries a pointer to msg_msgseg, making it an incomplete type for all callers outside msgutil.c. Move the definition of struct msg_msgseg and the two DATALEN macros to include/linux/msg.h so that other IPC code can safely copy multi-segment message payloads into a kernel buffer under a spinlock, without calling store_msg() which performs copy_to_user() and therefore cannot be used under a spinlock. ipc/msgutil.c already includes <linux/msg.h>, so it picks up the definitions from the header with no functional change. Signed-off-by: Shaurya Rane <ssrane_b23@ee.vjti.ac.in>
…spection Add support for F_MQ_PEEK, a new fcntl command that reads a POSIX message queue message by index without removing it from the queue. Background: CRIU (Checkpoint/Restore In Userspace) supports live container migration and process checkpoint/restore. POSIX message queues are a widely-used IPC mechanism, but CRIU cannot checkpoint processes that hold open mqueue file descriptors: there is no kernel interface to inspect queued messages non-destructively. The SysV IPC analogue (MSG_COPY for msgrcv) was introduced specifically for CRIU in commit 4a674f3 ("ipc: introduce message queue copy feature"). This patch provides the equivalent for POSIX mqueues. Implementation: The queue stores messages in a red-black tree (info->msg_tree) keyed by priority, with each tree node holding a FIFO list of messages at that priority level. mq_peek_at_offset() walks this structure in receive order (highest priority first, FIFO within priority) to locate the message at the requested index without modifying any state. Message payload is copied into a kvmalloc'd kernel buffer under info->lock using pure memcpy() (no page faults possible). This correctly handles multi-segment messages by walking the msg_msgseg chain. The lock is released before copy_to_user() transfers the kernel buffer to userspace. A new include/linux/mqueue.h kernel header is added to declare do_mq_peek() for use from fs/fcntl.c, following the same pattern as include/linux/memfd.h for memfd_fcntl(). Concurrency: The snapshot is consistent within the spin_lock() critical section. Between two F_MQ_PEEK calls the queue may change (messages may be sent or received). This is documented snapshot semantics, analogous to /proc entries. CRIU freezes the target process via ptrace before dumping, so in practice the queue is stable for the entire checkpoint sequence. Link: checkpoint-restore/criu#2285 Signed-off-by: Shaurya Rane <ssrane_b23@ee.vjti.ac.in>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Series: https://patchwork.kernel.org/project/linux-fsdevel/list/?series=1072498
Submitter: Shaurya Rane
Version: 1
Patches: 3/3
Message-ID:
<20260325190025.40312-1-ssrane_b23@ee.vjti.ac.in>Base: vfs.base.ci
Lore: https://lore.kernel.org/linux-fsdevel/20260325190025.40312-1-ssrane_b23@ee.vjti.ac.in
Automated by ml2pr