Skip to content

Commit

Permalink
futex: Implement mechanism to wait on any of several futexes
Browse files Browse the repository at this point in the history
This is a new futex operation, called FUTEX_WAIT_MULTIPLE, which allows
a thread to wait on several futexes at the same time, and be awoken by
any of them.  In a sense, it implements one of the features that was
supported by pooling on the old FUTEX_FD interface.

The use case lies in the Wine implementation of the Windows NT interface
WaitMultipleObjects. This Windows API function allows a thread to sleep
waiting on the first of a set of event sources (mutexes, timers, signal,
console input, etc) to signal.  Considering this is a primitive
synchronization operation for Windows applications, being able to quickly
signal events on the producer side, and quickly go to sleep on the
consumer side is essential for good performance of those running over Wine.

Wine developers have an implementation that uses eventfd, but it suffers
from FD exhaustion (there is applications that go to the order of
multi-milion FDs), and higher CPU utilization than this new operation.

The futex list is passed as an array of `struct futex_wait_block`
(pointer, value, bitset) to the kernel, which will enqueue all of them
and sleep if none was already triggered. It returns a hint of which
futex caused the wake up event to userspace, but the hint doesn't
guarantee that is the only futex triggered.  Before calling the syscall
again, userspace should traverse the list, trying to re-acquire any of
the other futexes, to prevent an immediate -EWOULDBLOCK return code from
the kernel.

This was tested using three mechanisms:

1) By reimplementing FUTEX_WAIT in terms of FUTEX_WAIT_MULTIPLE and
running the unmodified tools/testing/selftests/futex and a full linux
distro on top of this kernel.

2) By an example code that exercises the FUTEX_WAIT_MULTIPLE path on a
multi-threaded, event-handling setup.

3) By running the Wine fsync implementation and executing multi-threaded
applications, in particular modern games, on top of this implementation.

Changes were tested for the following ABIs: x86_64, i386 and x32.
Support for x32 applications is not implemented since it would
take a major rework adding a new entry point and splitting the current
futex 64 entry point in two and we can't change the current x32 syscall
number without breaking user space compatibility.

Included Valve's Proton compatibility code.

Adjusted for v5.9: Removed `put_futex_key` calls.

CC: Steven Rostedt <rostedt@goodmis.org>
Cc: Richard Yao <ryao@gentoo.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Co-developed-by: Zebediah Figura <z.figura12@gmail.com>
Signed-off-by: Zebediah Figura <z.figura12@gmail.com>
Co-developed-by: Steven Noonan <steven@valvesoftware.com>
Signed-off-by: Steven Noonan <steven@valvesoftware.com>
Co-developed-by: Pierre-Loup A. Griffais <pgriffais@valvesoftware.com>
Signed-off-by: Pierre-Loup A. Griffais <pgriffais@valvesoftware.com>
Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
[Added compatibility code]
Co-developed-by: André Almeida <andrealmeid@collabora.com>
Signed-off-by: André Almeida <andrealmeid@collabora.com>
Signed-off-by: Alexandre Frade <kernel@xanmod.org>
  • Loading branch information
krisman-at-collabora authored and xanmod committed Jul 29, 2021
1 parent 62c41f2 commit e32cf8b
Show file tree
Hide file tree
Showing 2 changed files with 370 additions and 1 deletion.
20 changes: 20 additions & 0 deletions include/uapi/linux/futex.h
Expand Up @@ -21,6 +21,7 @@
#define FUTEX_WAKE_BITSET 10
#define FUTEX_WAIT_REQUEUE_PI 11
#define FUTEX_CMP_REQUEUE_PI 12
#define FUTEX_WAIT_MULTIPLE 31

#define FUTEX_PRIVATE_FLAG 128
#define FUTEX_CLOCK_REALTIME 256
Expand All @@ -40,6 +41,8 @@
FUTEX_PRIVATE_FLAG)
#define FUTEX_CMP_REQUEUE_PI_PRIVATE (FUTEX_CMP_REQUEUE_PI | \
FUTEX_PRIVATE_FLAG)
#define FUTEX_WAIT_MULTIPLE_PRIVATE (FUTEX_WAIT_MULTIPLE | \
FUTEX_PRIVATE_FLAG)

/* Size argument to futex2 syscall */
#define FUTEX_32 2
Expand Down Expand Up @@ -181,4 +184,21 @@ struct robust_list_head {
(((op & 0xf) << 28) | ((cmp & 0xf) << 24) \
| ((oparg & 0xfff) << 12) | (cmparg & 0xfff))

/*
* Maximum number of multiple futexes to wait for
*/
#define FUTEX_MULTIPLE_MAX_COUNT 128

/**
* struct futex_wait_block - Block of futexes to be waited for
* @uaddr: User address of the futex
* @val: Futex value expected by userspace
* @bitset: Bitset for the optional bitmasked wakeup
*/
struct futex_wait_block {
__u32 __user *uaddr;
__u32 val;
__u32 bitset;
};

#endif /* _UAPI_LINUX_FUTEX_H */

0 comments on commit e32cf8b

Please sign in to comment.