Permalink
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Browse files
futex2: Implement wait and wake functions
Create a new set of futex syscalls known as futex2. This new interface
is aimed to implement a more maintainable code, while removing obsolete
features and expanding it with new functionalities.
Implements wait and wake semantics for futexes, along with the base
infrastructure for future operations. The whole wait path is designed to
be used by N waiters, thus making easier to implement vectorized wait.
* Syscalls implemented by this patch:
- futex_wait(void *uaddr, unsigned int val, unsigned int flags,
struct timespec *timo)
The user thread is put to sleep, waiting for a futex_wake() at uaddr,
if the value at *uaddr is the same as val (otherwise, the syscall
returns immediately with -EAGAIN). timo is an optional timeout value
for the operation.
Return 0 on success, error code otherwise.
- futex_wake(void *uaddr, unsigned long nr_wake, unsigned int flags)
Wake `nr_wake` threads waiting at uaddr.
Return the number of woken threads on success, error code otherwise.
** The `flag` argument
The flag is used to specify the size of the futex word
(FUTEX_[8, 16, 32]). It's mandatory to define one, since there's no
default size.
By default, the timeout uses a monotonic clock, but can be used as a realtime
one by using the FUTEX_REALTIME_CLOCK flag.
By default, futexes are of the private type, that means that this user address
will be accessed by threads that shares the same memory region. This allows for
some internal optimizations, so they are faster. However, if the address needs
to be shared with different processes (like using `mmap()` or `shm()`), they
need to be defined as shared and the flag FUTEX_SHARED_FLAG is used to set that.
By default, the operation has no NUMA-awareness, meaning that the user can't
choose the memory node where the kernel side futex data will be stored. The
user can choose the node where it wants to operate by setting the
FUTEX_NUMA_FLAG and using the following structure (where X can be 8, 16, or
32):
struct futexX_numa {
__uX value;
__sX hint;
};
This structure should be passed at the `void *uaddr` of futex functions. The
address of the structure will be used to be waited/waken on, and the
`value` will be compared to `val` as usual. The `hint` member is used to
defined which node the futex will use. When waiting, the futex will be
registered on a kernel-side table stored on that node; when waking, the futex
will be searched for on that given table. That means that there's no redundancy
between tables, and the wrong `hint` value will led to undesired behavior.
Userspace is responsible for dealing with node migrations issues that may
occur. `hint` can range from [0, MAX_NUMA_NODES], for specifying a node, or
-1, to use the same node the current process is using.
When not using FUTEX_NUMA_FLAG on a NUMA system, the futex will be stored on a
global table on some node, defined at compilation time.
** The `timo` argument
As per the Y2038 work done in the kernel, new interfaces shouldn't add timeout
options known to be buggy. Given that, `timo` should be a 64bit timeout at
all platforms, using an absolute timeout value.
Signed-off-by: André Almeida <andrealmeid@collabora.com>
Rebased-by: Joshua Ashton <joshua@froggi.es>- Loading branch information