Introduce support for running code in fibers#39771
Conversation
dcb04ca to
12fb4ae
Compare
Traditionally, asynchronous programming in systemd has been achieved using sd-event along with the asynchronous interfaces of sd-bus and sd-varlink. This works well when the system is reacting to events and all code triggered by those events can run without blocking. In these scenarios, the global Manager object is passed as userdata to the callback, and the callback can use the stack as usual, declaring local state and ensuring proper cleanup via _cleanup_. Control flow structures, such as loops, work as expected, and everything runs smoothly. However, challenges arise when the code needs to perform long-running operations within these callbacks. Since the system cannot block execution within the callback, we can't directly invoke a long-running operation and wait for its result without introducing complexities. Instead, we need to initiate the long-running task, register for completion with sd-event, sd-bus, or sd-varlink, and provide a callback to be invoked when the operation completes. This callback, however, only receives a single userdata pointer, which forces us to bundle all local variables into a struct and pass it along as part of the callback. On top of that, after queuing the asynchronous operation, the caller continues executing. As the caller's stack unwinds when the function exits, the resources and state within the local scope may be prematurely cleaned up. Therefore, the struct must store copies of the local variables or ensure proper reference counting to prevent premature resource cleanup. When multiple long-running operations need to be initiated within a loop, the complexity grows further. We must introduce additional shared state to track the completion of all operations before we can run any code that depends on their results. Furthermore, since the daemon may be shut down at any time, we must track the lifecycle of each long-running operation in the global Manager struct, ensuring proper cleanup even when stack unwinding can no longer manage the resources for us. Fibers, or green threads, provide a more natural way of handling asynchronous operations. By enabling cooperative multitasking within a single thread, fibers allow us to write code that looks like it’s running synchronously, but with the ability to yield control at predefined points, such as when waiting for long-running tasks to complete. With fibers, we can simplify the control flow by running asynchronous operations within a fiber, allowing us to "pause" execution while waiting for the long-running operation to finish and then "resume" the operation once it's complete. This eliminates the need for multiple callback chains, extensive state tracking, and the potential pitfalls of stack unwinding. This commit introduces the ability to execute long-running operations in a non-blocking manner while maintaining the simplicity and readability of synchronous code. The fiber-based approach will significantly improve the handling of complex workflows, making the code easier to write and maintain. The implementation is based on ucontext.h's makecontext() (with a fallback to the venerable sigaltstack() approach on musl), sigsetjmp()/siglongjmp() and sd-event. ucontext.h provides us with alternate stacks that we can switch between. We use sigsetjmp()/siglongjmp() instead of swapcontext() because the latter forcibly saves/restores a per context signal mask every time it is called. Using sigsetjmp()/siglongjmp(), we can avoid the unnecessary syscall and maintain a per thread signal mask, which makes much more sense than having a per fiber signal mask. The default stack size is the same as a regular thread. Because we use mmap() to allocate the stack, the memory won't actually be used until it is paged in by the kernel, so we don't actually use 8MB per fiber. To integrate fibers with the event loop, each fiber is assigned a deferred event source which resumes the fiber when enabled. The deferred event source is oneshot by default so the fiber will run immediately until it yields or suspends. If it yields, the deferred event source is enabled again (oneshot) immediately. If it suspends, before it suspends, one or more event sources are registered with sd-event that will enable the deferred event source (oneshot) to resume the fiber once the operation it is waiting for completes. Yielding or suspending the fiber is done by calling sd_fiber_yield() or sd_fiber_suspend() respectively. Both of these return zero on success or any error value from the async operation that caused the fiber to resume. This is also how fiber cancellation is implemented. When a fiber is cancelled, sd_fiber_yield() and sd_fiber_suspend() will return ECANCELED when the fiber is resumed, allowing the fiber to unwind its stack (which allows cleanup to happen automatically) and finish. Instead of having applications work directly with fibers, we hide them behind a generic futures interface to represent long-running operations, regardless of whether those operations are running on a fiber or not. Aside from fibers, the futures library (sd-future) will for example allow waiting for sd-event sources and doing sd-bus calls in the background as well. Fibers can suspend until a future is ready with sd_fiber_await() or by having the future wake up the fiber explicitly in its callback. A future always defaults to waking up the current fiber. Each future kind plugs into the library by providing an sd_future_ops vtable (alloc, free, cancel, set_priority). The library treats the impl pointer returned by alloc() as a black box. Future Implementations retrieve it via sd_future_get_private(). A future starts in SD_FUTURE_PENDING and transitions exactly once to SD_FUTURE_RESOLVED, carrying an integer result. Consumers can react to that transition either by installing a one-shot callback with sd_future_set_callback() (callback-style code) or by waiting on it from a fiber via sd_fiber_await() (synchronous-looking fiber code). sd_fiber_await() is itself built on a "wait future" that resolves when its target resolves; sd_future_new_wait() exposes the same primitive directly so non-fiber callers can chain futures without involving a fiber. Cancellation is cooperative: sd_future_cancel() invokes the future impl's cancel callback, which is responsible for tearing down its work and ultimately resolving the promise with -ECANCELED. For fiber futures this is what surfaces as the ECANCELED return from sd_fiber_yield()/sd_fiber_suspend() mentioned above. Fire-and-forget fibers — created by passing a NULL ret to sd_fiber_new() — take a self-reference on their future so they outlive the caller's scope. The self-ref is dropped when the fiber resolves. This floating mechanism (sd_fiber_set_floating()) is restricted to fiber futures because they uniquely guarantee resolution; allowing it for arbitrary future kinds would risk silent leaks for kinds that may never resolve. Note that fiber cleanup depends on the runtime operating normally. Each fiber's _cleanup_-style cleanups live on the fiber's own stack and run only when the fiber is resumed and allowed to unwind, which requires a working event loop to drive it to completion. The exit event source registered for top-level fibers ensures unwind on a normal sd_event_exit(), but if the event loop itself terminates abnormally (e.g. an unrecoverable allocation failure mid-dispatch) before all fibers have resolved, their stacks never unwind and any resources they own leak. The code lives in libsystemd as sd-future (not exported) for the following reasons: - We may want to make this a public libsystemd API in the future - The code can't live in src/basic as it makes heavy use of sd-event - The code can't live in src/shared as sd-bus and sd-event make use of it The log and log-context headers are updated with functions to allow fibers to have their own log prefix and log context.
Add a family of sd_fiber_*() I/O wrappers that, when called from a
fiber, behave like blocking I/O from the caller's perspective but
yield to the event loop instead of blocking the thread:
sd_fiber_read / sd_fiber_write
sd_fiber_readv / sd_fiber_writev
sd_fiber_recv / sd_fiber_send
sd_fiber_connect
sd_fiber_recvmsg / sd_fiber_sendmsg
sd_fiber_recvfrom / sd_fiber_sendto
sd_fiber_accept
sd_fiber_ppoll
Most of them share a single helper, fiber_io_operation(), which when
invoked outside a fiber falls through to the underlying syscall
directly, preserving the regular blocking behaviour. Inside a fiber
the helper flips the fd to non-blocking (restoring its original mode
on return), tries the syscall once on the fast path, and on EAGAIN/
EWOULDBLOCK creates an sd-event-backed IO future via future_new_io(),
suspends the fiber, and retries the syscall once the event source
fires.
future_new_io() itself is added to sd-event/event-future.{c,h} as a
new IoFuture kind. It wraps sd_event_add_io() into an sd_future:
oneshot enable, EPOLLERR translated via SO_ERROR (suppressed for
non-sockets), and the fd duplicated with F_DUPFD_CLOEXEC to avoid
EEXIST when multiple sources watch the same descriptor.
Together these let fiber-using code write straight-line socket and
pipe I/O without bundling state into callbacks.
Some helpers in src/basic — ppoll_usec_full() (used by fd_wait_for_event()), loop_read(), loop_read_exact(), loop_write_full() and pidref_wait_for_terminate_full() — block the calling thread. That's the right behaviour outside a fiber but not inside one, where blocking the thread also stalls every other fiber running on the same event loop. Rewriting every caller to pick a fiber or non-fiber variant explicitly would be a lot of churn and would split otherwise-shared code paths in two. Instead, the helpers detect at runtime whether they're running on a fiber and dispatch to a suspending variant when they are. FiberOps in fiber-ops.h holds five function pointers (ppoll, read, write, timeout, cancel_wait_unref); a fiber_ops global constant is populated whenever we enter a fiber with functions that delegate to suspending variants of common syscalls. With this approach, the variants themselves stay in libsystemd which is required because they make use of sd-event. - loop_read()/loop_read_exact() take the fiber read hook on a fiber unless the caller asked for a non-blocking attempt (do_poll=false) and the fd is already non-blocking — in that case we fall through to read() to preserve the existing return-EAGAIN-immediately semantic. The hook itself suspends on EAGAIN until data is available, so neither the do_poll knob nor the explicit fd_wait_for_event() retry loop are needed on the fiber path. - loop_write_full() likewise takes the fiber write hook on a fiber, except when timeout=0 with an already-non-blocking fd (preserving the fast-return-EAGAIN semantic). The fiber path runs inside a FIBER_OPS_TIMEOUT() scope so the caller's timeout is honoured via a deadline future, mirroring SD_FIBER_TIMEOUT() but reachable from src/basic without pulling in sd-future.h. - pidref_wait_for_terminate_full() polls the pidfd via fd_wait_for_event() before each waitid() when either a finite timeout is set or we're on a fiber, and requires pidref->fd >= 0 in those cases (returning -ENOMEDIUM otherwise — extending the rule that already applied to finite timeouts). The poll suspends the fiber via the ppoll hook above; the subsequent waitid() doesn't block because the pidfd is already signalled.
…iber sd_event_run() blocks the calling thread on the event loop's epoll fd until something happens. When the caller is a fiber, that's the wrong behaviour: blocking the thread also stalls every other fiber and the outer event loop driving them. The most common way to hit this is a fiber that creates its own inner event loop (e.g. a server-style fiber that wants to dispatch its own sources independently of whatever loop the test or supervising fiber is running on) — with the existing implementation the inner sd_event_run() would hold the thread while the outer scheduler should be free to advance other fibers. Add an event_run_suspend() variant in sd-event/event-future.c that performs the same prepare/wait/dispatch dance, but when the fast path finds nothing ready it (a) creates an IO future watching the inner event loop's epoll fd on the *outer* event loop, (b) optionally creates a time future for the timeout, and (c) suspends the fiber. When either future fires the fiber is resumed and the prepare/wait/dispatch sequence runs once more to actually dispatch what's pending. sd_event_run() checks sd_fiber_is_running() and delegates to this variant when on a fiber; profile_delays accounting is intentionally skipped on that path since the underlying prepare/wait/dispatch primitives already account for themselves.
Two changes to teach sd-bus how to behave when called from a fiber, in
order of increasing depth:
2. sd_bus_call() now redirects to a new bus_call_suspend() helper when
the caller is a fiber whose event loop is the same one the bus is
attached to. The plain bus_poll() path serializes all bus traffic on
the slot's reply (only one method call can be in flight per
sd_bus*), which would defeat the point of running multiple fibers
against one bus. bus_call_suspend() builds on the async sd-bus API:
it wraps the call in a new BusFuture (sd-bus/bus-future.{c,h}) that
resolves when the reply or method-error arrives, lets the fiber
await that future, and surfaces the reply to the caller via
future_get_bus_reply(). Because the futures live on the event loop
rather than a per-bus slot, multiple fibers can drive concurrent
method calls against the same bus.
3. A new private SD_BUS_VTABLE_METHOD_FIBER flag dispatches a vtable
method handler on its own fiber, so handlers are free to use
sd_bus_call() against the same bus, sd_fiber_sleep(), loop_read(),
etc. without stalling the event loop for other connections or
handlers. The flag stays out of sd-bus-vtable.h (its bit value is
reserved there to prevent collisions) — the fiber runtime is a
systemd-internal implementation detail.
Lifecycle of fiber-dispatched handlers is tracked on the bus itself: a
new bus->fiber_futures set holds a ref to each in-flight handler.
bus_enter_closing() cancels every entry and process_closing() returns
with the bus still in CLOSING state until the set drains, so we can be
sure no fiber handler outlives the bus. bus_fiber_resolved() removes
the entry on completion. bus_free()'s assert(set_isempty()) makes the
invariant load-bearing.
Note that plain sd_bus_call() already works correctly on a fiber as it
calls ppoll_usec() which has already been modified to suspend when
running on a fiber.
To exercise these changes the existing thread-based client/server
sd-bus tests (test-bus-chat, test-bus-objects, test-bus-peersockaddr,
test-bus-server, test-bus-watch-bind) are migrated to fibers, and a
new test-bus-fiber is added that covers SD_BUS_VTABLE_METHOD_FIBER —
including handlers that issue nested sd_bus_call() on the same bus, the
cancel-on-close path, and concurrent dispatches across multiple fibers.
Add varlink_server_bind_fiber() and varlink_server_bind_fiber_many()
in varlink-util.{c,h} for registering a method handler that should
run on a dedicated fiber per dispatch. The fiber-bound methods live
in a separate s->fiber_methods map alongside the regular s->methods;
bind_internal()/bind_many_internal() are factored out so the regular
and fiber bind variants share their parsing/insertion code.
Registering the same method in both maps is rejected because the
dispatcher consults the regular map first and would otherwise
silently shadow the fiber binding.
varlink_dispatch_fiber() builds a VarlinkFiberData (refs to the
connection, parameters, and method name), spawns a fiber via
sd_fiber_new(), and makes the future floating so the fiber
self-manages its lifetime — neither the dispatcher nor the
connection has to track it. The fiber's priority is set to one
below the connection's quit event source so that on graceful
shutdown the fiber's exit handler fires (and runs its cleanup)
before varlink's quit_callback() closes the connection underneath
it; this is what lets a fiber-bound handler reply or flush its
sentinel on a still-open connection during shutdown.
The connection state transitions are reordered so they happen before
the fiber spawn rather than after the synchronous callback returns:
the fiber runs after dispatch has already moved past PROCESSING, which
matches the behaviour expected for a deferred reply (the fiber may
either reply immediately, or stash the connection and reply later, in
which case the post-callback logic treats it as a PENDING_METHOD).
Note that all the synchronous varlink APIs (sd_varlink_call() and friends)
already behave properly when on a fiber because they call json_stream_wait()
which calls ppoll_usec() which we already fixed to suspend when called from
a fiber.
The client/server varlink tests are migrated to fibers (threads → mock
server fibers on the same event loop) to exercise the new paths.
The synchronous qmp_client_call() pumps the event loop until its reply arrives, pinning the parsed reply on c->current so it can hand out borrowed pointers to the caller. That model only fits one in-flight sync call: a second qmp_client_call() on the same client clears c->current before issuing its own send, invalidating the first caller's borrowed pointers. On a single-threaded event loop that was fine, but with fibers two concurrent calls on the same client can interleave through the pump (json_stream_wait() suspends the running fiber) and trample each other. To fix this, make qmp_client_call() detect when it's running on a fiber whose event loop matches the client and transparently delegate to qmp_client_call_suspend(), which makes use of a new QmpFuture to allow multiple concurrent calls to qmp_client_call(). To make this work concurrently, we also change qmp_client_call() to hand out references and copies of errors so that we don't have to store the borrowed pointers we hand out in the QmpClient struct.
The mock servers used to be driven out-of-band: each test created a
socketpair, forked a child, ran a hand-coded request/response script
against the raw fd, and sent SIGTERM to tear it down. That worked but
required pidref/process-util/signal plumbing in every test, two
distinct execution contexts that couldn't share state, and a JsonStream
attached to the mock side that pretended to be event-loop-driven while
actually being driven manually via blocking reads.
Now that JsonStream suspends when on a fiber, the mocks can live
inside the same process and event loop as the client. Each mock is
rewritten as an sd-fiber that runs alongside the client fiber: so the
mock fiber yields on I/O and the event loop schedules the client in the
meantime. Both sides progress cooperatively, no fork/SIGTERM/PID tracking,
no manual phase tracking.
Two cleanups fall out of the rewrite:
- A QMP_TEST(name, mock_fn) { ... } macro encapsulates the per-test
scaffolding (event loop, socketpair, mock fiber spawn, exit-on-idle
shim) and injects an already-connected QmpClient *client into the
test body. Each test now reads as a flat sequence of
qmp_client_call() invocations against that client.
- Repeated mock command/reply scripting is factored into
mock_qmp_expect(), mock_qmp_reply(), mock_qmp_expect_and_reply(),
mock_qmp_handshake(), and mock_qmp_query_status_running(). The
greeting JSON is built with sd_json_buildo() instead of being parsed
from a literal.
The file shrinks from 756 to 494 lines, mostly through deletions.
| static void install_atfork(void) { | ||
| /* __register_atfork() either returns 0 or -ENOMEM, in its glibc implementation. Since it's | ||
| * only half-documented (glibc doesn't document it but LSB does — though only superficially) | ||
| * we'll check for errors only in the most generic fashion possible. */ | ||
| atfork_ret = pthread_atfork(/* prepare= */ NULL, /* parent= */ NULL, reset_current_fiber); | ||
| if (atfork_ret != 0) | ||
| log_debug_errno(atfork_ret, "pthread_atfork() failed: %m"); | ||
| } |
There was a problem hiding this comment.
You should note that atfork handlers aren't run for clone()/clone3()/raw_clone() and posix_spawn().
|
I am not very familiar with the systemd codebase, so I don't know how much sense this makes in the context of systemd, but: My understanding is that code which contains suspension points has to be written differently from regular code, because it needs to handle state changes that happen during suspension - like IPC calls that attempt to release an object that is referenced by a suspended stack frame, or IPC calls that change the size of an array that the suspended stack frame will access, or similar. So I wonder if it makes sense to mark functions that can contain fiber stuff in some way, to make this clearer to the programmer - either with an annotation at the function definition, or with a function name suffix, or something like that. |
I've thought about this but I couldn't think of a nice way to do it in C. Function name suffix would be viral. As soon as we make one core function potentially suspend, every function using it would have to be annotated, and that would apply recursively of course. My current thinking is that this isn't any different from regular multithreaded programming, where none of these annotations exist there. It's just a fact of life that if you run multiple fibers in a program, you have to make sure they're not working with mutable shared state or be very careful if they do. |
Yes, and I think that would be a good thing, because every function using it probably needs to be checked for things like non-refcounted pointers that are live on the stack across suspension points?
The way I think about this is: In regular multithreaded programming, when you're looking at a function, you normally only have to think about locks held by that function or its callers - unless a function drops a lock that was acquired further up on the call stack, but that's weird and rare. With fibers, when looking at a function, you instead have to think about suspension points in callees, which I think is harder. I'm not an expert on this, but AFAIK web browsers have had many security issues roughly like this because some JavaScript functions implemented in native code need to call back into JavaScript callbacks in some cases; this can lead to issues if, for example, native code performs checks on the length of an array, then does a callback into attacker-controlled JavaScript, which can change the length of that array, and then proceeds with the assumption that the array still has the same length. |
Only if the function using it is ever invoked on a fiber itself and only for pointers shared across fibers. If other fibers can never access a pointer because it was allocated by the fiber itself and never shared, then there's no need to check it after resuming.
Only if you're operating on state shared with other fibers and want to access it without a lock or doing the necessary checks. But this is the same as with multi threading, you have to take a lock to safely operate on shared state. Sure a fiber can try to play games by not taking a lock if it knows it won't suspend inbetween accesses to the shared state which a thread can't because it can get preempted at any time, but if you don't try to play those games, then a fiber is the same as a thread when it comes to thinking about concurrent access to shared state.
This can totally happen here as well, but only if you're operating on shared state with other fibers. If you don't, then there's nothing to worry about. Fibers aren't some magic thing that saves you from thinking about concurrent access to mutable shared state. They very much suffer from the same problems you get with threads if you operate on mutable shared state. With fibers you can do the annotations, but all they will allow you to do is skip certain checks across function calls that you know won't suspend. The other approach you can take (which I'm more a fan of) is to simply assume that functions you call can suspend, and hence if you're working with shared mutable state, you need to operate on a data structure that can handle that or have appropriate locking in place across function calls that can potentially suspend. |
Traditionally, asynchronous programming in systemd has been achieved using
sd-event along with the asynchronous interfaces of sd-bus and sd-varlink.
This works well when the system is reacting to events and all code triggered
by those events can run without blocking. In these scenarios, the global
Manager object is passed as userdata to the callback, and the callback can
use the stack as usual, declaring local state and ensuring proper cleanup via
cleanup. Control flow structures, such as loops, work as expected, and
everything runs smoothly.
However, challenges arise when the code needs to perform long-running
operations within these callbacks. Since the system cannot block execution
within the callback, we can't directly invoke a long-running operation and
wait for its result without introducing complexities. Instead, we need to
initiate the long-running task, register for completion with sd-event,
sd-bus, or sd-varlink, and provide a callback to be invoked when the
operation completes.
This callback, however, only receives a single userdata pointer, which
forces us to bundle all local variables into a struct and pass it along as
part of the callback. On top of that, after queuing the asynchronous
operation, the caller continues executing. As the caller's stack unwinds
when the function exits, the resources and state within the local scope may
be prematurely cleaned up. Therefore, the struct must store copies of the
local variables or ensure proper reference counting to prevent premature
resource cleanup.
When multiple long-running operations need to be initiated within a loop,
the complexity grows further. We must introduce additional shared state to
track the completion of all operations before we can run any code that
depends on their results.
Furthermore, since the daemon may be shut down at any time, we must track
the lifecycle of each long-running operation in the global Manager struct,
ensuring proper cleanup even when stack unwinding can no longer manage the
resources for us.
Fibers, or green threads, provide a more natural way of handling
asynchronous operations. By enabling cooperative multitasking within a
single thread, fibers allow us to write code that looks like it’s running
synchronously, but with the ability to yield control at predefined points,
such as when waiting for long-running tasks to complete.
With fibers, we can simplify the control flow by running asynchronous
operations within a fiber, allowing us to "pause" execution while waiting
for the long-running operation to finish and then "resume" the operation once
it's complete. This eliminates the need for multiple callback chains,
extensive state tracking, and the potential pitfalls of stack unwinding.
This commit introduces the ability to execute long-running operations in a
non-blocking manner while maintaining the simplicity and readability of
synchronous code. The fiber-based approach will significantly improve the
handling of complex workflows, making the code easier to write and maintain.
The implementation is based on ucontext.h's makecontext() (with a fallback
to the venerable sigaltstack() approach on musl), sigsetjmp()/siglongjmp()
and sd-event. ucontext.h provides us with alternate stacks that we can switch
between. We use sigsetjmp()/siglongjmp() instead of swapcontext() because the
latter forcibly saves/restores a per context signal mask every time it is called.
Using sigsetjmp()/siglongjmp(), we can avoid the unnecessary syscall and maintain
a per thread signal mask, which makes much more sense than having a per fiber
signal mask.
The default stack size is the same as a regular thread. Because we
use mmap() to allocate the stack, the memory won't actually be used until it
is paged in by the kernel, so we don't actually use 8MB per fiber.
To integrate fibers with the event loop, each fiber is assigned a deferred
event source which resumes the fiber when enabled. The deferred event source
is oneshot by default so the fiber will run immediately until it yields or
suspends. If it yields, the deferred event source is enabled again (oneshot)
immediately. If it suspends, before it suspends, one or more event sources
are registered with sd-event that will enable the deferred event source
(oneshot) to resume the fiber once the operation it is waiting for completes.
Yielding or suspending the fiber is done by calling sd_fiber_yield() or
sd_fiber_suspend() respectively. Both of these return zero on success or any
error value from the async operation that caused the fiber to resume.
This is also how fiber cancellation is implemented. When a fiber is cancelled,
sd_fiber_yield() and sd_fiber_suspend() will return ECANCELED when the fiber
is resumed, allowing the fiber to unwind its stack (which allows cleanup to
happen automatically) and finish.
Instead of having applications work directly with fibers, we hide them behind
a generic futures interface to represent long-running operations, regardless of
whether those operations are running on a fiber or not. Aside from fibers, the
futures library (sd-future) will for example allow waiting for sd-event sources
and doing sd-bus calls in the background as well. Fibers can suspend until a
future is ready with sd_fiber_await() or by having the future wake up the fiber
explicitly in its callback. A future always defaults to waking up the current
fiber.
Each future kind plugs into the library by providing an sd_future_ops vtable
(alloc, free, cancel, set_priority). The library treats the impl pointer
returned by alloc() as a black box. Future Implementations retrieve it via
sd_future_get_private().
A future starts in SD_FUTURE_PENDING and transitions exactly once to
SD_FUTURE_RESOLVED, carrying an integer result. Consumers can react to that
transition either by installing a one-shot callback with
sd_future_set_callback() (callback-style code) or by waiting on it from a
fiber via sd_fiber_await() (synchronous-looking fiber code). sd_fiber_await()
is itself built on a "wait future" that resolves when its target resolves;
sd_future_new_wait() exposes the same primitive directly so non-fiber callers
can chain futures without involving a fiber.
Cancellation is cooperative: sd_future_cancel() invokes the future impl's
cancel callback, which is responsible for tearing down its work and ultimately
resolving the promise with -ECANCELED. For fiber futures this is what
surfaces as the ECANCELED return from sd_fiber_yield()/sd_fiber_suspend()
mentioned above.
Fire-and-forget fibers — created by passing a NULL ret to sd_fiber_new() —
take a self-reference on their future so they outlive the caller's scope.
The self-ref is dropped when the fiber resolves. This floating mechanism
(sd_fiber_set_floating()) is restricted to fiber futures because they
uniquely guarantee resolution; allowing it for arbitrary future kinds would
risk silent leaks for kinds that may never resolve.
Note that fiber cleanup depends on the runtime operating normally. Each
fiber's cleanup-style cleanups live on the fiber's own stack and run
only when the fiber is resumed and allowed to unwind, which requires a
working event loop to drive it to completion. The exit event source
registered for top-level fibers ensures unwind on a normal sd_event_exit(),
but if the event loop itself terminates abnormally (e.g. an unrecoverable
allocation failure mid-dispatch) before all fibers have resolved, their
stacks never unwind and any resources they own leak.
The code lives in libsystemd as sd-future (not exported) for the following reasons:
The log and log-context headers are updated with functions to allow
fibers to have their own log prefix and log context.