103 await any or fail deadlocks when only later futures complete#104
Merged
EdmondDantes merged 3 commits intoMay 9, 2026
Merged
Conversation
Bug: await_any_or_fail([$f1, $f2]) deadlocked when only $f2 was
completed; it worked fine when $f1 was completed.
Root cause is in async_API.c::async_await_futures():
* The iteration loop registers a callback per trigger:
- if the trigger is closed already, REPLAY fires the callback
synchronously (callback updates resolved_count, results, and
calls ZEND_ASYNC_RESUME on the awaiting coroutine);
- otherwise zend_async_resume_when() registers a real listener
on the trigger.
* After the loop, the function checks
coroutine->waker->events.nNumOfElements > 0
and unconditionally calls ZEND_ASYNC_SUSPEND().
* If a *later* element of the iterable is already closed, the
earlier elements have already had pending listeners installed.
REPLAY for the closed element synchronously satisfies the
waiting condition (resolved_count >= waiting_count), but the
function still suspends because waker->events is non-empty —
and now no one will ever fire those leftover triggers, so the
awaiter is stuck. With Coroutine triggers the bug rarely
surfaces because the producer coroutine has not had a chance to
run before iteration starts; with Future triggers (which can be
completed synchronously by another coroutine before iteration
even reaches them) it is easy to hit.
Fix: before suspending, check AWAIT_ITERATOR_IS_FINISHED. If the
condition is already satisfied, skip ZEND_ASYNC_SUSPEND but still
call zend_async_waker_clean() so the leftover callbacks on the
unresolved triggers are removed and refcounts decremented.
Discovered by the chaos test harness (#102, fuzzy_tests/await/
await_any.feature). Adds regression test
tests/await/093-awaitAnyOrFail_with_future_triggers.phpt that
exercises every slot of a 3-Future array.
Verified: full ext/async test suite (927/927) and new regression
test pass.
Architectural complement to the previous async_API.c fix. The waker itself was vulnerable to the same class of bug whenever a registered event closed before the awaiter actually entered SUSPEND — most easily reachable via Future, but conceptually possible for any event type (timer firing during synchronous setup, I/O completing in the same scheduler tick, etc.). start_waker_events() is called by SUSPEND immediately before the real context switch. Previously it just invoked event->start() on every trigger, which is a no-op for Coroutine/Future events and only does something useful for libuv-backed events. Closed events were therefore ignored, even though their callbacks were already in the waker — the coroutine would suspend with stale callbacks that never fire. Now: for every trigger whose event is already closed, replay each registered callback right here, in scheduler context. RESUME from inside the callback hits the short path (in_scheduler_context && coroutine == current) and sets waker->status = WAKER_RESULT, which the existing fast-return check directly below the call uses to skip the actual suspension. Open events still go through the normal start() path. Verified: ext/async/tests/ 928 / 928 (167 skipped — externals) ext/async/fuzzy_tests/ 44 / 44 per scheduler fuzz matrix (6 schedulers × 44) 264 / 264
The cache survived across requests, while enum cases live in the request-scoped constants table; on the next request the cached pointers were dangling and the slots were reused (typically by a Channel object), producing Cannot assign Async\Channel to property Async\ChannelException::\$reason of type Async\ChannelCloseReason and downstream SEGVs in zend_verify_property_type during shutdown_destructors. Resolve every call; lookup is a hash fetch.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.