Lock-order inversion deadlock in AsyncCurrentValue when cancellation races an update
A deadlock occurs in AsyncCurrentValue due to a lock-order inversion between the internal mutex and the Swift runtime's task-status lock. When FlagChangeStream is being consumed and the consuming task is cancelled concurrently with a value update, both threads block waiting for each other's lock indefinitely.
This reliably reproduces within seconds (typically on the first few hundred iterations of the attached test).
Vexil version: 3.0.0-beta.2 / main branch
Swift version: swift-driver version: 1.148.6 Apple Swift version 6.3.1 (swiftlang-6.3.1.1.2 clang-2100.0.123.102) Target: arm64-apple-macosx26.0
Environment: Xcode 26.4.1 (17E202), macOS 26.4 (25E246)
✅ Checklist
🔢 Steps to Reproduce
- Unzip the attached reproducer project ([VexilReproducer.zip])
- Run
swift test
- The test deadlocks within seconds. A watchdog aborts after 55s and writes thread backtraces to
/tmp/vexil-deadlock-sample-<PID>.txt
The test races AsyncCurrentValue.update (which fires continuation.resume() inside the mutex via wrappedValue.didSet) against Task.cancel() on a task suspended in FlagChangeStream.AsyncIterator.next(isolation:).
Deadlock diagram
Thread A (update): Thread B (cancel):
AsyncCurrentValue.update { Task.cancel()
mutex.withLock { ↓
┌─── HOLDS: mutex withStatusRecordLock {
│ ┌─── HOLDS: task-status lock
│ wrappedValue.didSet { │
│ continuation.resume() │ onCancel: {
│ ↓ │ mutex.withLock {
│ flagAsAndEnqueueOnExecutor │ ↓
│ ↓ │ BLOCKED waiting for mutex
│ withStatusRecordLock { │
│ ↓ │
│ BLOCKED waiting for task-status │
│ lock │
└───────────────────────────────────────┘
🎯 Expected behavior
The test completes all 100,000 iterations without hanging. Cancelling a task that is suspended on FlagChangeStream should not deadlock regardless of concurrent updates.
🕵️♀️ Actual behavior
The test deadlocks within seconds. The sample output shows two threads blocked on _os_unfair_lock_lock_slow (__ulock_wait2):
- Thread A —
AsyncCurrentValue.update → Mutex.withLock → wrappedValue.didSet → continuation.resume → flagAsAndEnqueueOnExecutor → withStatusRecordLock — blocked waiting for task-status lock
- Thread B —
swift_task_cancelImpl → withStatusRecordLock → onCancel: closure → Mutex.withLock — blocked waiting for mutex
Relevant sample output
Thread_4432711 DispatchQueue_15: com.apple.root.default-qos.cooperative (concurrent)
noDeadlockOnConcurrentCancelAndUpdate() DeadlockTests.swift:82
swift_task_cancelImpl
withStatusRecordLock
FlagChangeStream.AsyncIterator.next onCancel: closure FlagChangeStream.swift:66
Mutex<>.withLock Lock.swift:37
_os_unfair_lock_lock_slow
__ulock_wait2 ← BLOCKED on mutex
Thread_4432713 DispatchQueue_15: com.apple.root.default-qos.cooperative (concurrent)
closure #3 in noDeadlockOnConcurrentCancelAndUpdate() DeadlockTests.swift:81
AsyncCurrentValue.update AsyncCurrentValue.swift:72
Mutex<>.withLock Lock.swift:37
AsyncCurrentValue.State.wrappedValue.didset AsyncCurrentValue.swift:26
CheckedContinuation.resume
swift_continuation_throwingResumeImpl
flagAsAndEnqueueOnExecutor
updateStatusRecord → withStatusRecordLock
_os_unfair_lock_lock_slow
__ulock_wait2 ← BLOCKED on task-status lock
💡 Suggested Fix
The root cause is that continuation.resume() is called inside the mutex's critical section (in wrappedValue.didSet). The fix is to collect continuations while holding the lock, then resume them after releasing it:
func update<R: Sendable>(_ body: (inout sending Wrapped) throws -> R) rethrows -> R {
let continuationsToResume: [(Int, Wrapped, CheckedContinuation<(Int, Wrapped)?, Never>)]
let result: R
(result, continuationsToResume) = try allocation.mutex.withLock { state in
// ... perform mutation ...
let toResume = state.pendingContinuations
state.pendingContinuations = []
return (result, toResume.map { ($0.0, state.generation, state.wrappedValue, $0.1) })
}
// Resume AFTER releasing the mutex — safe from lock-order inversion
for (_, generation, value, continuation) in continuationsToResume {
continuation.resume(returning: (generation, value))
}
return result
}
VexilReproducer.zip
Lock-order inversion deadlock in
AsyncCurrentValuewhen cancellation races an updateA deadlock occurs in
AsyncCurrentValuedue to a lock-order inversion between the internal mutex and the Swift runtime's task-status lock. WhenFlagChangeStreamis being consumed and the consuming task is cancelled concurrently with a value update, both threads block waiting for each other's lock indefinitely.This reliably reproduces within seconds (typically on the first few hundred iterations of the attached test).
Vexil version:
3.0.0-beta.2/mainbranchSwift version:
swift-driver version: 1.148.6 Apple Swift version 6.3.1 (swiftlang-6.3.1.1.2 clang-2100.0.123.102) Target: arm64-apple-macosx26.0Environment: Xcode 26.4.1 (17E202), macOS 26.4 (25E246)
✅ Checklist
mainbranch of this package🔢 Steps to Reproduce
swift test/tmp/vexil-deadlock-sample-<PID>.txtThe test races
AsyncCurrentValue.update(which firescontinuation.resume()inside the mutex viawrappedValue.didSet) againstTask.cancel()on a task suspended inFlagChangeStream.AsyncIterator.next(isolation:).Deadlock diagram
🎯 Expected behavior
The test completes all 100,000 iterations without hanging. Cancelling a task that is suspended on
FlagChangeStreamshould not deadlock regardless of concurrent updates.🕵️♀️ Actual behavior
The test deadlocks within seconds. The
sampleoutput shows two threads blocked on_os_unfair_lock_lock_slow(__ulock_wait2):AsyncCurrentValue.update → Mutex.withLock → wrappedValue.didSet → continuation.resume → flagAsAndEnqueueOnExecutor → withStatusRecordLock— blocked waiting for task-status lockswift_task_cancelImpl → withStatusRecordLock → onCancel: closure → Mutex.withLock— blocked waiting for mutexRelevant sample output
💡 Suggested Fix
The root cause is that
continuation.resume()is called inside the mutex's critical section (inwrappedValue.didSet). The fix is to collect continuations while holding the lock, then resume them after releasing it:VexilReproducer.zip