Eliminate priority inversions in the metadata completion runtime #40542

rjmccall · 2021-12-14T04:55:19Z

The current system is based on MetadataCompletionQueueEntry objects which are allocated and then enqueued on dependencies. Blocking is achieved using a condition variable associated with the lock on the appropriate metadata cache. Condition variables are inherently susceptible to priority inversions because the waiting threads have no dynamic knowledge of which thread will notify the condition. In the current system, threads that unblock dependencies synchronously advance their dependent metadata completions, which means the signaling thread is unreliable even if we could represent it in condition variables. As a result, the current system is wholly unsuited for eliminating these priority inversions.

The new system uses AtomicWaitQueue, basically an object containing a lock. The queue is eagerly allocated, and the lock is held, whenever a thread is doing work that other threads might wish to block on. In the metadata completion system, this means whenever we construct a metadata cache entry and the metadata isn't already allocated and transitively complete after said construction. Blocking is done by safely acquiring a shared reference to the queue object (which, in the current implementation, requires briefly taking a lock that's global to the surrounding metadata cache) and then acquiring the contained lock. For typical lock implementations, this will avoid priority inversions by ensuring that the priority of waiting threads is temporarily propagated to the thread responsible for advancing the metadata's completion.

Dependencies are unblocked by simply releasing the lock contained in the queue. The unblocking thread doesn't know exactly what metadata are blocked on it and doesn't make any effort to directly advance their completion; instead, the blocking thread will wake up and then attempt to advance the dependent metadata completion itself, eliminating a source of priority overhang that affected the old system. Successive rounds of unblocking (e.g. when a metadata makes partial progress but isn't yet complete) can be achieved by creating a new queue and unlocking the old one. We can still record dependencies and use them to dynamically diagnose metadata cycles.

The new system allocates more eagerly than the old one. Formerly, metadata completions which were never blocked never needed to allocate a MetadataCompletionQueueEntry. (As a downside, we were unable to actually deallocate those entries if they did need to be allocated.) The new system will allocate a queue for most metadata completions. (As an upside, they can be reliably deallocated.) Cache entries are also now slightly smaller because some of the excess storage for status has been folded into the queue.

The fast path of an actual read of the metadata remains a simple load-acquire. Slow paths may require a bit more locking. On Darwin, the metadata cache lock can now use os_unfair_lock instead of pthread_mutex_t (which is a massive improvement) because it does not need to support associated condition variables.

In the future, the excess locking could be eliminated with some sort of generational scheme. Sadly, those are not portable, and I didn't want to take it on up-front.

rdar://76127798

rjmccall · 2021-12-14T04:55:56Z

@swift-ci Please test

swift-ci · 2021-12-14T05:09:02Z

Build failed
Swift Test Linux Platform
Git Sha - 34b6548c46155f8e971bc788353b9b6a6a8c5753

swift-ci · 2021-12-14T08:24:51Z

Build failed
Swift Test OS X Platform
Git Sha - 34b6548c46155f8e971bc788353b9b6a6a8c5753

rjmccall · 2021-12-14T08:54:03Z

@swift-ci Please test

swift-ci · 2021-12-14T08:56:25Z

Build failed
Swift Test Linux Platform
Git Sha - cade36601c10bfed04392f146da4deadf2d8eb93

rjmccall · 2021-12-14T09:02:07Z

@swift-ci Please test

swift-ci · 2021-12-14T10:28:46Z

Build failed
Swift Test Linux Platform
Git Sha - 2c265e4923bf272bf8a0122506a95e0166f44e97

swift-ci · 2021-12-14T12:09:32Z

Build failed
Swift Test OS X Platform
Git Sha - 2c265e4923bf272bf8a0122506a95e0166f44e97

mikeash · 2021-12-14T15:08:16Z

This looks fantastic, thanks! Now to see if I can get my brain to read through it properly....

- take critical sections by reference - forward critical sections into calls (except when called multiple times) - add withLock methods where they were missing

rjmccall · 2021-12-14T20:32:08Z

@swift-ci Please test

swift-ci · 2021-12-14T21:02:22Z

Build failed
Swift Test OS X Platform
Git Sha - 662716a50e68e7796fd2d41e0647d7a198f2d6d8

swift-ci · 2021-12-14T22:56:02Z

Build failed
Swift Test Linux Platform
Git Sha - 662716a50e68e7796fd2d41e0647d7a198f2d6d8

rjmccall · 2021-12-14T23:07:26Z

@swift-ci Please test macOS

rjmccall · 2021-12-14T23:07:43Z

@swift-ci Please test Linux

swift-ci · 2021-12-15T01:25:57Z

Build failed
Swift Test OS X Platform
Git Sha - 662716a50e68e7796fd2d41e0647d7a198f2d6d8

The current system is based on MetadataCompletionQueueEntry objects which are allocated and then enqueued on dependencies. Blocking is achieved using a condition variable associated with the lock on the appropriate metadata cache. Condition variables are inherently susceptible to priority inversions because the waiting threads have no dynamic knowledge of which thread will notify the condition. In the current system, threads that unblock dependencies synchronously advance their dependent metadata completions, which means the signaling thread is unreliable even if we could represent it in condition variables. As a result, the current system is wholly unsuited for eliminating these priority inversions. An AtomicWaitQueue is an object containing a lock. The queue is eagerly allocated, and the lock is held, whenever a thread is doing work that other threads might wish to block on. In the metadata completion system, this means whenever we construct a metadata cache entry and the metadata isn't already allocated and transitively complete after said construction. Blocking is done by safely acquiring a shared reference to the queue object (which, in the current implementation, requires briefly taking a lock that's global to the surrounding metadata cache) and then acquiring the contained lock. For typical lock implementations, this avoids priority inversions by temporarily propagating the priority of waiting threads to the locking threads. Dependencies are unblocked by simply releasing the lock held in the queue. The unblocking thread doesn't know exactly what metadata are blocked on it and doesn't make any effort to directly advance their completion; instead, the blocking thread will wake up and then attempt to advance the dependent metadata completion itself, eliminating a source of priority overhang that affected the old system. Successive rounds of unblocking (e.g. when a metadata makes partial progress but isn't yet complete) can be achieved by creating a new queue and unlocking the old one. We can still record dependencies and use them to dynamically diagnose metadata cycles. The new system allocates more eagerly than the old one. Formerly, metadata completions which were never blocked never needed to allocate a MetadataCompletionQueueEntry; we were then unable to actually deallocate those entries once they were allocated. The new system will allocate a queue for most metadata completions, although, on the positive side, we can reliably deallocate these queues. Cache entries are also now slightly smaller because some of the excess storage for status has been folded into the queue. The fast path of an actual read of the metadata remains a simple load-acquire. Slow paths may require a bit more locking. On Darwin, the metadata cache lock can now use os_unfair_lock instead of pthread_mutex_t (which is a massive improvement) because it does not need to support associated condition variables. The excess locking could be eliminated with some sort of generational scheme. Sadly, those are not portable, and I didn't want to take it on up-front. rdar://76127798

rjmccall · 2021-12-15T03:19:49Z

@swift-ci Please test

swift-ci · 2021-12-15T08:26:21Z

Build failed
Swift Test OS X Platform
Git Sha - 749ed09

DougGregor · 2021-12-15T18:43:56Z

@swift-ci Please test macOS

mikeash

Read through it all, may have understood some of it. The fact that waiters may be waiting for different states really complicates things, but this seems like a solid approach (and implementation) for solving that. Thanks for the extensive comments explaining various parts, and for putting in the work to fix this problem.

rjmccall requested review from mikeash and rokhinip December 14, 2021 04:55

rjmccall force-pushed the atomic-wait-queue-metadata branch from 34b6548 to cade366 Compare December 14, 2021 08:53

rjmccall force-pushed the atomic-wait-queue-metadata branch from cade366 to 2c265e4 Compare December 14, 2021 09:01

rjmccall added 2 commits December 14, 2021 15:31

[NFC] Assorted API improvements in Mutex.h

6efad8c

- take critical sections by reference - forward critical sections into calls (except when called multiple times) - add withLock methods where they were missing

Add AtomicWaitQueue, a lock/condvar for atomic slow paths.

b09e712

rjmccall force-pushed the atomic-wait-queue-metadata branch from 2c265e4 to 662716a Compare December 14, 2021 20:31

rjmccall force-pushed the atomic-wait-queue-metadata branch from 662716a to 749ed09 Compare December 15, 2021 03:19

mikeash approved these changes Dec 15, 2021

View reviewed changes

rjmccall merged commit 4969407 into swiftlang:main Dec 16, 2021

rjmccall deleted the atomic-wait-queue-metadata branch December 16, 2021 02:42

rjmccall mentioned this pull request Dec 16, 2021

[5.6] Eliminate priority inversions in the metadata completion runtime #40580

Merged

Eliminate priority inversions in the metadata completion runtime #40542

Eliminate priority inversions in the metadata completion runtime #40542

Uh oh!

Conversation

rjmccall commented Dec 14, 2021

Uh oh!

rjmccall commented Dec 14, 2021

Uh oh!

swift-ci commented Dec 14, 2021

Uh oh!

swift-ci commented Dec 14, 2021

Uh oh!

rjmccall commented Dec 14, 2021

Uh oh!

swift-ci commented Dec 14, 2021

Uh oh!

rjmccall commented Dec 14, 2021

Uh oh!

swift-ci commented Dec 14, 2021

Uh oh!

swift-ci commented Dec 14, 2021

Uh oh!

mikeash commented Dec 14, 2021

Uh oh!

rjmccall commented Dec 14, 2021

Uh oh!

swift-ci commented Dec 14, 2021

Uh oh!

swift-ci commented Dec 14, 2021

Uh oh!

rjmccall commented Dec 14, 2021

Uh oh!

rjmccall commented Dec 14, 2021

Uh oh!

swift-ci commented Dec 15, 2021

Uh oh!

rjmccall commented Dec 15, 2021

Uh oh!

swift-ci commented Dec 15, 2021

Uh oh!

DougGregor commented Dec 15, 2021

Uh oh!

mikeash left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!