snatch: Use a reentrant but unfair read lock #5400

SludgePhD · 2024-03-16T00:54:41Z

Connections
Should fix #5279

Description
RwLocks typically use some mechanism to ensure that readers cannot starve waiting writers, even if they continually re-acquire new read locks. This means that attempts to acquire a read lock can block when a writer is trying to acquire the lock, even if there is already an existing read lock.

This mechanism has the unfortunate consequence of making deadlocks possible the instant a single thread wants to acquire 2 reads locks at the same time (and wgpu is full of code that does that).

Luckily, parking_lot also offers an explicitly reentrant but unfair way to acquire a read lock. This PR switches the snatch lock to use that, hopefully resolving the deadlock.

Testing
Ran my workload. It didn't deadlock. Previously it deadlocked within a fraction of a second.

Checklist

Run cargo fmt.
Run cargo clippy. If applicable, add:
- --target wasm32-unknown-unknown
- --target wasm32-unknown-emscripten
Run cargo xtask test to run tests.
Add change to CHANGELOG.md. See simple instructions inside file.

jimblandy · 2024-03-21T14:58:59Z

This mechanism has the unfortunate consequence of making deadlocks possible the instant a single thread wants to acquire 2 reads locks at the same time (and wgpu is full of code that does that).

It's true that:

Recursive read locking on RwLocks is not as bad as recursive access to ordinary mutexes.
It's not strictly necessary that pending requests for write access should block read requests from threads that already have a read lock anyway.

However, as far as we know, wgpu should not be acquiring read locks that it already holds. Do you have a stack for the point at which the thread tries to acquire a second read lock?

SludgePhD · 2024-03-21T15:27:30Z

Recursive read locking on RwLocks is not as bad as recursive access to ordinary mutexes.

I'm not sure what you mean by "not as bad". Recursively acquiring a RwLock's read lock will sporadically deadlock outside of your control, whereas recursively acquiring a non-reentrant mutex will deadlock (or panic, which the libstd mutex allows) deterministically. Arguably the deterministic behavior is preferable, since it happens more reliably and thus is easier to catch?

It's not strictly necessary that pending requests for write access should block read requests from threads that already have a read lock anyway.

True, this is not something that will happen deterministically (if that's what you mean by "strictly"), but in order to make a RwLock fair, an implementation has to ensure that readers cannot starve a waiting writer. The only way to accomplish this, as far as I'm aware, is to make attempts to acquire another read lock block until the writer gets a turn, even when there is already a read lock around somewhere.

In the case of parking lot, which uses eventual fairness, this fair locking strategy is not always triggered, but only sporadically, or when it detects that the critical section is longer than some threshold.

Do you have a stack for the point at which the thread tries to acquire a second read lock?

Yes, I posted it in #5279 (comment).

jimblandy · 2024-03-22T15:34:07Z

This is a digression, but:

I'm not sure what you mean by "not as bad".

Rust Mutex doesn't have a recursive variant because that would let you get multiple &mut references to the contents, which would be UB. Recursive read locks at least only hand out shared references.

I don't know how lock_api::read_recursive works, but one could imagine an implementation that notices when the calling thread already has a read lock, and grants a second read lock regardless of whether there is a write request pending.

Certainly deterministic behavior is a good thing.

jimblandy · 2024-03-22T15:59:23Z

I don't think we should take this PR. I outline in #5279 (comment) what I think is the correct fix for that issue. If for some reason that doesn't work, and we do need to resort to this fix, we can re-open this PR then.

snatch: Use a reentrant but unfair read lock

8b2e510

SludgePhD requested a review from a team as a code owner March 16, 2024 00:54

jimblandy closed this Mar 22, 2024

SludgePhD mentioned this pull request Mar 22, 2024

Avoid recursive snatch lock acquisitions #5426

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

snatch: Use a reentrant but unfair read lock #5400

snatch: Use a reentrant but unfair read lock #5400

SludgePhD commented Mar 16, 2024

jimblandy commented Mar 21, 2024

SludgePhD commented Mar 21, 2024

jimblandy commented Mar 22, 2024

jimblandy commented Mar 22, 2024

snatch: Use a reentrant but unfair read lock #5400

snatch: Use a reentrant but unfair read lock #5400

Conversation

SludgePhD commented Mar 16, 2024

jimblandy commented Mar 21, 2024

SludgePhD commented Mar 21, 2024

jimblandy commented Mar 22, 2024

jimblandy commented Mar 22, 2024