Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

snatch: Use a reentrant but unfair read lock #5400

Closed
wants to merge 1 commit into from

Conversation

SludgePhD
Copy link
Contributor

Connections
Should fix #5279

Description
RwLocks typically use some mechanism to ensure that readers cannot starve waiting writers, even if they continually re-acquire new read locks. This means that attempts to acquire a read lock can block when a writer is trying to acquire the lock, even if there is already an existing read lock.

This mechanism has the unfortunate consequence of making deadlocks possible the instant a single thread wants to acquire 2 reads locks at the same time (and wgpu is full of code that does that).

Luckily, parking_lot also offers an explicitly reentrant but unfair way to acquire a read lock. This PR switches the snatch lock to use that, hopefully resolving the deadlock.

Testing
Ran my workload. It didn't deadlock. Previously it deadlocked within a fraction of a second.

Checklist

  • Run cargo fmt.
  • Run cargo clippy. If applicable, add:
    • --target wasm32-unknown-unknown
    • --target wasm32-unknown-emscripten
  • Run cargo xtask test to run tests.
  • Add change to CHANGELOG.md. See simple instructions inside file.

@SludgePhD SludgePhD requested a review from a team as a code owner March 16, 2024 00:54
@jimblandy
Copy link
Member

This mechanism has the unfortunate consequence of making deadlocks possible the instant a single thread wants to acquire 2 reads locks at the same time (and wgpu is full of code that does that).

It's true that:

  1. Recursive read locking on RwLocks is not as bad as recursive access to ordinary mutexes.
  2. It's not strictly necessary that pending requests for write access should block read requests from threads that already have a read lock anyway.

However, as far as we know, wgpu should not be acquiring read locks that it already holds. Do you have a stack for the point at which the thread tries to acquire a second read lock?

@SludgePhD
Copy link
Contributor Author

Recursive read locking on RwLocks is not as bad as recursive access to ordinary mutexes.

I'm not sure what you mean by "not as bad". Recursively acquiring a RwLock's read lock will sporadically deadlock outside of your control, whereas recursively acquiring a non-reentrant mutex will deadlock (or panic, which the libstd mutex allows) deterministically. Arguably the deterministic behavior is preferable, since it happens more reliably and thus is easier to catch?

It's not strictly necessary that pending requests for write access should block read requests from threads that already have a read lock anyway.

True, this is not something that will happen deterministically (if that's what you mean by "strictly"), but in order to make a RwLock fair, an implementation has to ensure that readers cannot starve a waiting writer. The only way to accomplish this, as far as I'm aware, is to make attempts to acquire another read lock block until the writer gets a turn, even when there is already a read lock around somewhere.

In the case of parking lot, which uses eventual fairness, this fair locking strategy is not always triggered, but only sporadically, or when it detects that the critical section is longer than some threshold.

Do you have a stack for the point at which the thread tries to acquire a second read lock?

Yes, I posted it in #5279 (comment).

@jimblandy
Copy link
Member

This is a digression, but:

I'm not sure what you mean by "not as bad".

Rust Mutex doesn't have a recursive variant because that would let you get multiple &mut references to the contents, which would be UB. Recursive read locks at least only hand out shared references.

I don't know how lock_api::read_recursive works, but one could imagine an implementation that notices when the calling thread already has a read lock, and grants a second read lock regardless of whether there is a write request pending.

Certainly deterministic behavior is a good thing.

@jimblandy
Copy link
Member

I don't think we should take this PR. I outline in #5279 (comment) what I think is the correct fix for that issue. If for some reason that doesn't work, and we do need to resort to this fix, we can re-open this PR then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Deadlock in Device
2 participants