Skip to content

Conversation

MichaReiser
Copy link
Contributor

@MichaReiser MichaReiser commented Sep 24, 2025

We've seen in astral-sh/ruff#20477 that Salsa can hang (it gets stuck in a fetch_with_retry and Memo::retry_provisional cycle) if a cycle head query panics.

I invested some time to see if there's an easy way to identify a Memo of a panicked query, but without success. That's why I decided to explore if we can clean up any provisional memo if a cycle head panics, and that turns out to be much easier.

/// Replaces any inserted memo with a fixpoint initial memo without a value if the current thread panics.
///
/// A regular query doesn't insert any memo if it panics and the query
/// simply gets re-executed if any later called query depends on the panicked query (and will panic again unless the query isn't deterministic).
///
/// Unfortunately, this isn't the case for cycle heads because Salsa first inserts the fixpoint initial memo and later inserts
/// provisional memos for every iteration. Detecting whether a query has previously panicked
/// in `fetch` (e.g., `validate_same_iteration`) and requires re-execution is probably possible but not very straightforward
/// and it's easy to get it wrong, which results in infinite loops where `Memo::provisional_retry` keeps retrying to get the latest `Memo`
/// but `fetch` doesn't re-execute the query for reasons.
///
/// Specifically, a Memo can linger after a panic, which is then incorrectly returned
/// by `fetch_cold_cycle` because it passes the `shallow_verified_memo` check instead of inserting
/// a new fix point initial value if that happens.
///
/// We could insert a fixpoint initial value here, but it seems unnecessary.

I also used this PR as an opportunity to add some more logging which should help debug similar issues more easily in the future.

Copy link

netlify bot commented Sep 24, 2025

Deploy Preview for salsa-rs canceled.

Name Link
🔨 Latest commit c0f2c21
🔍 Latest deploy log https://app.netlify.com/projects/salsa-rs/deploys/68d4152c324e7e00087ac9f4

Copy link

codspeed-hq bot commented Sep 24, 2025

CodSpeed Performance Report

Merging #993 will improve performances by 11.24%

Comparing MichaReiser:cleanup-provisional-memos-on-panic (c0f2c21) with master (e257df1)

Summary

⚡ 4 improvements
✅ 8 untouched

Benchmarks breakdown

Benchmark BASE HEAD Change
amortized[Input] 2.4 µs 2.2 µs +11.24%
amortized[InternedInput] 2.3 µs 2.1 µs +10.33%
amortized[SupertypeInput] 3 µs 2.8 µs +5.79%
new[InternedInput] 4.6 µs 4.3 µs +7.01%

@MichaReiser MichaReiser requested a review from carljm September 24, 2025 15:59
@MichaReiser MichaReiser added the bug Something isn't working label Sep 24, 2025
@MichaReiser
Copy link
Contributor Author

This is something that's sort of impossible to test reliably without shuttle but shuttle still deadlocks (when I'm very confident that it shouldn't because there's a log resuming the deadlocked thread directly one line above)

@MichaReiser MichaReiser changed the title fix: Cleanup provisional cycle head memos on panic fix: Cleanup provisional cycle head memos when query panics Sep 24, 2025
@carljm
Copy link
Contributor

carljm commented Sep 24, 2025

Any idea why this is improving several benchmarks? Seems surprising. Is that just noisy benchmarks? That also seems surprising that it would be consistent improvement across four different ones.

@MichaReiser
Copy link
Contributor Author

Any idea why this is improving several benchmarks? Seems surprising. Is that just noisy benchmarks? That also seems surprising that it would be consistent improvement across four different ones.

Nope. Maybe the first change that builds with Rust 1.90?

Copy link
Contributor

@carljm carljm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@MichaReiser MichaReiser added this pull request to the merge queue Sep 24, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Sep 24, 2025
@MichaReiser MichaReiser added this pull request to the merge queue Sep 24, 2025
Merged via the queue into salsa-rs:master with commit 29ab321 Sep 24, 2025
12 checks passed
@MichaReiser MichaReiser deleted the cleanup-provisional-memos-on-panic branch September 24, 2025 17:58
@github-actions github-actions bot mentioned this pull request Sep 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants