Evaluate free constants and statics in definition order under the parallel front-end#157353
Draft
xmakro wants to merge 1 commit into
Draft
Evaluate free constants and statics in definition order under the parallel front-end#157353xmakro wants to merge 1 commit into
xmakro wants to merge 1 commit into
Conversation
… front-end The crate analysis pass forces the value of every free `static` and non-generic `const`, so that an item whose value fails to evaluate is reported even when the item is unused. This forcing runs inside `par_hir_body_owners`, which under the parallel front-end hands the items to worker threads in an arbitrary order. A constant whose initializer refers to another constant evaluates that other constant as a nested, depth-limited query. When a worker thread starts evaluating a constant before the constants it depends on have been cached, the nested evaluation builds a query stack whose depth tracks the length of the still-uncached dependency chain. For a long chain of constants this depth depends on thread scheduling and can spuriously exceed the recursion limit, so the same valid crate compiles under the single-threaded front-end but intermittently fails under the parallel one. Chains of associated constants reach this same depth-limited path: the per-body const-checking that runs during analysis evaluates the constants an item refers to, so an associated constant defined in terms of earlier ones evaluates the whole chain. Include non-generic associated constants in the eager forcing so that their chains are evaluated in definition order as well, with the same parent-aware monomorphization guard used for free constants. As a result an associated constant whose value fails to evaluate is now reported at its definition, like a free constant, instead of at a use site. Force these values sequentially in definition order before the parallel loop when more than one thread is in use. This matches the single-threaded evaluation order: each constant's dependencies are already cached by the time it is evaluated, so the query depth stays shallow. Genuine unbounded recursion still overflows, because it has a single evaluation entry point whose depth does not depend on scheduling. Re-enable tests/ui/consts/chained-constants-stackoverflow.rs under the parallel front-end.
001ad70 to
049476e
Compare
Collaborator
|
The job Click to see the possible cause of the failure (guessed by this bot) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Under the parallel front-end,
tests/ui/consts/chained-constants-stackoverflow.rsintermittently fails with "queries overflow the depth limit", even though it is a valid program that compiles fine single-threaded. This is the depth-limit nondeterminism tracked in issue 146616.Cause
check_crateforces the value of every freestaticand non-genericconst, so that an item whose value fails to evaluate is still reported even when the item is unused. This forcing runs insidepar_hir_body_owners, which under the parallel front-end hands the items to worker threads in an arbitrary order.A constant whose initializer refers to another constant evaluates that other constant as a nested, depth-limited query. When a worker thread starts evaluating a constant before the constants it depends on have been cached, the nested evaluation builds a query stack whose depth tracks the length of the still-uncached dependency chain. For a long chain of constants (this test has 350) that depth depends on thread scheduling and can exceed the recursion limit, so the same crate compiles single-threaded but intermittently fails in parallel.
Fix
When more than one thread is in use, force these values sequentially in definition order before the parallel loop. This matches the single-threaded evaluation order: each constant's dependencies are already cached by the time it is evaluated, so the query depth stays shallow.
The fix changes only the evaluation order, never the recursion limit, so genuine overflow tests are unaffected. Those tests each have a single evaluation entry point whose depth does not depend on scheduling (one body owner recursing linearly, or codegen monomorphizing a generic const), so they still overflow with identical output. The side-effecting
maybe_check_static_with_link_sectioncheck is kept in the parallel loop so that it still runs exactly once.The previously ignored test is re-enabled under the parallel front-end.
Testing
chained-constants-stackoverflow.rs: 0 overflows across 240 runs at-Zthreads=8/16/32with the fix, and 80 of 80 overflow without it.consts/recursive-block,query-system/query_depth,consts/recursive-const-in-impl,generic-const-items/recursive) still overflow deterministically with their existing expected output../x test --set rust.parallel-frontend-threads=8on the affected tests, plus a sweep oftests/ui/constsandtests/ui/statics: all pass.Trade-off
This serializes evaluation of free constant and static values under the parallel front-end. The heavy per-body work (type checking, borrow checking, MIR) remains fully parallel in the separate loops in
rustc_interface. The cost is limited to crates with many independent, expensive top-level constants.