Conversation
|
I'm curious to see whether removing the cache lookup has any measurable perf effect. @bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Clean up query-forcing functions
There was a problem hiding this comment.
Looks ok. A couple of comments. I'm a bit unsure about the second commit so I'll wait to see if @Zoxc or @zetanumbers has anything to say about it.
| // lead to having to re-execute compile_codegen_unit, possibly unnecessarily. | ||
| // | ||
| // For more background, see the comment in `plumbing.rs` at | ||
| // <https://github.com/rust-lang/rust/commit/0454a41>. |
There was a problem hiding this comment.
It feels weird to refer to a deleted comment. Seems like either this reference should be removed, or part/all of the deleted comment should be replicated here.
There was a problem hiding this comment.
The linked comment is basically a longer and more detailed version of the comment that's just above these changed lines.
The main reason I wanted to add a link here is that someone looking at these codegen_unit calls could reasonably want more context on why they were added, and the actual history is rather tricky to track down due to a number of intermediate moves. But at the same time I think it would be too disruptive to add much more inline context here.
Perhaps I could try to find a more agreeable way to do this.
| ), | ||
| promote_from_disk_fn: (can_recover && is_cache_on_disk) | ||
| .then_some(promote_from_disk_inner::<Q>), | ||
| } |
There was a problem hiding this comment.
Should the second Q::query_vtable call be hoisted into a closure here as well?
There was a problem hiding this comment.
I suppose it could be.
Note that this is effectively me changing my mind on 4284edc. At that time I thought GetQueryVTable might be useful for other kinds of static lookup in the future, but after various other simplifications and cleanups we seem to be doing fine with just using QueryVTable almost everywhere.
This comment has been minimized.
This comment has been minimized.
|
Finished benchmarking commit (00d810c): comparison URL. Overall result: ❌✅ regressions and improvements - no action neededBenchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf. @bors rollup=never Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (secondary 2.3%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (secondary -2.5%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 485.484s -> 483.408s (-0.43%) |
|
Perf results look like pure noise. No need to avoid rollup. |
This assertion and its comment are much visually heavier than the part of the function that actually performs important work. This assertion is also useless, because the `codegen_unit` query does not have a `force_from_dep_node_fn` in its DepKindVTable, so the assertion would never be reached even in the situation it is trying to guard against. This assertion is also redundant, because we have plenty of incremental tests that fail if the corresponding calls to `tcx.ensure_ok().codegen_unit(..)` are removed.
In the serial compiler, we should never be trying to force a query node that has a cached value, because we would have already marked the node red or green when putting its value in cache. In the parallel compiler, we immediately re-check the cache while holding the state shard lock anyway, and trying to avoid that lock doesn't seem worthwhile.
The different parts of this function used to be split across different crates, but nowadays they are both in `rustc_query_impl`, so they can be combined. This commit also hoists the vtable lookup to a closure in `make_dep_kind_vtable_for_query`, to reduce the amount of code that deals with `GetQueryVTable`, and to make the inner function more consistent with other functions in `execution`.
This effectively reverses <rust-lang@4284edc>. At that time, I thought GetQueryVTable might be useful for other kinds of static lookup in the future, but after various other simplifications and cleanups that now seems less likely, and this style is more consistent with other vtable-related functions.
|
This is fine |
This PR takes the
force_queryfunction, inlines it into its only callerforce_from_dep_node_inner, and renames the resulting function toforce_query_dep_node. Combining the functions became possible after the removal ofrustc_query_system, because they are now in the same crate.There are two other notable cleanups along the way:
r? nnethercote