Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cache param env canonicalization #117749

Merged
merged 7 commits into from Dec 14, 2023
Merged

Conversation

aliemjay
Copy link
Member

@aliemjay aliemjay commented Nov 9, 2023

Canonicalize ParamEnv only once and store it. Then whenever we try to canonicalize ParamEnvAnd<'tcx, T> we only have to canonicalize T and then merge the results.

Prelimiary results show ~3-4% savings in diesel and serde benchmarks.

Best to review commits individually. Some commits have a short description.

Initial implementation had a soundness bug (#117749 (comment)) due to cache invalidation:

  • When canonicalizing Ty<'?0> we first try to resolve region variables in the current InferCtxt which may have a constraint ?0 == 'static. This means that we register Ty<'?0> => Canonical<Ty<'static>> in the cache, which is obviously incorrect in another inference context.
  • This is fixed by not doing region resolution when canonicalizing the query input (vs. response), which is the only place where ParamEnv is used, and then in a later commit we statically guard against any form of inference variable resolution of the cached canonical ParamEnv's.

r? @ghost

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Nov 9, 2023
@aliemjay
Copy link
Member Author

aliemjay commented Nov 9, 2023

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Nov 9, 2023
@bors

This comment was marked as outdated.

bors added a commit to rust-lang-ci/rust that referenced this pull request Nov 9, 2023
[perf] canonicalize param env only once

Canonicalize ParamEnv only once and store it. Then whenever we try to canonicalize `ParamEnvAnd<'tcx, T>` we only have to canonicalize `T` and then merge the results.

Prelimiary results show ~3-4% savings in diesel and serde benchmarks.

r? `@ghost`
@bors
Copy link
Contributor

bors commented Nov 9, 2023

☀️ Try build successful - checks-actions
Build commit: 557c12e (557c12e39628232629cbe904f927f16144010a47)

1 similar comment
@bors
Copy link
Contributor

bors commented Nov 9, 2023

☀️ Try build successful - checks-actions
Build commit: 557c12e (557c12e39628232629cbe904f927f16144010a47)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (557c12e): comparison URL.

Overall result: ✅ improvements - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.3% [0.3%, 0.3%] 1
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-1.9% [-3.3%, -0.4%] 17
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) -1.7% [-3.3%, 0.3%] 18

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.7% [0.6%, 0.8%] 2
Regressions ❌
(secondary)
4.7% [4.3%, 5.2%] 2
Improvements ✅
(primary)
-1.2% [-4.2%, -0.4%] 11
Improvements ✅
(secondary)
-2.0% [-2.0%, -2.0%] 1
All ❌✅ (primary) -1.0% [-4.2%, 0.8%] 13

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.5% [0.5%, 0.5%] 1
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-0.9% [-1.3%, -0.5%] 6
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) -0.7% [-1.3%, 0.5%] 7

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 661.067s -> 661.537s (0.07%)
Artifact size: 308.78 MiB -> 308.75 MiB (-0.01%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Nov 10, 2023
@aliemjay
Copy link
Member Author

aliemjay commented Dec 4, 2023

another approach with a global cache
@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Dec 4, 2023
@bors

This comment was marked as outdated.

bors added a commit to rust-lang-ci/rust that referenced this pull request Dec 4, 2023
[perf] canonicalize param env only once

Canonicalize ParamEnv only once and store it. Then whenever we try to canonicalize `ParamEnvAnd<'tcx, T>` we only have to canonicalize `T` and then merge the results.

Prelimiary results show ~3-4% savings in diesel and serde benchmarks.

r? `@ghost`
@rust-log-analyzer

This comment has been minimized.

@bors

This comment was marked as outdated.

@bors bors added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 4, 2023
@rust-log-analyzer

This comment has been minimized.

@aliemjay
Copy link
Member Author

aliemjay commented Dec 5, 2023

The job x86_64-gnu-llvm-16 failed! Check out the build log: (web) (plain)

MCVE:

struct VarZeroVec<'a, F>(&'a (), F);

trait Yokeable<'a> {
    type Output;
}

impl<'a> Yokeable<'a> for () {
    type Output = VarZeroVec<'a, ()>;
}

fn transform_mut<'a, F>(f: F)
where
    F: FnOnce(VarZeroVec<'a, ()>),
{
    f(absurd::<<() as Yokeable<'a>>::Output>())
}

fn absurd<T>() -> T{ todo!() }

@aliemjay
Copy link
Member Author

aliemjay commented Dec 5, 2023

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@bors

This comment was marked as outdated.

bors added a commit to rust-lang-ci/rust that referenced this pull request Dec 5, 2023
[perf] canonicalize param env only once

Canonicalize ParamEnv only once and store it. Then whenever we try to canonicalize `ParamEnvAnd<'tcx, T>` we only have to canonicalize `T` and then merge the results.

Prelimiary results show ~3-4% savings in diesel and serde benchmarks.

r? `@ghost`
@bors
Copy link
Contributor

bors commented Dec 5, 2023

☀️ Try build successful - checks-actions
Build commit: fd72347 (fd72347c63ef586b13bce5e2c07d442a0b5ec7ef)

@rust-timer

This comment has been minimized.

@lcnr
Copy link
Contributor

lcnr commented Dec 8, 2023

r? @lcnr

will look at this soon (hopefully)

@rustbot rustbot assigned lcnr and unassigned BoxyUwU Dec 8, 2023
Copy link
Contributor

@lcnr lcnr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some nits, then r=me, afaict CI was an unrelated issue

param_env,
self,
self.tcx,
&CanonicalizeFreeRegionsOtherThanStatic,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not spark joy 😁 Does anything break if you change this to CanonicalizeFreeRegions? I want to get away from canonicalizing static completely. iirc the last time I tried after #102472 there was still some failure in project or sth 🤔 would prefer for you (or me) to first try to also canonicalize 'static again to check why it matters.

Copy link
Member Author

@aliemjay aliemjay Dec 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Borrowck error in tests/ui/trivial-bounds/trivial-bounds-object.rs

ICE in tests/ui/nll/issue-61311-normalize.rs and tests/ui/nll/issue-61320-normalize.rs. MCVE:

pub trait HasProj {
    type Proj;
}

impl<T: ?Sized> HasProj for T {
    type Proj = ();
}

fn test()
where
    dyn Send + 'static: HasProj,
    <dyn Send + 'static as HasProj>::Proj: Clone,
{
    // ICE: cannot prove `<dyn Send + 'static as HasProj>::Proj: Clone`
    let _ = test;
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc #118965. It is also the root cause behind the ICE.

ParamEnv is canonicalized in *queries input* rather than query response.
In such case we don't "preserve universes" of canonical variable.
This means that `universe_map` always has the default value, which is
wasteful to store in the cache.
fixes a soundness regression described in the PR description.
This doesn't change behavior.
It should prevent unintentional resolution of inference variables
during canonicalization, which previously caused a soundness bug.
See PR description for more.
ParamEnv's with inference variabels are invalid.
@aliemjay
Copy link
Member Author

some nits, then r=me, afaict CI was an unrelated issue

I turns out it is related. Rustdoc incorrectly uses ParamEnv's with inference variables in them. This causes an ICE during canonicalization in one of the unwraps in the commit "make infcx optional in canonicalizer". This was fixed by the last commit by using an empty param env. I assume it does not need a re-approval.

@bors r=lcnr rollup=never

@bors
Copy link
Contributor

bors commented Dec 14, 2023

📌 Commit aa36c35 has been approved by lcnr

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 14, 2023
@bors
Copy link
Contributor

bors commented Dec 14, 2023

⌛ Testing commit aa36c35 with merge d23e1a6...

@bors
Copy link
Contributor

bors commented Dec 14, 2023

☀️ Test successful - checks-actions
Approved by: lcnr
Pushing d23e1a6 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Dec 14, 2023
@bors bors merged commit d23e1a6 into rust-lang:master Dec 14, 2023
12 checks passed
@rustbot rustbot added this to the 1.76.0 milestone Dec 14, 2023
@aliemjay aliemjay deleted the perf-canon-cache branch December 14, 2023 08:44
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (d23e1a6): comparison URL.

Overall result: ✅ improvements - no action needed

@rustbot label: -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-2.0% [-4.8%, -0.2%] 32
Improvements ✅
(secondary)
-1.3% [-1.4%, -1.3%] 2
All ❌✅ (primary) -2.0% [-4.8%, -0.2%] 32

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.9% [0.7%, 1.2%] 2
Regressions ❌
(secondary)
3.5% [2.5%, 4.4%] 2
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.9% [0.7%, 1.2%] 2

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-1.8% [-3.0%, -0.7%] 16
Improvements ✅
(secondary)
-1.5% [-1.5%, -1.5%] 1
All ❌✅ (primary) -1.8% [-3.0%, -0.7%] 16

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 673.665s -> 670.811s (-0.42%)
Artifact size: 312.46 MiB -> 312.42 MiB (-0.01%)

github-actions bot pushed a commit to rust-lang/miri that referenced this pull request Dec 15, 2023
cache param env canonicalization

Canonicalize ParamEnv only once and store it. Then whenever we try to canonicalize `ParamEnvAnd<'tcx, T>` we only have to canonicalize `T` and then merge the results.

Prelimiary results show ~3-4% savings in diesel and serde benchmarks.

Best to review commits individually. Some commits have a short description.

Initial implementation had a soundness bug (rust-lang/rust#117749 (comment)) due to cache invalidation:
- When canonicalizing `Ty<'?0>` we first try to resolve region variables in the current InferCtxt which may have a constraint `?0 == 'static`. This means that we register `Ty<'?0> => Canonical<Ty<'static>>` in the cache, which is obviously incorrect in another inference context.
- This is fixed by not doing region resolution when canonicalizing the query *input* (vs. response), which is the only place where ParamEnv is used, and then in a later commit we *statically* guard against any form of inference variable resolution of the cached canonical ParamEnv's.

r? `@ghost`
lnicola pushed a commit to lnicola/rust-analyzer that referenced this pull request Apr 7, 2024
cache param env canonicalization

Canonicalize ParamEnv only once and store it. Then whenever we try to canonicalize `ParamEnvAnd<'tcx, T>` we only have to canonicalize `T` and then merge the results.

Prelimiary results show ~3-4% savings in diesel and serde benchmarks.

Best to review commits individually. Some commits have a short description.

Initial implementation had a soundness bug (rust-lang/rust#117749 (comment)) due to cache invalidation:
- When canonicalizing `Ty<'?0>` we first try to resolve region variables in the current InferCtxt which may have a constraint `?0 == 'static`. This means that we register `Ty<'?0> => Canonical<Ty<'static>>` in the cache, which is obviously incorrect in another inference context.
- This is fixed by not doing region resolution when canonicalizing the query *input* (vs. response), which is the only place where ParamEnv is used, and then in a later commit we *statically* guard against any form of inference variable resolution of the cached canonical ParamEnv's.

r? `@ghost`
RalfJung pushed a commit to RalfJung/rust-analyzer that referenced this pull request Apr 27, 2024
cache param env canonicalization

Canonicalize ParamEnv only once and store it. Then whenever we try to canonicalize `ParamEnvAnd<'tcx, T>` we only have to canonicalize `T` and then merge the results.

Prelimiary results show ~3-4% savings in diesel and serde benchmarks.

Best to review commits individually. Some commits have a short description.

Initial implementation had a soundness bug (rust-lang/rust#117749 (comment)) due to cache invalidation:
- When canonicalizing `Ty<'?0>` we first try to resolve region variables in the current InferCtxt which may have a constraint `?0 == 'static`. This means that we register `Ty<'?0> => Canonical<Ty<'static>>` in the cache, which is obviously incorrect in another inference context.
- This is fixed by not doing region resolution when canonicalizing the query *input* (vs. response), which is the only place where ParamEnv is used, and then in a later commit we *statically* guard against any form of inference variable resolution of the cached canonical ParamEnv's.

r? `@ghost`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-types Relevant to the types team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants