Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ParamEnv::def_id is expensive #74865

Closed
nnethercote opened this issue Jul 28, 2020 · 10 comments
Closed

ParamEnv::def_id is expensive #74865

nnethercote opened this issue Jul 28, 2020 · 10 comments
Assignees
Labels
C-enhancement Category: An issue proposing an enhancement or a PR with one. I-compiletime Issue: Problems and improvements with respect to compile times.

Comments

@nnethercote
Copy link
Contributor

nnethercote commented Jul 28, 2020

ParamEnv::def_id is only used for Chalk, which currently isn't on by default. I tried removing that field, which (among other things) reduces the size of the hot PredicateObligation type from 40 bytes to 32 bytes. Here are some local perf results on Full builds.

ctfe-stress-4-opt
        avg: -3.1%?     min: -3.1%?     max: -3.1%?
unicode_normalization-check
        avg: 3.0%       min: 3.0%       max: 3.0%
ctfe-stress-4-check
        avg: -2.6%?     min: -2.6%?     max: -2.6%?
ctfe-stress-4-debug
        avg: -2.5%?     min: -2.5%?     max: -2.5%?
unicode_normalization-debug
        avg: 2.1%       min: 2.1%       max: 2.1%
piston-image-debug
        avg: -1.6%      min: -1.6%      max: -1.6%
regression-31157-debug
        avg: -1.6%      min: -1.6%      max: -1.6%
encoding-debug
        avg: -1.6%      min: -1.6%      max: -1.6%
deeply-nested-debug
        avg: -1.5%      min: -1.5%      max: -1.5%
webrender-debug
        avg: -1.5%      min: -1.5%      max: -1.5%
cargo-debug
        avg: -1.5%      min: -1.5%      max: -1.5%
serde-check
        avg: -1.5%      min: -1.5%      max: -1.5%
tokio-webpush-simple-debug
        avg: -1.4%      min: -1.4%      max: -1.4%
unicode_normalization-opt
        avg: 1.4%       min: 1.4%       max: 1.4%
serde-debug
        avg: -1.4%      min: -1.4%      max: -1.4%
encoding-opt
        avg: -1.4%      min: -1.4%      max: -1.4%
webrender-wrench-debug
        avg: -1.4%      min: -1.4%      max: -1.4%
issue-46449-debug
        avg: -1.3%      min: -1.3%      max: -1.3%
syn-debug
        avg: -1.3%      min: -1.3%      max: -1.3%
cranelift-codegen-debug
        avg: -1.3%      min: -1.3%      max: -1.3%       
futures-debug
        avg: -1.3%      min: -1.3%      max: -1.3%
futures-check
        avg: -1.3%      min: -1.3%      max: -1.3%
serde-opt
        avg: -1.3%      min: -1.3%      max: -1.3%
clap-rs-debug
        avg: -1.3%      min: -1.3%      max: -1.3%
ripgrep-debug
        avg: -1.3%      min: -1.3%      max: -1.3%
deeply-nested-check
        avg: -1.2%      min: -1.2%      max: -1.2%
hyper-2-debug
        avg: -1.1%      min: -1.1%      max: -1.1%
regex-debug
        avg: -1.1%      min: -1.1%      max: -1.1%
deep-vector-debug
        avg: -1.1%      min: -1.1%      max: -1.1%

(The unicode_normalization regression is weird, because this change results in the code clearly doing less work overall. Maybe it's a quirk of inlining or codegen, the kind of thing that may well disappear in the face of other minor changes.)

Is this field really needed?

cc @rust-lang/wg-compiler-performance

@jonas-schievink jonas-schievink added C-enhancement Category: An issue proposing an enhancement or a PR with one. I-compiletime Issue: Problems and improvements with respect to compile times. labels Jul 28, 2020
@nikomatsakis
Copy link
Contributor

We could probably remove it. Have to think a bit about it. It shouldn't eventually be needed for chalk, but it permits us to have a crutch for now, since chalk constructs its parameter environments differently (and hence the def_id allows us to recalculate those environments. cc @jackh726 I don't remember if we had any better ideas for how to handle parameter environments, I remember discussing them some when we were working on the chalk integration.

@jackh726
Copy link
Member

I think this is that relevant discussion.

We didn't really go into it, since I think it was somewhat unrelated. Unfortunately, I haven't looked into alternatives here and don't even really know much about the implementation history for this, since this particular bit of code comes from the original integration.

I imagine we could either try to "elaborate" the environment when we actually create a ParamEnv (under -Z chalk), which would require adding a TypeFromEnv predicate; or we could somehow keep track of the ParamEnv environment def_id while trait solving, though I don't really know the rustc code well enough to comment on how difficult/plausible that would be. I think the former would be easier and probably what we want in the long run (in terms of Chalk integration) anyways.

As a side note, looking through uses of ParamEnv::new, this is the only instance of a ParamEnv being constructed with a DefId (that isn't ParamEnv being contstructed based on an existing ParamEnv), so it would be really easy to just insert the TypeFromEnv elaboration here.

Somewhat related, but I think some of the cases where a ParamEnv is constructed without a DefId are wrong or this logic is wrong.

@nikomatsakis
Copy link
Contributor

@jackh726 I think it's the case that chalk's parameter environments just have strictly more things in them, is that right? But are all of those predicates things we can represent (and faithfully interconvert) with rustc's ty::Predicate?

@jackh726
Copy link
Member

See ChalkEnvironmentClause.

I think it's the case that chalk's parameter environments just have strictly more things in them, is that right?

Yes

But are all of those predicates things we can represent (and faithfully interconvert) with rustc's ty::Predicate?

No, we need a TypeFromEnv

@nikomatsakis
Copy link
Contributor

Yeah, I see. So we could extend the set of rustc predicates and then we could just have larger parameter environments when using chalk?

@jackh726
Copy link
Member

Yes, that could work.

@nnethercote
Copy link
Contributor Author

Is there a clear path forward for this? I'm happy to help but I know nothing about Chalk...

@jackh726
Copy link
Member

jackh726 commented Aug 3, 2020

@nnethercote yes

So, the first thing to do would be to add a TypeFromEnv(Ty<'tcx>) variant to PredicateAtom, which is only used for Chalk. It would be lowered to Chalk's FromEnv(FromEnv::Type), like here.

Then, you would essentially have to add those predicates here.

From there, you should be able to remove the environment elaboration in chalk_fulfill. As cleanup, you could also remove ChalkEnvironmentAndGoal and just use the PredicateObligation in chalk_fulfill.

Let me know if that makes sense! I'll be around on Zulip if you have questions or need to discuss.

@vandenheuvel
Copy link
Contributor

@rustbot claim

bors added a commit to rust-lang-ci/rust that referenced this issue Sep 13, 2020
…, r=nikomatsakis

Removing the `def_id` field from hot `ParamEnv` to make it smaller

This PR addresses rust-lang#74865.
@vandenheuvel
Copy link
Contributor

@nnethercote or @jonas-schievink could this be closed?

bors added a commit to rust-lang-ci/rust that referenced this issue Sep 22, 2020
Fixing the performance regression of rust-lang#76244

Issue rust-lang#74865 suggested that removing the `def_id` field from `ParamEnv` would improve performance. PR rust-lang#76244 implemented this change.

Generally, [results](https://perf.rust-lang.org/compare.html?start=80fc9b0ecb29050d45b17c64af004200afd3cfc2&end=5ef250dd2ad618ee339f165e9b711a1b4746887d) were as expected: an instruction count decrease of about a percent. The instruction count for the unicode crates increased by about 3%, which `@nnethercote` speculated to be caused by a quirk of inlining or codegen. As the results were generally positive, and for chalk integration, this was also a step in the right direction, the PR was r+'d regardless.

However, [wall-time performance results](https://perf.rust-lang.org/compare.html?start=a055c5a1bd95e029e9b31891db63b6dc8258b472&end=7402a394471a6738a40fea7d4f1891666e5a80c5&stat=task-clock) show a much larger performance degradation: 25%, as [mentioned](rust-lang#76244 (comment)) by `@Mark-Simulacrum.`

This PR, for now, reverts rust-lang#76244 and attempts to find out, which change caused the regression.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-enhancement Category: An issue proposing an enhancement or a PR with one. I-compiletime Issue: Problems and improvements with respect to compile times.
Projects
None yet
Development

No branches or pull requests

5 participants