Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removing the def_id field from hot ParamEnv to make it smaller #76244

Merged
merged 1 commit into from
Sep 13, 2020

Conversation

vandenheuvel
Copy link
Contributor

@vandenheuvel vandenheuvel commented Sep 2, 2020

This PR addresses #74865.

@rust-highfive
Copy link
Collaborator

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @jackh726 (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Sep 2, 2020
@@ -58,6 +58,7 @@ impl<'tcx> ExplicitPredicatesMap<'tcx> {
| ty::PredicateAtom::Subtype(..)
| ty::PredicateAtom::ConstEvaluatable(..)
| ty::PredicateAtom::ConstEquate(..) => (),
ty::PredicateAtom::TypeFromEnv(..) => unimplemented!(),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would you think about changing this list (which is also a duplicate) to a _ pattern?

compiler/rustc_infer/src/traits/util.rs Outdated Show resolved Hide resolved
compiler/rustc_lint/src/builtin.rs Outdated Show resolved Hide resolved
compiler/rustc_middle/src/traits/chalk.rs Outdated Show resolved Hide resolved
compiler/rustc_middle/src/ty/mod.rs Outdated Show resolved Hide resolved
compiler/rustc_middle/src/ty/print/pretty.rs Outdated Show resolved Hide resolved
compiler/rustc_typeck/src/check/mod.rs Outdated Show resolved Hide resolved
compiler/rustc_middle/src/traits/chalk.rs Outdated Show resolved Hide resolved
compiler/rustc_middle/src/traits/mod.rs Outdated Show resolved Hide resolved
compiler/rustc_trait_selection/src/traits/chalk_fulfill.rs Outdated Show resolved Hide resolved
compiler/rustc_traits/src/chalk/lowering.rs Outdated Show resolved Hide resolved
@jackh726
Copy link
Member

jackh726 commented Sep 2, 2020

Also, seems like one of the test outputs changed, so you'll have to ./x.py test --bless

Copy link
Contributor

@nikomatsakis nikomatsakis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! Nice work. I left two nits about naming, but otherwise I'm happy.

@@ -276,6 +281,113 @@ fn param_env(tcx: TyCtxt<'_>, def_id: DefId) -> ty::ParamEnv<'_> {
traits::normalize_param_env_or_error(tcx, def_id, unnormalized_env, cause)
}

fn environment<'tcx>(tcx: TyCtxt<'tcx>, def_id: DefId) -> &'tcx ty::List<Predicate<'tcx>> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add a doc comment to this fn? it's pretty non-obvious to me what it does from its name. Actually, I suspect the name should just be improved to something like well_formed_types_in_env

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/// Elaborate the environment.
///
/// Collect a list of `Predicate`'s used for building the `ParamEnv`.
/// 
/// Used only in chalk mode.

@jackh726 anything to add?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps mention that this adds TypeFromEnv (or what the eventual name ends up being) predicates. Specifically, because these types come from the environment, we assume they are well formed.

compiler/rustc_middle/src/ty/mod.rs Outdated Show resolved Hide resolved
@jonas-schievink
Copy link
Contributor

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion

@bors
Copy link
Contributor

bors commented Sep 2, 2020

⌛ Trying commit 8f39a0415a6ca789a18c9d3bbbba43b5fda542bd with merge 5ef250dd2ad618ee339f165e9b711a1b4746887d...

@bors bors added the S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. label Sep 11, 2020
@jonas-schievink
Copy link
Contributor

@bors r=nikomatsakis

@bors
Copy link
Contributor

bors commented Sep 13, 2020

📌 Commit 7dad29d has been approved by nikomatsakis

@bors
Copy link
Contributor

bors commented Sep 13, 2020

⌛ Testing commit 7dad29d with merge 7402a39...

@bors
Copy link
Contributor

bors commented Sep 13, 2020

☀️ Test successful - checks-actions, checks-azure
Approved by: nikomatsakis
Pushing 7402a39 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Sep 13, 2020
@bors bors merged commit 7402a39 into rust-lang:master Sep 13, 2020
@rustbot rustbot added this to the 1.48.0 milestone Sep 13, 2020
@vandenheuvel vandenheuvel deleted the remove__paramenv__def_id branch September 13, 2020 18:29
@Mark-Simulacrum
Copy link
Member

It looks like this was a pretty major wall-time regression (up to 25%). There were some ~1% improvements too, but this regression seems to outweigh those.

It looks like that was mostly in match checking, which is somewhat surprising -- there doesn't seem to be significant modifications to that code in this PR.

I suspect we'll want to revert as this is a pretty big regression, but it would be useful to try to analyze where the instruction diff is coming from.

@jackh726
Copy link
Member

@Mark-Simulacrum that's really strange, not what I would expect at all. Seems those 2-3% regressions in instruction count for unicode_normalization are pretty significant.

I wonder if the additional PredicateAtom variant is the problem? It might be worth doing a revert+that change alone for a perf comparison.

@Mark-Simulacrum
Copy link
Member

Ah, I missed that there's two things being done here -- would've been good to split that into separate commits perhaps. @vandenheuvel do you think you could prepare a PR that reverts this one, and then just adds the PredicateAtom variant as @jackh726 suggests?

@jackh726
Copy link
Member

@Mark-Simulacrum in hindsight, yeah, splitting into two makes sense. But given that we didn't expect a perf regression (since we were more focused on the removal of DefId from ParamEnv, it didn't seem necessary. (Only adding the PredicateAtom variant is basically useless without also removing DefId from ParamEnv.)

If @vandenheuvel can't, I can try to make a PR this weekend for perf.

@Mark-Simulacrum
Copy link
Member

No worries! Yeah, we'll check in on this next week's perf triage (roughly Tuesday) but it's no big deal if it takes some time.

@ecstatic-morse
Copy link
Contributor

Hi! This PR showed up in the weekly perf triage report. It resulted in a small improvement across the board in instruction counts. However, some benchmarks, notably unicode-normalization regressed moderately.

Seems like this is already being looked into. Thanks all!

bors added a commit to rust-lang-ci/rust that referenced this pull request Sep 22, 2020
Fixing the performance regression of rust-lang#76244

Issue rust-lang#74865 suggested that removing the `def_id` field from `ParamEnv` would improve performance. PR rust-lang#76244 implemented this change.

Generally, [results](https://perf.rust-lang.org/compare.html?start=80fc9b0ecb29050d45b17c64af004200afd3cfc2&end=5ef250dd2ad618ee339f165e9b711a1b4746887d) were as expected: an instruction count decrease of about a percent. The instruction count for the unicode crates increased by about 3%, which `@nnethercote` speculated to be caused by a quirk of inlining or codegen. As the results were generally positive, and for chalk integration, this was also a step in the right direction, the PR was r+'d regardless.

However, [wall-time performance results](https://perf.rust-lang.org/compare.html?start=a055c5a1bd95e029e9b31891db63b6dc8258b472&end=7402a394471a6738a40fea7d4f1891666e5a80c5&stat=task-clock) show a much larger performance degradation: 25%, as [mentioned](rust-lang#76244 (comment)) by `@Mark-Simulacrum.`

This PR, for now, reverts rust-lang#76244 and attempts to find out, which change caused the regression.
@vandenheuvel
Copy link
Contributor Author

@ecstatic-morse this issue is now picked up in #77058.

bors added a commit to rust-lang-ci/rust that referenced this pull request Sep 29, 2020
…m-pat, r=Mark-Simulacrum

Optimize `IntRange::from_pat`, then shrink `ParamEnv`

Resolves rust-lang#77058.

r? `@Mark-Simulacrum`
cc `@vandenheuvel`

Looking at the output of `perf report` for rust-lang#76244, the hot instructions seemed to be around the call to `pat_constructor` in `IntRange::from_pat`. I carried out an obvious optimization, but it actually made the instruction count higher (see rust-lang#77075). However, it seems to have mitigated whatever was causing the pipeline stalls, so when combined with rust-lang#76244, it's a net win.

As you can see below, the regression in rust-lang#76244 seems to have originated from something measured by `stalled-cycles-backend`. I'll try to collect some finer-grained stats to see if I can isolate it. I wish I had a better idea of what was going on here. I'd like to prevent the regression from reappearing in the future due to small changes in unrelated code.

<details>
<summary>Current `master`:</summary>

```
 Performance counter stats for 'cargo +baseline-stage1 check':

          2,275.67 msec task-clock:u              #    0.998 CPUs utilized
                 0      context-switches:u        #    0.000 K/sec
                 0      cpu-migrations:u          #    0.000 K/sec
            49,826      page-faults:u             #    0.022 M/sec
     5,117,221,678      cycles:u                  #    2.249 GHz
       299,655,943      stalled-cycles-frontend:u #    5.86% frontend cycles idle
     2,284,213,395      stalled-cycles-backend:u  #   44.64% backend cycles idle
     8,051,871,959      instructions:u            #    1.57  insn per cycle
                                                  #    0.28  stalled cycles per insn
     1,359,589,402      branches:u                #  597.447 M/sec
         7,359,347      branch-misses:u           #    0.54% of all branches

       2.281030026 seconds time elapsed

       2.108197000 seconds user
       0.164183000 seconds sys
```
</details>

<details>
<summary>Shrink `ParamEnv` without changing `IntRange::from_pat`:</summary>

```
 Performance counter stats for 'cargo +perf-stage1 check':

          2,751.79 msec task-clock:u              #    0.996 CPUs utilized
                 0      context-switches:u        #    0.000 K/sec
                 0      cpu-migrations:u          #    0.000 K/sec
            50,103      page-faults:u             #    0.018 M/sec
     6,260,590,019      cycles:u                  #    2.275 GHz
       317,355,920      stalled-cycles-frontend:u #    5.07% frontend cycles idle
     3,397,743,582      stalled-cycles-backend:u  #   54.27% backend cycles idle
     8,276,224,367      instructions:u            #    1.32  insn per cycle
                                                  #    0.41  stalled cycles per insn
     1,370,453,386      branches:u                #  498.023 M/sec
         7,281,031      branch-misses:u           #    0.53% of all branches

       2.763265838 seconds time elapsed

       2.544578000 seconds user
       0.204548000 seconds sys
```
</details>

<details>
<summary>Shrink `ParamEnv` and change `IntRange::from_pat`: </summary>

```
 Performance counter stats for 'cargo +perf-stage1 check':

          2,295.57 msec task-clock:u              #    0.996 CPUs utilized
                 0      context-switches:u        #    0.000 K/sec
                 0      cpu-migrations:u          #    0.000 K/sec
            49,959      page-faults:u             #    0.022 M/sec
     5,151,407,066      cycles:u                  #    2.244 GHz
       324,517,829      stalled-cycles-frontend:u #    6.30% frontend cycles idle
     2,301,671,001      stalled-cycles-backend:u  #   44.68% backend cycles idle
     8,130,868,329      instructions:u            #    1.58  insn per cycle
                                                  #    0.28  stalled cycles per insn
     1,356,618,512      branches:u                #  590.972 M/sec
         7,323,800      branch-misses:u           #    0.54% of all branches

       2.304509653 seconds time elapsed

       2.128090000 seconds user
       0.163909000 seconds sys
```
</details>
Xanewok added a commit to Xanewok/rust-semverver that referenced this pull request Nov 18, 2020
Xanewok added a commit to Xanewok/rust-semverver that referenced this pull request Nov 18, 2020
Xanewok added a commit to Xanewok/rust-semverver that referenced this pull request Nov 19, 2020
Xanewok added a commit to Xanewok/rust-semverver that referenced this pull request Nov 19, 2020
Xanewok added a commit to Xanewok/rust-semverver that referenced this pull request Nov 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet