New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize `Substs::super_fold_with`. #37108

Merged
merged 2 commits into from Oct 19, 2016

Conversation

Projects
None yet
8 participants
@nnethercote
Contributor

nnethercote commented Oct 12, 2016

This speeds up some of the rustc-benchmarks by up to ~4%.

futures-rs-test  4.467s vs  4.387s --> 1.018x faster (variance: 1.001x, 1.006x)
helloworld       0.242s vs  0.246s --> 0.980x faster (variance: 1.007x, 1.013x)
html5ever-2016-  7.664s vs  7.630s --> 1.004x faster (variance: 1.008x, 1.006x)
hyper.0.5.0      5.218s vs  5.133s --> 1.016x faster (variance: 1.013x, 1.008x)
inflate-0.1.0    5.040s vs  5.103s --> 0.988x faster (variance: 1.005x, 1.008x)
issue-32062-equ  0.361s vs  0.345s --> 1.047x faster (variance: 1.008x, 1.019x)
issue-32278-big  1.874s vs  1.850s --> 1.013x faster (variance: 1.020x, 1.018x)
jld-day15-parse  1.569s vs  1.508s --> 1.040x faster (variance: 1.009x, 1.003x)
piston-image-0. 12.210s vs 11.903s --> 1.026x faster (variance: 1.045x, 1.010x)
regex.0.1.30     2.568s vs  2.555s --> 1.005x faster (variance: 1.018x, 1.044x)
rust-encoding-0  2.139s vs  2.135s --> 1.001x faster (variance: 1.012x, 1.005x)
syntex-0.42.2   33.099s vs 32.353s --> 1.023x faster (variance: 1.003x, 1.028x)
syntex-0.42.2-i 17.989s vs 17.431s --> 1.032x faster (variance: 1.009x, 1.018x)

r? @eddyb. I don't know how this interacts with the changes that dikaiosune has been working on.

@eddyb

This comment has been minimized.

Show comment
Hide comment
@eddyb

eddyb Oct 12, 2016

Member

So this is an interesting take. I'd like to do it after moving to on-stack arrays and slice arena allocations instead of always allocating a Vec. Then it could either be an extension of that algorithm or maybe it wouldn't even have a measurable impact etc

Member

eddyb commented Oct 12, 2016

So this is an interesting take. I'd like to do it after moving to on-stack arrays and slice arena allocations instead of always allocating a Vec. Then it could either be an extension of that algorithm or maybe it wouldn't even have a measurable impact etc

@eddyb

This comment has been minimized.

Show comment
Hide comment
@eddyb

eddyb Oct 12, 2016

Member

Oh and this doesn't handle ty::relate which may have a larger impact.

Member

eddyb commented Oct 12, 2016

Oh and this doesn't handle ty::relate which may have a larger impact.

@nnethercote

This comment has been minimized.

Show comment
Hide comment
@nnethercote

nnethercote Oct 12, 2016

Contributor

I'd like to do it after moving to on-stack arrays and slice arena allocations instead of always allocating a Vec.

My original version of the patch just checked if params == self.params and skipped the mk_substs call in that case, and that alone was a decent improvement because it avoids the interning, which is dominated by hash operations.

Then I added the skip-Vec-creation-in-the-length-1 case and that improved things a bit more.

Contributor

nnethercote commented Oct 12, 2016

I'd like to do it after moving to on-stack arrays and slice arena allocations instead of always allocating a Vec.

My original version of the patch just checked if params == self.params and skipped the mk_substs call in that case, and that alone was a decent improvement because it avoids the interning, which is dominated by hash operations.

Then I added the skip-Vec-creation-in-the-length-1 case and that improved things a bit more.

@nnethercote

This comment has been minimized.

Show comment
Hide comment
@nnethercote

nnethercote Oct 12, 2016

Contributor

Oh and this doesn't handle ty::relate which may have a larger impact.

I haven't seen it in any of the profiles I've looked at. Can you describe in more detail how you think it might have an impact?

Contributor

nnethercote commented Oct 12, 2016

Oh and this doesn't handle ty::relate which may have a larger impact.

I haven't seen it in any of the profiles I've looked at. Can you describe in more detail how you think it might have an impact?

@bors

This comment has been minimized.

Show comment
Hide comment
@bors

bors Oct 15, 2016

Contributor

☔️ The latest upstream changes (presumably #37100) made this pull request unmergeable. Please resolve the merge conflicts.

Contributor

bors commented Oct 15, 2016

☔️ The latest upstream changes (presumably #37100) made this pull request unmergeable. Please resolve the merge conflicts.

Optimize `Substs::super_fold_with`.
This speeds up several rustc-benchmarks by 1--4%.
@nnethercote

This comment has been minimized.

Show comment
Hide comment
@nnethercote

nnethercote Oct 18, 2016

Contributor

I have changed the code to just do the "skip mk_substs if the fold was a no-op" optimisation. That should avoid any conflicts with @Mark-Simulacrum's changes to Substs. It still gives some nice speed-ups.

futures-rs-test  4.189s vs  4.157s --> 1.008x faster (variance: 1.004x, 1.007x)
helloworld       0.223s vs  0.223s --> 1.000x faster (variance: 1.013x, 1.012x)
html5ever-2016-  5.192s vs  5.147s --> 1.009x faster (variance: 1.007x, 1.004x)
hyper.0.5.0      4.788s vs  4.763s --> 1.005x faster (variance: 1.008x, 1.008x)
inflate-0.1.0    4.553s vs  4.494s --> 1.013x faster (variance: 1.010x, 1.044x)
issue-32062-equ  0.324s vs  0.314s --> 1.032x faster (variance: 1.008x, 1.020x)
issue-32278-big  1.661s vs  1.654s --> 1.005x faster (variance: 1.008x, 1.002x)
jld-day15-parse  1.475s vs  1.422s --> 1.038x faster (variance: 1.007x, 1.008x)
piston-image-0. 11.386s vs 11.227s --> 1.014x faster (variance: 1.005x, 1.063x)
reddit-stress    2.359s vs  2.308s --> 1.022x faster (variance: 1.017x, 1.024x)
regex.0.1.30     2.421s vs  2.388s --> 1.014x faster (variance: 1.036x, 1.008x)
rust-encoding-0  1.983s vs  1.973s --> 1.005x faster (variance: 1.008x, 1.016x)
syntex-0.42.2   29.582s vs 29.610s --> 0.999x faster (variance: 1.014x, 1.045x)
syntex-0.42.2-i 15.203s vs 15.175s --> 1.002x faster (variance: 1.011x, 1.006x)

r? @eddyb

Contributor

nnethercote commented Oct 18, 2016

I have changed the code to just do the "skip mk_substs if the fold was a no-op" optimisation. That should avoid any conflicts with @Mark-Simulacrum's changes to Substs. It still gives some nice speed-ups.

futures-rs-test  4.189s vs  4.157s --> 1.008x faster (variance: 1.004x, 1.007x)
helloworld       0.223s vs  0.223s --> 1.000x faster (variance: 1.013x, 1.012x)
html5ever-2016-  5.192s vs  5.147s --> 1.009x faster (variance: 1.007x, 1.004x)
hyper.0.5.0      4.788s vs  4.763s --> 1.005x faster (variance: 1.008x, 1.008x)
inflate-0.1.0    4.553s vs  4.494s --> 1.013x faster (variance: 1.010x, 1.044x)
issue-32062-equ  0.324s vs  0.314s --> 1.032x faster (variance: 1.008x, 1.020x)
issue-32278-big  1.661s vs  1.654s --> 1.005x faster (variance: 1.008x, 1.002x)
jld-day15-parse  1.475s vs  1.422s --> 1.038x faster (variance: 1.007x, 1.008x)
piston-image-0. 11.386s vs 11.227s --> 1.014x faster (variance: 1.005x, 1.063x)
reddit-stress    2.359s vs  2.308s --> 1.022x faster (variance: 1.017x, 1.024x)
regex.0.1.30     2.421s vs  2.388s --> 1.014x faster (variance: 1.036x, 1.008x)
rust-encoding-0  1.983s vs  1.973s --> 1.005x faster (variance: 1.008x, 1.016x)
syntex-0.42.2   29.582s vs 29.610s --> 0.999x faster (variance: 1.014x, 1.045x)
syntex-0.42.2-i 15.203s vs 15.175s --> 1.002x faster (variance: 1.011x, 1.006x)

r? @eddyb

@Mark-Simulacrum

This comment has been minimized.

Show comment
Hide comment
@Mark-Simulacrum

Mark-Simulacrum Oct 18, 2016

Member

Note that it will be at least a few days, more likely a week until I can resume work on the Substs improvements, so I'm not sure removing things now is a good idea. I haven't followed this PR closely though, not sure how large/significant the changes here are.

Member

Mark-Simulacrum commented Oct 18, 2016

Note that it will be at least a few days, more likely a week until I can resume work on the Substs improvements, so I'm not sure removing things now is a good idea. I haven't followed this PR closely though, not sure how large/significant the changes here are.

@eddyb

eddyb approved these changes Oct 18, 2016

Show outdated Hide outdated src/librustc/ty/subst.rs
@eddyb

This comment has been minimized.

Show comment
Hide comment
@eddyb
Member

eddyb commented Oct 18, 2016

@bors r+

@bors

This comment has been minimized.

Show comment
Hide comment
@bors

bors Oct 18, 2016

Contributor

📌 Commit 1e4241a has been approved by eddyb

Contributor

bors commented Oct 18, 2016

📌 Commit 1e4241a has been approved by eddyb

@bors

This comment has been minimized.

Show comment
Hide comment
@bors

bors Oct 18, 2016

Contributor

⌛️ Testing commit 1e4241a with merge eeb6379...

Contributor

bors commented Oct 18, 2016

⌛️ Testing commit 1e4241a with merge eeb6379...

bors added a commit that referenced this pull request Oct 18, 2016

Auto merge of #37108 - nnethercote:substs-experimentation, r=eddyb
Optimize `Substs::super_fold_with`.

This speeds up some of the rustc-benchmarks by up to ~4%.
```
futures-rs-test  4.467s vs  4.387s --> 1.018x faster (variance: 1.001x, 1.006x)
helloworld       0.242s vs  0.246s --> 0.980x faster (variance: 1.007x, 1.013x)
html5ever-2016-  7.664s vs  7.630s --> 1.004x faster (variance: 1.008x, 1.006x)
hyper.0.5.0      5.218s vs  5.133s --> 1.016x faster (variance: 1.013x, 1.008x)
inflate-0.1.0    5.040s vs  5.103s --> 0.988x faster (variance: 1.005x, 1.008x)
issue-32062-equ  0.361s vs  0.345s --> 1.047x faster (variance: 1.008x, 1.019x)
issue-32278-big  1.874s vs  1.850s --> 1.013x faster (variance: 1.020x, 1.018x)
jld-day15-parse  1.569s vs  1.508s --> 1.040x faster (variance: 1.009x, 1.003x)
piston-image-0. 12.210s vs 11.903s --> 1.026x faster (variance: 1.045x, 1.010x)
regex.0.1.30     2.568s vs  2.555s --> 1.005x faster (variance: 1.018x, 1.044x)
rust-encoding-0  2.139s vs  2.135s --> 1.001x faster (variance: 1.012x, 1.005x)
syntex-0.42.2   33.099s vs 32.353s --> 1.023x faster (variance: 1.003x, 1.028x)
syntex-0.42.2-i 17.989s vs 17.431s --> 1.032x faster (variance: 1.009x, 1.018x)
```
r? @eddyb. I don't know how this interacts with the changes that dikaiosune has been working on.
@alexcrichton

This comment has been minimized.

Show comment
Hide comment
@alexcrichton

alexcrichton Oct 18, 2016

Member

@bors: retry force clean

  • restarted buildbot
Member

alexcrichton commented Oct 18, 2016

@bors: retry force clean

  • restarted buildbot
Format comment in a nicer way.
Just kidding I'm doing this only to unstuck @bors/homu/buildbot.
@eddyb

This comment has been minimized.

Show comment
Hide comment
@eddyb
Member

eddyb commented Oct 19, 2016

@bors r+

@bors

This comment has been minimized.

Show comment
Hide comment
@bors

bors Oct 19, 2016

Contributor

📌 Commit ab5dcff has been approved by eddyb

Contributor

bors commented Oct 19, 2016

📌 Commit ab5dcff has been approved by eddyb

eddyb added a commit to eddyb/rust that referenced this pull request Oct 19, 2016

Rollup merge of #37108 - nnethercote:substs-experimentation, r=eddyb
Optimize `Substs::super_fold_with`.

This speeds up some of the rustc-benchmarks by up to ~4%.
```
futures-rs-test  4.467s vs  4.387s --> 1.018x faster (variance: 1.001x, 1.006x)
helloworld       0.242s vs  0.246s --> 0.980x faster (variance: 1.007x, 1.013x)
html5ever-2016-  7.664s vs  7.630s --> 1.004x faster (variance: 1.008x, 1.006x)
hyper.0.5.0      5.218s vs  5.133s --> 1.016x faster (variance: 1.013x, 1.008x)
inflate-0.1.0    5.040s vs  5.103s --> 0.988x faster (variance: 1.005x, 1.008x)
issue-32062-equ  0.361s vs  0.345s --> 1.047x faster (variance: 1.008x, 1.019x)
issue-32278-big  1.874s vs  1.850s --> 1.013x faster (variance: 1.020x, 1.018x)
jld-day15-parse  1.569s vs  1.508s --> 1.040x faster (variance: 1.009x, 1.003x)
piston-image-0. 12.210s vs 11.903s --> 1.026x faster (variance: 1.045x, 1.010x)
regex.0.1.30     2.568s vs  2.555s --> 1.005x faster (variance: 1.018x, 1.044x)
rust-encoding-0  2.139s vs  2.135s --> 1.001x faster (variance: 1.012x, 1.005x)
syntex-0.42.2   33.099s vs 32.353s --> 1.023x faster (variance: 1.003x, 1.028x)
syntex-0.42.2-i 17.989s vs 17.431s --> 1.032x faster (variance: 1.009x, 1.018x)
```
r? @eddyb. I don't know how this interacts with the changes that dikaiosune has been working on.

@eddyb eddyb referenced this pull request Oct 19, 2016

Merged

Rollup of 23 pull requests #37269

bors added a commit that referenced this pull request Oct 19, 2016

@bors bors merged commit ab5dcff into rust-lang:master Oct 19, 2016

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

@nnethercote nnethercote deleted the nnethercote:substs-experimentation branch Oct 19, 2016

@brson brson added the relnotes label Oct 24, 2016

bors added a commit that referenced this pull request Nov 25, 2016

Auto merge of #37979 - nnethercote:Ty-super_fold_with, r=eddyb
Avoid more unnecessary mk_ty calls in Ty::super_fold_with.

This speeds up several rustc-benchmarks by 1--5%.

This PR is the lovechild of #37108 and #37705.
```
futures-rs-test  4.059s vs  4.011s --> 1.012x faster (variance: 1.016x, 1.026x)
helloworld       0.236s vs  0.239s --> 0.986x faster (variance: 1.051x, 1.014x)
html5ever-2016-  3.831s vs  3.824s --> 1.002x faster (variance: 1.020x, 1.019x)
hyper.0.5.0      4.928s vs  4.936s --> 0.998x faster (variance: 1.003x, 1.012x)
inflate-0.1.0    4.135s vs  4.104s --> 1.007x faster (variance: 1.026x, 1.028x)
issue-32062-equ  0.309s vs  0.303s --> 1.017x faster (variance: 1.019x, 1.084x)
issue-32278-big  1.818s vs  1.797s --> 1.011x faster (variance: 1.011x, 1.008x)
jld-day15-parse  1.304s vs  1.271s --> 1.026x faster (variance: 1.018x, 1.012x)
piston-image-0. 10.938s vs 10.921s --> 1.002x faster (variance: 1.025x, 1.016x)
reddit-stress    2.327s vs  2.208s --> 1.054x faster (variance: 1.016x, 1.006x)
regex-0.1.80     8.796s vs  8.727s --> 1.008x faster (variance: 1.012x, 1.019x)
regex.0.1.30     2.294s vs  2.249s --> 1.020x faster (variance: 1.013x, 1.026x)
rust-encoding-0  1.914s vs  1.886s --> 1.015x faster (variance: 1.027x, 1.026x)
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment