Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid `chain()` in `find_constraint_paths_between_regions()`. #64801

Conversation

@nnethercote
Copy link
Contributor

commented Sep 26, 2019

This iterator can be hot, and chained iterators are slow. The second
half of the chain is almost always empty, so this commit specializes the
code to avoid the chained iteration.

This change reduces instruction counts for the wg-grammar benchmark by
up to 1.5%.

@rust-highfive

This comment has been minimized.

Copy link
Collaborator

commented Sep 26, 2019

r? @estebank

(rust_highfive has picked a reviewer for you, use r? to override)

@nnethercote

This comment has been minimized.

Copy link
Contributor Author

commented Sep 26, 2019

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

Copy link

commented Sep 26, 2019

Awaiting bors try build completion

bors added a commit that referenced this pull request Sep 26, 2019
…een_regions, r=<try>

Avoid `chain()` in `find_constraint_paths_between_regions()`.

This iterator can be hot, and chained iterators are slow. The second
half of the chain is almost always empty, so this commit specializes the
code to avoid the chained iteration.

This change reduces instruction counts for the `wg-grammar` benchmark by
up to 1.5%.
@bors

This comment has been minimized.

Copy link
Contributor

commented Sep 26, 2019

⌛️ Trying commit 60fa535 with merge c30cab8...

@PlasmaPower

This comment has been minimized.

Copy link
Contributor

commented Sep 26, 2019

I'd get rid of the is_empty check, and move the code currently in .map( into the for loop. I don't think checking is_empty before the loop will be any faster, even if it is indeed empty, but I'm not an expert on optimization.

Also, you're currently calling self.applied_member_constraints(r) again if it does have elements when you could just reuse the previous call's result.

@bors

This comment has been minimized.

Copy link
Contributor

commented Sep 26, 2019

☀️ Try build successful - checks-azure
Build commit: c30cab8 (c30cab87d99b93dc2a80cda3096d1c8640ae9d44)

@rust-timer

This comment has been minimized.

Copy link

commented Sep 26, 2019

Queued c30cab8 with parent a5bc0f0, future comparison URL.

@matklad

This comment has been minimized.

Copy link
Member

commented Sep 26, 2019

Drive by comment, but wouldn't just swapping

for constraint in outgoing_edges_from_graph.chain(outgoing_edges_from_picks)

for

outgoing_edges_from_graph.chain(outgoing_edges_from_picks).for_each(|constraint|

achieve the same effect?

@nnethercote

This comment has been minimized.

Copy link
Contributor Author

commented Sep 26, 2019

@matklad: How would that help? chain is still involved. Every next() call requires two extra checks compared to a vanilla iterator. Because of this, looping over two iterators will always be faster than looping over a single chained iterator.

@matklad

This comment has been minimized.

Copy link
Member

commented Sep 26, 2019

I'll tag @scottmcm who actually knows this stuff, but my understanding is that .for_each uses internal iteration for chain. That is, it doesn't call next repeteadly and compiles more like

lhs.for_each(f);
rhs.for_each(f);

I think, in the actual coude, this bottoms out in this fold override.

Because of this, .for_each is faster for chain-like iterators (and not slower for other iterator), and should be preferred in pref-sensitive code. Not that I like this state of affairs...

EDIT: cc #59740
EDIT: chain is slower in this case.

@rust-timer

This comment has been minimized.

Copy link

commented Sep 26, 2019

Finished benchmarking try commit c30cab8, comparison URL.

Copy link
Contributor

left a comment

Other than my and @jakubadamw's nitpick, you can consider it an r+ from me.

This iterator can be hot, and chained iterators are slow. The second
half of the chain is almost always empty, so this commit changes the
code to avoid the chained iteration.

This change reduces instruction counts for the `wg-grammar` benchmark by
up to 1.5%.
@nnethercote nnethercote force-pushed the nnethercote:improve-find_constraint_paths_between_regions branch from 60fa535 to 5ca99b7 Sep 29, 2019
@nnethercote

This comment has been minimized.

Copy link
Contributor Author

commented Sep 29, 2019

Thanks for the various comments, I have addressed them all. @estebank, the new code is sufficiently different that you might want to take another look.

@nnethercote

This comment has been minimized.

Copy link
Contributor Author

commented Sep 29, 2019

@matklad: I tried using for_each as you suggested and it slightly increased the instruction count.

@nnethercote

This comment has been minimized.

Copy link
Contributor Author

commented Oct 2, 2019

It's been a few days, I'm going to carry over @estebank's earlier r+.

@bors r=estebank

@bors

This comment has been minimized.

Copy link
Contributor

commented Oct 2, 2019

📌 Commit 5ca99b7 has been approved by estebank

tmandry added a commit to tmandry/rust that referenced this pull request Oct 2, 2019
…_paths_between_regions, r=estebank

Avoid `chain()` in `find_constraint_paths_between_regions()`.

This iterator can be hot, and chained iterators are slow. The second
half of the chain is almost always empty, so this commit specializes the
code to avoid the chained iteration.

This change reduces instruction counts for the `wg-grammar` benchmark by
up to 1.5%.
bors added a commit that referenced this pull request Oct 2, 2019
Rollup of 11 pull requests

Successful merges:

 - #64649 (Avoid ICE on return outside of fn with literal array)
 - #64722 (Make all alt builders produce parallel-enabled compilers)
 - #64801 (Avoid `chain()` in `find_constraint_paths_between_regions()`.)
 - #64805 (Still more `ObligationForest` improvements.)
 - #64840 (SelfProfiler API refactoring and part one of event review)
 - #64885 (use try_fold instead of try_for_each to reduce compile time)
 - #64942 (Fix clippy warnings)
 - #64952 (Update cargo.)
 - #64974 (Fix zebra-striping in generic dataflow visualization)
 - #64978 (Fully clear `HandlerInner` in `Handler::reset_err_count`)
 - #64979 (Update books)

Failed merges:

 - #64959 (syntax: improve parameter without type suggestions)

r? @ghost
@bors bors merged commit 5ca99b7 into rust-lang:master Oct 2, 2019
4 checks passed
4 checks passed
pr Build #20190929.48 succeeded
Details
pr (Linux mingw-check) Linux mingw-check succeeded
Details
pr (Linux x86_64-gnu-llvm-6.0) Linux x86_64-gnu-llvm-6.0 succeeded
Details
pr (LinuxTools) LinuxTools succeeded
Details
@nnethercote nnethercote deleted the nnethercote:improve-find_constraint_paths_between_regions branch Oct 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
8 participants
You can’t perform that action at this time.