Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ParallelExtend with map-reduce #479

Merged
merged 3 commits into from Nov 21, 2017

Conversation

Projects
None yet
2 participants
@cuviper
Copy link
Member

cuviper commented Nov 21, 2017

We were using a LinkedList<Vec<_>> intermediary for general ParallelExtend implementations, gathered using a fold to a vector and collect the list -- effectively a fold-fold-reduce. Now we explicitly fold-map-reduce it for better performance.

Thanks to @bluss for ridiculing the use of LinkedList, which got me to look at this again. At first I tried Vec<Vec<_>> instead, which did perform better, but I had to do this using fold-map-reduce to avoid infinite collect recursion. Then I tried that same tweak on the original code, finding it slightly better yet. So I stand by the use of LinkedList. 😄

test map_collect::i_mod_10_to_i::with_linked_list_collect_vec_sized     ... bench:   5,112,790 ns/iter (+/- 78,988)
test map_collect::i_mod_10_to_i::with_linked_list_map_reduce_vec_sized  ... bench:   4,954,336 ns/iter (+/- 103,285)
test map_collect::i_mod_10_to_i::with_vec_vec_sized                     ... bench:   4,965,241 ns/iter (+/- 80,797)
test map_collect::i_to_i::with_linked_list_collect_vec_sized            ... bench:   9,837,088 ns/iter (+/- 283,665)
test map_collect::i_to_i::with_linked_list_map_reduce_vec_sized         ... bench:   9,687,170 ns/iter (+/- 328,236)
test map_collect::i_to_i::with_vec_vec_sized                            ... bench:   9,690,253 ns/iter (+/- 242,379)
test vec_collect::vec_i::with_linked_list_collect_vec_sized             ... bench:  12,639,762 ns/iter (+/- 260,212)
test vec_collect::vec_i::with_linked_list_map_reduce_vec_sized          ... bench:   9,805,660 ns/iter (+/- 170,434)
test vec_collect::vec_i::with_vec_vec_sized                             ... bench:   9,904,740 ns/iter (+/- 226,382)
test vec_collect::vec_i_filtered::with_linked_list_collect_vec_sized    ... bench:  12,486,357 ns/iter (+/- 239,281)
test vec_collect::vec_i_filtered::with_linked_list_map_reduce_vec_sized ... bench:   9,794,605 ns/iter (+/- 306,361)
test vec_collect::vec_i_filtered::with_vec_vec_sized                    ... bench:   9,986,733 ns/iter (+/- 1,142,244)

cuviper added some commits Nov 20, 2017

Benchmark collecting to a Vec<Vec<_>> intermediate
... for comparing to the current `LinkedList<Vec<_>>` intermediate.
Benchmark LinkedList<Vec<_>> with collect vs map-reduce
`LinkedList::from_par_iter()` is a fold-reduce, so collecting a
`LinkedList<Vec<_>>` was essentially a fold-fold-reduce.  The new way
is instead a fold-map-reduce, and this wins!
@cuviper

This comment has been minimized.

Copy link
Member Author

cuviper commented Nov 21, 2017

bors r+

bors bot added a commit that referenced this pull request Nov 21, 2017

Merge #479
479:  ParallelExtend with map-reduce  r=cuviper a=cuviper

We were using a `LinkedList<Vec<_>>` intermediary for general `ParallelExtend` implementations, gathered using a `fold` to a vector and `collect` the list -- effectively a `fold`-`fold`-`reduce`.  Now we explicitly `fold`-`map`-`reduce` it for better performance.

Thanks to @bluss for [ridiculing](bluss/indexmap#45 (comment)) the use of `LinkedList`, which got me to look at this again.  At first I tried `Vec<Vec<_>>` instead, which did perform better, but I had to do this using `fold`-`map`-`reduce` to avoid infinite `collect` recursion.  Then I tried that same tweak on the original code, finding it slightly better yet.  So I stand by the use of `LinkedList`. 😄

```
test map_collect::i_mod_10_to_i::with_linked_list_collect_vec_sized     ... bench:   5,112,790 ns/iter (+/- 78,988)
test map_collect::i_mod_10_to_i::with_linked_list_map_reduce_vec_sized  ... bench:   4,954,336 ns/iter (+/- 103,285)
test map_collect::i_mod_10_to_i::with_vec_vec_sized                     ... bench:   4,965,241 ns/iter (+/- 80,797)
test map_collect::i_to_i::with_linked_list_collect_vec_sized            ... bench:   9,837,088 ns/iter (+/- 283,665)
test map_collect::i_to_i::with_linked_list_map_reduce_vec_sized         ... bench:   9,687,170 ns/iter (+/- 328,236)
test map_collect::i_to_i::with_vec_vec_sized                            ... bench:   9,690,253 ns/iter (+/- 242,379)
test vec_collect::vec_i::with_linked_list_collect_vec_sized             ... bench:  12,639,762 ns/iter (+/- 260,212)
test vec_collect::vec_i::with_linked_list_map_reduce_vec_sized          ... bench:   9,805,660 ns/iter (+/- 170,434)
test vec_collect::vec_i::with_vec_vec_sized                             ... bench:   9,904,740 ns/iter (+/- 226,382)
test vec_collect::vec_i_filtered::with_linked_list_collect_vec_sized    ... bench:  12,486,357 ns/iter (+/- 239,281)
test vec_collect::vec_i_filtered::with_linked_list_map_reduce_vec_sized ... bench:   9,794,605 ns/iter (+/- 306,361)
test vec_collect::vec_i_filtered::with_vec_vec_sized                    ... bench:   9,986,733 ns/iter (+/- 1,142,244)
```
@bors

This comment has been minimized.

Copy link
Contributor

bors bot commented Nov 21, 2017

@bors bors bot merged commit 8b7c0e7 into rayon-rs:master Nov 21, 2017

2 checks passed

bors Build succeeded
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@bluss

This comment has been minimized.

Copy link
Contributor

bluss commented Nov 21, 2017

That's awesome. Nice write-up.

@cuviper cuviper deleted the cuviper:extend_map_reduce branch Oct 18, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.