Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use fold instead of for_each? #781

Closed
Philippe-Cholet opened this issue Oct 12, 2023 · 3 comments
Closed

Use fold instead of for_each? #781

Philippe-Cholet opened this issue Oct 12, 2023 · 3 comments

Comments

@Philippe-Cholet
Copy link
Member

Philippe-Cholet commented Oct 12, 2023

#780 (comment) by jswrenn

In the Rust standard library, it's quite common for fold to be implemented in terms of try_fold. In your original code you construct a TakeWhile and call fold on it. But TakeWhile's fold implementation delegates to try_fold!

The general rule to follow is something like this:

  • If you need to completely consume an iterator, use for_each or fold.

    • Use for_each if you don't need to maintain an accumulator.
    • Use fold if you do need to maintain an accumulator.
  • If you need to partly consume an iterator, use try_for_each or try_fold.

    • Same rules about accumulators apply.

Bold is mine as I wonder if we are using for_each (convenient) at multiple places instead of fold.

Current for_each uses:

  • lib.rs: Itertools::{counts, partition_map, join}
  • extrema_set.rs: min_set_impl
  • grouping_map.rs: GroupingMap::{aggregate, collect}
  • group_map.rs: into_group_map
  • k_smallest.rs: k_smallest

(There is also Itertools::foreach but this one is okay.)

Why do I care? While benchmarking a new MapForGrouping::fold, I found out that some_slice_iter::for_each does not delegate to fold (default behavior) and is not optimized * (just while let Some(x) = self.next() { f(x); }) while some_slice_iter::fold is optimized (which is useful to see benchmark improvements of specializing fold methods of our adaptors by delegating to the inner iterator).

*: The for_each has a comment: "We override the default implementation, which uses try_fold, because this simple implementation generates less LLVM IR and is faster to compile."

@Philippe-Cholet
Copy link
Member Author

Well, apart from Itertools::join that is known to be slow anyway, I benchmarked each "for_eachfold" change and got mostly severe regressions. The only "win" was -5% for into_group_map.

Before I close this, do you know why?

@scottmcm
Copy link
Contributor

  • Use fold if you do need to maintain an accumulator.

I would tweak this one slightly:

Use fold if you need owned access to an accumulator

If the folder is something like |v, x| { v.push(x); v }, then it's (very slightly) better to do that as .for_each(|x| v.push(x)), using an FnMut closure to refer to local state.

Last people looked into it, threading the passing of a Vec or similar through all the folds was harder for LLVM to optimize well than if there's nothing (well, a trivial ()) to pass along. So if the owned access to the accumulator -- fold's superpower -- isn't needed, in μoptimized code it's better to use for_each capturing a &mut instead.

@Philippe-Cholet
Copy link
Member Author

Thanks for the detailled precision.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants