-
Notifications
You must be signed in to change notification settings - Fork 487
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change default weight of parallel iterators to assume expensive ops #49
Comments
Intel et al' I know this is not a direct correlation to the question, but I feel it is linked. In re-factoring for parallel many times code gets cleaner. It forces/allows the programmer to reason. This is my quandary at the moment to be honest. I love the idea of automating parallelism but have found in the past being told by the compiler I need to refactor to be handy. In essence I feel an opt-in to be the way forward here, possibly later giving hints at optimisation of algorithms for parallelism at compile tim . Hope that helps a little anyway |
@dirvine thanks for the input. A few thoughts:
|
Hmm. I have been experimenting with this in a branch. One interesting result that I found was that, when I ported the nbody demo, the par-reduce variant (which can generate quite a lot of inexpensive tasks...) ran ridiculously slow until I raised up the sequential threshold. This isn't really surprising I guess -- the defaults are very wrong for this case -- but it did point out of course the danger of changing our weights. |
I guess if we did more work on making task spawning cheap (work that would be very profitable in any case) that might help out here. (For that matter, par-reduce is still always slower than the more coarse-grained version.) |
The branch (for the record) is |
Definitely significant progress here with @cuviper's https://github.com/nikomatsakis/rayon/pull/106. I still think we want to remove the existing weight stuff before 1.0 -- and maybe add back with some other APIs. |
Maybe rayon could sample how long leaf nodes take to run and dynamically adjust? Of course some elements may require much more processing than others, but starting with fine grained splitting and dynamically increasing splits may get the best of both worlds. |
@edre we already do this, effectively, via the mechanism of work stealing as well as the adaptive changes. What we are talking about is tuning that mechanism. |
In particular I think the current mechanism should work pretty well except for when things are both highly variable between tasks and bigger tasks are clumped together. |
Currently parallel iterators assume cheap operations. I am thinking that should be changed to assume expensive operations (i.e., fine-grained parallel splits), and have people opt-in to the current behavior by manually adjusting the weights or calling
weight_min
. My reasoning is:Thoughts?
The text was updated successfully, but these errors were encountered: