-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
use parallel reductions to compute sum, maximum, etc #126
Comments
https://github.com/JuliaFolds/FLoops.jl (might be better for replacing @threads on loops and in some cases reducing memory use) |
unmaintained |
musoke
added a commit
that referenced
this issue
Mar 15, 2023
`maximum` and `sum` as defined in Base aren't parallel. This means that when more than one thread is available, some are wasted every time they are computed. This isn't a huge part of the simulation time, but does happen every time step, especially if certain summary stats are extracted. Folds.jl has a nearly drop in replacement for these and other reductions. Benchmarks suggest that with 8 threads, moving to Folds.jl gives a ~30% speed up of each call to `maximum` and ~80% speedup for `sum`. Fixes: #126
musoke
added a commit
that referenced
this issue
Mar 15, 2023
`maximum` and `sum` as defined in Base aren't parallel. This means that when more than one thread is available, some are wasted every time they are computed. This isn't a huge part of the simulation time, but does happen every time step, especially if certain summary stats are extracted. Folds.jl has a nearly drop in replacement for these and other reductions. Benchmarks suggest that with 8 threads, moving to Folds.jl gives a ~25-35% speed up of each call to `maximum` and ~60-80% speedup for `sum`: julia> include("benchmarks/folds.jl") res_min = minimum(results) = 2-element BenchmarkTools.BenchmarkGroup: tags: [] "Base" => 2-element BenchmarkTools.BenchmarkGroup: tags: [] "sum" => TrialEstimate(322.835 μs) "maximum" => TrialEstimate(825.208 μs) "Folds.jl" => 2-element BenchmarkTools.BenchmarkGroup: tags: [] "sum" => TrialEstimate(44.668 μs) "maximum" => TrialEstimate(573.368 μs) res_med = median(results) = 2-element BenchmarkTools.BenchmarkGroup: tags: [] "Base" => 2-element BenchmarkTools.BenchmarkGroup: tags: [] "sum" => TrialEstimate(332.623 μs) "maximum" => TrialEstimate(865.565 μs) "Folds.jl" => 2-element BenchmarkTools.BenchmarkGroup: tags: [] "sum" => TrialEstimate(62.110 μs) "maximum" => TrialEstimate(665.724 μs) 2-element BenchmarkTools.BenchmarkGroup: tags: [] "sum" => TrialJudgement(-81.33% => improvement) "maximum" => TrialJudgement(-23.09% => improvement) Fixes: #126
musoke
added a commit
that referenced
this issue
Mar 15, 2023
`maximum` and `sum` as defined in Base aren't parallel. This means that when more than one thread is available, some are wasted every time they are computed. This isn't a huge part of the simulation time, but does happen every time step, especially if certain summary stats are extracted. Folds.jl has a nearly drop in replacement for these and other reductions. Benchmarks suggest that with 8 threads, moving to Folds.jl gives a ~25-35% speed up of each call to `maximum` and ~60-80% speedup for `sum`: julia> include("benchmarks/folds.jl") res_min = minimum(results) = 2-element BenchmarkTools.BenchmarkGroup: tags: [] "Base" => 2-element BenchmarkTools.BenchmarkGroup: tags: [] "sum" => TrialEstimate(322.835 μs) "maximum" => TrialEstimate(825.208 μs) "Folds.jl" => 2-element BenchmarkTools.BenchmarkGroup: tags: [] "sum" => TrialEstimate(44.668 μs) "maximum" => TrialEstimate(573.368 μs) res_med = median(results) = 2-element BenchmarkTools.BenchmarkGroup: tags: [] "Base" => 2-element BenchmarkTools.BenchmarkGroup: tags: [] "sum" => TrialEstimate(332.623 μs) "maximum" => TrialEstimate(865.565 μs) "Folds.jl" => 2-element BenchmarkTools.BenchmarkGroup: tags: [] "sum" => TrialEstimate(62.110 μs) "maximum" => TrialEstimate(665.724 μs) 2-element BenchmarkTools.BenchmarkGroup: tags: [] "sum" => TrialJudgement(-81.33% => improvement) "maximum" => TrialJudgement(-23.09% => improvement) Fixes: #126
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
No description provided.
The text was updated successfully, but these errors were encountered: