Multi-threading particle swarm #735

tbeason · 2019-08-13T15:56:33Z

I added Threads.@threads in two spots: compute_cost! and limit_X!. The speedup on my particular problem was approximately 3x going from 1 thread to 6, the bulk of which comes from parallel execution of the cost function.

The user does not have to do anything additional, just needs to start up julia with multiple threads.

I have not run any further tests, and I am not sure how well threading would work on every problem. If there is IO or something else funky inside the cost function (calls to a RNG?), there could be an issue with multi-threading (from what I read on the recent blog post on the homepage).

I am on julia v1.3.0-alpha.

…icles within bounds. speedup on my particular problem was at least 2x from 1 thread to 6.

ChrisRackauckas · 2019-08-13T15:58:57Z

I think this is the wrong way to go. Instead, I think the interface should be changed so that the user gives a function f!(dx,x) which computes the loss at multiple points. That way if they want it multithreaded, GPU'd, etc. they can do it. Building it all into the package will never make everything work.

tbeason · 2019-08-13T16:05:21Z

That sounds nice but it also sounds like a major change. I just wanted to implement something here that can at least get this more on par with solvers in languages like MATLAB where this is already possible.

Like I said, I didn't test it on any problems other than mine, but it seems like it should work fine as long as the cost function is relatively standalone.

antoine-levitt · 2019-08-13T16:19:27Z

Also tricky because most optimizers don't support multiple evaluations at the same time, which would make it sort of weird to change the API for this single particular case. In general there's an explosion in the properties of the objective functions (differentiable, in place, multiple evaluations) that's pretty nasty to handle...

Short term I think this feature could be useful, provided it's toggled by a flag (also objective functions that have threaded BLAS calls inside them should not be threaded, at least for now)

codecov · 2019-08-13T16:23:02Z

Codecov Report

Merging #735 into master will decrease coverage by 0.04%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master     #735      +/-   ##
==========================================
- Coverage   81.69%   81.64%   -0.05%     
==========================================
  Files          43       43              
  Lines        2414     2414              
==========================================
- Hits         1972     1971       -1     
- Misses        442      443       +1

Impacted Files	Coverage Δ
...ultivariate/solvers/zeroth_order/particle_swarm.jl	`98.21% <100%> (ø)`	⬆️
src/multivariate/solvers/constrained/samin.jl	`75.55% <0%> (-0.75%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 40e939b...92aa083. Read the comment docs.

codecov · 2019-08-13T16:23:03Z

Codecov Report

Merging #735 into master will decrease coverage by 0.16%.
The diff coverage is 42.85%.

@@            Coverage Diff             @@
##           master     #735      +/-   ##
==========================================
- Coverage   81.69%   81.52%   -0.17%     
==========================================
  Files          43       43              
  Lines        2414     2419       +5     
==========================================
  Hits         1972     1972              
- Misses        442      447       +5

Impacted Files	Coverage Δ
...ultivariate/solvers/zeroth_order/particle_swarm.jl	`96.5% <42.85%> (-1.71%)`	⬇️
src/multivariate/solvers/constrained/samin.jl	`75.55% <0%> (-0.75%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 40e939b...bb258a6. Read the comment docs.

tbeason · 2019-08-13T17:27:21Z

I can add a user supplied threaded::Bool option to the ParticleSwarm type/constructor and a note in the docs tomorrow morning.

tbeason · 2019-08-14T15:07:45Z

I added a user-supplied flag to enable the multi-threading, which when enabled just calls the threaded version of the compute_cost! function. I updated the documentation. I removed the threading in the limit_X! function because it would likely only provide noticeable benefit if both the number of parameters and the number of particles were very large.

There is one outstanding issue with this that I've noticed and it is that the tracking of the total number of function calls is off. When run single-threaded, the number of function calls seems to be (N+1)*P + 1 where N is the number of iterations and P is the number of particles. With threading enabled, it always reports something just slightly less. I snooped around a bit and it seems this comes more from some of the internals of Optim which is a bit beyond me. If someone else could take a look that would be helpful.

antoine-levitt · 2019-08-14T15:29:03Z

Hm, that's right. The easy way out is to make the add atomic in the counter incrementation. That's in NLSolversBase, in the objective function wrappers.

pkofod · 2019-08-15T13:06:19Z

We discussed this on slack so I just want to note that I've seen it. I'm still digesting the change and comments.

For some things @ChrisRackauckas's approach can be nice, but in other cases what's in here is what you need. I think we may just have different modes if we want to support this. As @antoine-levitt mentions, if it's threading you're after then many times you can take advantage of all your available threads in the objective function yourself. Sometimes the objective function is inherently serial, so it may only be possible to thread at the f call level. Other times, @ChrisRackauckas is exactly right. I know his use case is that it can be beneficial to collect N, where N is big, different x's to evaluate, and then send them off to a compute node that efficiently handles big batches of solves. So we're really after some ParallelMode subtypes (or symbols, or whatever) to control various modes of parallellization. Merging this PR doesn't exclude us from experimenting in the future, but it's totally correct that Chris' need is the hardest to accommodate in terms of rewriting Optim.

pkofod · 2019-08-16T07:50:09Z

need is the hardest to accommodate in terms of rewriting Optim.

so I have a proto type of this, and it wasn't hard to do at the PSO level, but it won't really play nice with NDifferentiable types. This is the "hardest part" I was talking about.

pkofod · 2019-08-16T08:03:57Z

What about something like

@enum ParallelMode Serial Threading Distributed Batch

?

Serial is what it is now, Threading is a @threads for loop, Distributed is a pmap over an X which contains the points to evaluate as elements, Batch is what Chris described.

pkofod · 2020-03-10T07:06:11Z

The new PSO will support this. But thanks for bringing this up! :)

pkofod · 2020-11-14T16:12:55Z

https://discourse.julialang.org/t/optim-jl-particleswarm-multithreading-at-the-function-level/50117 for those who find this issue

particle swarm now threads cost function computation and setting part…

92aa083

…icles within bounds. speedup on my particular problem was at least 2x from 1 thread to 6.

threading enable by flag

e32750f

-fix tests

bb258a6

pkofod closed this Mar 10, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multi-threading particle swarm #735

Multi-threading particle swarm #735

tbeason commented Aug 13, 2019

ChrisRackauckas commented Aug 13, 2019

tbeason commented Aug 13, 2019

antoine-levitt commented Aug 13, 2019

codecov bot commented Aug 13, 2019

codecov bot commented Aug 13, 2019 •

edited

Loading

tbeason commented Aug 13, 2019

tbeason commented Aug 14, 2019

antoine-levitt commented Aug 14, 2019

pkofod commented Aug 15, 2019 •

edited

Loading

pkofod commented Aug 16, 2019

pkofod commented Aug 16, 2019

pkofod commented Mar 10, 2020

pkofod commented Nov 14, 2020

Multi-threading particle swarm #735

Multi-threading particle swarm #735

Conversation

tbeason commented Aug 13, 2019

ChrisRackauckas commented Aug 13, 2019

tbeason commented Aug 13, 2019

antoine-levitt commented Aug 13, 2019

codecov bot commented Aug 13, 2019

Codecov Report

codecov bot commented Aug 13, 2019 • edited Loading

Codecov Report

tbeason commented Aug 13, 2019

tbeason commented Aug 14, 2019

antoine-levitt commented Aug 14, 2019

pkofod commented Aug 15, 2019 • edited Loading

pkofod commented Aug 16, 2019

pkofod commented Aug 16, 2019

pkofod commented Mar 10, 2020

pkofod commented Nov 14, 2020

codecov bot commented Aug 13, 2019 •

edited

Loading

pkofod commented Aug 15, 2019 •

edited

Loading