New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Thoughts on Parallelism #47
Comments
I think we should. If there is a way to easily make But we should be careful. It is not difficult to come up with mutate functions which will break any parallelism (e.g. any I think the biggest win can be achieved when working with islands. Then you can make basically everything parallel. So although it certainly doesn't hurt to work on this before we've implemented the islands, we need to make sure the two can work together. |
FWIW, as a user of this library (great work BTW, thank you), I'd be fine with a set of "lower level" concurrency modules, provided with the caveat that mutate and breed should be implemented with care. |
glad to hear you like it!
we’re slowly considering how we might want to do things towards
parellism/concurrency but we’re not super sure on what the best method is.
we’re certainly open to suggestions. most likely we’ll offer a type of
population that is able to distribute its workload. what is the use-case?
|
Yes, I'd like to speed up the evolution by distributing the work among
CPU/GPU cores on one machine. Parallelism on several nodes would be great,
but not an immediate need for me personally. It seems like the fitness
evaluation is done serially across the population? I would think fitness
evaluation could be spawned as a thread without much risk of concurrent
access to shared resources.
|
Note that the fitness function is something that we currently evaluate lazily. Suppose that we do two mutate steps and then a survive step: we only need to evaluate an individual at the survive step, not at either mutation steps. The evaluation can be expensive, which is why the main tactic we deploy is to delay it. Assuming that the functions that you supply to Note that a |
Thanks for the link. Coming from mostly a C++/Java background, I was interested to learn that Python implements threading much differently than I'd expect. But yes, the solution you suggest sounds perfect for my use case. I'll be glad to help in whatever way I can. |
I've just submitted a PR with a very simple, arg-driven impl. using multiproc (the pathos port that uses dill in place of pickle). At one point I updated the population unit test to compare execution times. On my machine, evaluating a population with 3 concurrent workers was 3 times faster, as expected. |
Interesting. I'll have a look, I've never had any experience with @rogiervandergeer opinions? |
The main reason is that, unlike pickle, dill is capable of serializing instance methods and lambdas so they can be piped to the new process. It's possible to drop that dependency, but (if I understand correctly) it would mean detaching all functions needed from their instances and making them module-scoped. |
There's great value in being able to support lambdas. |
@rogiervandergeer close this? |
We already apply some performance tricks with the
.evaluate()
mechanic but we may be able to add some form of parallelism/queing to perhaps make things even more performant.In terms of easy win: it seems like the
.map
(and thus mutate) can be run in parallel in general. Same would hold for.evaluate()
in the BasePopulation.Do we want to explore this?
The text was updated successfully, but these errors were encountered: