-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
light-weight, param-driven implementation of parallelism using multiproc #99
Conversation
…lelism vs non-parallelism
First of all - happy new year! 🎆 So I've had a brief look and I like the approach. Previously I've tried using There's also some small things that I like:
There's a few concerns:
@rogiervandergeer thoughts? |
The pathos dependency is how I got around the issue of pickling functions. It's basically an exact port of the mp module that uses dill as a drop-in replacement for pickle. It can serialize functions that pickle can't. |
As for testing, I took a stab at unit testing the performance here. This test is far from complete coverage and it adds significant time to the test run, so I ended up taking it out. But there's no reason you couldn't reduce the sleep to something much smaller, then compare the wall-clock time of this run to the test that doesn't use concurrency. I'll try adding this back in a sane way. |
Timing added to unit tests. Tried making the latency large enough to maintain consistency on any number of cores while keeping it small enough that the test time isn't dramatically impacted. The times aren't compared at all if running on a single core, even though the functionality is still tested for multi-proc. |
Thank you @jasondemorrow for this PR! We've been meaning to implement something like this for a while, and now you beat us to it! 😊 I agree with @koaning though that we should prefer to do without introducing any new dependencies. I think the fact that we do not have any dependencies so far is one of the strong points of |
@jasondemorrow could you add
I haven't decided fully yet on |
Will work on the setup.py tonight; my initial guess (online documentation on the latest format is sparse for some reason) on how to add this seems wrong. As for pathos, it's for more than just lambdas. Pickle can't even serialize instance methods, meaning that my implementation doesn't work at all with the regular multiprocessing/pickle library. The fact that evaluation_function is assigned to a member of Population is what pickle can't handle. Using the python lib would require de-coupling the eval function from any instance. I'm not even immediately sure how you do that, but it would probably break your OO design. |
At this point I'm a little stumped as to why the build fails. The package multiprocess is public on PyPi. I notice your build is using an old version of pip; is there any way to upgrade it? |
Build passes. I reduced the dependency footprint by using just the multiprocess module of the pathos library, which can be installed as a standalone library. |
@rogiervandergeer right now i'll admit that i like the implementation, another implementation with If you're still doubting @rogiervandergeer; there might also be some middle ground. I can imagine that advanced users who need the parallelism can install evol via |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your efforts to make evol
better!
@koaning I also considered making the dependency optional, for example by not listing it as a requirement but only allowing In the end I think it is too much of a hassle. In any case I do not think that having multiprocessing functionality available by default is a bad idea. Also |
The problem is that setup.py imports evol, which in turn transitively imports multiprocess. So running "pip install ." fails because it attempts to import deps before installing them. I guess you guys never ran into this before because you were using only native libraries. The good news is, you're only using the package to get version, which could be defined someplace else. If I were to move that, any suggestions on where it should go? Examples I've seen simply defined directly it in setup.py. |
That explains. I never liked that construction anyway. I've fixed that problem in #100. Please rebase, either after we've merged that, or on that branch. |
I've just merged #100. Good find that odd bug. |
Perfect, thanks! |
i have just done a review and i have only one comment. @rogiervandergeer do we also want to implement parallelism for the if we don't mind implementing it in another PR then i am fine with mergeing. |
I went ahead and took a stab at adding concurrency to ContestPopulation. I'm not 100% confident in the change, since this code uses a lot of itertools stuff I'm not terribly familiar with. |
LGTM. @rogiervandergeer if you don't have any comments i'll gladly merge. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good work! I have a few more small comments but overall I think this is good to merge. What I do think is that we need more testing (e.g. parametrize all tests for Population
with one and multiple cpus); but perhaps it is better to do that after we merge -- it doesn't make sense to delay the merge.
evol/population.py
Outdated
@@ -327,6 +340,36 @@ def _update_documented_best(self): | |||
self.documented_best = copy(current_best) | |||
|
|||
|
|||
class Contest(object): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In python3 we no longer need to inherit from object
; i.e. class Contest:
should suffice.
evol/population.py
Outdated
self.eval_function = eval_function | ||
self.pool = process_pool | ||
|
||
def post_evaluate(self, scores): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be private, in my opinion.
Just mentioning the issue here: #47. |
Thanks for the prompt and thorough reviews! |
Thank you for the contribution! We’ll try to publish a new version of evol next week.
On 5 Jan 2019, at 09:10, Jason DeMorrow <notifications@github.com<mailto:notifications@github.com>> wrote:
Thanks for the prompt and thorough reviews!
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub<#99 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/ALPQhMgPr76GBDMibXP79DLT0eIEFvyKks5vAF3ggaJpZM4Zlyl4>.
|
For a version of population unit testing that compares execution time between evaluation with and without multiproc, see here.