-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LeafletFinder issues #76
Comments
These things came up in the context of me trying out various tests in #66. We need to test these problems separately here, just to make sure it's not something from me running the tests incorrectly. |
Okay let me check them and I will get back at you. |
So mdanalysis technically supports pickle and unpickle. We never documented how they should be used though. @richardjgowers @jbarnoud |
Hello @orbeckst @kain88-de, conserning the The reason for the first error is that the number of atoms that are present are not dividable with the number of processes (see leaflet.py#L192). There are two things, I can think of doing here:
|
AtomGroupsIf we can use pickling of AGs then that would be great. Otherwise the approach in the standard serial version should work, whereby you
However, come to think, that will be awful for performance because you would be doing this for every frame. So scratch that idea. Can you write it such that only coordinates are communicated to the dask workers? Numpy arrays are not problematic. Perhaps @VOD555 and @kain88-de have some better ideas. n_jobsHave the partitions got to be the same size? Is it not possible to have some that have different sizes? Changing If possible, unequal partition sizes would be my preferred solution, followed by dummies. Alternatively, oversubscribing workers might also help but I'd be interested in seeing performance data. |
I'll have a look at the pickling, see if I can recall how it works. But I never really needed to use it. @mnmelo is probably one who knows the most about it, though. Ping? |
For your n_jobs problem you can also use [make balanced slices]
https://github.com/MDAnalysis/pmda/blob/master/pmda/util.py#L62 . It solves
the same problem with our standard classes problem.
…On Mon, Nov 5, 2018 at 6:44 PM Jonathan Barnoud ***@***.***> wrote:
I'll have a look at the pickling, see if I can recall how it works. But I
never really needed to use it. @mnmelo <https://github.com/mnmelo> is
probably one who knows the most about it, though. Ping?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#76 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AEGnVssElasUmsHIxLsRbhq9U_S5pj9Aks5usHkEgaJpZM4YA_kn>
.
|
Pickling should work like: u = mda.Universe(...., anchor_name='this')
# make a pickle of each atomgroup
pickles = [pickle.dumps(ag) for ag in atomgroups]
# In parallel processes
# make a Universe with the same anchor_name
# this only has to happen once per worker, so could be done using `init_func` in multiprocessing
u = mda.Universe(....., anchor_name='this')
ags = [pickle.loads(s) for s in pickles] |
@iparask could have a look at this issue again? It would be good to have this fixed for the SciPy paper. |
Very quick issue with varying things that came up during PR #66; apologies for the messy report. See PR #81 for initial (failing) tests.
n_jobs
LeafletFinder with
n_jobs == 2
does not pass tests, see #66 (comment)distributed
LeafletFinder with
scheduler
as distributed.client fails , see also started PR #81.(complete error message from pytest)
The text was updated successfully, but these errors were encountered: