-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Accelerate sampling #8
Comments
Maybe merge your current pull request and I can look from there |
FYI not blaming pymbar here--I think it's just that there are lots of molecules to reweight. |
The whole dataset is a random subset of FreeSolv for the training set,
yes---not all of FreeSolv?
Might help to do a bit of profiling to see what the slow step actually is.
I believe the current code creates a new Force object for each molecule
every time. Some caching could speed things up.
Unlikely that pymbar is actually the slow part.
|
I'm profiling now. Also, I agree that pymbar is not the slow part. I was just saying that the step where everything is reweighted is likely the slow step, since there are a lot of molecules. |
Ok, so I did some profiling, here is the truncated output, sorted by cumulative time:
|
Also, here was the command that I used:
|
As @pgrinaway noted, the getState could indicate that energy calculations are the rate limiting step here. |
Ok, so I profiled again using Instruments in Xcode to take a closer look at what is going on. As suspected, the biggest consumer of instructions is Alternatively:
|
So if the energy calculation is rate limiting, it's possible that we could do something like this:
There is an example of this in pytraj: |
We're not currently using Amber's It might be interesting to use this strategy to allow either the OpenMM or |
I seem to recall that the The OpenMM |
New updates:
It seems nearly 31% of the CPU's time was spent compiling the CustomGBForce kernels--I hadn't realized (though the reason is fairly obvious now that I think of it) that the CustomForces are compiled at context initialization time. I'd imagine we could recover that 31% by modifying the PyMC code to use arrays, and then using a distributed computing framework to hold on to contexts and prevent recompilation. That scheme would also allow us to distribute the remaining expense as well. |
What about just caching the Actually, are you able to profile the |
Yeah. I had previously imagined there would be some roadblock to this (I had imagined refactoring the model to pass the
Yep! The stuff above from earlier is |
Ok, profiled the
I'll try caching + |
Note that I'm not sure if |
This may yet be the best idea. Let's chat about this tomorrow? |
Yeah, that sonds like a good plan. I think that is probably the best in the long run too, because it will let us distribute the energy computations. Trying to cache the |
Sounds good!
|
Looks like from my runs last night that this is very slow on the whole dataset. I'll profile and figure out what could possibly be done (possibly parallelize the reweighting step?). Not sure what priority this should be, though.
The text was updated successfully, but these errors were encountered: