-
Notifications
You must be signed in to change notification settings - Fork 638
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature (un)wrap enhancement #3169
base: develop
Are you sure you want to change the base?
Conversation
Hello @mnmelo! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:
Comment last updated at 2022-06-08 15:16:52 UTC |
c73420b
to
d5b05d0
Compare
Codecov ReportBase: 93.50% // Head: 93.51% // Increases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## develop #3169 +/- ##
===========================================
+ Coverage 93.50% 93.51% +0.01%
===========================================
Files 190 190
Lines 24943 24989 +46
Branches 3523 3535 +12
===========================================
+ Hits 23322 23368 +46
Misses 1100 1100
Partials 521 521
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
Sadly, this speedup is still well behind
Even with the ~10x speedup of this PR we're still some ~50x slower than
It seems the bulk of the work is now down to |
48e0bc9
to
58f57b3
Compare
62b3736
to
9cccbe3
Compare
box vectors. Related to the above, made universe cache validation more streamlined, moved universe cache validation to core/groups.py, and its tests to test_groups.py. Further optimized unwrap by skipping the rather expensive bonds checking. Clarified and made more uniform wrap/unwrap docs.
9cccbe3
to
38ef8f0
Compare
Made a final optimization, and I think this is ready to merge. An initial check for bonds was being made that was costing about 40% of the entire unwrap call (see above the call to Updated timings for MDA (now only ~27x slower than
With this last optimization, 88.5% of the I crawled a bit deeper into the rabbit whole, and was a bit surprised to realize that inside |
... and the results are in: @richardjgowers' tweak to
Comparing with
Line profiling of
|
12316d4
to
3022707
Compare
3022707
to
d6b5e42
Compare
This PR would make OTF transformations not only useable but competitive... Obviously, the PR desperately needs someone to merge with current develop and the, most importantly, a few eyes that know the part of the code. |
I love it if this nice feature gets into the code. I can try to rebase it and review the changes and test it, even though I am not an expert on this part of the code. |
That would be fantastic, @PicoCentauri ! |
46e0df5
to
72db762
Compare
Hello and as usual apologies for real life keeping me away from the cool stuff I'd like to make happen. If you're up to the task, @PicoCentauri, I'll try to keep an eye out for github pings and assist where possible. |
No problem, an eye would be great. I did some tests by my own and it seems that everything is working great. If there are no major objections by the other @MDAnalysis/coredevs we can soon merge this and look and more speedups after this PR. |
I’m going to want to review this. Performance is great, caches and
complexity aren’t :)
…On Sat, Jun 11, 2022 at 06:17, Philip Loche ***@***.***> wrote:
No problem, an eye would be great. I did some tests by my own and it seems
that everything is working great. If there are no major objections by the
other @MDAnalysis/coredevs
<https://github.com/orgs/MDAnalysis/teams/coredevs> we can soon merge
this and look and more speedups after this PR.
—
Reply to this email directly, view it on GitHub
<#3169 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACGSGBYFXYWMQXN4QYPVS5LVOQOPDANCNFSM4ZGX7EHQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we keep the ptp stuff (that's really cool!) and remove the cache stuff since original benchmarks show its only a minimal extra improvement?
positions = unique_atoms.positions | ||
spread = positions.ptp(axis=0) | ||
spread = distances.transform_RtoS(spread, self.dimensions) | ||
if np.any(spread > .5): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Im trying to remember if this is still true for triclinic boxes or if you need a trig correction...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is fine as is, since you already do the trig transform on the axis system. I don't have any formal proof, though (one may exist in some textbook, I guess?). I just couldn't come up with any breaking examples. (Note that this assumes that bonds cannot be larger than half the box vectors, which should be mostly ok).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I've been trying to prove this a bit more formally for triclinic cells.
The current ptp->RtoS strategy essentially means
Take each compound's bounding box diagonal vector, decompose it into the triclinic vectors, and see if any resulting component is larger than half the corresponding box vector.
So the question here is: are the projected diagonals always at least as large as they can be in every component using this method? I can already find cases where they aren't, so I'll dig some more. I have a hunch that I should be able to just compare the untransformed ptp difference to the orthorhombic vectors.
A direct alternative is to RtoS transform the entire system prior to unwrapping, but that seems costly...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you write it down in equations, I am happy to put on my "reviewer 2" hat and look over it ;-).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @orbeckst, I'm definitely taking you up on that offer! Please review the reasoning here while I clean up code/testing for pushing.
After much head scratching, and some performance measurements, my conclusions are as follows regarding the shortcuts in non-rectangular boxes:
Here I'm defining as 'safe' either a compound that is wholly within half of each vector and therefore assumed not to need unwrapping, or the box volume defined by the half box vectors
- Transforming the diagonals of compound bounding boxes to S space may wrongly decide a compound is safe when it isn't. The most obvious counter-example is to imagine
- the safe parallelepiped defined by half of each box vector (in essence a scaled down version of the box, corresponding to an eighth of its volume), which limits the safe space a compound can occupy, and
- the smallest rectangular parallelepiped that circumscribes the one in i). This rectangular parallelepiped, which is equivalent to its own bounding box, will have one diagonal that is entirely contained in the safe volume. Yet, parts of the parallelepiped will be outside, and it will span more than half of one or more box vectors.
- The lazy way out is to transform all atoms (not just diagonal vectors) to S space and compare compound spans there against 0.5. In my tests this incurs in a penalty between 15 and 20% of performance. Let's try to avoid that.
Strategy
The idea is to find, for each compound, the parallelepiped with same angles as the box and varying
Inscribed rectangular parallelepipeds
The ASCII figures below represents top views of box parallelepipeds, down along the z-axis. In each, the two parallelograms represent the bottom ( #
. Depending on the
Case I:
_________________
/z=cz /
__l___/__________ /
/ /|r3 r4| / /
/ /_|_______|/_____/
/ r1 /r2
#_z=0____________/
Case II:
_________________
/z=cz /
/ ______________/__
/ /|r4 r3| / /
/__/_|__________|/ /
/ r1 r2 /
#_z=0____________/
Case III:
_________________
/z=0 /
__l___/__________ /
/ /|r3 r4| / /
/ #_|_______|/_____/
/ r1 /r2
/_z=cz___________/
Case IV:
_________________
/z=0 /
/ ______________/__
/ /|r4 r3| / /
#__/_|__________|/ /
/ r1 r2 /
/_z=cz___________/
Note that, at least regarding inscribed volumes, cases I and III are equivalent because they correspond to parallelepipeds symmetrized through the z-axis (conversion of one into the other requires inverting the signs of
For case I the coordinates of
where
For case II (simpler, because the inscribed rectangular parallelepiped is now defined by parallelogram vertices, rather than by their intersection)
and the
Circumscribed parallelepiped
The above reasoning can be reversed to find the triclinc parallelepiped that circumscribes a given rectangular one (in our case, the latter will be the compounds' bounding boxes). We'll want the circumscribing parallelepiped to have the same angles as the box, for direct comparison with the reference safe volume. We find it by finding scaling factors
If we have a rectangular bounding box that we want circumscribed, with edge lengths
for case I
for case II
The y and z dimensions are the same in both cases
In matrix form:
case I
case II
With
After obtaining
Case I and II simplification
We can also see by comparing
For code simplification, all box types will be converted to case II before solving for
Negative $b_x$ and other oddities
The above cases cover all possible box types with
There remains the (rare?) case when
Because of all the symmetry/rotation equivalences it seems that this method would be valid for any boxes that have
Further optimization
There may still be molecules within the safe volume that are not identified as such with the above method (it's fine to unwrap them even if unneeded; we just lose a bit of performance). We can directly exclude those with bounding box diagonals outside the rectangular parallelepiped that circumscribes the safe volume. The remaining, which by now should be a small fraction, can be refined by RtoS transformation of all atoms, although the bookkeeping cost may offset the possible gains.
Alternative approaches
I also looked into an apparently simpler approach: the problem with the RtoS transformation of the bounding box diagonal is that the result might not be the S-space diagonal with largest spread. Initially, I was RtoS transforming only the all-positive diagonal of a compound's bounding box; yet there are three other bounding-box diagonals that will transform to vectors in S-space with different
This points to a conceptually simpler solution than the lengthy one above: to find if any of the four diagonals of a molecule's bounding box has a span greater than 0.5 in S-space. I tried finding if this could be simplified as finding a specific offending diagonal among the four, but couldn't come up with anything simpler than just RtoS transforming all four diagonals.
Now, RtoS transforming all four diagonals seems easy, but we can see it's one more transformation than transforming all atoms of a three-particle molecule abundantly used in simulation. So I left it at the solution above, which does only one matrix multiplication operation per molecule. Perhaps someone else wants to take a gander at this approach, which would simplify code somewhat.
Conclusions
In the end I was able to get back the 10-20% lost to RtoS transforming the entire system (yay!)
The code to do this isn't terribly complicated. I implemented it as a simple function that checks box type, converts to case II, and returns
Regarding testing, I'm thinking of synthetically creating all four cases (eight cases, with negative
Cheers!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll have some time Friday to look at it. Thanks for the ASCII art and using math notation.
@richardjgowers I agree with leaving the index-splitting caching out. Too complex, little gain. The fragment cache stays, right? There was quite some gain there. |
@mnmelo would be able to remove the index caching? For you it is probably quick while for me it would took a while to remove the correct lines. |
@richardjgowers, @PicoCentauri: I finally had some time to go over this during the holidays. Other than having to get into the spirit of this code again, excising the cache was simpler than I thought. I am also trying to prove better the correctness of the ptp -> RtoS sequence @richardjgowers pointed out in a review comment. I'm trying to work out some proof either way. Of course the time-saving idea here is not to have to do RtoS on all the atoms, but only on the diagonals of bounding boxes of compounds. I'll continue the discussion under that review comment. |
I let @IAlibay speak to the darker_linting stuff — I think we take it mostly as suggestions as it is a stop-gap measure in the absence of PEP8speaks. At least that's my recollection. |
Pretty much, all PEP8 things are really advisory anyways. The only thing I would specifically look out for are flake8 messages, since those specifically highlight PEP8 violations rather than Here we have one such violation, a indentation issue for core/groups.py:1964:47 See: https://github.com/MDAnalysis/mdanalysis/actions/runs/3827118314/jobs/6511441089#step:4:448 |
Still to-do, possibly in other PRs: - Make the unwrap transformation use AtomGroup.unwrap so it can benefit from speedups (must settle the conflicting default behaviors of using either None or com as the reference). - Consider moving the unwrapping pre-selection to inside make whole (it already does something like that)
5fa9c45
to
7b587a5
Compare
Changes made in this Pull Request:
Performance gains are already of ~10x for large systems. For the AdK test system (from the .gro file) I got the following:
(To be comparable with
develop
, this test ignores the TIP4P massless MW particles)Note that this is already including fragment caching (in develop). The caching that is compared here is just the cache of the operation of index splitting per fragment. That caching seems to only help marginally, but I'm not sure we should just leave it out because I imagine some systems may be heavier in this aspect.
PR Checklist