Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implements compact wrapping and nojump, and updates associated transforms. #2982

Open
wants to merge 4 commits into
base: develop
Choose a base branch
from

Conversation

mnmelo
Copy link
Member

@mnmelo mnmelo commented Oct 13, 2020

Implemented function distances.apply_compact_PBC. With it I could easily adapt AtomGroup.wrap() to also do compact wrapping, depending on a keyword. Backward compatibility is maintained.

Also optimized the way compound_indices are handled in AtomGroup.wrap(), with major speedup:
For an AtomGroup of 146k atoms:

Before

%timeit ag.wrap(compound='fragments', inplace=False)
1.08 s ± 1.02 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

After

%timeit ag.wrap(compound='fragments', inplace=False)
72.7 ms ± 1.38 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In transformations:

  • Made transformations.wrap and transformations.unwrap transparently expose AtomGroup.wrap() and AtomGroup.unwrap(), to take advantage of the new functionality.

  • Added transformations.nojump (should this be a separate PR?)

Also added the center keyword to distances.apply_PBC (with care to minimally impact performance), which allows the PBC wrapping to an arbitrarily centered box. This helps with vector treatment in transformations.nojump, and can be useful in other cases (say, to wrap atoms to a box centered on the origin or on a dynamic point). Should AtomGroup.wrap() expose this capability too?

Specific tests still to be written, but would like feedback in the meantime. Existing tests pass, except for the seemingly unrelated testsuite/MDAnalysisTests/analysis/test_pca.py:test_given_mean and testsuite/MDAnalysisTests/analysis/test_encore.py:TestEncoreClustering.test_clustering_three_ensembles_two_identical (which also fail for the unmodified develop branch).

Feedback much appreciated!

PR Checklist

  • Tests?
  • Docs?
  • CHANGELOG updated?
  • Issue raised/referenced?

(There's no issue for these enhancements. Should I open one?)

@pep8speaks
Copy link

pep8speaks commented Oct 13, 2020

Hello @mnmelo! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

Line 1488:80: E501 line too long (80 > 79 characters)

Line 211:80: E501 line too long (80 > 79 characters)

Comment last updated at 2020-10-14 13:44:37 UTC

@mnmelo mnmelo force-pushed the feature-compact-nojump-wrap_perf branch from 69bc40d to 110bd91 Compare October 13, 2020 14:41
Optimized AtomGroup.wrap() for compounds.

Added transformations.nojump; transformations.wrap and transformations.unwrap
now transparently expose AtomGroup.wrap() and AtomGroup.unwrap().

Added function apply_compact_PBC and added 'center' keyword to
apply_PBC.

Some PBC-related doc cleaning and clarification.
@mnmelo mnmelo force-pushed the feature-compact-nojump-wrap_perf branch from 110bd91 to 05c0fd9 Compare October 13, 2020 14:45
@codecov
Copy link

codecov bot commented Oct 13, 2020

Codecov Report

Merging #2982 into develop will decrease coverage by 0.24%.
The diff coverage is 31.18%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #2982      +/-   ##
===========================================
- Coverage    93.05%   92.80%   -0.25%     
===========================================
  Files          186      186              
  Lines        24609    24679      +70     
  Branches      3187     3197      +10     
===========================================
+ Hits         22900    22904       +4     
- Misses        1661     1727      +66     
  Partials        48       48              
Impacted Files Coverage Δ
package/MDAnalysis/lib/mdamath.py 100.00% <ø> (ø)
package/MDAnalysis/lib/distances.py 90.77% <21.21%> (-7.64%) ⬇️
package/MDAnalysis/transformations/wrap.py 33.89% <24.00%> (-66.11%) ⬇️
package/MDAnalysis/core/groups.py 98.57% <100.00%> (-0.01%) ⬇️
package/MDAnalysis/transformations/__init__.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a9543ba...85d566b. Read the comment docs.

@orbeckst
Copy link
Member

I didn't have the time to review, sorry. But: >10 times speedup for wrapping/unwrapping — that will finally make these transformations usable 🚀 . Also, getting compact and nojump will be super-neat.

I would break up the PR – one for the performance, one or more PRs for new features. The smaller the better, really.

If the performance one can be separate from everything else then we can easily backport it.

@mnmelo
Copy link
Member Author

mnmelo commented Oct 14, 2020

Perfect, I'll split it up, then. Just to be clear, I only optimized the wrapping. Was actually right now looking at the unwrapping and think we can look at some 5x speedup for a 100k atom system. Anyway, we can discuss better in the respective PRs.

Also, the atom-based wrapping is already quite fast (milliseconds for 100k atoms). What I optimized was the fragment based wrapping.

@orbeckst
Copy link
Member

The fragment-based wrapping is the crucial one for making sane trajectories.... so any improvements there are very welcome!

@mnmelo
Copy link
Member Author

mnmelo commented Nov 3, 2020

Thanks for the positive outlook, @orbeckst. To make this work I propose some refactoring to the compound and cache code in #3000 and #3005, and further unwrap-specific optimizations in #2376. I really need some approval of those ideas moving forward, because they involve some work and touch some central aspects that others might want to chip in.

@orbeckst
Copy link
Member

nojump would be really good for MSD (@hmacdope recently told someone on the mailing list that they had to use GROMACS for that part) and compact has always been one my favorites in trjconv. Having both would boost transformations.

I notice that all the prerequisite PRs have been merged so in principle we "just" need (1) merging of develop into this PR and (2) a few 👀 on the code. It would be a shame to let it rot.

@hmacdope
Copy link
Member

hmacdope commented Nov 8, 2023

@mnmelo there is now an nojump method available, and this PR is quite stale. Is it okay for me to close here?

@mnmelo
Copy link
Member Author

mnmelo commented Nov 8, 2023

Sure!

@orbeckst
Copy link
Member

orbeckst commented Nov 8, 2023

We have the nojump but IIRC a major contribution of this PR to speed up unwrapping ("make molecules whole across PBC") which is still veeeeeeeeeeeeeeeeeery slow.

It's become very clear to me that I don't have the bandwidth to move this PR forward (apologies) but I am also not sure that it makes sense to close it if there's code in there that we should really be working on. @hmacdope have you had a deeper look?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Wrapping / unwrapping
  
Awaiting triage
Development

Successfully merging this pull request may close these issues.

None yet

5 participants