Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature (un)wrap enhancement #3169

Open
wants to merge 21 commits into
base: develop
Choose a base branch
from
Open

Conversation

mnmelo
Copy link
Member

@mnmelo mnmelo commented Mar 15, 2021

Changes made in this Pull Request:

  • Optimizes unwrapping by pre-screening AG fragments that span more than half the box vectors;
  • Extends the universe-validated cache mechanism to avoid re-calculating the compound splits involved in fragment unwrapping.
  • Corner case of massless, single-particle compounds -- virtual-sites, or TIP4P dummy particles -- now handled gracefully by unwrap (addresses Unwrap with default settings fails for virtual sites #3168). Does not ensure that those virtual sites are unwrapped with the rest of the molecule, though.

Performance gains are already of ~10x for large systems. For the AdK test system (from the .gro file) I got the following:

 ===========   =============   =============   =============
                                  this PR         this PR
  num_atoms       develop       (no caching)     (caching)
 -----------   -------------   -------------   -------------
      10         561±0.9μs        431±4μs         389±1μs
     100          1.13±0ms        640±3μs         596±4μs
     1000       17.2±0.04ms     17.3±0.02ms     17.3±0.05ms
    10000         558±3ms        170±0.6ms       171±0.8ms
  All (47k)      2.17±0.01s      239±0.4ms       239±0.6ms
 ===========   =============   =============   =============

(To be comparable with develop, this test ignores the TIP4P massless MW particles)

Note that this is already including fragment caching (in develop). The caching that is compared here is just the cache of the operation of index splitting per fragment. That caching seems to only help marginally, but I'm not sure we should just leave it out because I imagine some systems may be heavier in this aspect.

PR Checklist

  • Tests?
  • Docs?
  • CHANGELOG updated?
  • Issue raised/referenced?

@pep8speaks
Copy link

pep8speaks commented Mar 15, 2021

Hello @mnmelo! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

Line 1566:80: E501 line too long (80 > 79 characters)
Line 1571:1: E302 expected 2 blank lines, found 1

Comment last updated at 2022-06-08 15:16:52 UTC

@mnmelo mnmelo force-pushed the feature-(un)wrap-enhancement branch from c73420b to d5b05d0 Compare March 15, 2021 15:33
@codecov
Copy link

codecov bot commented Mar 15, 2021

Codecov Report

Base: 93.50% // Head: 93.51% // Increases project coverage by +0.01% 🎉

Coverage data is based on head (7a5a0df) compared to base (130e862).
Patch coverage: 100.00% of modified lines in pull request are covered.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #3169      +/-   ##
===========================================
+ Coverage    93.50%   93.51%   +0.01%     
===========================================
  Files          190      190              
  Lines        24943    24989      +46     
  Branches      3523     3535      +12     
===========================================
+ Hits         23322    23368      +46     
  Misses        1100     1100              
  Partials       521      521              
Impacted Files Coverage Δ
package/MDAnalysis/core/groups.py 97.64% <100.00%> (+0.12%) ⬆️
package/MDAnalysis/core/topologyattrs.py 95.81% <100.00%> (+0.01%) ⬆️
package/MDAnalysis/core/universe.py 97.26% <100.00%> (ø)
package/MDAnalysis/lib/_cutil.pyx 100.00% <100.00%> (ø)
package/MDAnalysis/lib/util.py 89.50% <100.00%> (-0.15%) ⬇️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@mnmelo
Copy link
Member Author

mnmelo commented Mar 15, 2021

Sadly, this speedup is still well behind gmx trjconv speeds :(
For a synthetic test with an adk.xtc consisting of 1000 frames of the same structure (adk_oplsaa.gro) I get the following timings:

echo 0 | gmx trjconv -f adk.xtc -s adk_oplsaa.tpr -pbc mol -o pbc.xtc
real	0m6.313s
user	0m4.830s
sys	0m0.114s
...
for _frame in u.trajectory:
    u.atoms.unwrap()
    wrt.write(u.atoms)
...

real	4m8.090s
user	5m25.783s
sys	1m25.677s

Even with the ~10x speedup of this PR we're still some ~50x slower than gmx trjconv. Profiling the same 1000-frame unwrapping with MDA gives this:

         53456901 function calls (53383208 primitive calls) in 252.512 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     1000  138.932    0.139  147.632    0.148 {MDAnalysis.lib._cutil.make_whole}
   115538   36.884    0.000   36.884    0.000 {built-in method numpy.array}
     2001   28.912    0.014   28.912    0.014 {built-in method builtins.sorted}
     2001   18.361    0.009   22.284    0.011 topologyattrs.py:2310(<listcomp>)
     2001    7.138    0.004   97.336    0.049 topologyattrs.py:2307(get_atoms)
    25002    6.028    0.000    6.028    0.000 {method 'reduce' of 'numpy.ufunc' objects}
51071684/51024001    3.889    0.000    4.617    0.000 util.py:1542(wrapper)
     1001    1.802    0.002    1.805    0.002 {method 'read' of 'MDAnalysis.lib.formats.libmdaxdr.XTCFile' objects}
     1000    1.682    0.002  248.248    0.248 groups.py:1683(unwrap)
     1000    1.295    0.001    1.299    0.001 {method 'write' of 'MDAnalysis.lib.formats.libmdaxdr.XTCFile' objects}
     6001    0.725    0.000   98.494    0.016 topologyattrs.py:425(__getitem__)
     ...

It seems the bulk of the work is now down to lib._cutil.make_whole (ping @zemanj, who has a branch on this optimization from the Cython side). There are other possibly excessive calls to np.array and to sort. Because I bundled reading and writing I can't be sure what those refer to.

@mnmelo mnmelo force-pushed the feature-(un)wrap-enhancement branch 2 times, most recently from 48e0bc9 to 58f57b3 Compare March 15, 2021 16:47
@mnmelo mnmelo force-pushed the feature-(un)wrap-enhancement branch 3 times, most recently from 62b3736 to 9cccbe3 Compare April 4, 2021 05:19
box vectors.

Related to the above, made universe cache validation more streamlined, moved
universe cache validation to core/groups.py, and its tests to test_groups.py.

Further optimized unwrap by skipping the rather expensive bonds
checking.

Clarified and made more uniform wrap/unwrap docs.
@mnmelo mnmelo force-pushed the feature-(un)wrap-enhancement branch from 9cccbe3 to 38ef8f0 Compare April 4, 2021 05:51
@mnmelo
Copy link
Member Author

mnmelo commented Apr 4, 2021

Made a final optimization, and I think this is ready to merge.

An initial check for bonds was being made that was costing about 40% of the entire unwrap call (see above the call to topologyattrs.py:425(__getitem__)). This is unneeded, because _split_by_compound_indices and make_whole do that check on their own, and also because no really heavy work is done before those later checks.

Updated timings for MDA (now only ~27x slower than gmx trjconv!):

real	2m42.869s
user	4m1.193s
sys	1m24.911s

With this last optimization, 88.5% of the unwrap call is now spent running the cythonized make_whole function, which seems to better reflect Python's role as a wrapper in this case.

I crawled a bit deeper into the rabbit whole, and was a bit surprised to realize that inside make_whole, 83% of that time is spent on this line, doing set differences to determine which atoms have been unwrapped, and which are still to-unwrap. This strikes me as excessive, and probably there's easy gains to make there, perhaps bringing performance at least into the same order of magnitude as gmx trjconv. Stuff for a future PR.

@mnmelo
Copy link
Member Author

mnmelo commented Apr 4, 2021

... and the results are in: @richardjgowers' tweak to make_whole brings the time of that synthetic trajectory test way down! We're now only 5x slower than gmx trjconv!

real	0m30.821s
user	0m55.682s
sys	0m29.721s

Comparing with develop (using an adaptation of the included asv benchmark that also does full universe unwrapping) this PR can be already over 100x (!!) faster:

 ===========   =============   =============
  num_atoms       develop         this PR
 -----------   -------------   -------------
       10         561±0.9μs        249±1μs
      100          1.13±0ms        265±1μs
     1000       17.2±0.04ms     3.66±0.01ms
    10000         558±3ms        14.7±0.2ms
  All (47k)      2.17±0.01s      21.3±0.2ms
 ===========   =============   =============

Line profiling of make_whole shows that load is much better distributed over vector tasks. Still, 24.4% of the time goes to the getting of bonds (line 247: bonds = atomgroup.bonds.to_indices()), which is likely tied to the slow bond accession I had already seen above. 22.5% of the time goes to the subsequent selection of indices to work on. Here's the current line timings for make_whole:

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
   194                                               # map of global indices to local indices
   195      1000      10156.0     10.2      0.0      ix_view = atomgroup.ix[:]
   196      1000       1856.0      1.9      0.0      natoms = atomgroup.ix.shape[0]
   197
   198      1000      41394.0     41.4      0.1      oldpos = atomgroup.positions
   199
   200                                               # Nothing to do for less than 2 atoms
   201      1000        442.0      0.4      0.0      if natoms < 2:
   202                                                   return np.array(oldpos)
   203
   204      1000        307.0      0.3      0.0      for i in range(natoms):
   205   3341000    1389558.0      0.4      3.5          ix_to_rel[ix_view[i]] = i
   206
   207      1000        403.0      0.4      0.0      if reference_atom is None:
   208      1000        287.0      0.3      0.0          ref = 0
   209                                               else:
   210                                                   # Sanity check
   211                                                   if not reference_atom in atomgroup:
   212                                                       raise ValueError("Reference atom not in atomgroup")
   213                                                   ref = ix_to_rel[reference_atom.ix]
   214
   215      1000       6565.0      6.6      0.0      box = atomgroup.dimensions
   216      1000        491.0      0.5      0.0      for i in range(3):
   217      3000        950.0      0.3      0.0          half_box[i] = 0.5 * box[i]
   218      3000        980.0      0.3      0.0          if box[i] == 0.0:
   219                                                       raise ValueError("One or more dimensions was zero.  "
   220                                                                        "You can set dimensions using 'atomgroup.dimensions='")
   221
   222      1000        330.0      0.3      0.0      ortho = True
   223      1000        368.0      0.4      0.0      for i in range(3, 6):
   224      3000        973.0      0.3      0.0          if box[i] != 90.0:
   225      2000        665.0      0.3      0.0              ortho = False
   226
   227      1000        412.0      0.4      0.0      if ortho:
   228                                                   # If atomgroup is already unwrapped, bail out
   229                                                   is_unwrapped = True
   230                                                   for i in range(1, natoms):
   231                                                       for j in range(3):
   232                                                           if fabs(oldpos[i, j] - oldpos[0, j]) >= half_box[j]:
   233                                                               is_unwrapped = False
   234                                                               break
   235                                                       if not is_unwrapped:
   236                                                           break
   237                                                   if is_unwrapped:
   238                                                       return np.array(oldpos)
   239                                                   for i in range(3):
   240                                                       inverse_box[i] = 1.0 / box[i]
   241                                               else:
   242      1000      15963.0     16.0      0.0          from .mdamath import triclinic_vectors
   243      1000      60991.0     61.0      0.2          tri_box = triclinic_vectors(box)
   244
   245                                               # C++ dict of bonds
   246      1000        628.0      0.6      0.0      try:
   247      1000    9645472.0   9645.5     24.4          bonds = atomgroup.bonds.to_indices()
   248                                               except (AttributeError, NoDataError):
   249                                                   raise NoDataError("The atomgroup is required to have bonds")
   250      1000       1317.0      1.3      0.0      for i in range(bonds.shape[0]):
   251   3365000     933715.0      0.3      2.4          atom = bonds[i, 0]
   252   3365000     915652.0      0.3      2.3          other = bonds[i, 1]
   253                                                   # only add bonds if both atoms are in atoms set
   254   3365000    1367004.0      0.4      3.5          if ix_to_rel.count(atom) and ix_to_rel.count(other):
   255   3365000    1047097.0      0.3      2.6              atom = ix_to_rel[atom]
   256   3365000    1088477.0      0.3      2.8              other = ix_to_rel[other]
   257
   258   3365000    1704317.0      0.5      4.3              bonding[atom].insert(other)
   259   3365000    1807016.0      0.5      4.6              bonding[other].insert(atom)
   260
   261      1000      12398.0     12.4      0.0      newpos = np.zeros((oldpos.shape[0], 3), dtype=np.float32)
   262
   263      1000        820.0      0.8      0.0      todo = intset()
   264      1000        373.0      0.4      0.0      done = intset()  # Who have I already searched around?
   265                                               # initially we have one starting atom whose position is in correct image
   266      1000        616.0      0.6      0.0      todo.insert(ref)
   267      1000        320.0      0.3      0.0      for i in range(3):
   268      3000        902.0      0.3      0.0          newpos[ref, i] = oldpos[ref, i]
   269
   270      1000        379.0      0.4      0.0      while not todo.empty():
   271   3341000     911768.0      0.3      2.3          atom = deref(todo.begin())
   272   3341000     997489.0      0.3      2.5          todo.erase(todo.begin())
   273
   274   6681000    2478349.0      0.4      6.3          for other in bonding[atom]:
   275                                                       # If other is already a refpoint, leave alone
   276   6730000    2288131.0      0.3      5.8              if done.count(other) or todo.count(other):
   277   3390000     986825.0      0.3      2.5                  continue
   278                                                       # Draw vector from atom to other
   279   3340000     909444.0      0.3      2.3              for i in range(3):
   280  10020000    2754291.0      0.3      7.0                  vec[i] = oldpos[other, i] - newpos[atom, i]
   281                                                       # Apply periodic boundary conditions to this vector
   282   3340000     907822.0      0.3      2.3              if ortho:
   283                                                           minimum_image(&vec[0], &box[0], &inverse_box[0])
   284                                                       else:
   285   3340000    1121333.0      0.3      2.8                  minimum_image_triclinic(&vec[0], &tri_box[0, 0])
   286                                                       # Then define position of other based on this vector
   287   3340000     908447.0      0.3      2.3              for i in range(3):
   288  10020000    2738295.0      0.3      6.9                  newpos[other, i] = newpos[atom, i] + vec[i]
   289
   290                                                       # This other atom can now be used as a reference point
   291   3340000    1185546.0      0.4      3.0              todo.insert(other)
   292   3341000    1293332.0      0.4      3.3          done.insert(atom)
   293
   294      1000        304.0      0.3      0.0      if <np.intp_t> done.size() != natoms:
   295                                                   raise ValueError("AtomGroup was not contiguous from bonds, process failed")
   296      1000        410.0      0.4      0.0      if inplace:
   297                                                   atomgroup.positions = newpos
   298      1000      29068.0     29.1      0.1      return np.array(newpos)

@mnmelo mnmelo force-pushed the feature-(un)wrap-enhancement branch 3 times, most recently from 12316d4 to 3022707 Compare April 5, 2021 02:24
@mnmelo mnmelo force-pushed the feature-(un)wrap-enhancement branch from 3022707 to d6b5e42 Compare April 5, 2021 11:38
@orbeckst
Copy link
Member

This PR would make OTF transformations not only useable but competitive...

Obviously, the PR desperately needs someone to merge with current develop and the, most importantly, a few eyes that know the part of the code.

@PicoCentauri
Copy link
Contributor

I love it if this nice feature gets into the code. I can try to rebase it and review the changes and test it, even though I am not an expert on this part of the code.

@orbeckst
Copy link
Member

orbeckst commented Jun 3, 2022

That would be fantastic, @PicoCentauri !

@mnmelo
Copy link
Member Author

mnmelo commented Jun 10, 2022

Hello and as usual apologies for real life keeping me away from the cool stuff I'd like to make happen. If you're up to the task, @PicoCentauri, I'll try to keep an eye out for github pings and assist where possible.

@PicoCentauri
Copy link
Contributor

No problem, an eye would be great. I did some tests by my own and it seems that everything is working great. If there are no major objections by the other @MDAnalysis/coredevs we can soon merge this and look and more speedups after this PR.

@richardjgowers
Copy link
Member

richardjgowers commented Jun 11, 2022 via email

Copy link
Member

@richardjgowers richardjgowers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we keep the ptp stuff (that's really cool!) and remove the cache stuff since original benchmarks show its only a minimal extra improvement?

positions = unique_atoms.positions
spread = positions.ptp(axis=0)
spread = distances.transform_RtoS(spread, self.dimensions)
if np.any(spread > .5):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Im trying to remember if this is still true for triclinic boxes or if you need a trig correction...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine as is, since you already do the trig transform on the axis system. I don't have any formal proof, though (one may exist in some textbook, I guess?). I just couldn't come up with any breaking examples. (Note that this assumes that bonds cannot be larger than half the box vectors, which should be mostly ok).

Copy link
Member Author

@mnmelo mnmelo Jan 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I've been trying to prove this a bit more formally for triclinic cells.
The current ptp->RtoS strategy essentially means

Take each compound's bounding box diagonal vector, decompose it into the triclinic vectors, and see if any resulting component is larger than half the corresponding box vector.

So the question here is: are the projected diagonals always at least as large as they can be in every component using this method? I can already find cases where they aren't, so I'll dig some more. I have a hunch that I should be able to just compare the untransformed ptp difference to the orthorhombic vectors.

A direct alternative is to RtoS transform the entire system prior to unwrapping, but that seems costly...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you write it down in equations, I am happy to put on my "reviewer 2" hat and look over it ;-).

Copy link
Member Author

@mnmelo mnmelo Jan 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @orbeckst, I'm definitely taking you up on that offer! Please review the reasoning here while I clean up code/testing for pushing.

After much head scratching, and some performance measurements, my conclusions are as follows regarding the shortcuts in non-rectangular boxes:
Here I'm defining as 'safe' either a compound that is wholly within half of each vector and therefore assumed not to need unwrapping, or the box volume defined by the half box vectors

  • Transforming the diagonals of compound bounding boxes to S space may wrongly decide a compound is safe when it isn't. The most obvious counter-example is to imagine
    1. the safe parallelepiped defined by half of each box vector (in essence a scaled down version of the box, corresponding to an eighth of its volume), which limits the safe space a compound can occupy, and
    2. the smallest rectangular parallelepiped that circumscribes the one in i). This rectangular parallelepiped, which is equivalent to its own bounding box, will have one diagonal that is entirely contained in the safe volume. Yet, parts of the parallelepiped will be outside, and it will span more than half of one or more box vectors.
  • The lazy way out is to transform all atoms (not just diagonal vectors) to S space and compare compound spans there against 0.5. In my tests this incurs in a penalty between 15 and 20% of performance. Let's try to avoid that.

Strategy

The idea is to find, for each compound, the parallelepiped with same angles as the box and varying $l_a$, $l_b$ and $l_c$ so that it circumscribes the compound's rectangular bounding box. We then compare those lengths to half the box vector lengths to decide whether a compound is safe.

Inscribed rectangular parallelepipeds

The ASCII figures below represents top views of box parallelepipeds, down along the z-axis. In each, the two parallelograms represent the bottom ( $z=0$ ) and top ( $z=c_z$ ) faces. The inscribed rectangular parallelepiped, with faces parallel to the $xy$, $xz$ and $yz$ planes, is the one that fits inside the projected intersection of the $z=0$ and $z=c_z$ parallelograms. Its face on the $xy$ plane is marked by points $\mathbf{r_1},\mathbf{r_2},\mathbf{r_4},\mathbf{r_3}$. The origin is marked by #. Depending on the $c$ vector, there are four possible cases:

Case I: $c_y \ge 0$ and $\frac{c_x}{c_y} &gt; \frac{b_x}{b_y}$

               _________________
              /z=cz            /
       __l___/__________      /
      /     /|r3   r4| /     /
     /     /_|_______|/_____/
    /        r1      /r2
   #_z=0____________/

Case II: $c_y \ge 0$ and $\frac{c_x}{c_y} \le \frac{b_x}{b_y}$

      _________________
     /z=cz            / 
    /  ______________/__
   /  /|r4      r3| /  /
  /__/_|__________|/  /
    /  r1        r2  /
   #_z=0____________/

Case III: $c_y &lt; 0$ and $\frac{c_x}{c_y} &gt; \frac{b_x}{b_y}$ (equivalent to case I, but with symmetric $c_x$, $c_y$)

            _________________
           /z=0             /
    __l___/__________      /
   /     /|r3   r4| /     /
  /     #_|_______|/_____/
 /        r1      /r2
/_z=cz___________/

Case IV: $c_y &lt; 0$ and $\frac{c_x}{c_y} \le \frac{b_x}{b_y}$ (equivalent to case II, but with symmetric $c_x$, $c_y$)

       _________________
      /z=0             /
     /  ______________/__
    /  /|r4      r3| /  /
   #__/_|__________|/  /
     /  r1        r2  /
    /_z=cz___________/

Note that, at least regarding inscribed volumes, cases I and III are equivalent because they correspond to parallelepipeds symmetrized through the z-axis (conversion of one into the other requires inverting the signs of $c_x$ and $c_y$). Likewise for cases II and IV. We shall look only at cases I and II (later, we'll see how to interconvert between cases I and II, which will simplify the algorithm since we'll only have to deal with one case).

For case I the coordinates of $\mathbf{r_1}$, $\mathbf{r_2}$, $\mathbf{r_3}$ and $\mathbf{r_4}$ (all in the same $xy$ plane) are given by:
$\mathbf{r_1} = (b_x + l, c_y)$
$\mathbf{r_2} = (a_x + c_x - l, c_y)$
$\mathbf{r_3} = (b_x + l, b_y)$
$\mathbf{r_4} = (a_x + c_x - l, b_y)$
where $a_x$, $b_x$, etc. are the respective components of the triclinic volume vectors, and $l$ is the length of the segment indicated in the figure:
$l = c_x - c_y \frac{b_x}{b_y}$

$r_x$ and $r_y$ are the edges of this inscribed rectangle along $x$ and $y$, resp., with lengths:
$r_x = a_x + c_x - b_x - 2l = a_x - c_x - b_x + 2c_y\frac{b_x}{b_y}$
$r_y = b_y - c_y$

For case II (simpler, because the inscribed rectangular parallelepiped is now defined by parallelogram vertices, rather than by their intersection) $\mathbf{r_1}$, $\mathbf{r_2}$, $\mathbf{r_3}$ and $\mathbf{r_4}$ are:
$\mathbf{r_1} = (b_x, c_y)$
$\mathbf{r_2} = (a_x + c_x, c_y)$
$\mathbf{r_3} = (b_x, b_y)$
$\mathbf{r_4} = (a_x + c_x, b_y)$

and the $r_x$ and $r_y$ side lengths:
$r_x = a_x + c_x - b_x$
$r_y = b_y - c_y$

Circumscribed parallelepiped

The above reasoning can be reversed to find the triclinc parallelepiped that circumscribes a given rectangular one (in our case, the latter will be the compounds' bounding boxes). We'll want the circumscribing parallelepiped to have the same angles as the box, for direct comparison with the reference safe volume. We find it by finding scaling factors $f_a$, $f_b$ and $f_c$ that, when multiplied by the box vectors, yield an inscribed rectangular parallelepiped equal to our bounding box.

If we have a rectangular bounding box that we want circumscribed, with edge lengths $r_x$, $r_y$ and $r_z$, the equations above become:

for case I
$f_a a_x - f_b b_x + f_c (2c_y\frac{b_x}{b_y} - c_x) = r_x$ (note the cancelling out of $f_b$ in the $\frac{b_x}{b_y}$ ratio)

for case II
$f_a a_x - f_b b_x + f_c c_x = r_x$

The y and z dimensions are the same in both cases
$f_b b_y - f_c c_y = r_y$
$f_c c_z = r_z$

In matrix form:
case I

$$\begin{bmatrix} a_x & -b_x & 2c_y\frac{b_x}{b_y} - c_x \\ 0 & b_y & -c_y \\ 0 & 0 & c_z \end{bmatrix} \cdot \begin{bmatrix} f_a \\ f_b \\ f_c \end{bmatrix} = \begin{bmatrix} r_x \\ r_y \\ r_z \end{bmatrix}$$

case II

$$\begin{bmatrix} a_x & -b_x & c_x \\ 0 & b_y & -c_y \\ 0 & 0 & c_z \end{bmatrix} \cdot \begin{bmatrix} f_a \\ f_b \\ f_c \end{bmatrix} = \begin{bmatrix} r_x \\ r_y \\ r_z \end{bmatrix}$$

With $\mathbf{P}$ the coefficients' matrix, $\mathbf{f}$ the factors' vector and $\mathbf{r}$ the rectangular parallelepiped's dimensions vector (the latter corresponding to the all-positive bounding box diagonal vector):
$\mathbf{P} \cdot \mathbf{f} = \mathbf{r}$
$\mathbf{f} = \mathbf{P}^{-1} \cdot \mathbf{r}$

After obtaining $\mathbf{f}$, from only the compound's bounding box's diagonal (the all-positive diagonal is the $\left [ r_x, r_y, r_z \right ]$ vector), deciding which molecules are safe is a direct comparison with 0.5.

Case I and II simplification

We can also see by comparing $\mathbf{P}$ for either case, which differ only in their topmost-rightmost element, that conversion from case I to II is simply a matter of subtracting $2l$ from $c_x$, where $l = c_x - b_x c_y / b_y$ (see the case I figure above). A geometric visualization of this equivalence is to picture the case I above, but with the $z=0$ parallelogram shifted by $l$ to the right, and the $z=c_z$ parallelogram shifted by $l$ to the left. The inscribed rectangular parallelogram will be exactly the same, but now circumscribed by a parallelogram of case II.

For code simplification, all box types will be converted to case II before solving for $\mathbf{f}$. Note that this conversion and the determination of $\mathbf{P}$ is rather cheap, and need only be done once per frame.

Negative $b_x$ and other oddities

The above cases cover all possible box types with $a$ along the $x$ axis, $b$ in the $xy$ plane, and positive $a_x$, $b_y$, and $c_z$ (note that the common assumption that $|b_x| \leq \frac{a_x}{2}$, $|c_x| \leq \frac{a_x}{2}$ and $|c_y| \leq \frac{b_y}{2}$ needed not be invoked).

There remains the (rare?) case when $b_x &lt; 0$, corresponding to boxes with $\measuredangle ab &gt; 90^{\circ}$. It can be easily seen (sorry, tired of ASCII drawings) that mirroring of $b$ and $c$ through the $yz$ plane yields equivalent (rotated) parallelograms of one of the cases above.

Because of all the symmetry/rotation equivalences it seems that this method would be valid for any boxes that have $a$ and $b$ in the $xy$ plane (proof left as an exercise to the reader), but the ones discussed until here should be the only valid MD ones.

Further optimization

There may still be molecules within the safe volume that are not identified as such with the above method (it's fine to unwrap them even if unneeded; we just lose a bit of performance). We can directly exclude those with bounding box diagonals outside the rectangular parallelepiped that circumscribes the safe volume. The remaining, which by now should be a small fraction, can be refined by RtoS transformation of all atoms, although the bookkeeping cost may offset the possible gains.

Alternative approaches

I also looked into an apparently simpler approach: the problem with the RtoS transformation of the bounding box diagonal is that the result might not be the S-space diagonal with largest spread. Initially, I was RtoS transforming only the all-positive diagonal of a compound's bounding box; yet there are three other bounding-box diagonals that will transform to vectors in S-space with different $x$ and $y$ spans (the span in $z$ is always the same due to $a$ and $b$ lying in the $xy$ plane).

This points to a conceptually simpler solution than the lengthy one above: to find if any of the four diagonals of a molecule's bounding box has a span greater than 0.5 in S-space. I tried finding if this could be simplified as finding a specific offending diagonal among the four, but couldn't come up with anything simpler than just RtoS transforming all four diagonals.

Now, RtoS transforming all four diagonals seems easy, but we can see it's one more transformation than transforming all atoms of a three-particle molecule abundantly used in simulation. So I left it at the solution above, which does only one matrix multiplication operation per molecule. Perhaps someone else wants to take a gander at this approach, which would simplify code somewhat.

Conclusions

In the end I was able to get back the 10-20% lost to RtoS transforming the entire system (yay!)

The code to do this isn't terribly complicated. I implemented it as a simple function that checks box type, converts to case II, and returns $\mathbf{P}^{-1}$ for multiplication later. It would be important, however, to somehow link all this explanation to the code. Do you think a link to a comment in a PR-review in github is solid enough?

Regarding testing, I'm thinking of synthetically creating all four cases (eight cases, with negative $b_x$), but maybe you have other sugestions?

Cheers!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll have some time Friday to look at it. Thanks for the ASCII art and using math notation.

@mnmelo
Copy link
Member Author

mnmelo commented Jun 20, 2022

@richardjgowers I agree with leaving the index-splitting caching out. Too complex, little gain. The fragment cache stays, right? There was quite some gain there.

@PicoCentauri
Copy link
Contributor

@mnmelo would be able to remove the index caching? For you it is probably quick while for me it would took a while to remove the correct lines.

@mnmelo
Copy link
Member Author

mnmelo commented Jan 3, 2023

@richardjgowers, @PicoCentauri: I finally had some time to go over this during the holidays. Other than having to get into the spirit of this code again, excising the cache was simpler than I thought.
A question: I'm getting some darker_lint nagging on stuff like double quotes instead of single, and some really questionable suggestions on how to break lines. This is new since I last submitted PRs. How much of those suggestions should I follow? (Or where can I read up on style guide updates? The Wiki page hasn't been updated since 2020).

I am also trying to prove better the correctness of the ptp -> RtoS sequence @richardjgowers pointed out in a review comment. I'm trying to work out some proof either way. Of course the time-saving idea here is not to have to do RtoS on all the atoms, but only on the diagonals of bounding boxes of compounds. I'll continue the discussion under that review comment.

@orbeckst
Copy link
Member

orbeckst commented Jan 6, 2023

I let @IAlibay speak to the darker_linting stuff — I think we take it mostly as suggestions as it is a stop-gap measure in the absence of PEP8speaks. At least that's my recollection.

@IAlibay
Copy link
Member

IAlibay commented Jan 6, 2023

I let @IAlibay speak to the darker_linting stuff — I think we take it mostly as suggestions as it is a stop-gap measure in the absence of PEP8speaks. At least that's my recollection.

Pretty much, all PEP8 things are really advisory anyways.

The only thing I would specifically look out for are flake8 messages, since those specifically highlight PEP8 violations rather than black formatting preferences.

Here we have one such violation, a indentation issue for core/groups.py:1964:47

See: https://github.com/MDAnalysis/mdanalysis/actions/runs/3827118314/jobs/6511441089#step:4:448

Still to-do, possibly in other PRs:

- Make the unwrap transformation use AtomGroup.unwrap so it can
benefit from speedups (must settle the conflicting default behaviors of
using either None or com as the reference).

- Consider moving the unwrapping pre-selection to inside make whole (it already
does something like that)
@mnmelo mnmelo force-pushed the feature-(un)wrap-enhancement branch from 5fa9c45 to 7b587a5 Compare January 17, 2023 11:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Wrapping / unwrapping
  
Awaiting triage
Development

Successfully merging this pull request may close these issues.

None yet

7 participants