New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove lib.parallel.distances #530

Closed
orbeckst opened this Issue Nov 11, 2015 · 4 comments

Comments

Projects
None yet
2 participants
@orbeckst
Member

orbeckst commented Nov 11, 2015

I had a look at our parallel implementations of distance and geometry calculations. With PR #529 I propose a backward-compatible change that enables selecting between the serial (lib._distances) and OpenMP-enabled (lib._distances_openmp) C/Cython code that @richardjgowers wrote. I don't think that the OpenMP-enabled code was actually accessible in a convenient manner so far but with PR #529 that changes (see there).

  • I did some benchmarking (on an Intel Core 2 Duo 2.66 GHz, i.e. only 2 cores) and found that the OpenMP C/Cython code performs much better than the Cython parallel code in lib.parallel.distances (which is also based on OpenMP but within cython instead of C) as shown below.
  • Additionally, lib.parallel.distances can only take orthorhombic boxes into account and breaks/produces garbage with triclinic boxes. The C/Cython code can take PBC with all boxes into account.

Therefore, I propose to remove lib.parallel.distances.

Benchmarks

import MDAnalysis as mda
from MDAnalysis.tests.datafiles import TPR, XTC
import MDAnalysis.lib.parallel.distances as pdist
import MDAnalysis.lib.distances as dist

u = mda.Universe(TPR, XTC)
heavy = u.select_atoms("protein and not name H*")
ow = u.select_atoms("name OW")

# with result array
D = dist.distance_array(heavy.positions, ow.positions, box=u.dimensions)
D32 = D.astype(heavy.positions.dtype)

%timeit dist.distance_array(heavy.positions, ow.positions, box=u.dimensions, result=D, mode="serial")
1 loops, best of 3: 372 ms per loop

%timeit dist.distance_array(heavy.positions, ow.positions, box=u.dimensions, result=D, mode="OpenMP")
1 loops, best of 3: 203 ms per loop

%timeit pdist.distance_array(heavy.positions, ow.positions, box=u.dimensions, result=D32)
1 loops, best of 3: 412 ms per loop

# without results array
%timeit dist.distance_array(heavy.positions, ow.positions, box=u.dimensions, mode="serial")
1 loops, best of 3: 527 ms per loop

 %timeit dist.distance_array(heavy.positions, ow.positions, box=u.dimensions, mode="OpenMP")
1 loops, best of 3: 310 ms per loop

%timeit pdist.distance_array(heavy.positions, ow.positions, box=u.dimensions)
1 loops, best of 3: 435 ms per loop
@orbeckst

This comment has been minimized.

Show comment
Hide comment
@orbeckst

orbeckst Nov 12, 2015

Member

Now that #535 is merged and full-feature OpenMP distance routines are available, I will remove lib.parallel.distances.

I would like to simply remove it and declare it prominently in the CHANGELOG (I don't think it will break a huge amount of code). Alternatively, I can leave a stub and deprecate it but even then the new function will be a bit different from the old one.

Comments?

Member

orbeckst commented Nov 12, 2015

Now that #535 is merged and full-feature OpenMP distance routines are available, I will remove lib.parallel.distances.

I would like to simply remove it and declare it prominently in the CHANGELOG (I don't think it will break a huge amount of code). Alternatively, I can leave a stub and deprecate it but even then the new function will be a bit different from the old one.

Comments?

@richardjgowers

This comment has been minimized.

Show comment
Hide comment
@richardjgowers

richardjgowers Nov 12, 2015

Member

I'd just remove it and make sure that the new stuff is properly documented

Member

richardjgowers commented Nov 12, 2015

I'd just remove it and make sure that the new stuff is properly documented

@orbeckst

This comment has been minimized.

Show comment
Hide comment
@orbeckst

orbeckst Nov 12, 2015

Member

Ok, will do.

Member

orbeckst commented Nov 12, 2015

Ok, will do.

orbeckst added a commit that referenced this issue Nov 12, 2015

removed lib.parallel.distances
- the cython/prange-based/OpenMP versions of some of the distance calculations were
  slower and had fewer features than the C/OpenMP based ones that are now available
  with backend="OpenMP" as implemented in PR #529 so these ones (and there tests)
  are fully removed
- closes issue #530
- The whole lib.parallel module was also removed because at the moment there is nothing
  in it; we can revisit the discussion about organization of lib (issue #287) when
  necessary.

orbeckst added a commit that referenced this issue Nov 12, 2015

orbeckst added a commit that referenced this issue Nov 12, 2015

orbeckst added a commit that referenced this issue Nov 12, 2015

documented selection of backend for distance calculations
- closes issue #530.
- improved general docs for MDAnalysis.lib

@orbeckst orbeckst modified the milestone: 0.13 Nov 12, 2015

richardjgowers added a commit that referenced this issue Nov 13, 2015

Merge pull request #538 from MDAnalysis/issue/530
Issue #530: remove lib.parallel.distances
@orbeckst

This comment has been minimized.

Show comment
Hide comment
@orbeckst

orbeckst Nov 13, 2015

Member

Closed by #538.

Member

orbeckst commented Nov 13, 2015

Closed by #538.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment