The objective of this notebook is to time the current benchmark of around selection in MDAnalysis using the augment+Periodic KDTree, naive brute force and recently implemented Cell-list algorithm.

In [1]:
from MDAnalysisTests.datafiles import PSF, DCD, GRO
import MDAnalysis as mda
import numpy as np
from MDAnalysis.core.selection import Parser

In [2]:
def fast_naive(u, sel):
    sel.apply = sel._apply_distmat
    indices= sel.apply(u.atoms)
    return indices

def fast_ns(u, sel):
    sel.apply = sel._apply_nsgrid
    indices= sel.apply(u.atoms)
    return indices

def fast_pkd(u, sel):
    sel.apply = sel._apply_KDTree
    indices = sel.apply(u.atoms)
    return indices

# Check

First we begin by checking for a No-PBC case

In [3]:
# check
u = mda.Universe(PSF, DCD)
sel = Parser.parse('around 4.0 bynum 1943', u.atoms)

In [4]:
r_naive = fast_naive(u, sel)

In [5]:
r_kd = fast_pkd(u, sel)

In [6]:
r_ns = fast_ns(u, sel)

In [7]:
r_naive, r_kd, r_ns

(<AtomGroup with 32 atoms>,
 <AtomGroup with 32 atoms>,
 <AtomGroup with 32 atoms>)

In [8]:
np.testing.assert_equal(r_naive.indices, r_kd.indices)

In [9]:
np.testing.assert_equal(r_naive.indices, r_ns.indices)

Similarly lets check with PBC

In [10]:
u = mda.Universe(GRO)
sel = Parser.parse('around 5.0 resid 1', u.atoms)

In [11]:
r_naive = fast_naive(u, sel)
r_kd = fast_pkd(u, sel)
r_ns = fast_ns(u, sel)

In [12]:
r_naive, r_kd, r_ns

(<AtomGroup with 178 atoms>,
 <AtomGroup with 178 atoms>,
 <AtomGroup with 178 atoms>)

In [13]:
np.testing.assert_equal(r_naive.indices, r_kd.indices)

In [14]:
np.testing.assert_equal(r_naive.indices, r_ns.indices)

# Benchmark Case

In [15]:
u = mda.Universe(GRO)
sel = Parser.parse('around 5.0 resid 1', u.atoms)

In [16]:
%timeit fast_naive(u, sel)

44.1 ms ± 514 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)


In [17]:
%timeit fast_pkd(u, sel)

37 ms ± 661 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)


In [18]:
%timeit fast_ns(u, sel)

4.92 ms ± 112 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


# Note

While NSgrid is the fastest in this case, and the performance is superior compared to other methods in this case, it is not by any means the fastest in all circumstances. For instance, for the case of single queries brute force still is the fastest method. However, for practical applications, cell-list has superior performance for the most part and is slower in specific cases which are not covered in this notebook.