# Speeding things up: Neighborlists and Dangerous Builds


### Cutoffs
Recall that we typically apply a cutoff to pair potentials once the numerical value of the pair potential becomes very small. E.g., in our prior example of the Lennard-Jones potential:


$U(r) = 4\epsilon\left[ \left(\frac{\sigma}{r}\right)^{12}- \left(\frac{\sigma}{r}\right)^6 \right],  r < r_c$
$U(r) = 0 r,  \ge r_c$

Where $r_c$ is often $2.5\sigma$. 

By cutting off the interaction, we avoid having to calculate the energy/force between pair of atoms beyond the cutoff.  However, even though we avoid calculating the energy with these atoms, we still have to determine the separation between them to know if they are beyond the cutoff.  This distance calculation is itself quite costly (since we calculate r from the cartesian coordinates) and apply to an entire system scales as $O(N^2)$, where $N$ is the number of particles.

As an example, the python code below compare the number of interactions that would be calculated with and without a cutoff. 

In [8]:
import numpy
import random
import math 

n_particles = 1000
cutoff = 2.5
cutoff2 = cutoff*cutoff

xyz = numpy.zeros((n_particles,3))
L = numpy.array([10.0,10.0,10.0])
invL = numpy.array([1.0/L[0],1.0/L[1],1.0/L[2]])


def init_particles():
    #set a seed so every run uses the same particle array for consistency
    random.seed(12345)
    
    for i in range(0,n_particles):
        #generate coordinates in a box of dimension L[0]*L[1]*L[2]
        #note particles are free to overlap, this is simply for comparison
        xyz[i][0] = L[0]*random.random()
        xyz[i][1] = L[1]*random.random()
        xyz[i][2] = L[2]*random.random()

#very naive loop over particles
def simple_loop():
    sum_energy = 0
    counter_total = 0
    counter_cutoff = 0

    for i in range (0, n_particles):
        for j in range(i, n_particles):
            dx = xyz[i][0]-xyz[j][0]
            dy = xyz[i][1]-xyz[j][1]
            dz = xyz[i][2]-xyz[j][2]

            #apply pbc
            dx = dx-L[0]*round(dx*invL[0])
            dy = dy-L[1]*round(dy*invL[1])
            dz = dz-L[2]*round(dz*invL[2])
            
            r2 = dx*dx+dy*dy+dz*dz
            counter_total = counter_total+1
            if r2 < cutoff2:
                counter_cutoff = counter_cutoff +1
                
    print("number of energy calculations with no cutoff: ", counter_total)
    print("number of energy calculations with with cutoff: ", counter_cutoff)
      
init_particles()
simple_loop()

number of energy calculations with no cutoff:  500500
number of energy calculations with with cutoff:  33879


### Simple, brute force neighborlist
Recall that, to avoid having to perform this distance calculation every timesteps, a neighborlist can instead be constructed that includes all particles within the cutoff + buffer (the buffer is often called the "skin"). The general idea is that the local environment of a particle changes rather slowly, and thus the same neighborlist can be applied multiple times before needing to be reconstructed, thus reducing the computation. The construction of the neighborlist is still $O(N^2)$ (i.e., it's often called "brute force"), but it is done less frequently thus providing speed improvement. 


### Cell list-based neighborlist
To improve upon the brute force neighborlist, a cell list can first be constructed, and this cell list used to generate the neighborlist.  A cell list is constructed by gridding the system up into smaller boxes, where each edge length is typically $\ge r_c+buffer$.  Cell lists can be constructed relatively inexpensively by binning particles in each direction. The neighborlist is then constructed for a given particle by looping over the cell it is in and it's neighbor cells (note, since pair potentials are pair-wise additive, we only need to consider half of the cells; that is, if A is a neighbor of B, then B is a neighbor of A). 

For efficiency, some codes allow multiple cell lists in a system; this can be especially useful if your system has interactions/particle sizes that are very different; if a single neighborlist were used, it would need to be based on the largest interaction cutoff, thus negating speed increases for particles with shorter range interactions. 


Recall that the frequency of reconstruction (and hence the speed up) will depend on numerous factors:

How fast  the system is changing. 
> This is often a consequence of the phase, where, for example gas phases will change much more rapidly than dense fluids.  Similarly, temperature will play a role, where high temperature systems will have particles moving faster than low temperature. 

The size of the buffer/skin. 
> If this is set too small, the neighborlist will need to be rebuilt very frequently, if too large, the computational savings will be minimal as the list will contain a very large number of particles.   
To speed things up, we can generate a neighborlist, which is a list of particles within our interaction cutoff (i.e., $r_c$) with some 


The neighborlist has two basic parameters: (1) the skin size (i.e, buffer) and (2) the frequency of updating/checking for updates. These two parameters are coupled; for example, a system with high mobility may require a larger skin to avoid needing to be updated the neighborlist every timestep and to ensure particle interactions are not missed.

### The Skin
The skin is used to create a buffer of particles that are close by, but just outside of the interaction cutoff. This increases the number of calculations that must be done, since we still need to calculate the distance with all particles in the neighborlist even if a particle is ultimately be outside of the interaction range. However, by including this buffer of particles, neighborlists need to be updated less frequently, since we are basically keeping track of the location of nearby particles that we may interact with in the near future.



### Updating Frequency

In most codes, you have two ways of enforcing an update of the neighborlist.

For example, one approach is to specify that the neighborlist be updated every N steps. This approach tends to work well for systems that are dense or slowly moving, as particle motion should be relatively uniform in the system. However, this may make it challenging to balance wanting to reduce the number of updates that are done and ensuring that particles don't move too far, resulting in missed interactions (i.e., dangerous builds). This approach can speed things up, but should be considered "less safe".

Another approach is to check the displacement of particles. The general rule is that a neighborlist needs to be rebuilt if any particle has move more than skin/2.0, such to ensure that particle interactions are not being missed. This helps to ensure that "dangerous builds" aren't present in the system (See below). In many codes this also can be couple "checking" with a time modulation, e.g., only check every N timesteps.



### Dangerous Builds

Most codes will output a summary of the total number of dangerous builds, that is, the total number of times it built the neighborlist was rebuilt with particles move greater than skin/2.0. In such cases, interactions may be missed, resulting in misleading and incorrect behavior.  

For example, the output from a simulation code may look something similar to:

> -- Neighborlist stats:
6392 normal updates / 167 forced updates / 0 dangerous updates <br>
n_neigh_min: 16 / n_neigh_max: 100 / n_neigh_avg: 64.63 <br>
shortest rebuild period: 6<br>

where it is clear that we did not have any dangerous builds. This also provides information regarding the minimum time between neighborlist updates. In more extreme cases of missed interactions, systems may crash due to particles moving outside the bounds of the simulation cell (usually due to extremely high forces on the particles). 

Often more detailed information can be acquired.  For example, 50% of the computational time is actually spent on the neighborlist.

> `Simulation:     3.1s | 100.0%  `<br>
  `      Integrate:     0.5s | 16.5% `<br>
  `              NVT step 1:     0.1s | 4.2% `<br>
  `              NVT step 2:     0.1s | 3.0% `<br>
  `              Net force:      0.1s | 2.9% `<br>
  `              Thermo:         0.1s | 3.7% `<br>
  `              Self:           0.1s | 2.8% `<br> 
  `      Neighbor:      1.6s | 50.0% `<br>
  `              Cell:           0.0s | 0.8% `<br>
  `                      compute:     0.0s | 0.5% `<br>
  `                      init:        0.0s | 0.2% `<br>
  `              compute:        1.5s | 48.2% `<br>
  `              dist-check:     0.0s | 1.0% `<br> 
  `      Pair lj:       0.9s | 30.2% `<br>
  `      SFCPack:       0.1s | 3.1% `<br>
  `      Self:          0.0s | 0.2% `<br>

### Exercises

1) Using the simple monoatomic LJ system (shown below and similar to the Anatomy of a Script file), adjust the nlist parameters.
- change the buffer (r_buff) to examine the performance as a function of increasing/decreasing the skin.  How small can you make the skin before dangerous builds are detected? How does this impact the speed of the code (e.g. TPS).

 See: http://hoomd-blue.readthedocs.io/en/stable/module-md-nlist.html#hoomd.md.nlist.cell
 
2) What happens if you set the check period to 1? 10? 100? 1000? Can you speed up the simulation and still avoid dangerous builds?

3) What does the neightborlist auto-tuning function give you (note, you must define this after the integrate call)?

4) How does temperature (kT) influence the parameters you need and number of rebuilds?

In [5]:
#import hoomd
import hoomd.md
import hoomd.deprecated

hoomd.context.initialize("");
hoomd.init.create_lattice(unitcell=hoomd.lattice.sc(a=2), n=5)

nl = hoomd.md.nlist.cell(r_buff=0.5, dist_check=True, check_period=10)
lj = hoomd.md.pair.lj(r_cut=2.5, nlist=nl)
lj.pair_coeff.set('A', 'A', epsilon=1.0, sigma=1.0)

hoomd.md.integrate.mode_standard(dt=0.005)

all = hoomd.group.all();
hoomd.md.integrate.langevin(group=all, kT=1.0, seed=42)


hoomd.analyze.log(filename="log-output.log",
                  quantities=['potential_energy', 'temperature'],
                  period=100,
                  overwrite=True)
hoomd.dump.gsd("trajectory.gsd", period=2e3, group=all, overwrite=True)
hoomd.dump.dcd("trajectory.dcd", period=2e3, group=all, overwrite=True)

hoomd.run(1e4)
hoomd.deprecated.dump.xml(group=all, filename="snapshot.xml", vis=True)

notice(2): Group "all" created containing 125 particles
notice(2): integrate.langevin/bd is using specified gamma values
notice(2): -- Neighborlist exclusion statistics -- :
notice(2): Particles with 0 exclusions             : 125
notice(2): Neighbors included by diameter          : no
notice(2): Neighbors excluded when in the same body: no
** starting run **
Time 00:00:08 | Step 200000 / 200000 | TPS 23759.3 | ETA 00:00:00
Average TPS: 23757.7
---------
-- Neighborlist stats:
12615 normal updates / 2000 forced updates / 0 dangerous updates
n_neigh_min: 0 / n_neigh_max: 19 / n_neigh_avg: 7.928
shortest rebuild period: 10
-- Cell list stats:
Dimension: 3, 3, 3
n_min    : 0 / n_max: 9 / n_avg: 4.62963
** run complete **
** starting run **
Time 00:00:09 | Step 205000 / 205000 | TPS 8029.79 | ETA 00:00:00
Average TPS: 8017.33
---------
-- Neighborlist stats:
2450 normal updates / 50 forced updates / 0 dangerous updates
n_neigh_min: 0 / n_neigh_max: 16 / n_neigh_avg: 4.544
shortest rebuild 

n_neigh_min: 0 / n_neigh_max: 20 / n_neigh_avg: 7.152
shortest rebuild period: 8
-- Cell list stats:
Dimension: 3, 3, 3
n_min    : 0 / n_max: 13 / n_avg: 4.62963
** run complete **
** starting run **
Time 00:00:19 | Step 305000 / 305000 | TPS 23027.4 | ETA 00:00:00
Average TPS: 22910.9
---------
-- Neighborlist stats:
450 normal updates / 50 forced updates / 0 dangerous updates
n_neigh_min: 0 / n_neigh_max: 17 / n_neigh_avg: 6.304
shortest rebuild period: 8
-- Cell list stats:
Dimension: 3, 3, 3
n_min    : 2 / n_max: 9 / n_avg: 4.62963
** run complete **
** starting run **
Time 00:00:19 | Step 310000 / 310000 | TPS 21988.9 | ETA 00:00:00
Average TPS: 21906.3
---------
-- Neighborlist stats:
397 normal updates / 50 forced updates / 0 dangerous updates
n_neigh_min: 0 / n_neigh_max: 21 / n_neigh_avg: 7.28
shortest rebuild period: 9
-- Cell list stats:
Dimension: 3, 3, 3
n_min    : 0 / n_max: 15 / n_avg: 4.62963
** run complete **
** starting run **
Time 00:00:19 | Step 315000 / 315000 | T

Time 00:00:28 | Step 415000 / 415000 | TPS 18328.3 | ETA 00:00:00
Average TPS: 18216.7
---------
-- Neighborlist stats:
205 normal updates / 50 forced updates / 0 dangerous updates
n_neigh_min: 0 / n_neigh_max: 27 / n_neigh_avg: 10.056
shortest rebuild period: 17
-- Cell list stats:
Dimension: 3, 3, 3
n_min    : 1 / n_max: 13 / n_avg: 4.62963
** run complete **
** starting run **
Time 00:00:28 | Step 420000 / 420000 | TPS 17880.4 | ETA 00:00:00
Average TPS: 6061.29
---------
-- Neighborlist stats:
203 normal updates / 50 forced updates / 0 dangerous updates
n_neigh_min: 0 / n_neigh_max: 23 / n_neigh_avg: 9.568
shortest rebuild period: 15
-- Cell list stats:
Dimension: 3, 3, 3
n_min    : 0 / n_max: 10 / n_avg: 4.62963
** run complete **
** starting run **
Time 00:00:29 | Step 425000 / 425000 | TPS 25799.7 | ETA 00:00:00
Average TPS: 25715.4
---------
-- Neighborlist stats:
199 normal updates / 50 forced updates / 0 dangerous updates
n_neigh_min: 0 / n_neigh_max: 26 / n_neigh_avg: 10.168

<hoomd.deprecated.dump.xml at 0x107be1b70>