In [1]:
%load_ext autoreload
%autoreload 2

In [39]:
import molsysmt as msm
import numpy as np
from simtk import unit

# Getting distances, neighbor lists and contact maps.

MolSysMT includes a very versatile method to calculate distances between points in space, atoms and/or groups of atoms. As many other methods of this multitool, the method `molsysmt.distance()` has an input argument to choose the engine in charge of getting the result. For instance, `molsysmt.distance()` currently offers two engines `MolSysMT` and `MDTraj`. At this moment only `MolSysMT` will be reported in this guide.

The different options of the method `molsysmt.distance()` will be shown, little by little, along with the following examples.

## The XYZ molecular system form

The first case, and the most simple one, is getting distances from a points distribution in space. MolSysMT accepts a molecular system form where only spatial coordinates are described, with out topological information: the `XYZ` form.

In [3]:
molecular_system = np.zeros([6,3], dtype='float64') * unit.nanometers

In [4]:
msm.get(molecular_system, target='system', form=True)

'XYZ'

The `XYZ` form accepts numpy arrays with length units of the shape $[n\_frames, n\_atoms, 3]$ or $[n\_atoms, 3]$. In case of having an array of rank 2, MolSysMT always understands $n\_frames=1$ and the first rank as the number of atoms:

In [5]:
msm.get(molecular_system, n_frames=True, n_atoms=True)

[1, 6]

Lets create a couple of `XYZ` molecular systems with more than a frame. These two systems will help us illustrate the firts distance calculations:

In [6]:
# Molecular system A with three atoms and three frames.

molecular_system_A = np.zeros([3,3,3], dtype='float64') * unit.nanometers

## First atom
molecular_system_A[0,0,:] = [0, 2, -1] * unit.nanometers
molecular_system_A[1,0,:] = [1, 2, -1] * unit.nanometers
molecular_system_A[2,0,:] = [0, 2, -1] * unit.nanometers

## Second atom
molecular_system_A[0,1,:] = [-1, 1, 1] * unit.nanometers
molecular_system_A[1,1,:] = [-1, 0, 1] * unit.nanometers
molecular_system_A[2,1,:] = [0, 0, 1] * unit.nanometers

## Third atom
molecular_system_A[0,2,:] = [-2, 0, 1] * unit.nanometers
molecular_system_A[1,2,:] = [-2, 0, 0] * unit.nanometers
molecular_system_A[2,2,:] = [-1, 1, 0] * unit.nanometers



# Molecular system B with two atoms and three frames.

molecular_system_B = np.zeros([3,2,3], dtype='float64') * unit.nanometers

## First atom of B
molecular_system_B[0,0,:] = [4, -2, 0] * unit.nanometers
molecular_system_B[1,0,:] = [5, -2, -1] * unit.nanometers
molecular_system_B[2,0,:] = [5, -2, 0] * unit.nanometers

## Second atom of B
molecular_system_B[0,1,:] = [3, 0, -1] * unit.nanometers
molecular_system_B[1,1,:] = [3, 1, 0] * unit.nanometers
molecular_system_B[2,1,:] = [4, 1, 1] * unit.nanometers

## Distance between atoms in space

### Distance between atoms of a system

The first case shows how to get the distance between all points of a system at every frame

In [7]:
distances = msm.distance(molecular_system_A)

The result is an array of rank 3. Where the first axe or rank corresponds to the number of frames and the other two, the second and third axe, accounts for the point or atom indices:

In [8]:
distances.shape

(3, 3, 3)

This way every distance between atoms at each frame is stored. Lets print out the distance between the 0-th and the 2-th atom at frame 1-th:

In [9]:
print('Distance at frame 1-th between atoms 0-th and 2-th: {}'.format(distances[1,0,2]))

Distance at frame 1-th between atoms 0-th and 2-th: 3.7416573867739413 nm


If only the distance between atoms 0-th and 2-th at every frame is required, there is no need to compute $n\_atoms x n\_atoms$ distances. The input arguments `selection_1` and `selection_2` help us to define the range of elements of the output distance matrix:

In [10]:
distances = msm.distance(molecular_system_A, selection_1=0, selection_2=2)

This time the output is an array of rank 3 with shape $[3,1,1]$. The distance for just a pair of atoms was computed for three frames:

In [11]:
distances.shape

(3, 1, 1)

In [12]:
for ii in range(3):
    print('Distance at frame {}-th between atoms 0-th and 2-th: {}'.format(ii,distances[ii,0,0]))

Distance at frame 0-th between atoms 0-th and 2-th: 3.4641016151377544 nm
Distance at frame 1-th between atoms 0-th and 2-th: 3.7416573867739413 nm
Distance at frame 2-th between atoms 0-th and 2-th: 1.7320508075688772 nm


Lets try now to get the distance between the atom 1-th and the atoms 0-th and 2-th at every frame:

In [13]:
distances = msm.distance(molecular_system_A, selection_1=1, selection_2=[0,2])

As you will guess, the output matrix is an array of rank three this time with shape $[3,1,2]$:

In [14]:
distances.shape

(3, 1, 2)

If we want now to print out the distance between atoms 1-th and 2-th for frame 0-th:

In [15]:
print('Distance at frame 0-th between atoms 1-th and 2-th: {}'.format(distances[0,0,1]))

Distance at frame 0-th between atoms 1-th and 2-th: 1.4142135623730951 nm


The position of each atom in lists `selection_1` and `selection_2` is used to locate the corresponding distance in the output array. If instead, you want to use the original atom indices to locate a distance, the input argument `output_form='dict'` can help:

In [16]:
distances = msm.distance(molecular_system_A, selection_1=1, selection_2=[0,2], output_form='dict')

This way the output is no longer a numpy array of rank 3, the output object is now a dictionary of dictionaries of dictionaries. Where the set keys of the first dictionary corresponds to the atom indices of `selection_1`:

In [17]:
distances.keys()

dict_keys([1])

The second nested dictionary has the atom indices of `selection_2` as keys:

In [18]:
distances[1].keys()

dict_keys([0, 2])

And the third and last nested dictionary is defined with the frame indices as keys:

In [19]:
distances[1][2].keys()

dict_keys([0, 1, 2])

Thus, the distance now between atoms 0-th and 2-th in frame 1-th is:

In [20]:
print('Distance at frame 0-th between atoms 1-th and 2-th: {}'.format(distances[1][2][0]))

Distance at frame 0-th between atoms 1-th and 2-th: 1.4142135623730951 nm


Just like `selection_1` and `selection_2` can limit the range of atom indices of the calculation, `frame_indices_1` can be used to define the list of frames where the method applies:

In [21]:
distances = msm.distance(molecular_system_A, selection_1=1, selection_2=[0,2], frame_indices_1=[1,2])

In [22]:
print('Distance at frame 2-th between atoms 1-th and 2-th: {}'.format(distances[1,0,1]))

Distance at frame 2-th between atoms 1-th and 2-th: 1.7320508075688772 nm


You can check again that with `output_form='dict'` the original indics for atoms and frames work to locate a distance:

In [23]:
distances = msm.distance(molecular_system_A, selection_1=1, selection_2=[0,2], frame_indices_1=[1,2], output_form='dict')

In [24]:
print('Distance at frame 2-th between atoms 1-th and 2-th: {}'.format(distances[1][2][2]))

Distance at frame 2-th between atoms 1-th and 2-th: 1.7320508075688772 nm


### Distance between atoms of two systems

The second case shows how to get the distance between atoms of two systems at every frame

In [25]:
distances = msm.distance(item_1=molecular_system_A, item_2=molecular_system_B)

As it was shown previously, the result is an array of rank 3. Again, where the first axe or rank corresponds to the number of frames and the other two, the second and third axe, accounts for the atom indices -this time in each system-:

In [26]:
distances.shape

(3, 3, 2)

Lets print out the distance between atom 1-th of `molecular_system_A` and atom 0-th of `molecular_system_B` at every frame:

In [27]:
for ii in range(3):
    print('Distance at frame {}-th between atom 1-th of A and atom 0-th of B: {}'.format(ii,distances[ii,1,0]))

Distance at frame 0-th between atom 1-th of A and atom 0-th of B: 5.916079783099616 nm
Distance at frame 1-th between atom 1-th of A and atom 0-th of B: 6.6332495807108 nm
Distance at frame 2-th between atom 1-th of A and atom 0-th of B: 5.477225575051661 nm


Now that `item_1` and `item_2` contain different systems, `selection_1` and `selection_2` do not work over the same system as in previous subsection, but over each molecular system (`selection_1` over `item_1` and `selection_2` over `item_2`). Lets get the distance only between atom 1-th of `molecular_system_A` and atom 0-th of `molecular_system_B` for every frame:

In [28]:
distances = msm.distance(item_1=molecular_system_A, selection_1=1, item_2=molecular_system_B, selection_2=0)

In [29]:
distances.shape

(3, 1, 1)

In [30]:
for ii in range(3):
    print('Distance at frame {}-th between atom 1-th of A and atom 0-th of B: {}'.format(ii,distances[ii,0,0]))

Distance at frame 0-th between atom 1-th of A and atom 0-th of B: 5.916079783099616 nm
Distance at frame 1-th between atom 1-th of A and atom 0-th of B: 6.6332495807108 nm
Distance at frame 2-th between atom 1-th of A and atom 0-th of B: 5.477225575051661 nm


Again, the input argument `output_form='dict'` lets us play with the original indices in the output object. As it was described in the previous subsection, this dictionary of dictionaries of dictionaries has three keys: the first one corresponds to the atom indices of `item_1`, the second one corresponds to the atom indices of `item_2` and the third one to the frame indices. This way the distance now between atom 1-th of `molecular_system_A` and atom 0-th of `molecular_system_B` at frame 1-th:

In [31]:
distances = msm.distance(item_1=molecular_system_A, selection_1=1, frame_indices_1=1,
                         item_2=molecular_system_B, selection_2=0, output_form='dict')

In [32]:
print('Distance at frame 1-th between atom 1-th of A and atom 0-th of B: {}'.format(distances[1][0][1]))

Distance at frame 1-th between atom 1-th of A and atom 0-th of B: 6.6332495807108 nm


Notice that `frame_indices_1` was used to define the frame indices where the distance is computed. If only `frame_indices_1` is used, the same list of indices is used sequentially for both systems `item_1` and `item_2`:

In [33]:
distances_1 = msm.distance(item_1=molecular_system_A, selection_1=[1,2], frame_indices_1=[0,2],
                           item_2=molecular_system_B, selection_2=[0,1])

In [34]:
distances_2 = msm.distance(item_1=molecular_system_A, selection_1=[1,2], frame_indices_1=[0,2],
                           item_2=molecular_system_B, selection_2=[0,1], frame_indices_2=[0,2])

In [35]:
distances_1 == distances_2

array([[[ True,  True],
        [ True,  True]],

       [[ True,  True],
        [ True,  True]]])

In [38]:
print('Distance at frame 2-th between atom 2-th of A and atom 1-th of B: {}'.format(distances_1[1,1,1]))

Distance at frame 2-th between atom 2-th of A and atom 1-th of B: 5.0990195135927845 nm


### Displacement distances

### Distances between atoms positions in different frames