# **<center> Part 4: Advanced Features of MDAnalysis </center>**


MDAnalysis has plenty of other features for users and developers! You can find these by exploring our [User Guide](https://userguide.mdanalysis.org/stable/index.html) and [Documentation](https://docs.mdanalysis.org/stable/index.html). We're going to quickly run through some of them now:

1. Distance calculations with `lib.distances`
2. Universe creation and adding topology attributes
3. Working with "Auxiliary data"
4. On-the-fly transformations

## 1. The `lib.distances` module

Distance calculations come up frequently in analyses. Particle positions are given as numpy arrays, so most work can be done using numpy (and numpy derived) libraries. To save writing your own distance calculations, use those in the included `lib.distance` module! 

- `lib.distances` is particularly handy for considering periodic boundary conditions (which numpy cannot handle). Box information is passed in as `box=ag.dimensions` 

In [4]:
import MDAnalysis as mda
from MDAnalysis.lib import distances
 
print(distances)

<module 'MDAnalysis.lib.distances' from '/home/fiona/miniforge3/envs/mda_workshop/lib/python3.11/site-packages/MDAnalysis/lib/distances.py'>


In [30]:
from MDAnalysis.tests.datafiles import TPR, TRR
u = mda.Universe(TPR, TRR)

import warnings
warnings.filterwarnings("ignore") 

- `distance_array`: All **pairwise distances** between **two** arrays of coordinates

In [6]:
ag1 = u.atoms[:10]
ag2 = u.atoms[10:30]

da = distances.distance_array(ag1.positions, 
                              ag2.positions,
                              box=u.dimensions)

print(f'Our input atomgroups has sizes {len(ag1)} '
      f'and {len(ag2)}, the output has shape: '
      f'{da.shape}')
print()

# The output of distance array is a matrix of the 
# distance between each position in the first 
# coordinate array and each position in the second 
# coordinate array.
print(f'The distance between {ag1[3]} and {ag2[5]} '
      f'is: {da[3, 5]} A')

Our input atomgroups has sizes 10 and 20, the output has shape: (10, 20)

The distance between <Atom 4: H3 of type opls_290 of resname MET, resid 1 and segid seg_0_AKeco> and <Atom 16: HE2 of type opls_140 of resname MET, resid 1 and segid seg_0_AKeco> is: 6.19140688959118 A


- `self_distance_array`: All **pairwise distances** within **a single** coordinate array

In [7]:
sda = distances.self_distance_array(ag1.positions,
                                    box=None)

print(f'Our input AtomGroup has size {len(ag1)} '
      f'and the output has shape {sda.shape}')


Our input AtomGroup has size 10 and the output has shape (45,)


- `calc_bonds`: The distance between pairs from **two arrays** of coordinates of **equal length**.

In [9]:
coords1 = u.atoms[:10].positions
coords2 = u.atoms[10:20].positions
dist = distances.calc_bonds(coords1, 
                            coords2, 
                            box=None)

print(f'The inputs has length {len(coords1)} and '
      f'{len(coords2)} and the output has shape '
      f'{dist.shape}')
print()
print(f'The distance between the first coordinate '
      f'in each array is: {dist[0]} A')

The inputs has length 10 and 10 and the output has shape (10,)

The distance between the first coordinate in each array is: 4.759779866326316 A


- `calc_angles` and `calc_dihedrals`: The angle/dihedral angle between 3/4 arrays of coordinates of equal length.

In [13]:
import numpy as np
coords1 = u.atoms[:10].positions
coords2 = u.atoms[10:20].positions
coords3 = u.atoms[20:30].positions
coords4 = u.atoms[30:40].positions

# The middle array is the apex of the angle
angles = distances.calc_angles(
            coords1, coords2, coords3)
print(f'The inputs have length {len(coords1)}, '
      f'{len(coords2)} and {len(coords3)}; the '
      f'output has shape {dist.shape}. The first '
      f'angle is {np.rad2deg(angles)[0]}.')

# Dihedral angle is between plane of the first 
# three #coordinates and plan of the second three
dihedrals = distances.calc_dihedrals(
               coords1, coords2, coords3, coords4)
print(f'The inputs have length {len(coords1)}, '
      f'{len(coords2)}, {len(coords3)} and '
      f'{len(coords4)}; the output has shape '
      f'{dist.shape}. The first angle is '
      f'{np.rad2deg(dihedrals)[0]}')

The inputs have length 10, 10 and 10; the output has shape (10,). The first angle is 46.07622285213596.
The inputs have length 10, 10, 10 and 10; the output has shape (10,). The first angle is -50.1028437342708


- `capped_distance`: distances between coordinates from **two arrays** that are **within a specified cut-off**
- `self_capped_distance`: distances between coordinates from a **single arrays** that are **within a specified cut-off**

In [14]:
ag1 = u.atoms[:10]
ag2 = u.atoms[10:30]

# returns an array of indicies and an array of distances
ix, dist = distances.capped_distance(
                          ag1.positions, 
                          ag2.positions, 
                          min_cutoff=1.0,
                          max_cutoff=4.0,
                          box=u.dimensions)

print(f'We found {len(ix)} contacts less then 4.0 A')
print()
print(f'The first three are {ix[:3]} at '
      f'distances {dist[:3]}')


We found 70 contacts less then 4.0 A

The first three are [[0 9]
 [0 7]
 [0 8]] at distances [3.51211077 2.46426988 2.90528184]


### Summary
- `distance_array`: All **pairwise distances** between **two** arrays of coordinates
- `self_distance_array`: All **pairwise distances** within **a single** coordinate array
- `calc_bonds`: The distance between pairs from **two arrays** of coordinates of **equal length**.
- `calc_angles` and `calc_dihedrals`: The angle/dihedral angle between 3/4 arrays of coordinates of equal length.
- `capped_distance`: distances between coordinates from **two arrays** that are **within a specified cut-off**
- `self_capped_distance`: distances between coordinates from a **single array** that are **within a specified cut-off**

## 2. Creating and modifying Universes and Topology Attributes


   ### New Universes using `Universe.empty`

Whilst `MDAnalysis` is designed for reading pre existing simulation files, there are also some features which allow the construction of systems.

A `Universe` object can also be constructed from the `Universe.empty` method, which is similar to `np.zeros`.

Initially, atoms will have no topology attributes and positions will be 0.

In [31]:
u_new = mda.Universe.empty(n_atoms=21, n_residues=7,
                       trajectory=True)

print(u_new.atoms)
print(u_new.residues)
print(u_new.atoms.positions)


<AtomGroup [<Atom 1:>, <Atom 2:>, <Atom 3:>, ..., <Atom 19:>, <Atom 20:>, <Atom 21:>]>
<ResidueGroup [<Residue>, <Residue>, <Residue>, <Residue>, <Residue>, <Residue>, <Residue>]>
[[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]]


### Adding topoology attributes

We can then add topology attributes to any universe using `add_TopologyAttr`.
Only 'established' topology attributes can be added. Existing topology attributes are listed in the [User Guide](https://userguide.mdanalysis.org/stable/topology_system.html).

- You can also use `add_TopologyAttr` to add a topology attribute to a Universe created from a file where that attribute is not supported.

Positions can be assigned directly.

In [16]:
# adding topology attributes
u_new.add_TopologyAttr('mass', values=[10.0] * 21)
u_new.add_TopologyAttr('name', values=['A'] * 21)
u_new.add_TopologyAttr('type', values=['CA'] * 21)
u_new.add_TopologyAttr('resid', values=range(1,8))

# assiging atoms to residues
for i, res in enumerate(u_new.residues):
    u_new.atoms[i*3].residue = res
    u_new.atoms[i*3+1].residue = res
    u_new.atoms[i*3+2].residue = res
    
# adding positions
u_new.atoms.positions = [[1,1,1]]*21

print(u_new.atoms)
print(u_new.residues)
print(u_new.atoms.positions)

<AtomGroup [<Atom 1: A of type CA resid 1>, <Atom 2: A of type CA resid 1>, <Atom 3: A of type CA resid 1>, ..., <Atom 19: A of type CA resid 7>, <Atom 20: A of type CA resid 7>, <Atom 21: A of type CA resid 7>]>
<ResidueGroup [<Residue 1>, <Residue 2>, <Residue 3>, <Residue 4>, <Residue 5>, <Residue 6>, <Residue 7>]>
[[1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]]


### Adding custom topology attributes


You can also add custom attributes by defining a Topology Attribute class!

These attributes will then also be available for use in selection strings.

In [17]:
from MDAnalysis.core.topologyattrs import AtomAttr

class Bounciness(AtomAttr):
    dtype=bool
    attrname='bounciness'
    singular='bouncy'
    per_object='atom'

# add the attriubte, with default value False (0)
u_new.add_TopologyAttr(Bounciness([0]*21))
print('Initial bounciness:', u_new.atoms.bounciness)
bouncy_atoms = u_new.select_atoms("bouncy")
print('Total bouncy atoms:', len(bouncy_atoms))
print()

u_new.atoms[::2].bounciness = True
print('Reassigned bounciness:', u_new.atoms.bounciness)
bouncy_atoms = u_new.select_atoms("bouncy")
print('Total bouncy atoms:', len(bouncy_atoms))

Initial bounciness: [False False False False False False False False False False False False
 False False False False False False False False False]
Total bouncy atoms: 0

Reassigned bounciness: [ True False  True False  True False  True False  True False  True False
  True False  True False  True False  True False  True]
Total bouncy atoms: 11


### New Universes using `Merge`

We can use `MDAnalysis.Merge` to create new Universes from `AtomGroups`.

This could be a single atom group, or multiple atom groups from different Universes.

In [18]:
print("Our original Universe:", u_new)

# even residues only
residue_even = mda.Merge(u_new.select_atoms(
                                    "resid 2 4 6"))
print("Universe with only even residues:", 
      residue_even)

# odd residues only 
residue_odd = mda.Merge(u_new.select_atoms(
                                  "resid 1 3 5 7"))
print("Universe with only odd residues:", 
      residue_odd)

# Now let's combine them!
residue_all = mda.Merge(residue_even.atoms, residue_odd.atoms)
print("Back together again:", residue_all)

Our original Universe: <Universe with 21 atoms>
Universe with only even residues: <Universe with 9 atoms>
Universe with only odd residues: <Universe with 12 atoms>
Back together again: <Universe with 21 atoms>


## 3. Auxiliary Data

Auxiliary readers allow you to read in timeseries data accompanying a trajectory that is not stored in the regular trajectory file.

Auxiliary data may be added to a trajectory Reader through the `add_auxiliary()` method. Values from the auxiliary file will be read in, matching the timestep as the trajectory is iterated through, and are accessed as `ts.aux.<auxiliary_name>`.

In [19]:
from MDAnalysis.tests.datafiles import PDB_xvf, TRR_xvf, XVG_BZ2

u_aux = mda.Universe(PDB_xvf, TRR_xvf)

# The XVG_BZ2 file contains force information not
# present in the trajectory/topology file.
# Let's add this with the name 'forces'
u_aux.trajectory.add_auxiliary('forces', XVG_BZ2)

# We can now access the force at each timestep
# as we iteracte through the trajectory
for ts in u_aux.trajectory:
    print(ts.aux.forces)
    
# The first element of each array is the time; the 
# remainder are values from the XVG file.

[    0.        200.71288 -1552.2849  ...   128.4072   1386.0378
 -2699.3118 ]
[   50.      -1082.6454   -658.32166 ...  -493.02954   589.8844
  -739.2124 ]
[  100.       -246.27269   146.52911 ...   484.32501  2332.3767
 -1801.6234 ]




## 4. Transformations

We often wish to apply transformations to each frame of a trajectory in order to e.g. remove effects of periodic boundary conditions or align to a reference structure. 

In [20]:
import nglview as nv
view = nv.show_mdanalysis(u)
view



NGLWidget(max_frame=9)

### Wrapping/Unwrapping from the periodic box 

You can remove the effects of the periodic box in the current frame using `unwrap`. You can also use `wrap` to translate atoms back into the unit cell.

- For wrapping/unwrapping to be persistent, you need to to load the Universe in memory
- Wrapping/unwrapping is based on `Bonds` - they won't work if `Bonds` are missing from your universe. `guess_bonds` (specified when loading a Universe or applied to an `AtomGroup`) can be used to guess bonds based on distances.

To apply the transformation to every frame, you could loop through the trajectory - or...

In [21]:
u_mem = mda.Universe(TPR, TRR, in_memory=True)
u_mem.atoms.unwrap(reference='cog')

array([[ 52.017067 ,  43.56005  ,  31.554958 ],
       [ 51.18792  ,  44.112053 ,  31.722015 ],
       [ 51.550823 ,  42.827724 ,  31.038803 ],
       ...,
       [105.341995 ,  74.072    ,  40.988003 ],
       [ 57.684002 ,  35.323997 ,  14.804    ],
       [ 62.961002 ,  47.239    ,   3.7529998]], dtype=float32)

In [22]:
view_unwrap = nv.show_mdanalysis(u_mem)
view_unwrap

NGLWidget(max_frame=9)

### On-the-fly Transformations

Transformations can be added to a trajectory using `u.trajectory.add_transformations`. These will be performed 'on the fly' as the trajectory is iterated through - no need to load the trajectory to memory. **Note** - you can only add transformations once!

A number of common transformations are available in `MDAnalysis.transformations` - see the [documentation](https://docs.mdanalysis.org/stable/documentation_pages/trajectory_transformations.html#currently-implemented-transformations). These include:
 - translation and rotation
 - center in unit cell
 - fit to a reference
 - wrap/unrap over unit cell



In [28]:
from MDAnalysis import transformations as trans
u_wrap = mda.Universe(TPR, TRR)
protein = u_wrap.select_atoms('protein')
water = u_wrap.select_atoms('resname SOL')

workflow = [trans.unwrap(u_wrap.atoms),
            trans.center_in_box(protein, 
                                center='geometry'),
            trans.wrap(water, compound='residues')]

u_wrap.trajectory.add_transformations(*workflow)

view_trans = nv.show_mdanalysis(u_wrap)
view_trans


NGLWidget(max_frame=9)

## Custom transformations


Custom transformations can be defined and applied in the same way as the built-in transformations.

At its core, a transformation function must only take a `Timestep` object as its input, and return the `Timestep` as the output.

- If your transformation needs extra input (e.g. atom selections), you can use a wrapped function - see the Tutorial for more!

In [29]:
def up_by_2(ts):
    """Translates atoms up by 2 angstrom"""
    ts.positions += np.array([0.0, 0.0, 0.2])
    return ts

# If a translation doesn't require extra 
# information, it can be added directly when 
# creating the universe
u_up = mda.Universe(TPR, TRR, 
                    transformations=[up_by_2])

# Summary

We took a brief look at:

1. Measuring distances with lib.distances
2. Making Universes and adding atoms/attributes to them
3. Reading in other timestep data with Auxiliary readers
4. Applying on-the-fly transformations

Tutorial 4 has some more example on each of these to play around with. You don't have to do through all of them - pick the ones interesting to you (or continue working on the earlier notebooks). Also remember User Guide/Documentation (also things we haven't covered - User Guide, Docs, ask)!