# Calculating polymer properties, using built-in analysis methods, and iterating over trajectories

### In this notebook we learn how to use the methods of the atomgroup class to calculate bonds, distances between atoms, angles and dihedrals. We will also calculate typical polymer descriptors such as radius of gyration, end-to-end distance and persistence length. 
### We then see how to iterate over a trajectory, and plot these quantities for each timestep. 

In [None]:
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
import MDAnalysis as mda
from MDAnalysis.lib import distances
import MDAnalysisData as data
import nglview as nv
mda.__version__

### 1. Load the dataset

Create a Universe by loading the coordinates of a PEG - poly(ethyleneglycol) chain $HO(CH2CH2)_{20}OH$ 

In [None]:
polymer = data.datasets.fetch_PEG_1chain()

u = mda.Universe(polymer['topology'], polymer['trajectory'])

In [None]:
u.trajectory

We loaded a trajectory with 50 frames, but for the first part of this tutorial we'll limit ourselves to look at one frame.

Select the polymer (excluding water molecules):

### 2. Visualize the system

We can visualize the trajectory using `nglview`, which takes an `atomselection` as input:

### 3. Polymer descriptors

Calculate the center of mass of the polymer chain:

Now, calculate the radius of gyration:

### 4. End-to-end distance of the PEG chain: calc_bonds

We can use the hydrogens in the capping -OH groups (`type ho`) as reference points.
First, select the two hydrogen atoms:

Then, calculate the distance between their coordinates:

The numpy function works, but if your system has periodic boundary conditions you want to make sure your distances are calculated using the minimum image convention. We can take care of is by using `distances.calc_bonds`:

### 5. Calculate (dihedral) angles: calc_angles & calc_dihedrals
You might need these to check some conformational parameters in your system, or whether your force field yields the expected dihedral angle distribution.

Select the C and O atoms in the PEG chain:

Now with some creative index slicing we can look at the C-O-C angles and O-C-C-O dihedrals in our polymer chain.
Since the $CH_{2}$ groups have identical atom types, the simplest way to select only every other carbon is to slice the arrays.

Use `distances.calc_angles` and `distance.calc_dihedrals` to calculate the angles. We will need the positions of the atomgroups involved:

### 5. Now we do it all again, iterating over the entire trajectory

By default we start at frame 0:

`u.trajectory` can be sliced like a numpy array. 
We can change frame by selecting a different number of the `u.trajectory` object:

We can then iterate through it using a loop. 
Let's calculate $R_g$ and $r_e$ for each frame:

let's calculate the average values $<R_g^2>$ and $<r_e>$:

Now calculate the $O-C-C-O$ dihedrals, like previously, but iterating over the trajectory:

Each element of the dihedrals list is an array, so before looking at the distribution we need to "flatten" it. We can do this in one line using `numpy.concatenate`:

Now plot the histogram:

In [None]:
fig, ax = plt.subplots()
ax.hist(dih, bins=20, density=True)
ax.set_xlabel("O-C-C-O torsion")
ax.set_ylabel("Norm. Frequency")

plt.show()

### 6. Persistence length 

This is also a built-in method. In this case, we don't have to iterate over the trajectory because the function already does that for us.

In analysing polymers, the persistence length is a measure of a chains stiffness.  The persistence length is the distance at which the direction of two points on a polymer chain becomes decorrelated.  High persistence lengths indicate that the polymer chain is rigid and doesn't change direction, low persistence lengths indicate that the polymer chain has little memory of its orientation.

The bond autocorrelation function $C(n)$ measures the average cosine of the angle between bond vector $\mathbf{a_i}$ and a bond vector $n$ bonds away. 

$$C(n) = \langle \cos\theta_{i, i+n} \rangle= \langle \mathbf{a_i} \cdot \mathbf{a_{i+n}} \rangle$$

This is then fitted to an exponential decay, where $l_B$ is the average bond length, and $l_P$ is the persistence length.


$$C(n) \approx \exp\left(-\frac{n l_B}{l_P}\right)$$


In [None]:
from MDAnalysis.analysis.polymer import PersistenceLength

Select the backbone of the polymer. It's easy in this case since we only need to exclude hydrogens:

It is important that the contents of the polymer `atomgroup` are in order. 
Selections done using `select_atoms` will always be sorted.
This can be checked by listing the `atomgroup`.

Run the `PersistenceLength` function: 

We can plot the autocorrelation using pl.results:

The tool can then perform the exponential decay fit for us, which populates the `.lp` attribute.

We can check the validity of the fit by plotting the results, using the `.plot()` function:

.. Or just look at the value of $l_{P}$: