# Day 1, Lecture 2
## Handling simulation trajectory data

### Global imports

In [None]:
import MDAnalysis as mda
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

## 1. Reading a trajectory

MDAnalysis is able to read a wide variety of different simulation coordinate formats. A full list of these can be seen in the [coordinates documentation](https://docs.mdanalysis.org/stable/documentation_pages/coordinates/init.html#supported-coordinate-formats). A large majority of these formats store temporal coordinates (and somethings also forces and velocities), which MDAnalysis allows users to explore.

#### 1.1 Creating a universe from a trajectory file

< Here we essentially say that as per session 1, we load trajectories in by creating a universe >

In [None]:
# Here we add an example showing how a simple trajectory can be read
# We also show an example of how one could read a trajectory after universe construction

#### 1.2 Creating a universe by loading multiple trajectory files

< Here we describe the ChainReader case - loading a series of trajectories by passing a list of trajectories >

In [None]:
# We also add an example of chain reading

## 2. Traversing through a trajectory

< Here we describe universe.trajectory. We mention about how MDAnalysis uses an iterative IO model whereby as we travery the trajectory object, in most cases data is loaded from the disk and used to update temporal data in the universe and associated data groups (i.e. AtomGroups) >

#### 2.1 Random seeking

< Here we describe how trajectories can be randomly seeked by directly indexing the trajectory frame we want to access >

In [None]:
# Show off random seeking
# First create a universe and then an atomgroup from it, show the current time and the positions of the atomgroup
# Then seek to a random part of the trajectory and show how the time and the positions have been updated

#### 2.2 Iterating through the trajectory

< Here we describe about how a iterating through the trajectory is the most common approach for looping through it and capturing how parts of the system have changed >

In [None]:
# Here we show off iterating through a trajectory
# We show how information such as the box volume can be obtained and plotted by iterating through the trajectory

#### 2.3 Trajectory slicing

< We quickly mention about how the trajectory class can be sliced so as to allow both truncation of frames at the start and end of the trajectory, but also skipping a given set of frames >

In [None]:
# Quickly show off trajectory slicing

#### 2.4 Transfer to memory
< Here we explain about how it is possible to bypass the iterative I/O model by transferring all the trajectory data to memory. This can be quite useful in reducing overheads, but does means that any changes that happen to an in-memory trajectory is permanent.

In [None]:
# Give an example of transfer to memory,

## 3. Visualizing a trajectory

< Here we mention about how, in the same way as we did for single frames in session 1, we can visualize trajectories using NGLView >

In [None]:
# NGLView code to observe a simulation trajectory

## 4. Updating AtomGroups

< Here we mention that whilst AtomGroups (see session 1), are static (i.e. based on an initial selection, they will not change as you traverse a trajectory), we also offer the UpdatingAtomGroup class. This allows users to select atoms from a Universe based on a given metric and then have this atom selection update as you traverse through a trajectory >

In [None]:
# Add an example of an UpdatingAtomGroup

## 5. Writing trajectory files

< here we discuss the writers we have and how they are usually accessed >

In [None]:
# Some examples of the different ways users can write out a trajectory;
## a) Use the write class
## b) Write from the atom group
## c) etc...

## 6. Exercises

Exercises we could make the students do:
1) Use trajectory traversing - FRET stuff below?

2) Get them to write out a trajectory in a different file format
  - Just a few select frames, etc...

3) Use the UpdatingAtomGroup to explore solvation shell (as below)

4) Using the chainreader?

In [None]:
import MDAnalysis as mda
import nglview as nv
from MDAnalysis.tests.datafiles import PSF, DCD

As in the earlier session, we can use NGLView to traverse through the trajectory visually.

In [None]:
closed_to_open = mda.Universe(PSF, DCD)
nv.show_mdanalysis(closed_to_open)

### Working with AtomGroups: FRET distances

Experimental FRET labels: distances

<div>
<img src="figures/fret_distances_adk.png" alt="FRET distances" width="250"/>
</div>


* I52 - K145
* A55 - V169
* A127 - A194

Calculate the C$_\beta$ distances as proxies for the spin-label distances.

Sampling large conformational is challenging with standard equilibrium MD. Therefore we used an enhanced sampling method ("dynamic importance sampling", DIMS) to generate transitions between closed and open apo AdK [2, 3] in addition to "brute force" equilibrium MD (on PSC Anton).


In [None]:
beta = closed_to_open.select_atoms("name CB")

donors = beta.select_atoms("resname ILE and resid 52", 
                           "resname ALA and resid 55",
                           "resname ALA and resid 127")
acceptors = beta.select_atoms("resname LYS and resid 145", 
                           "resname VAL and resid 169",
                           "resname ALA and resid 194")

Indexing the trajectory sets the active frame to that index.

In [None]:
closed_to_open.trajectory[0]
print(f"Frame: {closed_to_open.trajectory.frame}")
print(f"Time: {closed_to_open.trajectory.time}")

In [None]:
closed_to_open.trajectory[-1]
print(f"Frame: {closed_to_open.trajectory.frame}")
print(f"Time: {closed_to_open.trajectory.time}")

Setting the frame updates dynamic data such as positions. Note that the positions array itself does not update.

In [None]:
closed_to_open.trajectory[0]
print(closed_to_open.trajectory.frame)
donor_positions = donors.positions
donor_positions

In [None]:
closed_to_open.trajectory[-1]
print(closed_to_open.trajectory.frame)
donor_positions

Rather, it's the AtomGroup that updates.

In [None]:
donors.positions

The more common way to traverse through a trajectory (e.g. for analysis) is to iterate through it.

In [None]:
for ts in closed_to_open.trajectory:
    print(f"Frame: {ts.frame}, time: {ts.time}")

You can also easily slice the trajectory.

In [None]:
for ts in closed_to_open.trajectory[2:92:8]:
    print(f"Frame: {ts.frame}, time: {ts.time}")

Let's apply this to the FRET analysis we did earlier. First, for convenience, let's codify the analysis into a function. The arguments (`donors`, `acceptors`) are `AtomGroup`s so that we can work the the updated positions arrays for each frame.

In [None]:
def calculate_fret_distances(donors, acceptors):
    return np.linalg.norm(donors.positions - acceptors.positions, axis=1)

In [None]:
distances = []
times = []
for ts in closed_to_open.trajectory:
    d = calculate_fret_distances(donors, acceptors)
    distances.append(d)
    times.append(ts.time)
print(distances[:3])

In [None]:
import matplotlib.pyplot as plt

plt.plot(times, distances)
plt.legend(("I52-K145", "A55-V169", "A127-A194"))
plt.xlabel("Time (ps)")
plt.ylabel(r"Distance (Å)");

### Working with UpdatingAtomGroups: solvent shells

In [None]:
from MDAnalysisData import datasets
ifabp_data = datasets.fetch_ifabp_water()
ifabp = mda.Universe(ifabp_data.topology, ifabp_data.trajectory)

In [None]:
solvshell_static = ifabp.select_atoms("resname TIP3 and around 5.0 protein")
solvshell_static

In [None]:
ifabp.trajectory[-1]
solvshell_static

In [None]:
solvshell_updating = ifabp.select_atoms("resname TIP3 and around 5.0 protein", updating=True)
solvshell_updating

In [None]:
ifabp.trajectory[0]
solvshell_updating

In [None]:
times = []
n_waters = []
for ts in ifabp.trajectory:
    times.append(ts.time)
    n_waters.append(len(solvshell_updating.residues))
print(n_waters[:3])

# uhhh why are the times negative

In [None]:
plt.plot(times, n_waters)
plt.xlabel("Time (ps)")
plt.ylabel(r"# waters within 5 $\AA$ of protein");