# Gromos Trajectory evaluation with Pygromos and Pandas

## Example file for the evaluation of GROMOS trajectory files in pygromos

1. Analysis of a GROMOS trc file (position trajectory)
    1. Import
    2. Common Functions
2. Analysis of a GROMOS tre file (energy trajectory)
    1. Import
    2. Common Functions

In [None]:
# general imports for manual data manipulations. Not needed if only provided functions are used
import numpy as np
import pandas as pd

In [None]:

#specific imports from pygromos for trc and tre file support
import pygromos.files.trajectory.trc as traj_trc
import pygromos.files.trajectory.tre as traj_tre



## 1) TRC

### 1.1) TRC import

In [None]:
# import the trajectory file into a Trc class
trc = traj_trc.Trc(input_value="example_files/Traj_files/test_CHE_vacuum_sd.trc")

The Trc class offers the normal gromos block structure and additionally a pandas DataFrame called database where all the timesteps are stored.
For typical trc files the only classic block is the TITLE block, and all the other blocks are stored inside the database.

Additionally many common functions are offered to evaluate the given data. If a needed function is not provided, the normal pandas syntax can be used to create custom functions.

If you have a function that's generally useful, please contact the developers to possibly add it to the pygromos code to help other people :)

In [None]:
[x for x in dir(trc) if not x.startswith("_")]

### 1.2) Common trc functions

In [None]:
# Get the average movement length between two frames
trc.get_atom_movement_length_mean(atomI=1)

In [None]:
# Or get the center of mass movement for a whole group of atoms. The atoms are provided as numbers in a list.
trc.get_cog_movement_total_series_for_atom_group(atoms=[1,2,5]).mean()

In [None]:
# Get the average distance between two atoms over all time frames
trc.get_atom_pair_distance_mean(atomI=1, atomJ=2)

#### RMSD

In [None]:
# Calculate the rmsd to the initial frame (0th frame).
# Alternatively a different trajectory can be provide as argument to the rmsd function.
# The accepted arguments are integer or single trajectory frame.
rmsd = trc.rmsd(0)

In [None]:
# Which returns the rmsd for every time frame to the initial frame.
# It can be seen how the rmsd slowly gets larger as the simulations get farther away from the initial setup.
rmsd

In [None]:
# The mean over all frames can be easily taken with the pandas function mean()
rmsd.mean()

#### RDF

In [None]:
# This functionality is still under development

## 2) TRE

### 2.1) Tre import and structure

In [None]:
# import the trajectory file into a Tre class
from pygromos.files.trajectory.tre_field_libs.ene_fields import gromos_2015_tre_block_names_table

tre = traj_tre.Tre(input_value="example_files/Traj_files/test_CHE_H2O_bilayer.tre", _ene_ana_names=gromos_2015_tre_block_names_table)

In [None]:
tre.database

In [None]:
[x for x in dir(tre) if not x.startswith("_")]

Tre files contain all energy related data (like split up energy terms, temperature, pressure, .....). In PyGromos they generally share the same block structure as other files, but all the data inside the specific timesteps is stored efficiently inside a pandas DataFrame, here called tre.database . This database offers manipulation with all pandas functions. Alternatively many common functions are provided inside the Tre class. 

This class should in principle replace further usage of the gromos++ ene_ana function, since all these operation can be done efficiently on the pandas DataFrame. 

We are currently working on adding more common functions to the Tre class. If you find a useful function please contact the developers so the function can be added for general usage :)

### 2.2) Common Tre functions

In [None]:
# calculate the average density over all timesteps
tre.get_density().mean()

In [None]:
# calculate the mean temperature over all frames for all baths in the system. In this example two baths with slightly different temperatures.
tre.get_temperature().mean()

Tables and lists inside the database are stored in numpy arrays. For example the two temperatures from the previous example are stored in a numpy array of size 2 since it has two temperature baths

Specific values inside a tre file can also be directly accessed with numpy and pandas syntax

In [None]:
tre.database.iloc[2]

In [None]:
# select the first nonbonded energy value for the first force group over all time frames
tre.database["nonbonded"].apply(lambda x: x[0][0])

In [None]:
tre.get_totals()

### $\lambda$-Sampling & TREs

In [None]:
# import the trajectory file into a Tre class
tre = traj_tre.Tre(input_value="example_files/Traj_files/RAFE_TI_l0_5.tre")
tre.get_precalclam()

### EDS in TREs

In [None]:
# import the trajectory file into a Tre class
tre = traj_tre.Tre(input_value="example_files/Traj_files/RAFE_eds.tre")
tre.get_eds()

## Concatenate  and Copy multiple Trajectories

Trajectories offer a wide range of additional file manipulations. Trajectory classes can be copied (deep) and added to each other to concatenate multiple small simulation pieces into one large trajectory. 

In [None]:
tre_copy = traj_tre.Tre(input_value=tre)

In [None]:
tre_copy.database.shape

In [None]:
tre_combined = tre + tre_copy

In [None]:
tre_combined.database.shape

In the new combined trajectory we have one long trajectory made from the two smaller ones. The length is one element shorter, since normally the last element of the first trajectory and the first element of the second trajectory is the same element. This can be controlled via the option "skip_new_0=True" in the add_traj() function which is the core of the "+" operator for trajectories. In the following line the default behavior can be seen as a smooth numbering in the TIMESTEPs.

In [None]:
tre_combined.database.time

In [None]:
print(len(tre_combined.database), len(tre.database))