# 10. Writing mapped trajectories from Gromacs all atom (AA) trajectories

### Prerequisites
* MDAnalysis
* gsd

This example maps a system of 4 EKEK (GLU-LYS-GLU-LYS) peptides. One amino acid residue's center of mass represents one CG bead. The all atom topology and trajecotry files can be found in `CG_tutoral` folder. Note that solvent molecules and Hydrogen atoms are removed from the AA system.

In [40]:
import hoomd
import numpy as np 
import hoomd.md
import gsd,gsd.hoomd
import gsd.pygsd
import hoomd.htf as htf
import tensorflow as tf
import MDAnalysis as mda

# disable GPU
import os
os.environ['CUDA_VISIBLE_DEVICES'] = '-1'

#disable warnings
import warnings
warnings.filterwarnings('ignore')

Here we generate snapshots of each state. This will be called for all frames of the trajectory. 

In [30]:
def create_frame(frame_number, N, types, type_array, positions,masses, box):
    ''' Create snapshots of a system state.
    :param frame_number: Frame number in a trajectory
    :type frame_number: int
    :param N: Number of CG beads
    :type N: int
    :param types: Names of particle types
    :type types: List of strings (len N)
    :param typeids: CG bead type id
    :type typeids: Numpy array (N,)
    :param positions: CG beads positions
    :type positions: Numpy array (N,3)
    :param masses: CG beads masses
    :type masses: Numpy array (N,)
    :param box: System box dimensions
    :type box: Numpy array (6,)
    :return: Snapshot of a system state
    '''
    s = gsd.hoomd.Snapshot()
    s.configuration.step = frame_number
    s.configuration.box = box
    s.particles.N = len(type_array)
    s.particles.types = types
    s.particles.typeid = type_array
    s.particles.position = positions
    s.particles.mass = masses
    return s

In [38]:
def CGmap_trajectory(
        N,
        CGtypes,
        CGids,
        mapped_positions,
        CGmasses,
        outfile,
        mda_universe=None,
        splice_traj=False,
        traj_frames=None):
    
    ''' Writes a mapped trajectory.
    
    :param N: Number of CG beads
    :type N: int
    :param CGtypes: Names of particle types
    :type CGtypes: List of strings (len N)
    :param CGids: CG bead type id
    :type CGids: Numpy array (N,)
    :param mapped_positions: CG beads positions
    :type mapped_positions: Numpy array (N,3)
    :param CGmasses: CG beads masses
    :type CGmasses: Numpy array (N,)
    :param outfile: Name of the output file with .gsd extension
    :type outfile: str
    :param mda_universe: Universe with topology and trajectory
    :type mda_universe: MD ANALYSIS Universe
    :param splice_traj: Flag to splice a trajectory length
    :type splice_traj: boolean
    :param traj_frames: First and last frame numbers for splicing
    :type traj_frames: Numpy array [2,]
    :return: mapped trajectory in gsd file format
    '''

    if mda_universe is not None:

        #User can either map all frames or a selected number of frames
        if splice_traj is False:
            traj_len = len(mda_universe.trajectory)
        else:
            traj_len = len(
                mda_universe.trajectory[traj_frames[0]:traj_frames[-1]])

        # IRuntime error with hoomd: not all particles found inside the box
        box = mda_universe.dimensions

        t = gsd.hoomd.open(name=outfile, mode='wb')
        for ts, i in zip(mda_universe.trajectory, range(traj_len)):
            t.append(create_frame(i, N, CGtypes,
                                  CGids, mapped_positions, CGmasses, box))
        print('GSD file written')

Now let's read the trajectory and generate CG bead types, indices, COMs and total masses. There are 16 residues in the system. Hence we will name each bead index by an integer scaling from 1 to 16. CG beads names will be `amino acid type + index` (Glu=E, Lys=K)

In [9]:
u = mda.Universe('CG_tutorial/EKEKxsolxH.pdb','CG_tutorail/EKEKxsolxH_last.trr')

In [41]:
#I will CG the atoms by their residues
N = len(u.atoms.residues)

typeids = np.arange(1,N+1)

#creating bead names
types = []
for i in typeids:
    if i%2 != 0:
        types.append('E'+str(i))
    else:
        types.append('K'+str(i))
print('CG bead ids: ', typeids)
print('CG bead names: ', types)
# Now let's calculate the COMs and total masses of the CG groups
masses = []
mapped_pos = []

for ids in typeids:
    ag = u.select_atoms('resid '+str(ids))
    coms = ag.center_of_mass()
    masses.append(ag.total_mass())
    mapped_pos.append(coms)
    

CG bead ids:  [ 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16]
CG bead names:  ['E1', 'K2', 'E3', 'K4', 'E5', 'K6', 'E7', 'K8', 'E9', 'K10', 'E11', 'K12', 'E13', 'K14', 'E15', 'K16']


In [39]:
#Let's generate GSD file
CGmap_trajectory(N=N, CGtypes=types, CGids=typeids, mapped_positions=mapped_pos, 
                     CGmasses=masses, outfile='CG_tutorial/EKEKmapped_traj.gsd', mda_universe=u, splice_traj=False)

GSD file written
