reimaging and interfacing with mdraj #260

evanfeinberg · 2015-05-06T08:08:33Z

Hi,

On suggestion of @swails , I'm moving mdtraj/mdtraj#814 to here. Thanks to @hainm , I have much cleaner code now for reimaging a single pdb. load_mdtraj was very helpful.

Since I want to add pairwise distances as features to my trajectory, I need to reimage all trajectories. However, I realize that while it's easy to go from mdtraj --> pytraj, it's nontrivial to go from pytraj --> mdtraj , either directly or via HDF5 format. So I had to develop a little workaround. It seems to work, but it would be nice if it were cleaner. The gist is that I load a .h5 trajectory in mdtraj, use pytraj's load_mdtraj feature to transfer it over, autoimage() it, save it as a dcd trajectory, separately save a frame as a pdb to act as the topology, re-load that dcd/pdb combo into mdtraj, and then save it once again as a .h5 to complete the cycle. It would be great if I could circumvent the middle steps and just dump a pytraj object straight to h5, or convert it back somehow to an mdtraj object. Any thoughts?

Cheers,
Evan

import mdtraj as md
import os
import h5py
import multiprocessing as mp
from functools import partial 
from pytraj import adict 

def get_trajectory_files(traj_dir, ext = ".h5"):
    traj_files = []
    for traj in os.listdir(traj_dir):
            if traj.endswith(ext):
                traj_files.append("%s/%s" %(traj_dir,traj))
    return sorted(traj_files)

def reimage_traj(traj_file, save_dir):
    traj = md.load(traj_file)
    topology = md.load_frame(traj_file,index=0)

    traj_pytraj = mdio.load_mdtraj(traj)
    traj_pytraj.autoimage()

    filename = traj_file.split("/")[len(traj_file.split("/"))-1]
    filename = filename.split(".")[0]
    dcd_filename = "%s.dcd" %filename
    top_filename = "%s.pdb" %filename
    h5_filename = "%s.h5" %filename
    new_dcd_file = "%s/%s" %(save_dir, dcd_filename)
    new_top_file = "%s/%s" %(save_dir, top_filename)
    new_h5_file = "%s/%s" %(save_dir, h5_filename)
    print new_dcd_file
    print new_top_file
    traj_pytraj.save(new_dcd_file)

    topology.save(new_top_file)

    new_traj = md.load(new_dcd_file, top = new_top_file)
    new_traj.save(new_h5_file)
    return


def reimage_trajs(traj_dir):
    new_dir = "%s_reimaged" %traj_dir

    if not os.path.exists(new_dir): os.makedirs(new_dir)

    trajs = get_trajectory_files(traj_dir)

    reimage = partial(reimage_traj, save_dir = new_dir)

    num_workers = mp.cpu_count()
    pool = mp.Pool(num_workers)
    pool.map(reimage, trajs)
    pool.terminate()
    return

traj_dir = "/scratch/users/enf/b2ar_analysis/subsampled_allprot_combined"
reimage_trajs(traj_dir)

evanfeinberg · 2015-05-06T08:12:17Z

Sorry the formatting got kind of messed up. I made a gist version here. In the mean time, while pytraj --> mdtraj is nontrivial, I hope someone finds this useful:

https://gist.github.com/evanfeinberg/6d1379b1556cef92d0ec#file-reimage-py

hainm · 2015-05-06T08:21:35Z

hi,
regarding the format, you can use markdown syntax for python code. I just updated yours.

https://help.github.com/articles/github-flavored-markdown/

Per loading back from pytraj to mdtraj, you can try

traj = md.load(traj_file)
traj_pytraj = mdio.load_mdtraj(traj)
traj_pytraj.autoimage()
traj.xyz[:] = traj_pytraj.xyz / 10. # need to convert from `Angstrom` to `nm`

# that's it. and you just use xyz-updated`traj` as `mdtraj` object.

Note: calling xyz in traj_pytraj will return a copy of coords. So you can not update xyz easily like mdtraj (due to different implementation)

Hai

evanfeinberg · 2015-05-06T08:40:23Z

@hainm , that's awesome!! I just tested it and it worked. so much better. Thank you!!!

hainm · 2015-05-06T08:50:41Z

great.

@evanfeinberg: I am just curious what kind of pairwise distance you want do calculate? pytraj has Action_Matrix to calculate distance for a given mask and return a distance matrix too.

import pytraj.common_actions as pyca
from pytraj import load_sample_data
traj  = load_sample_data()
d = pyca.calc_matrix(traj[0], '@CA', traj.top)
print (d.to_ndarray()) # 1D array, need to reshape to n_atoms * n_atoms

evanfeinberg · 2015-05-11T23:12:53Z

Hi for some reason I only just saw this now -- will do this asap and report back.

swails mentioned this issue May 6, 2015

Why FrameArray and TrajReadOnly? #262

Closed

hainm added the iterface label Jun 3, 2015

hainm closed this as completed Jun 25, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

reimaging and interfacing with mdraj #260

reimaging and interfacing with mdraj #260

evanfeinberg commented May 6, 2015

evanfeinberg commented May 6, 2015

hainm commented May 6, 2015

evanfeinberg commented May 6, 2015

hainm commented May 6, 2015

evanfeinberg commented May 11, 2015

reimaging and interfacing with mdraj #260

reimaging and interfacing with mdraj #260

Comments

evanfeinberg commented May 6, 2015

evanfeinberg commented May 6, 2015

hainm commented May 6, 2015

evanfeinberg commented May 6, 2015

hainm commented May 6, 2015

evanfeinberg commented May 11, 2015