Molecular dynamics analysis routines implemented in Tensorflow.
See this blog post for motivations and an overview of the RMSD computation.
RMSD is implemented via composition of existing Tensorflow Ops in tftraj.rmsd
.
traj = md.load('trajectory.xtc', top='topology.pdb')
inds = [5, 19, 234]
target = np.array(traj.xyz[inds])
frames = tf.constant(traj.xyz)
target = tf.constant(target)
prmsd = tftraj.rmsd.pairwise_msd(frames, target)
result = sess.run(prmsd)
This might be somewhat slow, so there is a native CPU operation that is very fast.
traj = md.load('trajectory.xtc', top='topology.pdb')
rmsd = tftraj.rmsd_op.load()
prmsd, _ = rmsd.pairwise_msd(traj.xyz, traj.xyz)
result = sess.run(prmsd)
The benchmark consists of running a pairwise RMSD calculation among fs peptide trajectories. Specifically, between 2800 (stride = 100) frames and 28 targets (stride = 100 * 100).
Algorithm | Time / ms |
---|---|
mdtraj | 33.3 |
tf-cpu | 1.6 |
tf-native | 22843.6 |
The code between the tf-cpu
custom op and mdtraj
is largely the same, although
there is much less Python overhead in the tf
case because all of our looping
is done in c++ (with openmp parallelization) and we deal with the xyz
numpy
arrays directly instead of the convenience md.Trajectory
objects.
The custom op requires a working c++ compiler and CMake.
Running python setup.py develop
should automatically invoke CMake.
It will use cmake-build-release/
as a scratch (build) directory. To change
CMake options, change into that directory and use cmake ..
or ccmake ..
.
The final shared library is copied back into the source tree, much to CMake's
chagrin. This is so we can package it up in python. The rmsd_op.py
file
is in charge of finding the shared library and calling the Tensorflow function
to load it.
The build script needs to know where the tensorflow headers are installed. It uses
python -c "import tensorflow as tf; print(tf.sysconfig.get_include())"
automatically. Make sure python
is the right one when you run CMake.
The package requires compiler and cpu support for SSE instructions and OpenMP.
I've tested this package on Linux and Python 3.5. Other configurations may work.
The tests require the following molecular dyanmics dataset:
https://figshare.com/articles/Fs_MD_Trajectories/1030363
Download and extract the fs_peptide
directory, and make
sure it's at the top level of the source directory.
To run the tests, cd
to the source directory and use:
pytest
You can also use conda-build
to build and run tests in a pristine environment. Make
sure you have the conda-forge
channel added to your .condarc
and run
conda build conda-recipe/
There are a couple examples in the examples/
folder.