Molecular dynamics trajectory analysis in Tensorflow
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
conda-recipe
examples
tests
tftraj
.gitignore
CMakeLists.txt
LICENSE
README.md
pytest.ini Use Pytest to run tests May 15, 2017
setup.py

README.md

TFTraj

Molecular dynamics analysis routines implemented in Tensorflow.

See this blog post for motivations and an overview of the RMSD computation.

RMSD

RMSD is implemented via composition of existing Tensorflow Ops in tftraj.rmsd.

traj = md.load('trajectory.xtc', top='topology.pdb')
inds = [5, 19, 234]
target = np.array(traj.xyz[inds])

frames = tf.constant(traj.xyz)
target = tf.constant(target)
prmsd = tftraj.rmsd.pairwise_msd(frames, target)
result = sess.run(prmsd)

This might be somewhat slow, so there is a native CPU operation that is very fast.

traj = md.load('trajectory.xtc', top='topology.pdb')
rmsd = tftraj.rmsd_op.load()
prmsd, _ = rmsd.pairwise_msd(traj.xyz, traj.xyz)
result = sess.run(prmsd)

Benchmarks

The benchmark consists of running a pairwise RMSD calculation among fs peptide trajectories. Specifically, between 2800 (stride = 100) frames and 28 targets (stride = 100 * 100).

Algorithm Time / ms
mdtraj 33.3
tf-cpu 1.6
tf-native 22843.6

The code between the tf-cpu custom op and mdtraj is largely the same, although there is much less Python overhead in the tf case because all of our looping is done in c++ (with openmp parallelization) and we deal with the xyz numpy arrays directly instead of the convenience md.Trajectory objects.

Building

The custom op requires a working c++ compiler and CMake. Running python setup.py develop should automatically invoke CMake. It will use cmake-build-release/ as a scratch (build) directory. To change CMake options, change into that directory and use cmake .. or ccmake ... The final shared library is copied back into the source tree, much to CMake's chagrin. This is so we can package it up in python. The rmsd_op.py file is in charge of finding the shared library and calling the Tensorflow function to load it.

The build script needs to know where the tensorflow headers are installed. It uses

python -c "import tensorflow as tf; print(tf.sysconfig.get_include())"

automatically. Make sure python is the right one when you run CMake.

The package requires compiler and cpu support for SSE instructions and OpenMP.

Running tests

I've tested this package on Linux and Python 3.5. Other configurations may work.

The tests require the following molecular dyanmics dataset:

https://figshare.com/articles/Fs_MD_Trajectories/1030363

Download and extract the fs_peptide directory, and make sure it's at the top level of the source directory.

To run the tests, cd to the source directory and use:

pytest

You can also use conda-build to build and run tests in a pristine environment. Make sure you have the conda-forge channel added to your .condarc and run

conda build conda-recipe/

Examples

There are a couple examples in the examples/ folder.