MDSynthesis: a persistence engine for molecular dynamics data
As computing power increases, it is now possible to produce hundreds of molecular dynamics simulation trajectories that vary widely in length, system size, composition, starting conditions, and other parameters. Managing this complexity in ways that allow use of the data to answer scientific questions has itself become a bottleneck. MDSynthesis is an answer to this problem.
Built on top of datreant, MDSynthesis gives a Pythonic interface to molecular dynamics trajectories using MDAnalysis, giving the ability to work with the data from many simulations scattered throughout the filesystem with ease. It makes it possible to write analysis code that can work across many varieties of simulation, but even more importantly, MDSynthesis allows interactive work with the results from hundreds of simulations at once without much effort.
Efficiently store intermediate data from individual simulations for easy recall
The MDSynthesis Sim object gives an interface to raw simulation data through MDAnalysis. Data structures generated from raw trajectories (pandas objects, numpy arrays, or any pure python structure) can then be stored and easily recalled later. Under the hood, datasets are stored in the efficient HDF5 format when possible.
datreant under the hood
MDSynthesis is built on top of the general-purpose datreant library. The Sim is a Treant with special features for working with molecular dynamics data, but every feature of datreant applies to MDSynthesis.
A brief user guide is available on Read the Docs.
This project is still under heavy development, and there are certainly rough edges and bugs. Issues and pull requests welcome!