NOTE: This repository is hosted on sourcehut and recently on GitHub as well.
A reader and writer for the Gromacs xtc file format implemented in pure Rust.
molly tries to decompress and read the minimal number of bytes from disk. To this end, the library features extensive selection methods for frames within a trajectory and atoms within each frame. This selection allows for some exciting optimizations—only the necessary positions are decompressed. Similarly, there are cases where only a limited number of compressed bytes are read in the first place. This is particularly powerful in applications where a subset of positions at the top-end of the frame is selected in a large trajectory. Such buffered reading can be very beneficial when disk read speed is particularly poor, such as over networked file storage.
molly also supports writing XTC files, enabling roundtrip workflows where trajectories can be read, processed, and written back out.
For convenient use in existing analysis tools, molly exposes a set of bindings that allow access to its functions from Python.
molly can also be installed as a command line tool for shortening and filtering xtc files. It supports the 1995 and 2023 magic numbers.
NOTE: molly is in a pretty stable state and is used in the wild. Please do take care and verify the results. Blind trust in any tool is irresponsible.
For any questions, feel free to get in contact with me.
cargo install molly -F cliWith the molly command, xtc files can be filtered and shortened. Selections can be made on frames as well as the atoms within the frames.
- Frames can be selected with the
-f/--frame-selectionoption, usingstart:stop:stepranges, which operate much like ranges in Python. - The first n atoms can be selected with the
-a/--atom-selectionoption.
Here is a short showcase of possible uses.
# List all options.
molly --help
# Print a summary of a trajectory to standard out.
molly --info big.xtc
# Trajectories can be filtered in a number of ways. Here are a few combinations.
# Select the 100th to the 600th frame in steps of two. From those, store only the first 161 atoms.
molly big.xtc out.xtc --frame-selection 100:600:2 --atom-selection 161
molly big.xtc out.xtc -f 100:600:2 -a 161 # With shorter arguments.
# Reverse a selection. Here we use it to select the last frame.
molly big.xtc last.xtc --reverse-frame-selection --frame-selection :1
molly big.xtc last.xtc -Rf :1 # With shorter arguments.
# Reverse a trajectory.
molly big.xtc rev.xtc --reverse
# For any of these filtering commands, the frame times and steps can be written to standard out.
molly big.xtc rev_last_ten.xtc -rRf :10 --steps --timesTo use molly in a Rust project, add this repository to the dependencies in
your Cargo.toml.
Find molly on crates.io.
use molly::{XTCReader, XTCWriter, Frame};
// Read frames from a trajectory.
let mut reader = XTCReader::open("input.xtc")?;
let frames = reader.read_all_frames()?;
// Write frames to a new file.
let mut writer = XTCWriter::create("output.xtc")?;
for frame in frames.iter() {
writer.write_frame(frame)?;
}cargo (which provides the Rust compiler) is required for building the Python
bindings. (The stable toolchain is sufficient.)
To install the module, run the following command. It will automatically clone the repository and install the Python library from the correct directory.
pip3 install 'git+https://git.sr.ht/~ma3ke/molly#egg=molly&subdirectory=bindings/python'Alternatively, clone the repository, go into the bindings directory, and
install it from there using pip.
git clone https://git.sr.ht/~ma3ke/molly
cd molly/bindings/python
# Perhaps you want to use/create a virtual environment first.
python3 -m venv venv
source venv/bin/activate
pip3 install .A number of useful example programs can be found in the examples directory.
Some of these can be used to benchmark against other xtc reader
implementations, or to create test files.
NOTE: I'm leaving these here for the moment, but ultimately, I will remove or fundamentally change many of these examples.
In order to access these, clone the repository and build them.
git clone https://git.sr.ht/~ma3ke/molly
cd molly
cargo build --release --examples
target/release/examples/<name> [options]Or directly run them using
cargo run --release --example <name>The library includes a number of unit tests of internal mechanisms and
integration tests (including comparisons against the values produced by other
libraries). Note that it is desirable to run the tests with the --release
flag, since the debug builds run rather slow.
cargo test --releaseGo ahead and run the provided benchmarks if you're interested!
cargo benchAdditionally, there is a couple of benchmark scripts lying around the repo. I may place them into a neat table at a later point. For now, some things are still subject to change. Though we can see the broad shape of the performance story for molly, this is not yet the time for hard promises.
It looks like molly is around 2× faster than xdrf (the widely-used Gromacs implementation), and around 4× faster than the chemfiles implementation.
For the buffered implementations this gap is slightly less pronounced. When disk I/O is factored out, buffered reading is around 20% slower than unbuffered reading. But over very large trajectories where only a subset of positions from the top of each frame is selected, the advantage is considerable.
- I want to thank Ladislav Bartos for his contributions and feedback, including his fix of the large frame number–bug.
- Thanks to Mikael Lund for adding writing support.
Marieke Westendorp, 2024