Skip to content


Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

Mastic: The stuff that keeps stuff together.

Latest DOI:

Mastic is currently in beta.


Discussion takes place on ( which is a slack-like app that works on the Matrix protocol:


Currently there is no package available for mastic, but you can clone one of the releases and manually install it.

git clone
cd mastic

# install it as editable so you can make your own interaction classes!
pip install --user -e .


  • numpy


  • rdkit - highly recommended as this is used for file-parsing and feature detection
  • pandas - for output of data to DataFrames and other file formats


The original impetus that led to writing MASTIC was a need to have a general and flexible framework for interacting with macromolecules (particular biomolecules) in order to implement complex selections and queries for profiling of intermolecular interactions on large sets of data.

While there are a few other high-quality projects that provide APIs for macromolecules the goals and fundamental designs of these libraries did not meet my needs. Primarily, the underlying representations are biased towards protein structures and other polymers due to the historical development of force fields for proteins. As such amino acid “residues” are usually part of the fundamental datastructures even in modern libraries. Design decisions like this and others are essentially isomorphic to the way textbooks write about these molecules. My experience was that while this is convenient for simple systems more complicated systems with many proteins, small molecules, etc. quickly become difficult to query and manipulate. Hence, Mastic was conceived in order to deliver a proper separation of molecular data from domain specific uses.


Overarching Goals

  1. An abstract layer for representing multi-atom structures and selections that is extensible via object-oriented programming in Python.
  2. An applied layer of advanced subclasses that represents the objects (Features) which are more isomorphic to textbook knowledge including:
    • Organic chemistry features e.g. functional groups
    • Intermolecular interactions, e.g. hydrogen-bonding, pi-pi stacking etc.
    • Protein secondary and tertiary features
    • Proteins
    • Amino acid residues
    • Receptor-ligand relationships
    • Multimeric protein complexes
    • Molecular dynamics systems
    • Dummy atoms
    • Typed pseudo-atoms
    • Pseudo-atom constructs, e.g. pharmacophores
  3. An extensible command line interface “porcelain” for performing common workflows, including:
    • profiling of intraprotein intermolecular interactions, e.g. for rcsb data, molecular dynamics trajectories, etc.
    • profiling of receptor-ligand interactions
    • profiling of protein-protein interactions

Design Goals

  1. Separation of types (i.e. classes; e.g. protein topology) and instances (i.e. type + coordinates).
  2. Separation and development of both expressive and complete patterns.
  3. Ability to incorporate data from many other libraries representations easily via an extensible interface.
  4. Ability to use portions of this library as a stable and fairly future-proof solution across the python ecosystem.
  5. Minimal core dependencies, with optional features provided by other libraries.

Other goals

  1. Language agnostic file-format for storing objects and preserving complex relations within them, i.e. hdf5.
  2. Ability to export representations to many other commonly used formats (PDB dialects perhaps?).
  3. Optimization and parallelization built in.
  4. Optional type system (via python type hints and mypy) for developing and debugging the construction of molecules and systems.
  5. API for accessing databases of Features.


The immediate “killer feature” will be for easily profiling intermolecular interactions, but there is definitely potential for use in setting up complex multi-molecular systems, structural analyses, and complex distance metric calculations in enhanced sampling simulations.

I have so far used it to profile hydrogen bonds between a protein and ligand for 4,000 conformations.



See for version number meanings.

Version 1.0.0 will be released whenever the abstract layer API is stable. Subsequent 1.X.y releases will be made as applied and porcelain layer features are added.