Skip to content
Chris Iacovella edited this page Dec 21, 2023 · 9 revisions

chiron: reimagining biomolecular simulation for a fully differentiable future

Mission statement & overview

We aim to establish a comprehensive and interoperable ecosystem tailored for differentiable physical modeling. The initial focus is on streamlining neural network potentials (NNPs) for efficient and accurate free energy calculations. We understand the critical importance of usability in software development; hence, we have crafted our architecture with clean and straightforward APIs. These APIs are designed to offer scientists the ease of customizing routines and integrating workflows with minimal overhead.

Key components

  • Chiron Package: Specializes in Markov Chain Monte Carlo (MCMC) state sampling and advanced Monte Carlo-based methods.
  • Auditorium Package: A suite for automated NNP testing and benchmarking, ensuring performance validation with rigor and transparency.
  • Modelforge: Dedicated to the training, optimizing, and managing neural network potentials (NNPs).

Goals

Minimum viable product (MVP)

The MVP will calculate alchemical free energies using chiron and a neural network potential trained with modelforge, improving the baseline molecular mechanics binding free energy estimate on protein-ligand systems.

Fully developed product

The comprehensive version will incorporate

  • a variety of Monte Carlo (MC) moves to construct Markov Chains to sample slow degrees of freedom (e.g., MC moves mixed with Langevin Dynamics that sample selected rotamers or dihedrals, sample different binding modes/sites, and protonation and tautomeric states)
  • multistate sampling methods to sample different thermodynamic states using multiple replicas
  • Optimization of potentials and/or protocols based on thermodynamic observables (when experimental data are available).

Milestones for achieving the MVP

To achieve the MVP, the following milestones need to be reached:

  • NNP training: train a fast and expressive neural network potential on a training set appropriate for molecular dynamics simulations using modelforge
    • Dataset: the SPICE dataset contains the most appropriate selection of chemical space, but pre-training might still be necessary to ensure that the potential energy landscape can be sampled.
    • Alternatively, instead of training directly from spice, train an ML potential from MM energy using molecules (conformers) defined in SPICE to allow more direct comparison with MM.
    • Simulation stability: ensure the stability of molecular simulations within the applicability domain, leveraging insights gained in the development and application of StableNetGuardOwl and Auditorium
  • Alchemical Protocol: implement an NNP-compatible alchemical protocol.
    • Methodology: focus on methods that avoid reliance on bonded terms to achieve overlap in conformational space (i.e., single/dual topology with dummy atoms). We will use the alchemical transfer method (ATM) for absolute/relative binding free energy calculations.
    • Use single replica method, such as Times Square Sampling on host-guest datasets.

Table of Contents