# Running molecular dynamics simulations with `polymerist`
In addition to useful polymer building and parameterization tools, polymerist also provides functionality which \
simplifies running molecular dynamics (MD) simulations by integrating with the OpenFF stack

Here we will assume you have a parameterized structure for a single polymer availa, and will show how to :
* Solvate the system (with Packmol backend)
* Export the system to GROMACS, LAMMPS, and OpenMM (with OpenFF Interchange backend)
* Reproducibly define and serialize parameters describing an MD simulation
* Run a series of simulations defined by parameter sets (OpenMM only)

## Logging and utilities setup

In [1]:
# Supressing annoying warnings (!must be done first!)
import warnings
warnings.catch_warnings(record=True)
warnings.filterwarnings('ignore', category=UserWarning)
warnings.filterwarnings('ignore', category=DeprecationWarning)

# Logging
import logging
logging.basicConfig(level=logging.INFO)

In [2]:
from pathlib import Path

def fetch_file(filename : str, extension : str, dir_to_search : Path=Path.cwd()) -> Path:
    '''
    Check a directory for the first file with a given name and extension
    Return the file if found, or a message containing files with compatible 
    extension in the same directory otherwise
    '''
    available_files : dict[str, Path] = {
        path.stem : path
            for path in dir_to_search.glob(f'**/*.{extension}')
    }
    choice_str = ',\n'.join(available_files.keys())

    if filename in available_files:
        return available_files[filename]
    else:
        raise ValueError(
            f'No file called "{filename}.{extension}" found in {dir_to_search};' \
            f'\nThe following (potentially) compatible files were found while searching:\n{choice_str}'
        )

## Loading a molecule
We will employ pre-parameterized (i.e. with topologies, coordinates, and partial charges already assigned) polymer structures for this demo for compactness \
If interested in learning HOW to obtain such structures, see one of the other demos contained herein, such as [polymer_build_demo.ipynb](polymer_build_demo.ipynb)

In [3]:
STRUCT_DIR = Path('cleaned_structures') # this already should exist on import
MD_DEMO_DIR = Path('MD_demo_files')
MD_DEMO_DIR.mkdir(exist_ok=True)

In [None]:
from polymerist.mdtools.openfftools.topology import topology_to_sdf, topology_from_sdf, get_largest_offmol
from polymerist.genutils.fileutils.pathutils import assemble_path


polymer_name = 'naturalrubber'

polymer_sdf = fetch_file(polymer_name, extension='sdf', dir_to_search=STRUCT_DIR)
polymer_outdir = MD_DEMO_DIR / polymer_name # only initialize this once an SDF has been found
polymer_outdir.mkdir(exist_ok=True)

polymer_topology = topology_from_sdf(polymer_sdf)
polymer = get_largest_offmol(polymer_topology)
polymer.visualize(backend='nglview')

### Define periodic box and pack that box with solvent

In [None]:
from openmm.unit import gram, centimeter, nanometer

from polymerist.mdtools.openfftools import boxvectors
from polymerist.mdtools.openfftools.solvation import solvents 
from polymerist.mdtools.openfftools.solvation.packing import pack_topology_with_solvent


box_padding = 1*nanometer # how far beyond the tight bounding box of the polymer to extend the periodic box
solvent = solvents.water_TIP3P
rho = 0.997 * gram / centimeter**3

# calculate periodic box vectors
box_dims    = boxvectors.get_topology_bbox(polymer_topology)
box_vectors = boxvectors.box_vectors_flexible(box_dims)
box_vectors = boxvectors.pad_box_vectors_uniform(box_vectors, box_padding)

# calculate number of solvent molecules
solvated_topology = pack_topology_with_solvent(polymer_topology, solvent=solvent, box_vecs=box_vectors, density=rho, exclusion=box_padding)
solv_path = assemble_path(polymer_outdir, polymer_name, postfix=f'solv_{solvent.name}', extension='sdf')
topology_to_sdf(solv_path, solvated_topology)

### Selecting force field and generating Interchange object
See [OpenFF Interchange documentation](https://docs.openforcefield.org/projects/interchange/en/stable/using/output.html) for details on how to output to various MD engine formats

In [6]:
from openff.toolkit import ForceField


forcefield_names : list[str] = ['openff-2.2.0.offxml', 'tip3p.offxml']

forcefield = ForceField(*forcefield_names)
interchange = forcefield.create_interchange(solvated_topology, charge_from_molecules=[polymer])
interchange.box = box_vectors

## Running simulations using OpenMM
`polymerist` and the OpenFF toolkit provide extensive support for running OpenMM simulations,\
due to the relative simplicity of automating these simulations thank to OpenMM's Pythonn API 

### Defining reproducible and serializable simulation parameters
These provide a means to exchange and cache information about how you set up a simulation \
These can be readily serialized to disc and are stored at the start of each simulation, if run through polymerist

In [7]:
from openmm.unit import femtosecond, picosecond, nanosecond
from openmm.unit import kelvin, atmosphere
from polymerist.mdtools.openmmtools.parameters import (
    SimulationParameters,
    ThermoParameters,
    IntegratorParameters,
    ReporterParameters,
)
from polymerist.mdtools.openmmtools.reporters import DEFAULT_STATE_DATA_PROPS


# define how data should be periodically reported during a simulation (will be shared between simulations in this examples)
reporter_params = ReporterParameters( # these will be shared between both sets of parameters
        report_trajectory=True,
        traj_ext='dcd', # output to compressed binary trajectory files (recommended)
        report_state_data=True,
        state_data=DEFAULT_STATE_DATA_PROPS, # can tune these to taste
        report_checkpoint=True, # also keep checkpoints of OpenMM objects (specific to Context and machine)
        report_state=True,      # saving State is a bit redundant with checkpoints, but is machine-transferrable
    )

# define reproducible simulation parameter sets
## 1) HIGH-TEMPERATURE ANNEAL FOR RELAXATION
equil_params = SimulationParameters( 
    integ_params=IntegratorParameters(
        time_step=1*femtosecond,
        total_time=100*picosecond, # just a short simulation to demonstrate
        num_samples=10, # don't want to take too many samples
    ),
    thermo_params=ThermoParameters(
        ensemble='NPT', #
        temperature=600*kelvin,
        friction_coeff=1*picosecond**-1, # required for Langevin Thermostat
        barostat_freq=25, # number of steps between barostat move attempts
    ),
    reporter_params=reporter_params,
)

## 2) LOW-TEMPERATURE PRODUCTION
prod_params = SimulationParameters( 
    integ_params=IntegratorParameters(
        time_step=2*femtosecond,  # NOTE that 2 fs step doesn't work well for unconstrainted FFs
        total_time=0.5*nanosecond, # just a short simulation to demonstrate
        num_samples=50, # don't want to take too many samples
    ),
    thermo_params=ThermoParameters(
        ensemble='NVT', # clamp volume
        temperature=300*kelvin,
        friction_coeff=1*picosecond**-1, 
    ),
    reporter_params=reporter_params,
)

### Running a simulation "schedule"
A "schedule" here denotes a serial sequence of simulations, each defined by their own SimulationParameters
Note that subsequent parameter set in the schedule will generate a new directory containing the serailize parametesrs and all files output during the simulation

In [None]:
from openff.interchange.interop.openmm._positions import to_openmm_positions

from polymerist.mdtools.openmmtools.forcegroups import impose_unique_force_groups
from polymerist.mdtools.openmmtools.execution import run_simulation_schedule


# initialize core OpenMM objects
omm_topology = interchange.to_openmm_topology()
omm_system  = interchange.to_openmm(combine_nonbonded_forces=False)
omm_positions = to_openmm_positions(interchange, include_virtual_sites=True)
impose_unique_force_groups(omm_system) # ensure each Force is separate to enable mapping of energy contributions

# run schedule
schedule = { # simulations will be run in the order they appear here
    'anneal'     : equil_params, 
    'production' : prod_params,
}

history = run_simulation_schedule(
    polymer_outdir, 
    schedule, 
    init_top=omm_topology,
    init_sys=omm_system,
    init_pos=omm_positions,
    return_history=True
)