# MD Simulation with OpenMM

Python has become increasingly popular amongst the Biomolecular Simulation community over the last 15 years. Initially it was primarily used for simulation data analysis work, but with the advent of [OpenMM](https://openmm.org/), it has become possible, and popular, to actually **run** simulations "in Python".

Amongst the attractions are:
 -  flexibility: if OpenMM doesn't do it already, you can often write additional code
 -  transparency: it's not a "black box"
 -  speed (especially on GPUs)

The (maybe) downsides are:
 - You have to "build" your MD code before you can use it
 - Only fairly mainstream types of simulation are easy to do
 - You have to be comfortable with Python

A web search will quickly identify good OpenMM tutorials and guides, this workshop focusses on a quick introduction, and is designed to allow, as far as possible, a comparison with the **Amber MD introduction** covered previously.

Being Python-based, most usually you will work within the context of a Jupyter notebook or similar, rather than from the command line in a terminal session, so that is what will happen here.

## 1. Build your MD Engine

### 1.1 Import components

For this you first need to import the neccessary components from different sections of the OpenMM package:

 - The base `openmm` package holds the heavy number-crunching components.
 - The `openmm.app` sub-package holds components that organise the data, both internally and for input/output.
 - The `openmm.unit` sub-package provides a way to specify parameters for the simulation in terms of *quantities* - that is, amounts of "stuff" in specified units.



In [None]:
from openmm import LangevinMiddleIntegrator, MonteCarloBarostat
from openmm.app import AmberInpcrdFile, AmberPrmtopFile, Simulation, HBonds, PME
from openmm.app import XTCReporter, StateDataReporter
from openmm.unit import nanometer, picosecond, kelvin, bar, angstrom, kilojoule, mole

from sys import stdout

### 1.2 Load simulation data

Now we load data from the two files that describe our simulation system (the Mcl-1 protein prepared earlier):

In [None]:
inpcrd = AmberInpcrdFile('5fdr_A.inpcrd')
prmtop = AmberPrmtopFile('5fdr_A.prmtop', periodicBoxVectors=inpcrd.boxVectors)

### 1.3 Create the simulation "system"

A key OpenMM object is the `system`: a complete description of what will be simulated, and how. We create this from the `prmtop` object, specifying in addition extra details like how non-bonded interactions will be handled, and then we supplement it with a *barostat* so we can run constant pressure simulations:

In [None]:
system = prmtop.createSystem(nonbondedMethod=PME, nonbondedCutoff=1*nanometer,
        constraints=HBonds)
system.addForce(MonteCarloBarostat(1*bar, 310*kelvin))

### 1.4 Create a "simulation"

The next key OpenMM object is the `simulation`, which connects the `system` with the computational infrastructure to actually do MD (or energy minimisation) - a key part of this is the `integrator` that calculates the dynamics:

In [None]:
integrator = LangevinMiddleIntegrator(310*kelvin, 1/picosecond, 0.002*picosecond)
simulation = Simulation(prmtop.topology, system, integrator)

## 2. Use your MD engine

### 2.1 Energy minimise the system from the starting coordinates.

Now the `simulation` exists, you can start to interact with it. Things you can do include:

1. Set or get the current coordinates.
2. Get the current simulation energy.
3. Perform energy minimisations or MD simulations on it.

OpenMM's `simulation` objects are tightly connected to your compute infrastructure, so most often you do not interact with a `simulation` directly, but via an intermediary - the `context`. Here you see the `context` being used to initialise the coordinates, and query the energy of the system before and after the minimisation process:

In [None]:
simulation.context.setPositions(inpcrd.positions)
state = simulation.context.getState(getEnergy=True)
print(f"Initial energy: {state.getPotentialEnergy().format('%8.2f')}")
simulation.minimizeEnergy(tolerance=0.5*kilojoule/mole/angstrom,
                          maxIterations=1000)
state = simulation.context.getState(getEnergy=True)
print(f"After energy minimization: {state.getPotentialEnergy().format('%8.2f')}")

### 2.2 Run a short MD simulation.

Now the system has been energy minimised (very roughly - notice we are not using the sophisticated relaxation/equilibration workflow used previously, so there is a bit of a risk here), we can run some MD.

By default, OpenMM works very quietly - it will tell you almost nothing about the simulation progress unless you add some `reporters` to the `simulation`, so that's what we will do.

We add an `XTCReporter` so a *trajectory* file gets generated, and a `StateDataReporter`, so that basic info about the simulation progress is printed to the screen. With these added, we instruct the `simulation to run for a number of steps:

In [None]:
simulation.reporters.append(XTCReporter('5fdr_A_md.xtc', 500))
simulation.reporters.append(StateDataReporter(stdout, 500, step=True, time=True,
        potentialEnergy=True, temperature=True, density=True))
simulation.step(10000)

## 3. Analyse the results

From what the `StateDataReporter` produces, you should be able to see the simulation temperature rising quickly to about 310K, and the system density increasing to about 1 g/mL.

You should also see the trajectory file `5fdr_A_md.xtc` has been created. If you download this to your laptop (along with a copy of `5fdr_A.prmtop`, if you haven't got this already), as you did for the **Amber MD** workshop, you can visualize the trajectory using VMD or Chimera.

# Summary

If you are familiar with Python, OpenMM provides a very convenient approach to running MD simulations. The Notebook environment allows you to interleave code and comments, documenting your research for the benefit of both yourself and others.

But you don't have to use a Notebook - put the same code into a .py file and you can run it on an HPC service, for example.

The main issue with OpenMM is that if you want to run anything other than a "vanilla" MD simulation (e.g some form of enhanced sampling method) you will probably have to author a lot of extra code yourself, while many of the older command-line oriented tools (e.g. AMBER or GROMACS) offer many such options "out of the box".