# Creating and importing molecular systems in OpenMM

## Preliminaries

First, we import OpenMM. It's recommended that you always import this way:

In [1]:
from simtk import openmm, unit
import numpy as np


## The OpenMM `System` object

The OpenMM [`System`](http://docs.openmm.org/7.1.0/api-python/library.html#system) object is a container that completely specifies how to compute the forces and energies for all particles in the molecular system. 

Because it is part of the C++ OpenMM API, `System` (and the related `Force` classes) are not true Python objects but instead SWIG-wrapped C++ objects, so you will find their naming and accessor conventions conform to C++ (rather than Pythonic) standards.

There are many ways to create or import OpenMM `System` objects:
* We can assemble all the particles and define their interactions ourselves programmatically using the OpenMM C++ API
* We could load in a serialized version of the `System` object via the built-in XML serializer
* We can use the various importers in the [`simtk.openmm.app`](http://docs.openmm.org/7.1.0/api-python/app.html) layer to read in a system definition in AMBER, CHARMM, or gromacs format
* We can use [`parmed`](http://github.com/parmed/parmed) to create an OpenMM `System` object form a wide variety of additional formats
* We can use a large variety of pre-built test systems provided by the [`openmmtools.testsystems`](http://openmmtools.readthedocs.io) module for testing code on a battery of different system types with different kinds of forces

Remember that if you want more information on `openmm.System`, you can use `help(openmm.System)`:

In [2]:
help(openmm.System)

Help on class System in module simtk.openmm.openmm:

class System(builtins.object)
 |  This class represents a molecular system. The definition of a System involves four elements:
 |  
 |  
 |  
 |  
 |  The particles and constraints are defined directly by the System object, while forces are defined by objects that extend the Force class. After creating a System, call addParticle() once for each particle, addConstraint() for each constraint, and addForce() for each Force.
 |  
 |  In addition, particles may be designated as "virtual sites". These are particles whose positions are computed automatically based on the positions of other particles. To define a virtual site, call setVirtualSite(), passing in a VirtualSite object that defines the rules for computing its position.
 |  
 |  Methods defined here:
 |  
 |  __copy__(self)
 |      __copy__(self) -> System
 |  
 |  __deepcopy__(self, memo)
 |  
 |  __del__ lambda self
 |  
 |  __getattr__ lambda self, name
 |  
 |  __getstate__(se

## Creating a periodic Lennard-Jones system

We'll start by building a simple periodic Lennard-Jones system using the Python wrappers for the main OpenMM C++ API.

In [3]:
# Define the parameters of the Lennard-Jones periodic system
nparticles = 512
reduced_density = 0.05
mass = 39.9 * unit.amu
charge = 0.0 * unit.elementary_charge
sigma = 3.4 * unit.angstroms
epsilon = 0.238 * unit.kilocalories_per_mole

# Create a system and add particles to it
system = openmm.System()
for index in range(nparticles):
    system.addParticle(mass)
    
# Set the periodic box vectors
number_density = reduced_density / sigma**3
volume = nparticles * (number_density ** -1)
box_edge = volume ** (1. / 3.)
box_vectors = np.diag([box_edge/unit.angstrom for i in range(3)]) * unit.angstroms
system.setDefaultPeriodicBoxVectors(*box_vectors)

# Add Lennard-Jones interactions using a NonbondedForce
force = openmm.NonbondedForce()
force.setNonbondedMethod(openmm.NonbondedForce.CutoffPeriodic)
for index in range(nparticles): # all particles must have parameters assigned
    force.addParticle(charge, sigma, epsilon)
force.setCutoffDistance(3.0 * sigma) # set cutoff (truncation) distance at 3*sigma
force.setUseSwitchingFunction(True) # use a smooth switching function to avoid force discontinuities at cutoff
force.setSwitchingDistance(2.5 * sigma) # turn on switch at 2.5*sigma
force.setUseDispersionCorrection(True) # use long-range isotropic dispersion correction
force_index = system.addForce(force) # system takes ownership of the NonbondedForce object

### The OpenMM Python API uses units throughout

Note the use of unit-bearing quantities via [`simtk.unit`](). OpenMM's Python API uses a powerful units system that prevents many common kinds of mistakes with unit conversion, such as those that caused the loss of the [Mars Polar Lander](). We'll examine the units system in more detail later---for now, we simply recognize that best practices dictate we feed OpenMM unit-bearing quantities (though this is not strictly necessary), and that it will return unit-bearing quantities.

## Accesssors for System attributes and Forces

The [`System`](http://docs.openmm.org/7.1.0/api-python/generated/simtk.openmm.openmm.System.html#simtk.openmm.openmm.System) object has a number of useful accessors for examining the contents of the system and 

In [4]:
# Get the number of particles in the System
print('The system has %d particles' % system.getNumParticles())
# Print a few particle masses
for index in range(5):
    print('Particle %5d has mass %12.3f amu' % (index, system.getParticleMass(index) / unit.amu))
# Print number of constraints
print('The system has %d constraints' % system.getNumConstraints())
# Get the number of forces and iterate through them
print('There are %d forces' % system.getNumForces())
for (index, force) in enumerate(system.getForces()):
    print('Force %5d : %s' % (index, force.__class__.__name__))

The system has 512 particles
Particle     0 has mass       39.900 amu
Particle     1 has mass       39.900 amu
Particle     2 has mass       39.900 amu
Particle     3 has mass       39.900 amu
Particle     4 has mass       39.900 amu
The system has 0 constraints
There are 1 forces
Force     0 : NonbondedForce


### Define positions for the Lennard-Jones particles

To compute energies or forces, we will need to define some positions for the system before we can compute energies or forces.
We'll generate some randomly, but as we will see later, there are many more ways to load molecular positions.

In [5]:
positions = box_edge * np.random.rand(nparticles,3) * unit.angstroms

## Computing energies and forces

To compute energies and forces efficiently, OpenMM requires you create a [`Context`](http://docs.openmm.org/7.1.0/api-python/generated/simtk.openmm.openmm.Context.html#simtk.openmm.openmm.Context) that handles the computation on a particular piece of hardware. This could be a GPU or CPU using a specific set of fast computation kernels.

OpenMM forces us to explicitly move data (positions, velocities, forces) into and out of the `Context` through specific calls so that we remember that each of these operations carries overhead of moving data across the bus to the GPU, for example. To achieve maximum speed, OpenMM can perform a number of operations---such as integrating many molecular dynamics steps---fully on the GPU without slowing things down by moving data back and forth over the bus. We'll see examples of this soon.

First, we just want to compute energies and forces and then minimize the energy of our system. To do that, we first need to create an [`Integrator`](http://docs.openmm.org/7.1.0/userguide/application.html#integrators) that will be bound to the [`Context`](http://docs.openmm.org/7.1.0/api-python/generated/simtk.openmm.openmm.Context.html#simtk.openmm.openmm.Context):

In [6]:
# Create an integrator
timestep = 1.0 * unit.femtoseconds
integrator = openmm.VerletIntegrator(timestep)
# Create a Context using the default platform (the fastest abailable one is picked automatically)
# NOTE: The integrator is bound irrevocably to the context, so we can't reuse it
context = openmm.Context(system, integrator)

We can also specify which platform we want. OpenMM comes with the following platforms:
* [`Reference`](http://docs.openmm.org/7.1.0/userguide/library.html#platforms) - A slow double-precision single-threaded CPU-only implementation intended for comparing fast implementations with a "verifiably correct" one
* [`CPU`](http://docs.openmm.org/7.1.0/userguide/library.html#cpu-platform) - A fast multithreaded mixed-precision CPU-only implementation
* [`OpenCL`](http://docs.openmm.org/7.1.0/userguide/library.html#opencl-platform) - A highly portable OpenCL implementation that can be used with GPU or CPU OpenCL drivers; supports multiple precision models (single, mixed, double)
* [`CUDA`](http://docs.openmm.org/7.1.0/userguide/library.html#cuda-platform) - The fastest implementation that uses [CUDA](https://developer.nvidia.com/cuda-downloads) for NVIDIA GPUs; supports multiple precision models (single, mixed, double)

In [7]:
# We have to create a new integrator for every Context since it takes ownership of the integrator we pass it
integrator = openmm.VerletIntegrator(timestep)
# Create a Context using the multithreaded mixed-precision CPU platform
platform = openmm.Platform.getPlatformByName('CPU')
context = openmm.Context(system, integrator, platform)

Once we have a [`Context`](http://docs.openmm.org/7.1.0/api-python/generated/simtk.openmm.openmm.Context.html#simtk.openmm.openmm.Context), we need to tell OpenMM we want to retrieve the energy and forces for a specific set of particle positions. This is done by retrieving a [`State`](http://docs.openmm.org/7.1.0/api-python/generated/simtk.openmm.openmm.State.html#simtk.openmm.openmm.State) object from the [`Context`](http://docs.openmm.org/7.1.0/api-python/generated/simtk.openmm.openmm.Context.html#simtk.openmm.openmm.Context) that contains *only* the information we want to retrieve so as to minimize the amount of data that needs to be sent over the bus from a GPU:

In [8]:
# Set the positions
context.setPositions(positions)
# Retrieve the energy and forces
state = context.getState(getEnergy=True, getForces=True)
potential_energy = state.getPotentialEnergy()
print('Potential energy: %s' % potential_energy)
forces = state.getForces()
print('Forces: %s' % forces)

Potential energy: 1.3230821105716127e+21 kJ/mol
Forces: [(-62014832640.0, -182438395904.0, -116295802880.0), (387894401302528.0, 518838386688.0, 899752564621312.0), (1.458334133649408e+16, 8644700256862208.0, -3743985537384448.0), (9491452928.0, 36256874496.0, 176904781824.0), (9553244585984.0, 11258361282560.0, -1011142492160.0), (-577658552320.0, -312803885056.0, -502928539648.0), (323272189149184.0, 1915071627264.0, -26337838366720.0), (-415543820288.0, 876672974848.0, 297246556160.0), (-198527867158528.0, 230344364130304.0, 576063360991232.0), (-137934422016.0, -58734673920.0, 268009668608.0), (298707360.0, -1418730240.0, -1884944000.0), (-214101519237120.0, -112381451567104.0, 900316614623232.0), (1125768776122368.0, -1035784245215232.0, -452165600542720.0), (-3181190840320.0, 1467850948608.0, -357198528512.0), (8.620596535324836e+17, -4.1962716963169894e+17, -1.0198845146561249e+18), (-64556752896.0, -82982928384.0, 11443345408.0), (-768330726113280.0, 1.0258579208116634e+17, -3.

Note that the forces are returned as a `list` of tuples with units attached. If we want a numpy array instead, we could convert it, but it's much easier just use the `asNumpy=True` argument to [`state.getForces`](http://docs.openmm.org/7.1.0/api-python/generated/simtk.openmm.openmm.State.html#simtk.openmm.openmm.State.getForces)

In [9]:
# Get forces as numpy array
forces = state.getForces(asNumpy=True)
print('Forces: %s' % forces)

Forces: [[ -6.20148326e+10  -1.82438396e+11  -1.16295803e+11]
 [  3.87894401e+14   5.18838387e+11   8.99752565e+14]
 [  1.45833413e+16   8.64470026e+15  -3.74398554e+15]
 ..., 
 [ -1.59298919e+12  -1.95165985e+12  -2.88140591e+11]
 [ -3.18801707e+11   1.74586384e+12   7.57386379e+11]
 [  8.47980304e+16   1.56856040e+18   3.75194084e+17]] kJ/(nm mol)


## Minimizing the potential energy

You'll notice the energy and force magnitude is pretty large:

In [10]:
# Compute the energy
state = context.getState(getEnergy=True, getForces=True)
potential_energy = state.getPotentialEnergy()
print('Potential energy: %s' % potential_energy)
# Compute the force magnitude
forces = state.getForces(asNumpy=True)
force_magnitude = (forces**2).sum().sqrt()
print('Force magnitude: %s' % force_magnitude)

Potential energy: 1.3230821105716167e+21 kJ/mol
Force magnitude: 2.3200985830192305e+24 kJ/(nm mol)


If we tried to do dynamics at this point, the system will likely explode!

OpenMM provides a [`LocalEnergyMinimizer`](http://docs.openmm.org/7.1.0/api-python/generated/simtk.openmm.openmm.LocalEnergyMinimizer.html#simtk.openmm.openmm.LocalEnergyMinimizer) that can be used to bring the force magnitude to be low enough to be able to run a simulation that won't explode.

In [11]:
# Minimize the potential energy
openmm.LocalEnergyMinimizer.minimize(context)

# Compute the energy and force magnitude after minimization
state = context.getState(getEnergy=True, getForces=True)
potential_energy = state.getPotentialEnergy()
print('Potential energy: %s' % potential_energy)
forces = state.getForces(asNumpy=True)
force_magnitude = (forces**2).sum().sqrt()
print('Force magnitude: %s' % force_magnitude)

Potential energy: -2609.628306312464 kJ/mol
Force magnitude: 333.35341916761604 kJ/(nm mol)


## Serializing the System object to disk or for transport over the network

You can serialize and restore `System` objects as a string (which can be written to disk) via the [`XmlSerializer`](http://docs.openmm.org/7.1.0/api-python/generated/simtk.openmm.openmm.XmlSerializer.html#simtk.openmm.openmm.XmlSerializer) class:

In [12]:
# Serialize to a string
system_xml = openmm.XmlSerializer.serialize(system)
# Restore from a string
restored_system = openmm.XmlSerializer.deserialize(system_xml)
assert(system.getNumParticles() == restored_system.getNumParticles())

## Load a molecular system defined by AMBER, CHARMM, or gromacs

The OpenMM [`app` layer](http://docs.openmm.org/7.1.0/userguide/application.html#using-amber-files) provides convenience classes that help you load in systems defined in AMBER, CHARMM, or gromacs. 

### Loading an AMBER system

We can use [`AmberPrmtopFile`](http://docs.openmm.org/7.1.0/api-python/generated/simtk.openmm.app.amberprmtopfile.AmberPrmtopFile.html#simtk.openmm.app.amberprmtopfile.AmberPrmtopFile) to load an Amber `prmtop` file and [`InpcrdFile`](http://docs.openmm.org/7.1.0/api-python/generated/simtk.openmm.app.amberinpcrdfile.AmberInpcrdFile.html#simtk.openmm.app.amberinpcrdfile.AmberInpcrdFile) to load an `inpcrd` file:

In [13]:
# Load in an AMBER system
from simtk.openmm import app
prmtop = app.AmberPrmtopFile('resources/alanine-dipeptide-implicit.prmtop')
inpcrd = app.AmberInpcrdFile('resources/alanine-dipeptide-implicit.inpcrd')
system = prmtop.createSystem(nonbondedMethod=app.NoCutoff, implicitSolvent=app.OBC2, constraints=app.HBonds)
positions = inpcrd.getPositions(asNumpy=True)

# Compute the potential energy
def compute_potential(system, positions):
    """Print the potential energy given a System and positions."""
    integrator = openmm.VerletIntegrator(1.0 * unit.femtoseconds)
    context = openmm.Context(system, integrator)
    context.setPositions(positions)
    print('Potential energy: %s' % context.getState(getEnergy=True).getPotentialEnergy())
    # Clean up
    del context, integrator
    
compute_potential(system, positions)

Potential energy: -137.43955993652344 kJ/mol


## Loading a CHARMM system

We can also [load CHARMM files](http://docs.openmm.org/7.1.0/userguide/application.html#using-charmm-files):

In [14]:
# Load in a CHARMM system
from simtk.openmm import app
psf = app.CharmmPsfFile('resources/ala_ala_ala.psf')
pdbfile = app.PDBFile('resources/ala_ala_ala.pdb')
params = app.CharmmParameterSet('resources/charmm22.rtf', 'resources/charmm22.par')
system = psf.createSystem(params, nonbondedMethod=app.NoCutoff, constraints=app.HBonds)
positions = pdbfile.getPositions(asNumpy=True)
compute_potential(system, positions)

Potential energy: 187615.40625 kJ/mol


## Loading a gromacs system

We can also [load gromacs files](http://docs.openmm.org/7.1.0/userguide/application.html#using-gromacs-files), though you'll want to have the gromacs parameter files installed first.

You can install a version of gromacs via `conda` with:
```
conda install --yes -c bioconda gromacs
```

In [15]:
# Load in a gromacs system
from simtk.openmm import app
gro = app.GromacsGroFile('resources/input.gro')
gromacs_include_filepath = '/Users/choderaj/miniconda3/share/gromacs/top/'
top = app.GromacsTopFile('resources/input.top', periodicBoxVectors=gro.getPeriodicBoxVectors(), includeDir=gromacs_include_filepath)
system = top.createSystem(nonbondedMethod=app.PME, nonbondedCutoff=1*unit.nanometer, constraints=app.HBonds)
positions = gro.getPositions(asNumpy=True)
compute_potential(system, positions)

Potential energy: -115290.92637980473 kJ/mol


## Visualizing the system and using `Topology` objects

While the `System` objects don't contain any information on the *identities* of the particles or how these are organized into residues or molecules, we can extract a [`Topology`](http://docs.openmm.org/7.1.0/api-python/generated/simtk.openmm.app.topology.Topology.html#simtk.openmm.app.topology.Topology) object that provides this information from a PDB file:

In [16]:
# Extract the topology from a PDB file
pdbfile = app.PDBFile('resources/alanine-dipeptide-implicit.pdb')
positions = pdbfile.getPositions()
topology = pdbfile.getTopology()

OpenMM provides a way to write a frame to a PDB file should you want to visualize your initial configuration:

In [17]:
# Write out a PDB file
with open('output.pdb', 'w') as outfile:
    app.PDBFile.writeFile(topology, positions, outfile)

If working in a Jupyter notebook, we can use [`mdtraj`](http://mdtraj.org) and [`nglview`](https://github.com/arose/nglview) to visualize the system:

In [18]:
# Create an MDTraj Trajectory object
import mdtraj
mdtraj_topology = mdtraj.Topology.from_openmm(topology)
traj = mdtraj.Trajectory(positions/unit.nanometers, mdtraj_topology)

# View it in nglview
import nglview
view = nglview.show_mdtraj(traj)
view.add_ball_and_stick('all')
view.center_view(zoom=True)
view

## Creating systems via ParmEd

## Prebuilt systems with `openmmtools.testsystems`

The [`openmmtools.testsystems`](http://openmmtools.readthedocs.io/en/latest/testsystems.html) module provides a number 
of pre-built test systems that are very useful for testing your code or benchmarking your algorithms. The full list of test systems can be found [here](http://openmmtools.readthedocs.io/en/latest/testsystems.html), but we'll highlight a few below.

In [19]:
from openmmtools import testsystems

t = testsystems.HarmonicOscillator()
system, positions, topology = t.system, t.positions, t.topology

print(positions)


[[ 0.  0.  0.]] A


In [20]:
# Create a Lennard-Jones cluster with a harmonic spherical restraint:
t = testsystems.LennardJonesCluster()

def visualize(t):
    import mdtraj
    mdtraj_topology = mdtraj.Topology.from_openmm(t.topology)
    traj = mdtraj.Trajectory([t.positions/unit.nanometers], mdtraj_topology)

    # View it in nglview
    import nglview
    view = nglview.show_mdtraj(traj)
    if t.system.getNumParticles() < 2000:
        view.add_ball_and_stick('all')
    view.center_view(zoom=True)
    return view

visualize(t)

In [21]:
t = testsystems.LennardJonesFluid(reduced_density=0.90)
visualize(t)

In [22]:
t = testsystems.WaterBox()
visualize(t)

In [23]:
t = testsystems.AlanineDipeptideVacuum()
visualize(t)

In [24]:
t = testsystems.DHFRExplicit()
visualize(t)

In [25]:
t = testsystems.SrcExplicit()
visualize(t)

There are lots and lots of systems defined! Try some!

In [26]:
help(testsystems)

Help on module openmmtools.testsystems in openmmtools:

NAME
    openmmtools.testsystems - Module to generate Systems and positions for simple reference molecular systems for testing.

DESCRIPTION
    DESCRIPTION
    
    This module provides functions for building a number of test systems of varying complexity,
    useful for testing both OpenMM and various codes based on pyopenmm.
    
    Note that the PYOPENMM_SOURCE_DIR must be set to point to where the PyOpenMM package is unpacked.
    
    EXAMPLES
    
    Create a 3D harmonic oscillator.
    
    >>> from openmmtools import testsystems
    >>> ho = testsystems.HarmonicOscillator()
    >>> system, positions = ho.system, ho.positions
    
    See list of methods for a complete list of provided test systems.
    
    COPYRIGHT
    
    @author John D. Chodera <john.chodera@choderalab.org>
    @author Randall J. Radmer <radmer@stanford.edu>
    
    All code in this repository is released under the MIT License.
    
    This progr