# Workflow 1: Prepare and run a protein-ligand simulation

<br />
<details>
    <summary><small>▼ Click here for dependency installation instructions</small></summary>

    The simplest way to install dependencies is to install the examples package:
    
    conda install -c conda-forge openff-toolkit-examples
    
    This example will need access to a GROMACS install in addition to the above dependencies. Your existing GROMACS installed can be used, or you can install it from Bioconda:
    
    conda install -c bioconda gromacs
    
    You can also install all the depencies using the provided environment.yaml:
    
    conda env update --file ../environment.yaml 
    
    You may also need to restart this notebook's kernel after you make these changes (Kernel -> Restart)
</details>


In [1]:
# Imports from the Python standard library
import sys
from pathlib import Path
from tempfile import NamedTemporaryFile

# Imports from dependencies
from simtk import openmm, unit
import parmed as pmd
import numpy as np
import mdtraj as mdt
import nglview

# Imports from the toolkit
import openff.toolkit
from openff.toolkit.typing.engines.smirnoff import ForceField
from openff.toolkit.topology import Molecule, Topology

# Imports from local files
from utils import find_clashing_water, minimize_and_visualize





_(The OpenEye loading warning is expected -- The toolkit is informing us that OETK is unavailable, but it will safely fall back to using RDKit and AmberTools for the same functionality)_

## Introducing the main cast

Merck [provides data](https://github.com/MCompChem/fep-benchmark) to benchmark Free Energy Perturbation (FEP) procedures. We'll use structures from this dataset for this showcase:

https://github.com/MCompChem/fep-benchmark

This example is pre-packaged with one protein-ligand complex from the above repository, however you should be able to download other complexes and run them as well using this workflow. CHEMBL1078774, our ligand of choice, is an inhibitor of the mitotic functions of kinesin-5, a motor protein involved in cell division. 

The ligand and protein structures are already prepared for simulation:

- Their co-ordinates are super-imposable
- Hydrogens added to protein and crystallographic waters
- N-methyl and acetyl terminal caps on the protein to prevent unphysical charges
- Missing atoms are replaced

<br />
<details>
    <summary><small>▼ Click here for the shell commands we used to download the protein-ligand complex</small></summary>

    # Clone the repository
    git clone https://github.com/MCompChem/fep-benchmark.git
    # Take the first ligand from the eg5 benchmark
    head -n119 fep-benchmark/eg5/ligands.sdf > chembl_1078774.sdf
    # Take the prepared protein structure
    cp fep-benchmark/eg5/3l9h_prepared.pdb .
    
</details>

In [2]:
receptor_path: str = "3l9h_prepared.pdb"
ligand_path: str = "chembl_1078774.sdf"

In [3]:
view: nglview.NGLWidget = nglview.show_file(ligand_path)
view

NGLWidget()

In [4]:
view: nglview.NGLWidget = nglview.show_file(receptor_path)
view

NGLWidget()

# The plan:

| Action | Software|
|--|--|
| Parameterize the ligand | OpenFF Toolkit
| Solvate the protein | OpenMM
| Parameterize the protein | OpenMM
| Combine the ligand and protein into a complex | ParmEd
| Remove waters that clash with the ligand | ParmEd/MDTraj
| Simulate the complex | OpenMM
| Visualize the simulation | NGLView

<div class="alert alert-info" style="max-width: 700px; margin-left: auto; margin-right: auto;">
<!-- TODO: Explain in more detail what OpenMMForceFields, and decide whether to use this in the showcase -->
    ℹ️ Note that there's a new package `OpenMMForceFields` to replace much of this!
    <ul>
        <li> Home: <a href=https://github.com/openmm/openmmforcefields>https://github.com/openmm/openmmforcefields</a> </li>
        <li><code>conda install -c conda-forge -c omnia openmmforcefields</code></li>
        <li><a href=https://github.com/openforcefield/openforcefield/blob/master/examples/swap_amber_parameters/swap_existing_ligand_parameters_with_openmmforcefields.ipynb>Example notebook available</a></li>
    </ul>
</div>

## Parameterize the ligand (OFF Toolkit)

In this step, we'll produce parameters for our ligand from the unconstrained Parsley 1.3.0 force field. [Parsley](https://openforcefield.org/force-fields/force-fields/#parsley) is the first generation force field produced by the Open Force Field Initiative. Rather than using atom types like traditional biomolecular force fields, Parsley assigns parameters to a molecule with [fancy subgraph matching](https://www.daylight.com/dayhtml/doc/theory/theory.smarts.html). "Unconstrained" denotes that bonds involving hydrogen are treated as harmonics, like other bonds, rather than being constrained to a fixed length. This is appropriate when single-point energies must be as accurate as possible, though constraints allow a greater stable time step.

The Open Force Field Toolkit takes a molecular topology (the atoms in a molecule and their bonds) and the force field specification, and produces a `System` object that can be simulated with OpenMM or converted to inputs for other simulation packages. Note that, to produce a molecular topology, you need more than just the co-ordinates of the atoms; you also need their bonds, bond orders, and formal charges. As a result, `.sdf` files are used in this example; other file types are possible, but they must include this information.

In [5]:
# Load a molecule from a SDF file
ligand: Molecule = Molecule.from_file(ligand_path)
# Molecule loads both the co-ordinates of the atoms and their bond graph
ligand_positions: unit.Quantity = ligand.conformers[0]
ligand_topology: Topology = ligand.to_topology()

# Load the force field specification
off_forcefield: ForceField = ForceField("openff_unconstrained-1.3.0.offxml")

# Use the force field to produce an OpenMM system for the given topology
ligand_system: openmm.System = off_forcefield.create_openmm_system(ligand_topology)

_(takes ~100 seconds)_
### This is the only block in the first workflow that uses the Open Force Field Toolkit

In this workflow, the Toolkit is just responsible for combining a force field with a molecular topology. It is designed to check the input and give the user useful feedback if there seem to be errors or problems in the provided files. It also computes electric charges without user intervention. Computing charges is a process that can be confusing and error-prone, so we try to specify and simplify it as much as possible. Charges are computed efficiently with [OpenEye](https://www.eyesopen.com/) if is available; if it is not, the free toolkits [RDKit](https://www.rdkit.org/) and [AmberTools](https://ambermd.org/AmberTools.php) are used instead.


## Solvate and parameterize the protein (OpenMM)

Parsley is designed for small molecule parameterization. Parameters for proteins and other polymers are coming, but for now we'll use Amber 99sb as our protein force field.

In [6]:
# Load protein and water force field parameters
omm_forcefield: openmm.app.ForceField = openmm.app.ForceField(
    "amber99sb.xml", "tip3p.xml"
)
# Load the kinesin-5 receptor structure and solvate it in 0.15 M NaCl solution
pdb: openmm.app.PDBFile = openmm.app.PDBFile(receptor_path)
modeller: openmm.app.Modeller = openmm.app.Modeller(pdb.topology, pdb.positions)
modeller.addSolvent(
    omm_forcefield,
    model="tip3p",
    padding=4.0 * unit.angstrom,
    ionicStrength=0.15 * unit.molar,
)
# Construct the OpenMM System from the solvated structure and the protein force field
protein_system: openmm.System = omm_forcefield.createSystem(
    modeller.topology, nonbondedMethod=openmm.app.PME, rigidWater=False
)

* Magic: 
    * The protein was already prepared
    * AMBER-compatible residue names
    * `rigidWater=False` is necessary at this step due to design differences between OFF/OMM and ParmEd

<div class="alert alert-warning" style="max-width: 700px; margin-left: auto; margin-right: auto;">
⚠️ Note that OpenMM and the Open Force Field Toolkit both have classes called `Topology` and `ForceField` that serve similar functions. Don't get them confused!
</div>


## Combine the parameterized ligand and the parameterized protein

In [7]:
# Load the protein into a ParmEd Structure
protein_struct: pmd.Structure = pmd.openmm.load_topology(
    modeller.topology, protein_system, modeller.positions
)
# Load the ligand into a ParmEd Structure
ligand_struct: pmd.Structure = pmd.openmm.load_topology(
    ligand_topology.to_openmm(), ligand_system, ligand_positions
)

# ParmEd Structures override the "+" operator with a method that combines systems (!)
pmd_complex_struct: pmd.Structure = protein_struct + ligand_struct

# Assign periodic box vectors from the solvated receptor structure
pmd_complex_struct.box_vectors = modeller.topology.getPeriodicBoxVectors()

## Visualize the complex

In [8]:
view: nglview.NGLWidget = nglview.show_parmed(pmd_complex_struct)
view.add_licorice(selection="(not protein)")
view.add_surface(selection=":.NA or :.CL")

view

NGLWidget()

* Note waters clashing with ligand, since protein was solvated alone

## Remove waters that clash with the ligand (ParmEd/MDTraj)

* Magic:
    * Uses function at top of file to find clashes

In [9]:
clashes = find_clashing_water(pmd_complex_struct, "CHE", 0.15)

if len(clashes) != 0:
    clash_residues_str = ",".join([str(i) for i in clashes])
    print(f"Removing ligand-clashing water residues {clash_residues_str}")
    pmd_complex_struct.strip(f":{clash_residues_str}")
else:
    print("No ligand-water clashes to resolve")

view: nglview.NGLWidget = nglview.show_parmed(pmd_complex_struct)

view.add_licorice(selection="(not protein)")
view.add_surface(selection=":.NA or :.CL")
view

Removing ligand-clashing water residues 7481,7535


NGLWidget()

## Convert the combined system from ParmEd back to OpenMM

In [10]:
system: openmm.System = pmd_complex_struct.createSystem(
    nonbondedMethod=openmm.app.PME,
    nonbondedCutoff=9 * unit.angstrom,
    constraints=openmm.app.HBonds,
    rigidWater=True,
)
integrator: openmm.LangevinIntegrator = openmm.LangevinIntegrator(
    300 * unit.kelvin, 1 / unit.picosecond, 0.002 * unit.picoseconds
)

simulation: openmm.app.Simulation = openmm.app.Simulation(
    pmd_complex_struct.topology, system, integrator
)

# The box is about 75 angstroms per side, so add (30, 30, 30) to center the protein
simulation.context.setPositions(
    pmd_complex_struct.positions + np.array([30, 30, 30]) * unit.angstrom
)

nc_reporter: pmd.openmm.NetCDFReporter = pmd.openmm.NetCDFReporter("trajectory.nc", 10)
simulation.reporters.append(nc_reporter)

## Simulate the complex (OpenMM)
### Minimize the combined system
_(Takes 110 seconds)_

In [11]:
simulation.minimizeEnergy()
minimized_coords: unit.Quantity = simulation.context.getState(
    getPositions=True
).getPositions()

### Run a short simulation

If this were anything more than a demonstration of the Toolkit, this example would need to include additional steps like equilibration.

_(Takes 85 seconds, largely due to trajectory writing frequency)_

In [12]:
simulation.context.setVelocitiesToTemperature(300 * unit.kelvin)
simulation.step(1000)



## While we wait, a few asides...

### Force Fields
* Reproducibility - User *must* see the name of what they're using
* Conda data packages - "Plugin" support for additional force fields (anybody can add!)
* Evolving together - Toolkit will support all functional forms in [OpenForceFields repo](https://github.com/openforcefield/openforcefields/)

<img src="img/openforcefields.png" alt="drawing" width="800"/>

<hr/>
    
### Charge generation
* Released FFs only use AM1-BCC, though different semiempirical methods and charge corrections are now available
* "Graph based" charges are coming in the near future -- Consistency and speed!
* Library charge support is available
    
    
<img src="img/xkcd_charge.png" alt="drawing" width="400"/>

<hr/>

### Current cheminformatics toolkit differences
* File formats
* Slight differences in partial charge
* Speed
* SMILES canonicalization
* Behavior stability
* Stereochemistry definition (Edge cases)

## Visualize the simulation (nglview)

In [13]:
openmm.app.PDBFile.writeFile(
    pmd_complex_struct.topology, pmd_complex_struct.positions, open("system.pdb", "w")
)
mdt_traj = mdt.load("trajectory.nc", top="system.pdb")
print(mdt_traj)
import nglview

view = nglview.show_mdtraj(mdt_traj)
view

<mdtraj.Trajectory with 100 frames, 40097 atoms, 11915 residues, and unitcells>


NGLWidget(max_frame=99)

# What about GROMACS?

OpenMM makes it easy to run molecular simulations without leaving Python. The OpenFF toolkit currently exports directly to OpenMM, but no part of the parametrization process is exclusively supported by OpenMM. With ParmEd and other tools, the same systems can be run in other engines. Here we show how to use ParmEd to prepare and run the same workflow in GROMACS (_Thanks, Dennis Della Corte!_).

Feel free to skip to "Workflow 1 Conclusions" below.

In [17]:
pmd_complex_struct.coordinates = minimized_coords

# Export GROMACS files.
pmd_complex_struct.save("system.top", overwrite=True)
pmd_complex_struct.save("system.gro", overwrite=True)

In [18]:
# TODO: Work out why net charge is  not zero - Seems too big for rounding errors
# TODO: Check if PARMED can produce position restraints, and return define = POSRES to equilibration MDPs
# TODO: Otherwise clean up the notes produced by GROMACS

! gmx -quiet grompp -f minim.mdp -c system.gro -p system.top -o em.tpr -maxwarn 1
! gmx -quiet mdrun -deffnm em

! gmx -quiet grompp -f nvt.mdp -c em.gro -r em.gro -p system.top -o nvt.tpr -maxwarn 1
! gmx -quiet mdrun -deffnm nvt

! gmx -quiet grompp -f npt.mdp -c nvt.gro -r nvt.gro -t nvt.cpt -p system.top -o npt.tpr -maxwarn 1
! gmx -quiet mdrun -deffnm npt

! gmx grompp -f md.mdp -c npt.gro -t npt.cpt -p system.top -o md_0_1.tpr -maxwarn 1
! gmx mdrun -deffnm md_0_1

Ignoring obsolete mdp entry 'ns_type'

NOTE 1 [file minim.mdp]:
  With Verlet lists the optimal nstlist is >= 10, with GPUs >= 20. Note
  that with the Verlet scheme, nstlist has no effect on the accuracy of
  your simulation.

Setting the LD random seed to -134499474

Generated 190 of the 190 non-bonded parameter combinations

Excluding 3 bonded neighbours molecule type 'system1'

Excluding 3 bonded neighbours molecule type 'HOH'

Excluding 3 bonded neighbours molecule type 'NA'

Excluding 3 bonded neighbours molecule type 'CL'

Excluding 3 bonded neighbours molecule type 'CHEMBL1078774'

NOTE 2 [file system.top, line 52558]:
  System has non-zero total charge: 0.004995
  Total charge should normally be an integer. See
  http://www.gromacs.org/Documentation/Floating_Point_Arithmetic
  for discussion on how close it should be to an integer.
  



  You are using Ewald electrostatics in a system with net charge. This can
  lead to severe artifacts, such as ions moving into regions with 

Analysing residue names:
There are:   349    Protein residues
There are: 11503      Water residues
There are:    62        Ion residues
There are:     1      Other residues
Analysing Protein...
Analysing residues not classified as Protein/DNA/RNA/Water and splitting into groups...
Analysing residues not classified as Protein/DNA/RNA/Water and splitting into groups...
Number of degrees of freedom in T-Coupling group Protein is 13666.51
Number of degrees of freedom in T-Coupling group non-Protein is 69331.49

Determining Verlet buffer for a tolerance of 0.005 kJ/mol/ps at 300 K

Calculated rlist for 1x1 atom pair-list as 1.032 nm, buffer size 0.032 nm

Set rlist, assuming 4x4 atom pair-list, to 1.000 nm, buffer size 0.000 nm

Note that mdrun will redetermine rlist based on the actual pair-list setup

Reading Coordinates, Velocities and Box size from old trajectory

Will read whole trajectory
Last frame         -1 time    2.000   

Using frame at t = 2 ps

Starting time for run is 0 ps
Ca


Back Off! I just backed up md_0_1.log to ./#md_0_1.log.3#
Reading file md_0_1.tpr, VERSION 2021.1-MODIFIED (single precision)
Changing nstlist from 10 to 100, rlist from 1 to 1.151

1 GPU selected for this run.
Mapping of GPU IDs to the 2 GPU tasks in the 1 rank on this node:
  PP:0,PME:0
PP tasks will do (non-perturbed) short-ranged interactions on the GPU
PP task will update and constrain coordinates on the CPU
PME tasks will do all aspects on the GPU
Using 1 MPI thread
Using 4 OpenMP threads 


Back Off! I just backed up md_0_1.xtc to ./#md_0_1.xtc.3#

Back Off! I just backed up md_0_1.edr to ./#md_0_1.edr.3#
starting mdrun 'Generic title'
1000 steps,      2.0 ps.

Writing final coordinates.

Back Off! I just backed up md_0_1.gro to ./#md_0_1.gro.3#

               Core t (s)   Wall t (s)        (%)
       Time:       12.285        3.071      400.0
                 (ns/day)    (hour/ns)
Performance:       56.321        0.426

GROMACS reminds you: "I think it would be a good idea." 

* Magic:
    * MDP files already prepared
    * `maxwarn 1` becuase of rounding errors with charges
* ParmEd *is* great, but *isn't* perfect, and we're actively working on bugfixes.
* We have philosophical differences about what constitutes "parameterization" 
    * Hbond constraints?
    * Electrostatics cutoffs?
* ParmEd is unable to process several OpenMM GBSA models

_(Takes 120 seconds)_


In [19]:
# TODO: PBC treatment?
mdt_traj = mdt.load("md_0_1.xtc", top="system.gro", stride=1000000)
print(mdt_traj)

<mdtraj.Trajectory with 1 frames, 40097 atoms, 11915 residues, and unitcells>


In [20]:
import nglview

view = nglview.show_mdtraj(mdt_traj)
view

NGLWidget()

### Workflow 1 Conclusions
* Toolkit parameterization requires *8 lines*, three of which are cheap hacks 
* Conda-installable, open source tools performed everything from basic system prep to simulation and visualization
* Using OpenMM, we never had to leave Python
* Using ParmEd, there was little additional work to running with GROMACS


<img src="img/dog_food.jpg" alt="drawing" width="350"/>


## Workflow 2: Changing force field parameters and energy-minimizing the resulting molecule


### Note the recent change to the SMIRNOFF 0.3 specification

```
<Angles version="0.3" potential="harmonic">
		<Angle smirks="[*:1]~[#6X4:2]-[*:3]" angle="109.5*degree" k="100.0*mole**-1*radian**-2*kilocalorie"/>
		<Angle smirks="[#1:1]-[#6X4:2]-[#1:3]" angle="109.5*degree" k="70.0*mole**-1*radian**-2*kilocalorie"/>
</Angles>
```
<hr/>

### Getting started

Let's reload the ligand, in case the live demo had a hiccup above.

Magic:
* To avoid spending time running AM1-BCC again, I'm providing explicitly-defined charges

In [21]:
ligand_path = "fep-benchmark/eg5/chembl_1078774.sdf"
ligand = Molecule.from_file(ligand_path)
ligand.partial_charges = (
    np.array(
        [
            -0.085767,
            -0.0027,
            -0.085767,
            -0.085767,
            -0.1043,
            -0.092,
            -0.174,
            0.1506,
            -0.1383,
            -0.073,
            0.2004,
            -0.4076,
            0.1254,
            -0.1114,
            -0.0684,
            -0.1077,
            0.2508,
            -0.1043,
            -0.092,
            -0.138,
            0.1021,
            -0.4871,
            0.0369,
            -0.1449,
            -0.124,
            -0.7206,
            0.036144,
            0.036144,
            0.036144,
            0.036144,
            0.036144,
            0.036144,
            0.036144,
            0.036144,
            0.036144,
            0.131,
            0.135,
            0.138,
            0.0437,
            0.0442,
            0.0442,
            0.0497,
            0.0497,
            0.0497,
            0.0497,
            0.0567,
            0.0837,
            0.147,
            0.159,
            0.432,
            0.15,
            0.3978,
        ]
    )
    * unit.elementary_charge
)
ligand_positions = ligand.conformers[0]
ligand_topology = ligand.to_topology()

First, we use `ForceField.label_molecules` to identify which torsion parameters were assigned to the hydroxyl.

In [22]:
openff_forcefield = ForceField("openff-1.2.0.offxml")
ff_applied_parameters = openff_forcefield.label_molecules(ligand_topology)[0]
for atoms, parameter in ff_applied_parameters["ProperTorsions"].items():
    ele_1 = ligand.atoms[atoms[0]].element.symbol
    ele_2 = ligand.atoms[atoms[1]].element.symbol
    ele_3 = ligand.atoms[atoms[2]].element.symbol
    ele_4 = ligand.atoms[atoms[3]].element.symbol
    if (ele_1 == "H" and ele_2 == "O") or (ele_3 == "O" and ele_4 == "H"):
        print(atoms, parameter)

(19, 20, 21, 49) <ProperTorsionType with smirks: [*:1]~[#6X3:2]-[#8X2:3]-[#1:4]  periodicity1: 2  phase1: 180.0 deg  id: t97  k1: 0.8722932201352 kcal/mol  idivf1: 1.0  >
(22, 20, 21, 49) <ProperTorsionType with smirks: [*:1]~[#6X3:2]-[#8X2:3]-[#1:4]  periodicity1: 2  phase1: 180.0 deg  id: t97  k1: 0.8722932201352 kcal/mol  idivf1: 1.0  >


In [23]:
hydroxyl_torsion = openff_forcefield.get_parameter_handler("ProperTorsions").parameters[
    "[*:1]~[#6X3:2]-[#8X2:3]-[#1:4]"
]
hydroxyl_torsion.periodicity1 = 2
hydroxyl_torsion.phase1 = 180 * unit.degree
hydroxyl_torsion.k1 = -10 * unit.kilocalorie / unit.mole

In [24]:
type(view)

nglview.widget.NGLWidget

In [25]:
view = minimize_and_visualize(ligand, openff_forcefield)
view

NGLWidget()

## But we didn't need the OFF toolkit to change the parameters for a _single term_
## So, how about changing FF parameters for all H-X-H angles?

In [26]:
ff_applied_parameters = openff_forcefield.label_molecules(ligand_topology)[0]
for atoms, parameter in ff_applied_parameters["Angles"].items():
    ele_1 = ligand.atoms[atoms[0]].element.symbol
    ele_2 = ligand.atoms[atoms[1]].element.symbol
    ele_3 = ligand.atoms[atoms[2]].element.symbol
    if ele_1 == "H" and ele_3 == "H":
        print(atoms, parameter)

(26, 0, 27) <AngleType with smirks: [#1:1]-[#6X4:2]-[#1:3]  angle: 110.2468561538 deg  k: 67.57751269282 kcal/(mol rad**2)  id: a2  >
(26, 0, 28) <AngleType with smirks: [#1:1]-[#6X4:2]-[#1:3]  angle: 110.2468561538 deg  k: 67.57751269282 kcal/(mol rad**2)  id: a2  >
(27, 0, 28) <AngleType with smirks: [#1:1]-[#6X4:2]-[#1:3]  angle: 110.2468561538 deg  k: 67.57751269282 kcal/(mol rad**2)  id: a2  >
(29, 2, 30) <AngleType with smirks: [#1:1]-[#6X4:2]-[#1:3]  angle: 110.2468561538 deg  k: 67.57751269282 kcal/(mol rad**2)  id: a2  >
(29, 2, 31) <AngleType with smirks: [#1:1]-[#6X4:2]-[#1:3]  angle: 110.2468561538 deg  k: 67.57751269282 kcal/(mol rad**2)  id: a2  >
(30, 2, 31) <AngleType with smirks: [#1:1]-[#6X4:2]-[#1:3]  angle: 110.2468561538 deg  k: 67.57751269282 kcal/(mol rad**2)  id: a2  >
(32, 3, 33) <AngleType with smirks: [#1:1]-[#6X4:2]-[#1:3]  angle: 110.2468561538 deg  k: 67.57751269282 kcal/(mol rad**2)  id: a2  >
(32, 3, 34) <AngleType with smirks: [#1:1]-[#6X4:2]-[#1:3]  an

In [27]:
hxh_angle = openff_forcefield.get_parameter_handler("Angles").parameters[
    "[#1:1]-[#6X4:2]-[#1:3]"
]
hxh_angle.angle = 50 * unit.degree

view = minimize_and_visualize(ligand, openff_forcefield)
view

NGLWidget()

![title](img/aperture.jpg)


## Now, let's mess with some torsion parameters
### Load a molecule with more interesting torsion from PDB, supplying complete topological information using SMILES

In [28]:
view = nglview.show_file("CID_15513.pdb")
view

NGLWidget()

In [29]:
ligand = Molecule.from_smiles('COC(=O)C1=CC=C(C=C1)C(=O)O')

In [30]:
omm_pdbfile = openmm.app.PDBFile('CID_15513.pdb')
ligand_topology = Topology.from_openmm(omm_pdbfile.topology, unique_molecules=[ligand])

In [31]:
openff_forcefield = ForceField('openff-1.2.0.offxml')
ligand_system = openff_forcefield.create_openmm_system(ligand_topology)

integrator = openmm.LangevinIntegrator(300*unit.kelvin, 1/unit.picosecond, 0.002*unit.picoseconds)
simulation = openmm.app.Simulation(ligand_topology.to_openmm(), ligand_system, integrator)
simulation.context.setPositions(omm_pdbfile.positions)
simulation.minimizeEnergy()

lig_struct = pmd.openmm.load_topology(simulation.topology, ligand_system, simulation.context.getState(getPositions=True).getPositions())
with NamedTemporaryFile(suffix='.pdb') as tf:
    openmm.app.PDBFile.writeModel(simulation.topology, simulation.context.getState(getPositions=True).getPositions(), open(tf.name, 'w'))
    view = nglview.show_file(tf.name)
view


NGLWidget()

### Let's make the substituent groups perpendicular to the ring.

In [32]:
torsion_smirkses = set()
ff_term_labels = openff_forcefield.label_molecules(ligand_topology)[0]
for atoms, parameter in ff_term_labels['ProperTorsions'].items():
    ele_1 = ligand.atoms[atoms[0]].element.symbol
    ele_2 = ligand.atoms[atoms[1]].element.symbol
    ele_3 = ligand.atoms[atoms[2]].element.symbol
    ele_4 = ligand.atoms[atoms[3]].element.symbol
    if (ele_1 == 'O') or (ele_4 == 'O'):
        print(atoms, parameter)
        torsion_smirkses.add(parameter.smirks)

(0, 1, 2, 3) <ProperTorsionType with smirks: [#8,#16,#7:1]=[#6X3:2]-[#8X2H0:3]-[#6X4:4]  periodicity1: 2  periodicity2: 1  phase1: 180.0 deg  phase2: 180.0 deg  id: t101  k1: 2.458649012284 kcal/mol  k2: 0.2721741554653 kcal/mol  idivf1: 1.0  idivf2: 1.0  >
(1, 2, 4, 5) <ProperTorsionType with smirks: [*:1]~[#6X3:2]-[#6X3$(*=[#8,#16,#7]):3]~[*:4]  periodicity1: 2  phase1: 180.0 deg  id: t47  k1: 0.9350453896311 kcal/mol  idivf1: 1.0  >
(1, 2, 4, 9) <ProperTorsionType with smirks: [*:1]~[#6X3:2]-[#6X3$(*=[#8,#16,#7]):3]~[*:4]  periodicity1: 2  phase1: 180.0 deg  id: t47  k1: 0.9350453896311 kcal/mol  idivf1: 1.0  >
(3, 2, 4, 5) <ProperTorsionType with smirks: [*:1]~[#6X3:2]-[#6X3$(*=[#8,#16,#7]):3]~[*:4]  periodicity1: 2  phase1: 180.0 deg  id: t47  k1: 0.9350453896311 kcal/mol  idivf1: 1.0  >
(3, 2, 4, 9) <ProperTorsionType with smirks: [*:1]~[#6X3:2]-[#6X3$(*=[#8,#16,#7]):3]~[*:4]  periodicity1: 2  phase1: 180.0 deg  id: t47  k1: 0.9350453896311 kcal/mol  idivf1: 1.0  >
(6, 7, 10, 11)

_You can go back to the [original FF file](https://github.com/openforcefield/openforcefields/blob/master/openforcefields/offxml/openff-1.2.0.offxml) to see where these are defined._

This returns three _unique_ parameters, so I use a Python `set` to record all of their SMIRKSes.

Now let's change the underlying FF to prefer those torsions being perpendicular.

In [33]:
for smarts in torsion_smirkses:
    oxygen_torsion = openff_forcefield.get_parameter_handler('ProperTorsions').parameters[smarts]
    oxygen_torsion.periodicity1 = 2
    oxygen_torsion.phase1 = 180 * unit.degree
    oxygen_torsion.k1 = -10 * unit.kilocalorie / unit.mole

In [34]:
ligand_system = openff_forcefield.create_openmm_system(ligand_topology)

integrator = openmm.LangevinIntegrator(300*unit.kelvin, 1/unit.picosecond, 0.002*unit.picoseconds)
simulation = openmm.app.Simulation(ligand_topology.to_openmm(), ligand_system, integrator)
simulation.context.setPositions(omm_pdbfile.positions)
simulation.minimizeEnergy()

lig_struct = pmd.openmm.load_topology(simulation.topology, ligand_system, simulation.context.getState(getPositions=True).getPositions())
with NamedTemporaryFile(suffix='.pdb') as tf:
    openmm.app.PDBFile.writeModel(simulation.topology, simulation.context.getState(getPositions=True).getPositions(), open(tf.name, 'w'))
    view = nglview.show_file(tf.name)
view

NGLWidget()

## Workflow 2 Conclusions:

* The 0.3 update of the SMIRNOFF specification has brought the object model more closely in line with the XML representation
* The ForceField object model exposes a way to inspect the parameters assigned to molecules
* The SMARTS-based parameters themselves can be modified prior to system creation
* The resulting systems are *immediately* ready for calculation
* This API enables fully automated cycles of parameter optimization
* Generally, this creates opportunities to bridge cheminformatics and FF science

Yet to come - An OpenFF `System` class
* Could use a layer of indirection to make parameter optimization more efficient
* Will require resolving questions in the SMIRNOFF spec
    * How will the hierarchy of charge models be resolved?
    * How will `GBSA` and `Electrostatics` forces know to inherit the same charges?
    * Where will VirtualSites, which have both charge and vdW parameters, be defined?
