# Solvating and equilibrating a sulfonamide ligand in a box of water

We want to ensure that sulfonamide molecules keep their geometry correct in solvent. This test will assess that.

In [11]:
import time

import mdtraj
import numpy
import openmm
import openmm.app
import openmm.unit
from openff.toolkit import ForceField, Molecule, Topology
from openff.units import unit

from openff.interchange import Interchange
from openff.interchange.components._packmol import RHOMBIC_DODECAHEDRON, pack_box
from openff.interchange.interop.openmm import to_openmm_positions
import MDAnalysis as mda

## Construct the topology

In this example we’ll construct a topology consisting of one ligand in a rhombic dodecahedral box with 2100 water molecules. We’ll use a mapped SMILES when creating Molecule objects to ensure the atom ordering matches. (Atom ordering is not strictly a part of SMILES and therefore liable to be changed with updates to RDKit.)

This can be extended or modified by i.e.

   * Replacing this sample ligand with a different ligand of interest - substitute out the ligand SMILES
   * Using a different number of water molecules - substitute out the 2100 used below
   * Adding ions or co-solvents into the box - add more Molecule object as desired

In [2]:
ligand = Molecule.from_file('../geometries/36973709/qcaid_36973709.sdf', allow_undefined_stereo=True)
water = Molecule.from_mapped_smiles("[H:2][O:1][H:3]")

There are a few ways to convert the information in this trajectory to an Openff Topology object. Since we already know how many of which molecules we want, we’ll use a PACKMOL wrapper shipped with Interchange. The Topology object returned by pack_box contains the ligand, 1000 copies of water, the box vectors we asked for, and the positions generated by PACKMOL.

In [3]:
topology = pack_box(
    molecules=[ligand, water],
    number_of_copies=[1, 1000],
    box_vectors=3.5 * RHOMBIC_DODECAHEDRON * unit.nanometer,
)
topology.n_molecules, topology.box_vectors, topology.get_positions().shape

(1001,
 array([[3.5       , 0.        , 0.        ],
        [0.        , 3.5       , 0.        ],
        [1.75      , 1.75      , 2.47487373]]) <Unit('nanometer')>,
 (3043, 3))

The ["Sage"](https://openforcefield.org/community/news/general/sage2.0.0-release/) force field line (version 2.x.x) includes TIP3P  parameters for water, so we don't need to use multiple force fields to parametrize this topology as long as we're okay using TIP3P.

Note that the "Parsley" (version 1.x.x) line did *not* include TIP3P parameters, so loading in an extra force field was required.

In [5]:
sage = ForceField("../../../openff_unconstrained-2.2.1-rc1.offxml", allow_cosmetic_attributes=True)

From here, we can create an Interchange object, which stores the results of applying the force field to the topology. Since the Topology object contained positions and box vectors, we don’t need to set them again - they’re already set on the Interchange object!

In [6]:
interchange: Interchange = Interchange.from_smirnoff(
    force_field=sage, topology=topology
)
interchange.topology.n_atoms, interchange.box, interchange.positions.shape

(3043,
 array([[3.5       , 0.        , 0.        ],
        [0.        , 3.5       , 0.        ],
        [1.75      , 1.75      , 2.47487373]]) <Unit('nanometer')>,
 (3043, 3))

Now, we can prepare everything else that OpenMM needs to run and report a brief equilibration simulation:
* A barostat, since we want to use NPT dynamics to relax the box size toward equilibrium
* An integrator
* A [`Simulation`](http://docs.openmm.org/latest/api-python/generated/openmm.app.simulation.Simulation.html#openmm.app.simulation.Simulation) object, putting it together
* Reporters for the trajectory and simulation data

For convenience, let's wrap some boilerplate code into a function that can be called again later with different inputs.

In [7]:
def create_simulation(
    interchange: Interchange,
    pdb_stride: int = 500,
    trajectory_name: str = "trajectory.pdb",
) -> openmm.app.Simulation:
    integrator = openmm.LangevinIntegrator(
        300 * openmm.unit.kelvin,
        1 / openmm.unit.picosecond,
        1 * openmm.unit.femtoseconds,
    )

    barostat = openmm.MonteCarloBarostat(
        1.0 * openmm.unit.bar, 293.15 * openmm.unit.kelvin, 25
    )

    simulation = interchange.to_openmm_simulation(
        combine_nonbonded_forces=True,
        integrator=integrator,
    )

    simulation.system.addForce(barostat)

    # https://github.com/openmm/openmm/wiki/Frequently-Asked-Questions#why-does-it-ignore-changes-i-make-to-a-system-or-force
    simulation.context.reinitialize(preserveState=True)

    # https://github.com/openmm/openmm/issues/3736#issuecomment-1217250635
    simulation.minimizeEnergy()

    simulation.context.setVelocitiesToTemperature(300 * openmm.unit.kelvin)
    simulation.context.computeVirtualSites()

    pdb_reporter = openmm.app.PDBReporter(trajectory_name, pdb_stride)
    state_data_reporter = openmm.app.StateDataReporter(
        "data.csv",
        10,
        step=True,
        potentialEnergy=True,
        temperature=True,
        density=True,
    )
    simulation.reporters.append(pdb_reporter)
    simulation.reporters.append(state_data_reporter)

    return simulation

In [8]:
simulation = create_simulation(interchange,trajectory_name='sulfonamide_trajectory.pdb')

Finally, we can run this simulation. This should take approximately 10-20 seconds on a laptop or small workstation.

Again, let's wrap this up into a function to avoid copy-pasting code.

In [9]:
def run_simulation(simulation: openmm.app.Simulation, n_steps: int = 5000):
    print("Starting simulation")
    start_time = time.process_time()

    print("Step, volume (nm^3)")

    for step in range(n_steps):
        simulation.step(1)
        if step % 500 == 0:
            box_vectors = simulation.context.getState().getPeriodicBoxVectors()
            print(step, numpy.linalg.det(box_vectors._value).round(3))

    end_time = time.process_time()
    print(f"Elapsed time: {(end_time - start_time):.2f} seconds")

In [10]:
run_simulation(simulation, n_steps=100000)

Starting simulation
Step, volume (nm^3)
0 30.317
500 30.863
1000 30.659
1500 30.54
2000 30.812
2500 31.169
3000 30.776
3500 30.756
4000 30.766
4500 30.769
5000 30.646
5500 30.556
6000 31.138
6500 30.975
7000 30.966
7500 30.805
8000 31.139
8500 30.858
9000 31.077
9500 30.881
10000 30.901
10500 30.863
11000 31.226
11500 31.237
12000 30.914
12500 30.383
13000 30.495
13500 30.506
14000 30.416
14500 30.548
15000 30.348
15500 30.33
16000 30.196
16500 30.125
17000 30.708
17500 30.717
18000 30.879
18500 30.65
19000 30.617
19500 30.632
20000 31.153
20500 31.068
21000 31.318
21500 31.188
22000 31.22
22500 30.825
23000 30.991
23500 30.98
24000 31.114
24500 30.752
25000 30.746
25500 30.666
26000 30.737
26500 30.928
27000 30.514
27500 30.664
28000 30.835
28500 31.238
29000 30.692
29500 31.258
30000 30.551
30500 30.379
31000 30.402
31500 30.46
32000 30.69
32500 30.589
33000 30.338
33500 30.576
34000 30.575
34500 30.862
35000 30.314
35500 30.224
36000 30.847
36500 31.052
37000 31.409
37500 31.336
380

In [13]:
# strip water
u = mda.Universe("sulfonamide_trajectory.pdb")
with mda.Writer("sulfonamide_trajectory_noh2o.pdb", len(u.residues[0].atoms)) as writer:
    for _ in u.trajectory:
        writer.write(u.residues[0].atoms)

