# Solvating and equilibrating a sulfonamide ligand in a box of water

We want to ensure that sulfonamide molecules keep their geometry correct in solvent. This test will assess that.

In [2]:
import time

import mdtraj
import numpy
import openmm
import openmm.app
import openmm.unit
from openff.toolkit import ForceField, Molecule, Topology
from openff.units import unit

from openff.interchange import Interchange
from openff.interchange.components._packmol import RHOMBIC_DODECAHEDRON, pack_box
from openff.interchange.interop.openmm import to_openmm_positions

## Construct the topology

In this example we’ll construct a topology consisting of one ligand in a rhombic dodecahedral box with 2100 water molecules. We’ll use a mapped SMILES when creating Molecule objects to ensure the atom ordering matches. (Atom ordering is not strictly a part of SMILES and therefore liable to be changed with updates to RDKit.)

This can be extended or modified by i.e.

   * Replacing this sample ligand with a different ligand of interest - substitute out the ligand SMILES
   * Using a different number of water molecules - substitute out the 2100 used below
   * Adding ions or co-solvents into the box - add more Molecule object as desired

In [3]:
ligand = Molecule.from_file('../geometries/qcaid_36972425.sdf', allow_undefined_stereo=True)
water = Molecule.from_mapped_smiles("[H:2][O:1][H:3]")

There are a few ways to convert the information in this trajectory to an Openff Topology object. Since we already know how many of which molecules we want, we’ll use a PACKMOL wrapper shipped with Interchange. The Topology object returned by pack_box contains the ligand, 1000 copies of water, the box vectors we asked for, and the positions generated by PACKMOL.

In [4]:
topology = pack_box(
    molecules=[ligand, water],
    number_of_copies=[1, 1000],
    box_vectors=3.5 * RHOMBIC_DODECAHEDRON * unit.nanometer,
)
topology.n_molecules, topology.box_vectors, topology.get_positions().shape

(1001,
 array([[3.5       , 0.        , 0.        ],
        [0.        , 3.5       , 0.        ],
        [1.75      , 1.75      , 2.47487373]]) <Unit('nanometer')>,
 (3031, 3))

The ["Sage"](https://openforcefield.org/community/news/general/sage2.0.0-release/) force field line (version 2.x.x) includes TIP3P  parameters for water, so we don't need to use multiple force fields to parametrize this topology as long as we're okay using TIP3P.

Note that the "Parsley" (version 1.x.x) line did *not* include TIP3P parameters, so loading in an extra force field was required.

In [5]:
sage = ForceField("../../../openff_unconstrained-2.2.0-rc1.offxml", allow_cosmetic_attributes=True)

From here, we can create an Interchange object, which stores the results of applying the force field to the topology. Since the Topology object contained positions and box vectors, we don’t need to set them again - they’re already set on the Interchange object!

In [6]:
interchange: Interchange = Interchange.from_smirnoff(
    force_field=sage, topology=topology
)
interchange.topology.n_atoms, interchange.box, interchange.positions.shape

(3031,
 array([[3.5       , 0.        , 0.        ],
        [0.        , 3.5       , 0.        ],
        [1.75      , 1.75      , 2.47487373]]) <Unit('nanometer')>,
 (3031, 3))

Now, we can prepare everything else that OpenMM needs to run and report a brief equilibration simulation:
* A barostat, since we want to use NPT dynamics to relax the box size toward equilibrium
* An integrator
* A [`Simulation`](http://docs.openmm.org/latest/api-python/generated/openmm.app.simulation.Simulation.html#openmm.app.simulation.Simulation) object, putting it together
* Reporters for the trajectory and simulation data

For convenience, let's wrap some boilerplate code into a function that can be called again later with different inputs.

In [7]:
def create_simulation(
    interchange: Interchange,
    pdb_stride: int = 500,
    trajectory_name: str = "trajectory.pdb",
) -> openmm.app.Simulation:
    integrator = openmm.LangevinIntegrator(
        300 * openmm.unit.kelvin,
        1 / openmm.unit.picosecond,
        1 * openmm.unit.femtoseconds,
    )

    barostat = openmm.MonteCarloBarostat(
        1.0 * openmm.unit.bar, 293.15 * openmm.unit.kelvin, 25
    )

    simulation = interchange.to_openmm_simulation(
        combine_nonbonded_forces=True,
        integrator=integrator,
    )

    simulation.system.addForce(barostat)

    # https://github.com/openmm/openmm/wiki/Frequently-Asked-Questions#why-does-it-ignore-changes-i-make-to-a-system-or-force
    simulation.context.reinitialize(preserveState=True)

    # https://github.com/openmm/openmm/issues/3736#issuecomment-1217250635
    simulation.minimizeEnergy()

    simulation.context.setVelocitiesToTemperature(300 * openmm.unit.kelvin)
    simulation.context.computeVirtualSites()

    pdb_reporter = openmm.app.PDBReporter(trajectory_name, pdb_stride)
    state_data_reporter = openmm.app.StateDataReporter(
        "data.csv",
        10,
        step=True,
        potentialEnergy=True,
        temperature=True,
        density=True,
    )
    simulation.reporters.append(pdb_reporter)
    simulation.reporters.append(state_data_reporter)

    return simulation

In [8]:
simulation = create_simulation(interchange,trajectory_name='sulfamide_trajectory.pdb')

Finally, we can run this simulation. This should take approximately 10-20 seconds on a laptop or small workstation.

Again, let's wrap this up into a function to avoid copy-pasting code.

In [9]:
def run_simulation(simulation: openmm.app.Simulation, n_steps: int = 5000):
    print("Starting simulation")
    start_time = time.process_time()

    print("Step, volume (nm^3)")

    for step in range(n_steps):
        simulation.step(1)
        if step % 500 == 0:
            box_vectors = simulation.context.getState().getPeriodicBoxVectors()
            print(step, numpy.linalg.det(box_vectors._value).round(3))

    end_time = time.process_time()
    print(f"Elapsed time: {(end_time - start_time):.2f} seconds")

In [10]:
run_simulation(simulation, n_steps=100000)

Starting simulation
Step, volume (nm^3)
0 30.317
500 30.603
1000 30.603
1500 30.569
2000 30.455
2500 30.876
3000 30.691
3500 30.174
4000 30.159
4500 30.742
5000 30.401
5500 30.743
6000 30.243
6500 29.975
7000 30.429
7500 30.161
8000 30.41
8500 30.248
9000 30.485
9500 30.605
10000 30.981
10500 30.485
11000 30.49
11500 30.655
12000 31.03
12500 30.669
13000 30.267
13500 30.854
14000 30.633
14500 30.654
15000 30.588
15500 30.875
16000 30.628
16500 29.965
17000 30.775
17500 30.345
18000 30.187
18500 30.692
19000 30.565
19500 30.433
20000 30.907
20500 30.624
21000 30.919
21500 31.048
22000 30.773
22500 30.493
23000 30.579
23500 30.824
24000 30.347
24500 30.195
25000 30.438
25500 30.793
26000 30.516
26500 30.587
27000 31.088
27500 30.364
28000 30.256
28500 30.193
29000 30.646
29500 30.663
30000 30.591
30500 31.37
31000 30.831
31500 30.505
32000 30.465
32500 30.69
33000 30.905
33500 31.185
34000 30.916
34500 30.69
35000 30.758
35500 30.694
36000 30.722
36500 30.662
37000 30.397
37500 30.923
38