# Solvating and equilibrating a ligand in a box of water

<details>
    <summary><small>▼ Click here for dependency installation instructions</small></summary>
    The simplest way to install dependencies is to use the Interchange examples environment. From the root of the cloned openff-interchange repository:
    
    conda env create --name interchange-examples --file devtools/conda-envs/examples_env.yaml 
    conda activate interchange-examples
    pip install -e .
    cd examples
    jupyter notebook ligand_in_water.ipynb
    
</details>

In [None]:
import time

import mdtraj
import openmm
import openmm.app
from openff.toolkit import ForceField, Molecule, Topology
from openff.units import unit
from openff.units.openmm import to_openmm

from openff.interchange import Interchange

In [None]:
ligand = Molecule.from_mapped_smiles(
    "[H:7][C@:6]1([C:13](=[C:11]([C:9](=[O:10])[O:8]1)[O:12][H:19])[O:14][H:20])[C@:3]([H:4])([C:2]([H:16])([H:17])[O:1][H:15])[O:5][H:18]"
)
water = Molecule.from_mapped_smiles("[H:2][O:1][H:3]")

There are a few ways to convert the information in this trajectory to an Openff [`Topology`](https://docs.openforcefield.org/projects/toolkit/en/stable/api/generated/openff.toolkit.topology.Topology.html#openff.toolkit.topology.Topology) object. In this case, since we already know how many of which molecules we want, we'll use [`Topology.from_molecules`](https://docs.openforcefield.org/projects/toolkit/en/stable/api/generated/openff.toolkit.topology.Topology.html#openff.toolkit.topology.Topology.from_molecules), which takes a list of `Molecule` objects and assembles them into a `Topology`.

In [None]:
topology = Topology.from_molecules([ligand, *2100 * [water]])

We'll also set the box vectors to match the prepared PDB file.

In [None]:
topology.box_vectors = unit.Quantity(
    mdtraj.load("solvated.pdb").unitcell_lengths[0],
    unit.nanometer,
)

And finally, set the positions of each molecule according to the data in the MDTraj object:

The ["Sage"](https://openforcefield.org/community/news/general/sage2.0.0-release/) force field line (version 2.x.x) includes TIP3P  parameters for water, so we don't need to use multiple force fields to parametrize this topology as long as we're okay using TIP3P.

Note that the "Parsley" (version 1.x.x) line did *not* include TIP3P parameters, so loading in an extra force field was required.

In [None]:
sage = ForceField("openff-2.0.0.offxml")

From here, we can create an ``Interchange`` object and promptly export it to an [``openmm.System``](http://docs.openmm.org/latest/api-python/generated/openmm.openmm.System.html#openmm.openmm.System):

In [None]:
interchange = Interchange.from_smirnoff(force_field=sage, topology=topology)
system = interchange.to_openmm(combine_nonbonded_forces=True)

Note that these two steps (creating an `Interchange` object and exporting it to an [``openmm.System``](http://docs.openmm.org/latest/api-python/generated/openmm.openmm.System.html#openmm.openmm.System)) can equivalently be done in one step via [``ForceField.create_openmm_system``](https://docs.openforcefield.org/projects/toolkit/en/stable/api/generated/openff.toolkit.typing.engines.smirnoff.forcefield.ForceField.html#openff.toolkit.typing.engines.smirnoff.forcefield.ForceField.create_openmm_system).

In [None]:
interchange: Interchange = Interchange.from_smirnoff(
    force_field=sage, topology=topology
)
system: openmm.System = interchange.to_openmm(combine_nonbonded_forces=True)

Note that these two lines do essentially the same thing as calling `sage.create_openmm_system(topology)`, which can be used to the same result if only interested in using OpenMM. Here we explicitly store the intermediate `Interchange` object for later steps.

Finally, we need to set the positions according to the PDB file. There are plenty of ways to extract positions from a PDB file, here we'll use MDTraj.

In [None]:
interchange.positions = unit.Quantity(
    mdtraj.load("solvated.pdb").xyz[0],
    unit.nanometer,
)

Now, we can prepare everything else that OpenMM needs to run and report a brief equilibration simulation:
* A barostat, since we want to use NPT dynamics to relax the box size toward equilibrium
* An integrator
* A [`Simulation`](http://docs.openmm.org/latest/api-python/generated/openmm.app.simulation.Simulation.html#openmm.app.simulation.Simulation) object, putting it together
* Reporters for the trajectory and simulation data

In [None]:
barostat = openmm.MonteCarloBarostat(
    1.00 * openmm.unit.bar, 293.15 * openmm.unit.kelvin, 25
)
system.addForce(barostat)

integrator = openmm.LangevinIntegrator(
    300 * openmm.unit.kelvin, 1 / openmm.unit.picosecond, 2 * openmm.unit.femtoseconds
)

simulation = openmm.app.Simulation(topology.to_openmm(), system, integrator)
simulation.context.setPositions(to_openmm(interchange.positions))
simulation.context.setVelocitiesToTemperature(300 * openmm.unit.kelvin)

pdb_reporter = openmm.app.PDBReporter("trajectory.pdb", 100)
state_data_reporter = openmm.app.StateDataReporter(
    "data.csv", 10, step=True, potentialEnergy=True, temperature=True, density=True
)
simulation.reporters.append(pdb_reporter)
simulation.reporters.append(state_data_reporter)

Finally, we can run this simulation. This should take approximately 10-20 seconds on a laptop or small workstation.

In [None]:
print("Starting simulation")
start_time = time.process_time()

print("Step, box lengths (nm)")

for step in range(5000):
    simulation.step(1)
    if step % 500 == 0:
        box_vectors = simulation.context.getState().getPeriodicBoxVectors()
        print(step, [round(box_vectors[dim][dim]._value, 3) for dim in range(3)])

end_time = time.process_time()
print(f"Elapsed time: {(end_time - start_time):.2f} seconds")

## Appendix A: visualizing the trajectory

If [NGLView](http://nglviewer.org/nglview/latest/) is installed, we can use it and MDTraj to load and visualize the PDB trajectory:

In [None]:
# NBVAL_SKIP
try:
    import nglviewasdfkjskdfjskfdj

    view = nglview.show_mdtraj(mdtraj.load("trajectory.pdb"))
    view
except ImportError:
    pass

## Appendix B: Using the TIP5P water model

If desired, we can use a different force field for the water, even one that uses virtual sites! To start, we'll load up Sage alongside a TIP5P file in this directory. When passed multiple force field sources, the `ForceField` class makes an effort to combine them into a single object.

Note that Sage was parametrized alongside TIP3P - this example only shows how to substitute water models and does not delve into the intricacies of water model selections.

In [None]:
force_field = ForceField("openff-2.0.0.offxml", "tip5p.offxml")

The `Molecule` object in the OpenFF Toolkit does not store information about virtual sites since they do not exist in chemical representations of molecules. This is convient here, since we can re-use the same `Topology` object we created before and the same atomic positions we loaded in from the PDB file.

In [None]:
interchange: Interchange = Interchange.from_smirnoff(
    force_field=force_field, topology=topology
)
system: openmm.System = interchange.to_openmm(combine_nonbonded_forces=True)

interchange.positions = unit.Quantity(
    mdtraj.load("solvated.pdb").xyz[0],
    unit.nanometer,
)

In [None]:
from openff.interchange.interop.openmm import to_openmm_positions

barostat = openmm.MonteCarloBarostat(
    1.00 * openmm.unit.bar, 293.15 * openmm.unit.kelvin, 25
)
system.addForce(barostat)

integrator = openmm.LangevinIntegrator(
    300 * openmm.unit.kelvin, 1 / openmm.unit.picosecond, 2 * openmm.unit.femtoseconds
)

simulation = openmm.app.Simulation(interchange.to_openmm_topology(), system, integrator)
simulation.context.setPositions(to_openmm(to_openmm_positions(interchange)))
simulation.context.setVelocitiesToTemperature(300 * openmm.unit.kelvin)
simulation.context.computeVirtualSites()

pdb_reporter = openmm.app.PDBReporter("trajectory.pdb", 100)
state_data_reporter = openmm.app.StateDataReporter(
    "data.csv", 10, step=True, potentialEnergy=True, temperature=True, density=True
)
simulation.reporters.append(pdb_reporter)
simulation.reporters.append(state_data_reporter)

In [None]:
print("Starting simulation")
start_time = time.process_time()

print("Step, box lengths (nm)")

for step in range(5000):
    simulation.step(1)
    if step % 1 == 0:
        box_vectors = simulation.context.getState().getPeriodicBoxVectors()
        print(step, [round(box_vectors[dim][dim]._value, 3) for dim in range(3)])

end_time = time.process_time()
print(f"Elapsed time: {(end_time - start_time):.2f} seconds")

In [None]:
view = nglview.show_mdtraj(mdtraj.load("trajectory.pdb"))
view

## Appendix C: using the TIP4P water model

In [None]:
from openff.interchange.tests import get_test_file_path

force_field = ForceField("openff-2.0.0.offxml", get_test_file_path("tip4p.offxml"))

In [None]:
interchange: Interchange = Interchange.from_smirnoff(
    force_field=force_field, topology=topology
)
system: openmm.System = interchange.to_openmm(combine_nonbonded_forces=True)

interchange.positions = unit.Quantity(
    mdtraj.load("solvated.pdb").xyz[0],
    unit.nanometer,
)

In [None]:
from openff.interchange.interop.openmm import to_openmm_positions

barostat = openmm.MonteCarloBarostat(
    1.00 * openmm.unit.bar, 293.15 * openmm.unit.kelvin, 25
)
system.addForce(barostat)

integrator = openmm.LangevinIntegrator(
    300 * openmm.unit.kelvin, 1 / openmm.unit.picosecond, 2 * openmm.unit.femtoseconds
)

simulation = openmm.app.Simulation(interchange.to_openmm_topology(), system, integrator)
simulation.context.setPositions(to_openmm(to_openmm_positions(interchange)))
simulation.context.setVelocitiesToTemperature(300 * openmm.unit.kelvin)
simulation.context.computeVirtualSites()

pdb_reporter = openmm.app.PDBReporter("trajectory.pdb", 100)
state_data_reporter = openmm.app.StateDataReporter(
    "data.csv", 10, step=True, potentialEnergy=True, temperature=True, density=True
)
simulation.reporters.append(pdb_reporter)
simulation.reporters.append(state_data_reporter)

In [None]:
print("Starting simulation")
start_time = time.process_time()

for step in range(5000):
    simulation.step(1)
    if step % 500 == 0:
        print(f"{step}\t{simulation.context.getState().getPeriodicBoxVectors()}")

end_time = time.process_time()
print(f"Elapsed time: {(end_time - start_time):.2f} seconds")