# Solvated Ibuprofen

This notebook shows how to set up a simulation cell of two solvated ibuprofen molecules, using the same [Open Force Field (OFF)](https://openforcefield.org/) toolkit as in `01_ibuprofen_gas_phase.ipynb`.

The imports below will generate a warning that you can safely ignore.

In [None]:
# Python built-in modules
from sys import stdout

# Popular scientific packages for Python
import matplotlib.pyplot as plt
import mdtraj
# MD related packages
import nglview
import numpy as np
# OpenFF package.
# Do not use from openff.xxx import ... to avoid name collisions.
import openff.toolkit.topology
import openff.toolkit.typing.engines.smirnoff
import openmmforcefields.generators
# Other utilities for setting up the
import openmoltools
import pandas
import requests
# Custom functions defined in the current directory
from ligands import *
from openmm import *
from openmm.app import *
from openmm.unit import *

## 1. Download and convert molecules from PubChem

In [None]:
# Provide the PubChem compound IDs as strings:
cids = [
    "3672",  # ibuprofen
    "962",   # water
]
# Residue names to be used in the PDB files
resnames = [
    "IBU",
    "HOH",
]
# Number of times each molecule is added to the simulation cell.
num_molecules = [
    2,
    500
]

In [None]:
fns_sdf = []
fns_pdb = []
for cid, resname in zip(cids, resnames):
    # Download the SDF file if not present yet.
    fn_sdf = f"CID_{cid}.sdf"
    if not os.path.isfile(fn_sdf):
        url = f"https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/CID/{cid}/record/SDF/?record_type=3d&response_type=save"
        with open(fn_sdf, "w") as f:
            f.write(requests.get(url).text)
    fn_pdb = f"CID_{cid}.pdb"
    convert_sdf_to_pdb(fn_sdf, fn_pdb, resname)
    fns_sdf.append(fn_sdf)
    fns_pdb.append(fn_pdb)

## 2. Build the explicit solvent model with packmol

Packmol is a tool to place molecules at random positions into a simulation cell, without any overlap. We use it here to generate random mixtures, but it can also be used for more advanced setups. See http://m3g.iqm.unicamp.br/packmol/home.shtml

In [None]:
# approximate volumes of solute and solvent, needed to estimate the initial box size.
volumes = [estimate_volume(fn_pdb) for fn_pdb in fns_pdb]
print("Volumes [Å^3]:", volumes)
total_volume = np.dot(volumes, num_molecules)
box_size = total_volume ** (1.0 / 3.0)
print("Box size [Å]:", box_size)

# Run packmol through the openmoltools wrapper.
print("--- Packmol input ---")
traj_packmol = openmoltools.packmol.pack_box(
    fns_pdb, num_molecules, box_size=box_size)
traj_packmol.save_pdb("packmol_02.pdb")

As you can see in the visualization of the initial structure below, the simulation cell contains a vacuum edge, which is due to a limitation of packmol. This is not a problem since the system will be equilibrated with NpT molecular dynamics, which will gradually adjust the cell size.

In [None]:
# Visualize the initial solvated system
traj_init = mdtraj.load("packmol_02.pdb")
view = nglview.show_mdtraj(traj_init)
view.clear_representations()
view.add_licorice()
view.add_unitcell()
view

## 3. Assign Sage parameters with the SMIRNOFF engine

In this section, the force field parameters are assigned using the methodology developed in the OpenFF community.
The SDF files are used as input because they contain bond-order data, which are needed to assign the correct parameters.

In [None]:
# The generator can create an OpenMM system object.
# It uses standard force fields when applicable and
# falls back to OpenFF when needed.
generator = openmmforcefields.generators.SystemGenerator(
    ["tip3p.xml"],
    small_molecule_forcefield="openff-2.0.0",
    molecules=[openff.toolkit.topology.Molecule.from_file(
        fn_sdf) for fn_sdf in fns_sdf]
)
# Create the OpenMM system using the output of packmol.
pdb = PDBFile("packmol_02.pdb")
system = generator.create_system(pdb.topology)

The following line prints the non-bonding parameters of the last atom, which you can use to check if the right parameters were loaded for the last atom.

For example, when running this notebook without modifications, the molecule is water and the last atom is a hydrogen. You should observe the charge of the hydrogen atom in tip3p-fb, which you can compare to the parameter from the XML parameter file: https://github.com/openmm/openmm/blob/master/wrappers/python/openmm/app/data/amber14/tip3pfb.xml#L249

In [None]:
for force in system.getForces():
    if isinstance(force, openmm.NonbondedForce):
        npart = force.getNumParticles()
        print(force.getParticleParameters(npart-1))

## 4. Short NpT molecular dynamics simulation

In [None]:
# Setup the MD
temperature = 300 * kelvin
pressure = 1 * bar
integrator = LangevinIntegrator(temperature, 1/picosecond, 2*femtoseconds)
system.addForce(MonteCarloBarostat(pressure, temperature))
simulation = Simulation(pdb.topology, system, integrator)
simulation.context.setPositions(pdb.positions)
simulation.minimizeEnergy()

# Write the initial state back to a PDB, could be useful
# for debugging.
with open("init_02.pdb", "w") as f:
    PDBFile.writeFile(simulation.topology, pdb.positions, f)

# Set the reporters collecting the MD output.
simulation.reporters = []
simulation.reporters.append(DCDReporter('traj_02.dcd', 100))
simulation.reporters.append(StateDataReporter(
    stdout, 1000,
    step=True,
    temperature=True,
    elapsedTime=True
))
simulation.reporters.append(StateDataReporter(
    "scalars_02.csv", 100,
    time=True,
    potentialEnergy=True,
    totalEnergy=True,
    temperature=True,
    volume=True
))
simulation.step(10000)

# The last line is only needed for Windows users,
# to close the DCD file before it can be opened by nglview.
del simulation

In [None]:
# Visualize the trajectory.
view = nglview.show_mdtraj(mdtraj.load("traj_02.dcd", top="init_02.pdb"))
view.clear_representations()
view.add_licorice()
view.add_unitcell()
view

In [None]:
# Plot temperature and volume as an initial verification of convergence.
df = pandas.read_csv("scalars_02.csv")
df.plot(kind='line', x='#"Time (ps)"', y='Temperature (K)')
df.plot(kind='line', x='#"Time (ps)"', y='Box Volume (nm^3)')

## 5. Changing compositions

**<span style="color:#A03;font-size:14pt">
&#x270B; HANDS-ON! &#x1F528;
</span>**

> Try a few different compositions of the simulation cell (i) to verify that you can include practically any molecule you like and (ii) to see how the cost of setting up the force field can vary:
>
> - Remove the ibuprofen molecules.
> - Replace the second ibuprofen by another drug molecule (e.g. aspirin).
> - Replace the two ibuprofens by one large drug molecule (e.g. amoxicillin).
> - Replace the solvent (e.g. dimethylether).