# Ibuprofen in gas phase

This notebook demonstrates a simple use case of the [Open Force Field (OFF)](https://openforcefield.org/) toolkit. [OFF](https://openforcefield.org/) is a relatively new initiative and aims to bring more transparency into the process of generating force fields (for small molecules) through open science and open source principles.

We will create a force field for a small molecule, for which no standard force fields are available, using version 2.0.0 of the [Sage force field](https://openforcefield.org/force-fields/force-fields/). Sage covers a very broad set of organic small drug-like molecules. This parameterization is comparable to older initiatives like GAFF, OPLS or CGENFF. Unlike these older models, Sage makes extensive use of modern cheminformatics tools, which makes the force field easier to extend and improve. More details can be found here: [10.1021/acs.jctc.8b00640](https://pubs.acs.org/doi/10.1021/acs.jctc.8b00640)

This notebook uses ibuprofen as a simple test case. In principle any small drug-like molecule could be used instead and your are encouraged to try your favorite compound from [PubChem](https://pubchem.ncbi.nlm.nih.gov/).

The imports below will generate some warnings, which you can safely ignore.

In [None]:
# Python built-in modules
from sys import stdout

# Popular scientific packages for Python
import matplotlib.pyplot as plt
import mdtraj
# MD related packages
import nglview
import numpy as np
# OpenFF package.
# Do not use from openff.xxx import ... to avoid name collisions.
import openff.toolkit.topology
import openff.toolkit.typing.engines.smirnoff
import pandas
import requests
# Custom functions defined in the current directory
from ligands import *
from openmm import *
from openmm.app import *
from openmm.unit import *

## 1. Download a molecule from PubChem

In [None]:
cid = '3672'  # This is the pubchem ID for ibuprofen
fn_sdf = f"CID_{cid}.sdf"
if not os.path.isfile(fn_sdf):
    url = f"https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/CID/{cid}/record/SDF/?record_type=3d&response_type=save"
    with open(f"CID_{cid}.sdf", "w") as f:
        f.write(requests.get(url).text)
fn_pdb = f"CID_{cid}.pdb"
convert_sdf_to_pdb(fn_sdf, fn_pdb)

## 2. Assigning Sage 2.0.0 parameters with the SMIRNOFF engine

Only in the following code, the Open Force Field toolkit is used.

The first step is to maken an OpenFF Molecule object, here by loading the SDF file.
OpenFF Molecule objects can be made in [many different ways](https://open-forcefield-toolkit.readthedocs.io/en/stable/users/molecule_cookbook.html).

In [None]:
# Load the topology from the SDF file.
molecule = openff.toolkit.topology.Molecule(fn_sdf)
# Quick and dirty visualization of the molecule.
molecule.visualize()

Now, we will derive an OpenMM system object, using the Sage force field.
This looks simple, yet a lot is going on under the hood to derive force field parameters for this specific molecule.

In [None]:
force_field = openff.toolkit.typing.engines.smirnoff.ForceField(
    "openff-2.0.0.offxml")
system = force_field.create_openmm_system(molecule.to_topology())

## 3. Perform a short molecular dynamics simulation

The code below is essentially a copy from the molecular dynamics simulations on alanine dipeptide in section 2.

You should receive warnings about duplicates atoms, which can be ignored. These are caused by a few too simple choices in the PDB file generated by OpenBabel.

In [None]:
# Setup the MD
pdb = PDBFile(fn_pdb)

integrator = LangevinIntegrator(300*kelvin, 1/picosecond, 2*femtoseconds)
simulation = Simulation(pdb.topology, system, integrator)
simulation.context.setPositions(pdb.positions)
simulation.minimizeEnergy()

# Write the initial state back to a PDB, could be useful
# for debugging.
with open("init_01.pdb", "w") as f:
    PDBFile.writeFile(simulation.topology, pdb.positions, f)

# Set the reporters collecting the MD output.
simulation.reporters = []
simulation.reporters.append(DCDReporter('traj_01.dcd', 100))
simulation.reporters.append(StateDataReporter(
    stdout, 1000, step=True,
    temperature=True, elapsedTime=True
))
simulation.reporters.append(StateDataReporter(
    "scalars_01.csv", 100, time=True,
    potentialEnergy=True, totalEnergy=True, temperature=True
))
simulation.step(10000)

# The last line is only needed for Windows users,
# to close the DCD file before it can be opened by nglview.
del simulation

In [None]:
# Visualize the trajectory.
view = nglview.show_mdtraj(mdtraj.load("traj_01.dcd", top="init_01.pdb"))
view.clear_representations()
view.add_licorice()
view.add_unitcell()
view

In [None]:
# Plot temperature as an initial verification of convergence.
df = pandas.read_csv("scalars_01.csv")
df.plot(kind='line', x='#"Time (ps)"', y='Temperature (K)')