> # OpenForceField Tutorial 1
>> ## How to parametrize a small substrate with OpenFF for Simulation in CHARMM/GROMACS
>>> ### Dr. Dennis Della Corte
>>> *Department of Physics and Astrophysics,*
>>> *Brigham Young University*

<img src="figs/title.png">

**After completing this tutorial, you will know how to:**
-	Parametrize substrates in OpenFF
-	Export structure and topology files for simulation with CHARMM or GROMACS

**Preliminaries:**
-	Install ANACONDA or MINICONDA as described [here](https://docs.anaconda.com/anaconda/install/).
-	Install the most recent OpenForceField in Linux or Mac Terminal:

	```conda create -n openff python=3.7 anaconda
      conda config --add channels omnia --add channels conda-forge
      conda update --all
	  conda install -n openff -c omnia smirnoff99frosst
      conda activate openff```
      
-	Install the Open Forcefield toolkit:

    ```conda install -n openff  -c omnia openforcefield```
-	Install OpenMM:

    ```conda install -n openff -c omnia openmm```
-	Opt.: For MD with GROMACS or CHARMM you will need a running installation of this software; such installations are out of scope for this tutorial.


# Step by Step Guide:

**Case Study: Simulate Lysine in water in GROMACS.**

1.	Check if someone has done the work for you and your substrate has already been parametrized and uploaded here. If the work has not been done follow the next steps.

2.	Make sure that your installation of OpenFF is up to date by opening a terminal and entering:
conda activate openff
conda update openforcefield

3.	In order to parametrize your substrate, you will need the full chemical identity, including Elements, Connectivity, Bond Orders and Formal Charges. The recommended representation of your substrate is Stereo SMILES (you can learn more about this representation [here](http://opensmiles.org/opensmiles.html). A PDB file is typically not sufficient.  You can read the OpenFF FAQ for more details [here](https://open-forcefield-toolkit.readthedocs.io/en/latest/faq.html). If you do not know the SMILES string and cannot obtain it from a collaborator, you can use the following procedure to obtain it for substrates inside of the Protein Data Bank.

    1.	Go to the [Ligand Expo](http://ligand-expo.rcsb.org/ld-search.html)

    2.	Enter as search term a component identifier (3-letter code) <img src="figs/fig1.png">         
    3.	Inspect the query results [here](http://ligand-expo.rcsb.org/pyapps/ldHandler.py?formid=cc-index-search&target=LYS&operation=ccid).  Under the tab Chemical Details you can find the stereo SMILES representation of LYSINE as obtained from the OpenEye toolkit: <img src="figs/fig2.png">
              
    4.	Under Downloads you should also download the ideal coordinates of your substrate, the PDB format will suffice for the next steps.

4.	With the SMILES string and PDB in place, we can now go ahead and produce the topology and structure files for simulations with GROMACS or CHARMM. The workflow for this involves an initial conversion to an OpenMM system, followed by a conversion with ParmED to GROMACS or CHARMM topology and structure files.  For this you can copy the following python script into an ipython shell (change directory to where you downloaded the LYS_ideal.pdb, type “ipython” into the terminal and don’t forget to activate the environment with “conda activate openff”) or notebook (expect a warning that the OpenEyeToolkit cannot be imported, which can safely be ignored as you do not need a license for the rest of this tutorial):

In [None]:
from simtk.openmm.app import PDBFile
from openforcefield.typing.engines.smirnoff import ForceField
from openforcefield.topology import Molecule, Topology

lysine = Molecule.from_smiles("C(CC[NH3+])C[C@@H](C(=O)O)N")
pdbfile = PDBFile('LYS_ideal.pdb')

omm_topology = pdbfile.topology

# Create the Open Forcefield Topology.
off_topology = Topology.from_openmm(omm_topology, unique_molecules=[lysine])

# Load the smirnoff99Frosst force field.
forcefield = ForceField('test_forcefields/smirnoff99Frosst.offxml')
omm_system = forcefield.create_openmm_system(off_topology)

import parmed

# Convert OpenMM System to a ParmEd structure.
parmed_structure = parmed.openmm.load_topology(omm_topology, omm_system, pdbfile.positions)

# Export AMBER files.
parmed_structure.save('system.prmtop', overwrite=True)
parmed_structure.save('system.inpcrd', overwrite=True)

# Export GROMACS files.
parmed_structure.save('system.top', overwrite=True)
parmed_structure.save('system.gro', overwrite=True)

5.	The files system.top and system.gro can now be used to start a simulation of Lysine in Water with GROMACS.  Our topology does not yet include any parametrization for water or ions.  In order to add this, we will have to modify system.top manually.  Open up the file in a text editor and add the forcefield you want to use by including




at the very top over the [defaults] directive of your system.top.  The included forcefield comes with its own [defaults] directive. In order to avoid future errors, delete the lines under the [defaults] directive from system.top:

 Finally, add the water and ions model at the very end of your system.top by including:

    right above the [ molecules ] directive.  

6.	An example bash script to run the Lysine in water simulation is posted below.  Please make sure that the param variable contains the correct path to the run input files ([here](https://www.dropbox.com/sh/xuc8nn6gi7b6j11/AAC30hL1mX6pGI1ffcLTW8Rxa?dl=0) is an example set provided).  For setup of a basic simulation within GROMACS and explanation of various run input parameters, I recommend tutorials posted [here](http://www.mdtutorials.com/gmx/). Open a terminal and enter the following commands:

   You might notice a few odd things about this script. Because we are overwriting atom parameters within the AMBER forcefield in our system.top every call of grompp produces 6 warnings. In addition, it is not possible to balance the charges perfectly for such a small simulation box, so we allow for one additional warning due to electrostatics. This yields the additional  -maxwarn 7 flag in all grompp calls.  

7.	After execution of above script, you should have 1ns MD of Lysine in water produced, using the SMIRNOFF99FROST forcefield. **Congratulations!**