# 01 Alchemical free energy setup


<a rel="license" href="http://creativecommons.org/licenses/by/4.0/"><img alt="Creative Commons Licence" style="border-width:0" src="https://i.creativecommons.org/l/by/4.0/88x31.png" title='This work is licensed under a Creative Commons Attribution 4.0 International License.' align="right"/></a>

Authors:   
[Antonia Mey -- @ppxasjsm](https://github.com/ppxasjsm)   
[Lester Hedges -- @lohedges](https://github.com/lohedges)

## Learning objectives:
- Setup an alchemical solvation free energy simulation using BioSimSpace and SOMD
- Setup an alchemical solvation free energy simulation using BioSimSpace and Gromacs
- Setup an alchemical binding free energy simulation for Gromacs and SOMD using BioSimSpace

You will be using the following functions in BioSimSpace:

- `BSS.IO.ReadMolecules()` to load you molecules
- `BSS.Parameters.parameterise()` will be used to parametrise your molecules
- `BSS.Align.matchAtoms()` MCS matches atoms for the morphing
- `BSS.Align.rmsdAlign()` Aligns the molecules to be morphed
- `BSS.Align.merge()` Creates a merged molecule used for alchemical simulations
- `BSS.Solvent.tip3p()` Solvates a molecule in a tip3 water box
- `BSS.Protocol.FreeEnergy()` Defines the free energy protocol
- `BSS.FreeEnergy.Solvation()` Sets up a solvation free energy process
- `BSS.FreeEnergy.Binding()` Sets up a binding free energy process


**Reading time**:
~ 30 mins

**Jupyter cheat sheet**:
- to run the currently highlighted cell, hold <kbd>&#x21E7; Shift</kbd> and press <kbd>&#x23ce; Enter</kbd>;
- to get help for a specific function, place the cursor within the function's brackets, hold <kbd>&#x21E7; Shift</kbd>, and press <kbd>&#x21E5; Tab</kbd>;
- you can find the full documentation at [biosimspace.org](https://biosimspace.org).

## Table of Contents
1. [Working with ligands](#lig)    
   1.1 [Loading ligands](#load)   
   1.2 [Parametrising ligands](#param)   
2. [Morphing ligands](#merge)   
   2.1 [MCS](#mcs)   
   2.2 [morphed ligand](#morph)   
3. [Running solvation free energy simulation](#solv)
4. [Binding free energy simulation](#bind)   
5. [Exercises](#exerc2)   

#### Let's get all the necessary imports out of the way

In [None]:
%pylab inline
import BioSimSpace as BSS

## 1. Free energy of solvation of ethane and methanol
<a id="lig"></a>

We want to compute the relative free energy of hydration between ethane and methanol. That is the free energy difference between hydrating an ethane molecule in water and a methanol molecule in water. It assumes that you had a look at the slides of lecture 1 or attended lecture1.

Below you can see a thermodynamic cycle for the relative hydration free energy of ethane and methanol:
![therm_cycle](images/Therm_cycle.png)

### 1.1 Loading ligands
<a id="load"></a>
Next, we read in the two molecules we want to perturb. In this case, this is ethane changing to methanol. You can use the BioSimSpace function `BSS.IO.readMolecules()` for this task.

In [None]:
# We assume the molecules to perturb are the first molecules in each system
ethane = BSS.IO.readMolecules('data/ethane.pdb').getMolecules()[0]
methanol = BSS.IO.readMolecules('data/methanol.pdb').getMolecules()[0]

It might be nice to quickly check we are reading in the right molecules so we can visualise them using the `viewMolecules()` function.

In [None]:
BSS.viewMolecules('data/ethane.pdb')

In [None]:
BSS.viewMolecules('data/methanol.pdb')

### 1.2 Parametrising molecules
<a id="param"></a>

Current we only have the coordinates information saved for the loaded molecules so the first thing we need to do is generated some forcefield parameters. In this case, we will use the `gaff` [forcefield](http://ambermd.org/antechamber/gaff.html).

In [None]:
ethane = BSS.Parameters.gaff(ethane).getMolecule()
methanol = BSS.Parameters.gaff(methanol).getMolecule()

## 2. Creating merged system
<a id="merge"></a>
Now ethane and methanol have all the required properties for running an MD simulation of them individually. But here we are interested in creating a morphed system, or `single topology` for running an alchemical free energy calculation. In this case, two of the ethane hydrogens will turn into dummy atoms and the second carbon and the 3rd hydrogen will turn into the `oh` group of the methanol. 

### 2.1 MCS
<a id="MCS"></a>
In order to automatically figure out which atoms are common between ethane and methanol we can use the `matchAtoms()` function of BioSimSpace. This will compute a MCS match. An example of what and MCS match might look like is shown here:
![MCS](images/MCS.png)


In [None]:
# mapping returns a dictionary
mapping = {}
mapping = BSS.Align.matchAtoms(ethane, methanol)

print(mapping)

Once we have the mapping we need to align the molecules to each other using an RMSD metric and from the alignment we can then create a merged molecule which contains the `singel topology` information needed.

In [None]:
# Align lig0 to lig1 based on the mapping.
ethane = BSS.Align.rmsdAlign(ethane, methanol, mapping)

# Merge the two ligands based on the mapping.
merged = BSS.Align.merge(ethane, methanol, mapping)

### 2.2 Creating a morph
<a id="morph"></a>

Different software tools have different ways of running alchemical free energy calculations. If you were to use `SOMD` for the underlying free energy calculations you will automatically generate something called a `pert` file. This file contains information on how e.g. the charges change with $\lambda$

Let's have a closer look at this merged molecule

In [None]:
# Looking at merged molecule
prop_map = {}

for prop in merged._sire_molecule.propertyKeys():
    if prop[-1] == "1":
        prop_map[prop[:-1]] = prop

BSS.IO.saveMolecules('test',merged, 'pdb', property_map=prop_map)

In [None]:
BSS.viewMolecules('test.pdb')

In [None]:
!cat 'test.pdb'

In [None]:
merged._toPertFile('ethane_methanol.pert')

In [None]:
!head -n 20 ethane_methanol.pert

### 2.3 Solvation
<a id="solv"></a>

Before we can run a free energy simulation we will have to solvate the system. In this case, rather than passing ethane and methanol separately we will solvate the whole merged system. 

In [None]:
solvated = BSS.Solvent.tip3p(molecule=merged, box=3*[40*BSS.Units.Length.angstrom])

## 3. Solvation free energy
<a id="free"></a>
As before we need to define a protocol and then run this protocol using a solvation free energy process. 
A simple protocol consists of a 2 fs timestep a runtime of 4 ns and using equally spaced 9 $\lambda$ windows. 

In [None]:
# Create the free energy protocol.
protocol = BSS.Protocol.FreeEnergy(timestep=2*BSS.Units.Time.femtosecond, runtime=4*BSS.Units.Time.nanosecond, num_lam=9)


In [None]:
freenrg = BSS.FreeEnergy.Solvation(solvated, protocol, work_dir="ethane_methanol")

Next you would run the alchemical free energy simulation in the following way:

`freenrg.run()`   

This only makes sense on a workstation with GPUs or GPU cloud resources or a GPU cluster. Otherwise you will have to wait for too long to run these simulations on the notebook server. 

In [None]:
# freenrg.run()

Let's have a look the `ethane_methanol` directoy. In this directory you have now all the files setup and ready for simulation using Gromacs as the simulation engine. 

### 3.1. Exercises
<a id="exerc1"></a>

The exercises are announced by the keyword **Exercise** and followed by an incomplete cell.
Missing parts are indicated by
```python
#FIXME
```

### 3.1.1. Exercise on selecting lambda windows
Above we defined a protocol with 9 $\lambda$ windows. We worked out this not an optimal protocol but would like to instead use 12 lambda window. Can you write down a protocol that would allow you to run 12 rather than 9 lambda windows?

In [None]:
# Create the free energy protocol.
protocol = BSS.Protocol.FreeEnergy(#FIXME)

### Solution

In [None]:
# Create the free energy protocol.
protocol = BSS.Protocol.FreeEnergy(timestep=2*BSS.Units.Time.femtosecond, runtime=4*BSS.Units.Time.nanosecond, num_lam=12)

### 3.1.2. Exercise on merged molecules
Previously we have set up an ethane to methanol alchemical free energy simulation. One way of assessing how good an estimate of a free energy difference from an alchemical simulation is, by running the simulation in the opposite direction, i.e. methanol to ethane. Can you set up a new merged molecule and run the necessary steps for the free energy setup?

In [None]:
mapping = {}
mapping = #FIXME

# Align lig0 to lig1 based on the mapping.
methanol = BSS.Align.rmsdAlign(#FIXME)

# Merge the two ligands based on the mapping.
merged_methanol = #FIXME
solvated_methanol = #FIXME
freenrg_methanol = #FIXME

### Solution

In [None]:
mapping = {}
mapping = BSS.Align.matchAtoms(methanol, ethane)

# Align lig0 to lig1 based on the mapping.
methanol = BSS.Align.rmsdAlign(methanol,ethane,mapping)

# Merge the two ligands based on the mapping.
merged_methanol = BSS.Align.merge(methanol, ethane, mapping)
solvated_methanol = BSS.Solvent.tip3p(molecule=merged_methanol, box=3*[40*BSS.Units.Length.angstrom])
freenrg_methanol = BSS.FreeEnergy.Solvation(solvated_methanol, protocol, work_dir="methanol_ethane")

### 3.1.3. Exercise on using different simulation engines
Currently, alchemical free energy simulations with SOMD and Gromacs are supported. Can you figure out how to setup and run the simulations using SOMD rather than the default of Gromacs? **Hint**: look at the `engine` keyword of `FreeEnergy.Solvation()`. You might also want to change the working directory

In [None]:
freenrg_somd = BSS.FreeEnergy.Solvation(#FIXME)

### Solution

In [None]:
freenrg_somd = BSS.FreeEnergy.Solvation(solvated_methanol, protocol, work_dir="methanol_ethane_somd", engine='somd')

You will notice that again in your `work_dir`, two directories were created `free` and `vacuum`. Have a look at the content of these directories. You will notice that now the input files that were generated are now indeed for SOMD and not Gromacs inputfiles anymore. Take a moment to look at the config files etc. 

In [None]:
! ls ethane_methanol_somd/free/lambda_0.000

## 4. Free energy of binding
<a id="bind"></a>

So far we have done a setup for free energies of hydration. Next how can we use BioSimSpace to compute free energies of binding by setting up an alchemical free energy simulation for this. The thermodynamic cycle for the free energy of binding looks like this:

![reltherm](images/thermo_cycle_rel_eq.png)

In out case the host is Lysozyme, an antimicrobial protein, which has been studied extensivley using alchemical free energy calculations in the past.  

### 4.1 Loading parametrising the protein protein
Loading the protein is done in the same way as loading the small molecules. However, in order to compute free energies of binding, we have to make sure that the ligand is aligned with the protein and in an appropriate binding site. BioSimSpace is not a docking program. Therefore ligands will have to be aligned for alchemical free energy calculations in a different way. First, we will load the protein and ligands to check if they are aligned correctly for the calculation. 

In [None]:
lysozyme = BSS.IO.readMolecules("data/protein.pdb")

In [None]:
benzene = BSS.IO.readMolecules('data/benzene.mol2').getMolecules()[0]
o_xylene = BSS.IO.readMolecules('data/o-xylene.mol2').getMolecules()[0]

In [None]:
system = lysozyme+benzene+o_xylene

In [None]:
BSS.IO.saveMolecules('prot_lig', system, 'pdb')

In [None]:
BSS.viewMolecules('prot_lig.pdb')

### 4.2 Parametrisation
<a id="param"></a>

For the protein, we can use a standard Amber forcefield such as `amber 14 BS`

In [None]:
lysozyme = BSS.Parameters.ff14SB(lysozyme.getMolecules()[0]).getMolecule()

For the two ligands we can chose to parametrise them using `gaff2`. 

In [None]:
benzene = BSS.Parameters.gaff2(benzene).getMolecule()
o_xylene = BSS.Parameters.gaff2(o_xylene).getMolecule()

### 5.3 Morphing again

Now all we have to do is go back through the morphing process and then combine the system. 

In [None]:
mapping = BSS.Align.matchAtoms(o_xylene, benzene)

In [None]:
# Align lig0 to lig1 based on the mapping.
o_xylene = BSS.Align.rmsdAlign(o_xylene, benzene, mapping)

# Merge the two ligands based on the mapping.
merged = BSS.Align.merge(o_xylene, benzene, mapping)

In [None]:
# Merge the two ligands based on the mapping.
merged = BSS.Align.merge(o_xylene, benzene, mapping, allow_ring_breaking=True)

In [None]:
## This creates the protein ligand system with the merged molecule
system = merged + lysozyme

### 5.4 Binding free energy simulation
Now we can run the binding free energy simulation. It looks very similar to the solvation one. 

In [None]:
# Solvate in a 60 angstrom box of TIP3P water.
solvated = BSS.Solvent.tip3p(molecule=system, box=3*[60*BSS.Units.Length.angstrom])

# Create the free energy protocol.
protocol = BSS.Protocol.FreeEnergy(runtime=4*BSS.Units.Time.nanosecond, num_lam=9)

# Initialise the binding free energy object.
freenrg = BSS.FreeEnergy.Binding(solvated, protocol, work_dir="Binding_benzene_o_xylene" )

`freenrg.run()`
Would again run the simulation again. 

### 5.5 Exercises
<a id="exerc2"></a>
Exercises for binding free energies. 

### 5.5.1 Looking at the directories
Just take a look at the directory that was generated using `BSS.FreeEnergy.Binding()`.
What are the differences and similarities you can observe between the solvation free energies setup and binding free energy setup? 

### 5.5.2 Box sizes of the solvated leg
One thing you can notice is that the box size of the bound and free leg are the same. This of course is a bit silly, because you don't need to use such a large box for just running the ligand in water. There is a handy way of adjusting this. 

In [None]:
# Solvate in a 60 angstrom box of TIP3P water.
solvated = BSS.Solvent.tip3p(molecule=system, box=3*[60*BSS.Units.Length.angstrom])
# Create the free energy protocol.
protocol = BSS.Protocol.FreeEnergy(runtime=4*BSS.Units.Time.nanosecond, num_lam=9)
# try using a workdirecotry called `exercise_5` and a box size for the free leg of the simulations of 30 angstrom. 
freenrg = BSS.FreeEnergy.Binding(#FIXME )

### Solution

In [None]:
# Solvate in a 60 angstrom box of TIP3P water.
solvated = BSS.Solvent.tip3p(molecule=system, box=3*[60*BSS.Units.Length.angstrom])
# Create the free energy protocol.
protocol = BSS.Protocol.FreeEnergy(runtime=4*BSS.Units.Time.nanosecond, num_lam=9)
# try using a workdirecotry called `exercise_5` and a box size for the free leg of the simulations of 30 angstrom. 
freenrg = BSS.FreeEnergy.Binding(solvated, protocol, work_dir="exercise_5", box=3*[30*BSS.Units.Length.angstrom])

### 5.5.3. Running a minimisation and equilibration before the production
While `SOMD` will automatically minimise and equilibrate the system, Gromacs will not. You can use BioSimSpace to do this. **Hint**: use the [documentation](https://biosimspace.org/) and this morning workshop to do this. Also when returning the system use `getSystem(block=True)`

In [None]:
### Solution

# Solvate in a 60 angstrom box of TIP3P water.
solvated = BSS.Solvent.tip3p(molecule=system, box=3*[60*BSS.Units.Length.angstrom])

minimised = BSS.Process.Gromacs(solvated, BSS.Protocol.Minimisation()) \
               .start().getSystem()

equilibrated = BSS.Process.Gromacs(minimised, BSS.Protocol.Equilibration()) \
                  .start().getSystem()

# Create the free energy protocol.
protocol = BSS.Protocol.FreeEnergy(runtime=4*BSS.Units.Time.nanosecond, num_lam=9)
# Initialise the binding free energy object.
freenrg = BSS.FreeEnergy.Binding(equilibrated, protocol, work_dir="exercise_5_3 " )# Solvate in a 60 angstrom box of TIP3P water.
solvated = BSS.Solvent.tip3p(molecule=system, box=3*[60*BSS.Units.Length.angstrom])

minimised = #FIXME
equilibrated = #FIXME

# Create the free energy protocol.
protocol = BSS.Protocol.FreeEnergy(runtime=4*BSS.Units.Time.nanosecond, num_lam=9)
# Initialise the binding free energy object.
freenrg = BSS.FreeEnergy.Binding(equilibrated, protocol, work_dir="exercise_5_3 " )

### Solution

In [None]:
# Solvate in a 60 angstrom box of TIP3P water.
solvated = BSS.Solvent.tip3p(molecule=system, box=3*[60*BSS.Units.Length.angstrom])

minimised = BSS.Process.Gromacs(solvated, BSS.Protocol.Minimisation(block=True)) \
               .start().getSystem()

equilibrated = BSS.Process.Gromacs(minimised, BSS.Protocol.Equilibration(block=True)) \
                  .start().getSystem()

# Create the free energy protocol.
protocol = BSS.Protocol.FreeEnergy(runtime=4*BSS.Units.Time.nanosecond, num_lam=9)
# Initialise the binding free energy object.
freenrg = BSS.FreeEnergy.Binding(equilibrated, protocol, work_dir="exercise_5_3 " )