## Setting up a Hydration Free Energy Calculuation
This notebook will guide you through how to use FESetup in order to run an alchemical free energy calculation.    


The notebook forms part of the CCPBio-Sim workshop **Tackling Alchemistry with FESetup and Sire SOMD** run on the 10th of April 2018 at the University of Bristol.

*Author: Anotnia Mey   
Email: antonia.mey@ed.ac.uk*

**Reading time of the document: xx mins**

### Setting up the notebook with all necessary imports

In [1]:
%pylab inline
import nglview as nv

Populating the interactive namespace from numpy and matplotlib


### Setting the stage
We want to compute the relative free energy of hydration between ethane and methanol. That is the free energy difference between hydrating an ethane molecule in water and a methanol molecule in water. It assumes that you had a look at the slides of lecture 1 or attended lecture1 (maybe youtube link here).

Below you can see a thermodynamic cycle for the relative hydration free energy of ethane and methanol:
![therm_cycle](images/Therm_cycle.png)

We use FESetup to authomatically generate the perturbation information needed for running the alchemical free energy calcualtions.

### Visualise molecule

Have a look in the `FESetup_01` directory. You will find a file called `morph.in` and a directory called `solute`. `morph.in` is the main input file and we will look at it more closely in a second. `solute` contains two further directories: `ethane` and `methanol` each containing a pdb structure file called `solute.pdb` containing the 3D structure of the two molecules we want to compute the relative hydration free energy difference.   

Let's use nglview to look at the ethane molecule:

In [2]:
view = nv.show_file('FESetup_01/solutes/ethane/solute.pdb')
view

NGLWidget()

**1. Task: load methanol into nglview as well**

In [None]:
#Insert code for methanol here



## FESetup
Now, rather than just looking at the strucutre of these two molecules we want to set them up with FESetup such that they will be ready for running an alchemical free energy simulation. You can open a terminal and run FESetup from the command line by simply typing:   
    ```FESetup```

Or you can run things within the notebook. In order to access command line arguments in a notebook you can use the exclamation mark as an escape and the magic function %%cpature to capture all output. 

In [6]:
%%capture info
!/home/ppxasjsm/Software/FESetup1.2.1/FESetup64/bin/FESetup 

The reason why it is useful to capture output is because sometimes quite a lot of output is generated. This is also the case for FESetup. if you run FESetup without providing it with an input file, it will list all possible config file options you might want to use. We can access the captured output by printing it in the following way:

In [8]:
print (info.stdout)


=== FESetup release 1.2.1, SUI version: 0.8.3 ===

Please cite: HH Loeffler, J Michel, C Woods, J Chem Inf Mod, 55, 2485
             DOI 10.1021/acs.jcim.5b00368
For more information please visit http://www.ccpbiosim.ac.uk/fesetup/

[protein]
align_axes = False
basedir = (empty)
box.length = 10.0
box.type = (empty)
file.name = protein.pdb
ions.conc = 0.0
ions.dens = 1.0
md.constT.T = 300.0
md.constT.nsteps = 0
md.constT.p = 1.0
md.constT.restr_force = 10.0
md.constT.restraint = notsolvent
md.heat.T = 300.0
md.heat.nsteps = 0
md.heat.p = 1.0
md.heat.restr_force = 10.0
md.heat.restraint = notsolvent
md.press.T = 300.0
md.press.nsteps = 0
md.press.p = 1.0
md.press.restr_force = 10.0
md.press.restraint = notsolvent
md.relax.T = 300.0
md.relax.nrestr = 0
md.relax.nsteps = 0
md.relax.p = 1.0
md.relax.restraint = notsolvent
min.ncyc = 10
min.nsteps = 0
min.restr_force = 10.0
min.restraint = notsolvent
molecules = (empty)
neutralize = False
propka = F

FESetup is a command line tool and you can pass it an input file defining what you would like to set up. Let's change to the directory `FESetup_01` in order to run our fist simulation setup. 

In [10]:
cd FESetup_01/

/home/ppxasjsm/Projects/CCP-BioSim-Workshop/Exercises/01_FESetup_ethane_methanol/FESetup_01


In [11]:
ls

morph.in  [0m[01;34msolutes[0m/


 Here we are working with ethane and methanol. Let's have a closer look at the inputfile `morph.in` provided.

In [12]:
!head -n 7 morph.in

[globals]
logfile = eth-meth.log
forcefield = amber, ff14SB, tip3p, hfe
gaff = gaff2
mdengine = amber, sander 
AFE.type = Sire
AFE.separate_vdw_elec = false


The first section consists of global directives. Important here are the two directives `AFE.tye=Sire` and `AFE.separate_vdw_elec = false`. The first one means that Sire compatible output is generated and the second one means that the simulation will be carried out using a single step perturbation, perturbing charges and van der Waals interactions at the same time. Later on we will also look at how to generate output for different simulation software.

In [13]:
!tail -n 19 morph.in


[ligand]
basedir = solutes 
file.name = solute.pdb
molecules = ethane, methanol

# the following are required to create the morph in solution
box.type = rectangular
box.length = 12.0
neutralize = yes
min.nsteps = 100

min.ncyc = 100
min.restr_force = 10.0
min.restraint = notsolvent

#defining the morphing:
morph_pairs = ethane > methanol



For FESetup to understand what files to work on a strict directory structure is used. `basdir` gives information where it can find all molecules files. Here the basedir is called `solutes`. In `solutes` you have the molecules ethane and methanol as indicated by the corresponding directory and the actual filename which should be the same for all solutes is in this case `file.name = solute.pdb`. `molecules` defines the molecules that should actually be setup and in this case is the list of directories.

Any directive that starts with `box.type` indicates that a solvated version of the molecule should be gnerated. FESetup also allows to run a minimisation straight away by using the different `min` directives. In the above example a minimisation for 100 steps using a restraint on non-solvent molecules of 10. 

To execute FESetup you either type into your terminal `FESetup morph.in` or exectute the next cell.

In [14]:
!/home/ppxasjsm/Software/FESetup1.2.1/FESetup64/bin/FESetup morph.in


=== FESetup release 1.2.1, SUI version: 0.8.3 ===

Please cite: HH Loeffler, J Michel, C Woods, J Chem Inf Mod, 55, 2485
             DOI 10.1021/acs.jcim.5b00368
For more information please visit http://www.ccpbiosim.ac.uk/fesetup/

Making ligand ethane...
Making ligand methanol...
Morphs will be generated for Sire
Morphing ethane to methanol...

=== All molecules built successfully ===



Let's have a look at the directory again. Some additional files have been generated!

In [15]:
ls

eth-meth.log  [0m[01;34m_ligands[0m/  morph.in  [01;34m_perturbations[0m/  [01;34msolutes[0m/


Directories with an underscore `_` are output directories that contain the relevant information for the simualtion input. The most useful file is the generated logfile `eth-meth.log`. It contains all commands that were used in order to actually generate the output. It also contains information whether a setup has failed or not. If you are familiar with AMBER setup tools you will find a lot of familiar looking commands. If you want you can use `nano` on the commandline in order to take a closer look. 

### Solvated molecules
The generated directory `_ligands` contains the original pdb files in a box of water, found in `_ligands/ethane` and `_ligands/methanol`. Let's have a more detailed look at one of the directories. 

In [21]:
ls _ligands/methanol/

corr.ch      ligand_conv.mol2  min00001.info   solvated.pdb   vacuum.parm7
gaff.mol2    ligand.frcmod     min00001.out    solvated.rst7  vacuum.pdb
leap.log     ligand_tmp.mol2   min00001.rst7   sqm.in         vacuum.rst7
ligand.ac    min00001.en       solute.pdb      sqm.out
ligand.ac.0  min00001.in       solvated.parm7  sqm.pdb


If you wanted to run a simple MD simulation of methanol in water you can use the `solvated.rst7` (coordinate file) and `solvated.parm7` (topology file) files. The coordinte file contains the energy minimised coordinates as generated by the minimisation protocol in the `morph.in` file. But since we are interested in running an alchemical free energy perturbation from ethane to methanol `_perturbations` is actually the directory we are intrested in. 

**Task visualise the solvated molecule `solvated.pdb` using nglview below**

### Perturbation files
Let's take a closer look at `_perturbations`. It has a subdirectory called `sire`, because we chose Sire to be the alchemical free enenergy simulation engine and that again has a subdirectory called `ethane~methanol` indicating the perturbation we want to simulate. 

In [22]:
ls _perturbations/sire/ethane~methanol/

leap.log     mcs.mol2      MORPH.onestep.pert  vacuum.pdb
ligand.flex  MORPH.frcmod  [0m[01;34msolvated[0m/           vacuum.rst7
mcs_map.pkl  MORPH.mol2    vacuum.parm7


The files of interest are the `MORPH.onestep.pert` file, `vacuum.parm7` (topology), `vacuum.rst7` (coordinates) and in the directory `solvated`, the solution equivalent `solvated.parm7` and `solvated.rst7` files. The solution and vacuum files will be used to simulate the vacuum and solution leg of the thermodynamic cycle. Let's take a closer look at `MORPH.onestep.pert`

In [25]:
!head -n11 _perturbations/sire/ethane~methanol/MORPH.onestep.pert

version 1
molecule LIG
	atom
		name C1
		initial_type    c3
		final_type      c3
		initial_LJ      3.39771  0.10780
		final_LJ        3.39771  0.10780
		initial_charge -0.09435
		final_charge    0.11670
	endatom


It defines the initial and final atom types and their initial and final lennard jones parameters as well as partial charges. In the perturbation between ethane and methanol the first C atom (C1), will stay a C atom but the partial charges on the molecule will change because of the changed environment from initially having a methyl group attached to it to an OH group attached after the transformation. The second C atom will change to being an oxygen atom after the perturbation. So lets have a look at that atom in the `.pert` file. 

In [27]:
!sed -n '12,20p' _perturbations/sire/ethane~methanol/MORPH.onestep.pert

	atom
		name C2
		initial_type    c3
		final_type      oh
		initial_LJ      3.39771  0.10780
		final_LJ        3.24287  0.09300
		initial_charge -0.09435
		final_charge   -0.59880
	endatom


You can now see how the initial atom type is `c3`, and the final atom type is `oh`, meaning that we are going from an sp3 hypridised carbon atom to an oxygen atom that is bonded to a hydrogen. The Lennard jones terms chage a little and the partial charges do too. The alchemcial free enrgy software can then create a liniar perturbation between the initial and final values to slowly morph from one to the other.  

### Adding more perturbations
Now we understand the output generated by FESetup lets see how we can change the input to get a different output. 

**Task: Can you modify the `morph.in` file in such a way that we not only generate a ethane to methanol, but also a methanol to ethane perturbation file**

Using nano open morph.in and identify what you need to change in order to not only generate ethane > methanol, but also methanol > ethane. 

In [None]:
## Now run the modified file through FESetup add the requried code below


In [None]:
We shoud now have an additional perturbation in the `_perturbations/sire` directory. 

In [28]:
ls _perturbations/sire

[0m[01;34methane~methanol[0m/  [01;34mmethanol~ethane[0m/


**Task: visulatise the `vacuum.pdb` file using nglview of the new methanol~ethane perturbation**

In [None]:
## add code to load vacuum.pdb into nglview




Questions:
- Why does the molecule look more like ethane than methanol?
- Can you identify the atom types that are connected to oxygen atom? Think of what was covered in the lecture. 

### Equilibration in FESetup
FESetup can also be used to straightaway equilibrate the solvation box with the solute in it. It may be desirable to run an equilibration protocol with FESetup, this can be done by adding equilibration information to the input file. For this purpose a second FESetup directory FESetup_02 has been prepared. Change to that directory and have a look at the `equilibrate.in` input file. 

In [30]:
cd ../FESetup_02/

/home/ppxasjsm/Projects/CCP-BioSim-Workshop/Exercises/01_FESetup_ethane_methanol/FESetup_02


In [39]:
!sed -n '10,50p' equilibrate.in

[ligand]
basedir = solutes 
file.name = solute.pdb
molecules = ethane, methanol

# the following are required to create the morph in solution
box.type = rectangular
box.length = 12.0
neutralize = yes
min.nsteps = 100

min.ncyc = 100
min.restr_force = 10.0
min.restraint = notsolvent

# heat the system to the final temperature running NVT
md.heat.nsteps = 500
md.heat.T = 300.0
md.heat.restraint = notsolvent
md.heat.restr_force = 5.0

# fix the density of the system running NpT
md.press.nsteps = 1000
md.press.T = 300.0
md.press.p = 1.0
md.press.restraint = notsolvent
md.press.restr_force = 4.0

# restraints release in 4 steps, this is a NpT protocol
md.relax.nrestr = 4
md.relax.nsteps = 500
md.relax.T = 300.0
md.relax.p = 1.0
md.relax.restraint = notsolvent


The file now contains a set of directives that will run a NVT simultation to heat the system to 300K followed by an NPT simulation to equilibrate the density and then a protocol that will slowly release the restraint on the solute molecule in the NPT ensemble. Note however, how we have not specified any kind of morphing. It is often useful to not have the morph and equilibration happen in one go, because often you want to run the equilibrations individually in parallel on a cluster and then when completed actually run the morphing. 

In [None]:
# Insert code to run the equilibration 



With the equilibration successfully run, we can now run the morphing. You should notice that there is no `_perturbations` directory that was cretated by FESetup. 

In [None]:
ls

This step will be very quick. For this purpose we have a `morph.in` file again. Have a look at it. It is much shorter and mostly defines the type of simulation engine and the fact that we also want to morph the solution phase and not the vacuum phase as indicated by the `box.type = rectangular` and `box.length = 12.0` directives. 

In [None]:
# Insert the code to then run the morphing



Make sure that the pertruabtions were actually succesfully generated. 

In [None]:
ls _perturbations/sire

### Running simulation in vacuum
This bit may be cut out, if the server we are running on is too slow, so don't worry about this bit yet. 

Well done! You have now started running a short simulation with a free energy perturbation in vacuum, let's go and have a tea/coffe while we wait for this to finish!