# <span style="color:teal">Alchemical free energy setup</span>

This jupyter notebook is an introduction alchemical free energy methods with BioSimSpace for the September 2022 CCPBioSim Workshop.
It includes core as well as <span style="color:purple">extra</span> options.

<a rel="license" href="http://creativecommons.org/licenses/by/4.0/"><img alt="Creative Commons Licence" style="border-width:0" src="https://i.creativecommons.org/l/by/4.0/88x31.png" title='This work is licensed under a Creative Commons Attribution 4.0 International License.' align="right"/></a>

**<span style="color:teal">Authors</span>**
- [Antonia Mey -- @ppxasjsm](https://github.com/ppxasjsm)   
- [Lester Hedges -- @lohedges](https://github.com/lohedges)
- edited and expanded by [Finlay Clark -- @fjclark](https://github.com/fjclark) and [Anna Herz -- @annamherz](https://github.com/annamherz)

**<span style="color:teal">Reading Time:</span>**
~ 30 mins

##### <span style="color:teal">Required knowledge</span> 
 - Basic python
 - Presentation : [<span style="color:pink">Introduction to alchemical free energy methods</span>](slides/CCPBioSim-FEP-tutorial.pdf) TODOnewslides

##### <span style="color:teal">Learning objectives</span> 
- Setup an alchemical solvation free energy simulation using BioSimSpace and SOMD
- Setup an alchemical solvation free energy simulation using BioSimSpace and Gromacs
- Setup an alchemical binding free energy simulation for Gromacs and SOMD using BioSimSpace

You will be using the following functions in BioSimSpace:

- `BSS.IO.readMolecules()` To load the molecules
- `BSS.Parameters.gaff` To parameterise molecules using the Generalised Amber Force Field (GAFF)
- `BSS.Parameters.ff14SB` To parameterise a protein using FF14SB
- `BSS.Align.matchAtoms()` Maximum Common Substructure matches atoms for the morphing
- `BSS.Align.rmsdAlign()` Aligns the molecules to be morphed
- `BSS.Align.merge()` Creates a merged molecule used for alchemical simulations
- `BSS.Solvent.tip3p()` Solvates a molecule in a tip3 water box
- `BSS.Protocol.FreeEnergy()` Defines the free energy protocol
- `BSS.FreeEnergy.Relative()` Sets up a leg for a relative free-energy simulation
- `BSS.FreeEnergy.Relative.analyse()` To analyse the results of the perturbation

### <span style="color:teal">Table of Contents</span>  
1. [Working with ligands](#lig)    
   1.1 [Loading ligands](#load)   
   1.2 [Parametrising ligands](#param)   
2. [Morphing ligands](#merge)   
   2.1 [MCS](#mcs)   
   2.2 [Creating a _merged_ molecule](#merged)   
   2.3 [Solvation](#solv)  
   2.4 [Reading and writing perturbable systems](#readwrite)   
3. [Running a solvation free energy simulation](#free)   
   3.1 [Exercises for solvation free energies](#exerc1)   
4. [Binding free energy simulation](#bind)   
   4.1 [Loading the protein and ligands](#prot)   
   4.2 [Paramaterising a protein](#paramp)   
   4.3 [Morphing again](#morphp)   
   4.4 [Running the binding free energy simulation](#runbind)   
   4.5 [Exercises](#exerc2)    
5. [Analysis](#ana)   

 <span style="color:pink">Further reading </span> references some sections of the [LiveComs Best Practices for Alchemical Free Energy Calculations](https://livecomsjournal.org/index.php/livecoms/article/view/v2i1e18378).

**<span style="color:teal">Jupyter Cheat Sheet</span>**
- To run the currently highlighted cell and move focus to the next cell, hold <kbd>&#x21E7; Shift</kbd> and press <kbd>&#x23ce; Enter</kbd>;
- To run the currently highlighted cell and keep focus in the same cell, hold <kbd>&#x21E7; ctrl</kbd> and press <kbd>&#x23ce; Enter</kbd>;
- To get help for a specific function, place the cursor within the function's brackets, hold <kbd>&#x21E7; Shift</kbd>, and press <kbd>&#x21E5; Tab</kbd>;
- You can find the full documentation at [biosimspace.org](https://biosimspace.org).


First, let's import BioSimspace!

In [2]:
import BioSimSpace as BSS

### <span style="color:teal">1. Free energy of solvation of ethane and methanol</span>
<a id="lig"></a>

We want to compute the relative free energy of hydration between ethane and methanol, $\Delta \Delta G_{\mathrm{hyd,\: ethane-methanol}}$. This is the free energy difference between hydrating an ethane molecule in water and a methanol molecule in water.

Below you can see a thermodynamic cycle for the relative hydration free energy of ethane and methanol:

![therm_cycle](images/Therm_cycle.png)

Because free energy is a state function, the total free energy change around the cycle is 0:

$\Delta G^{\mathrm{ethane}}_{\mathrm{hyd}} + \Delta G_{\mathrm{solv}} - \Delta G^{\mathrm{methanol}}_{\mathrm{hyd}} -  \Delta G_{\mathrm{vac}} = 0$

This allows us to obtain $\Delta \Delta G_{\mathrm{hyd,\: ethane-methanol}}$ in terms of $\Delta G_{\mathrm{solv}}$ and $\Delta G_{\mathrm{vac}}$:

$\Delta G^{\mathrm{methanol}}_{\mathrm{hyd}}  - \Delta G^{\mathrm{ethane}}_{\mathrm{hyd}} = \Delta G_{\mathrm{solv}} - \Delta G_{\mathrm{vac}}$

$\Delta \Delta G_{\mathrm{hyd,\: ethane-methanol}} = \Delta G_{\mathrm{solv}} - \Delta G_{\mathrm{vac}}$

Now we just need to compute these quantities using alchemical simulations.

#### <span style="color:teal">1.1 Loading ligands</span>
<a id="load"></a>
Next, we read in the two molecules we want to perturb. In this case, this is ethane changing to methanol. You can use the BioSimSpace function `BSS.IO.readMolecules()` for this task.

In [34]:
# We assume the molecules to perturb are the first molecules in each system. (Each file contains a single molecule.)
# we use [0] to select this first molecule.
ethane = BSS.IO.readMolecules("input/ethane.pdb")[0]
methanol = BSS.IO.readMolecules("input/methanol.pdb")[0]

To check whether we have selected an atom/molecule/system, a quick check is to print the object, which will tell us which BSS type the object has.

In [12]:
print(ethane)
print(methanol)

<BioSimSpace.Molecule: nAtoms=8, nResidues=1>
<BioSimSpace.Molecule: nAtoms=6, nResidues=1>


The below cell illustrates the automatic instantiation of BSS objects further - if we do not select the first molecule when we load our system in, even if there is only one molecule in the file, this will be instantiated as a system as a default. Similarly, if we combine molecules it automatically changes the BSS type of the object.

In [13]:
test_ethane_system = BSS.IO.readMolecules("input/ethane.pdb")
print(test_ethane_system)
test_combined = (ethane + methanol)
print(test_combined)

<BioSimSpace.System: nMolecules=1>
<BioSimSpace.Molecules: nMolecules=2>


Apart from just the type of the BSS object, it might also be nice to quickly check we are reading in the right molecules so we can visualise them using the `View` class from `BSS.Notebook`.

In [14]:
BSS.Notebook.View("input/ethane.pdb").system()



ThemeManager()

NGLWidget(gui_style='ngl')

In [15]:
BSS.Notebook.View("input/methanol.pdb").system()

NGLWidget(gui_style='ngl')

#### <span style="color:teal">1.2 Paramaterising Molecules</span>
<a id="param"></a>

Current we only have the coordinates information saved for the loaded molecules so the first thing we need to do is generated some forcefield parameters. In this case, we will use the Generalise Amber Force Field [(GAFF)](http://ambermd.org/antechamber/gaff.html).

In [35]:
ethane = BSS.Parameters.gaff(ethane).getMolecule()
methanol = BSS.Parameters.gaff(methanol).getMolecule()

### <span style="color:teal">2. Creating merged system</span>
<a id="merge"></a>
Now ethane and methanol have all the required properties for running an MD simulation of them individually. But here we are interested in creating a morphed system, or `single topology` for running an alchemical free energy calculation. In this case, two of the ethane hydrogens will turn into dummy atoms and the second carbon and the 3rd hydrogen will turn into the `OH` group of the methanol.

 There are different topologies, and which is used, single or dual, depends largely on the software. For more information, check the <span style="color:pink">further reading </span>: 7.1.1 (Topologies) .

#### <span style="color:teal">2.1 Maximum Common Substructure (MCS)</span>
<a id="MCS"></a>
In order to automatically figure out which atoms are common between ethane and methanol we can use the `matchAtoms()` function of BioSimSpace. This will compute a Maxmimum Common Substructure (MCS) match and return a dictionary that maps the indices of atoms in the ethane molecule to the indices of the atoms in the ethanol to which they match. An example of what and MCS match might look like is shown here:

![MCS](images/MCS.png)


In [17]:
mapping = BSS.Align.matchAtoms(ethane, methanol)

# Mapping is a dictionary mapping atom indices in ethane to those in methanol.
print(mapping)

{0: 0, 2: 1, 1: 2, 6: 5, 3: 4, 4: 3}


Once we have the mapping we can align the molecules to each other using a root mean squared displacement (RMSD) metric and from the alignment we can then create a merged molecule which contains all of the `single topology` information needed for the alchemical perturbation.

To visualise the mapping we can use:

In [18]:
BSS.Align.viewMapping(ethane, methanol, mapping)

<py3Dmol.view at 0x7f4881ff28e0>

This shows ethane, with the atoms that map to those in methanol highlighted in green. The numbers next to the atoms are their indices within the molecule (and mapping dictionary).To instead use methanol as the reference, we can swap the order of the molecules that are passed to the function and invert the mapping dictionary:

In [19]:
# get the inverse mapping
inv_mapping = {v: k for k, v in mapping.items()}
# view
BSS.Align.viewMapping(methanol, ethane, inv_mapping)

<py3Dmol.view at 0x7f49a831f1f0>

#### <span style="color:teal">2.2 Creating a _merged_ molecule</span>
<a id="merged"></a>

In order to perform an alchemical simulation we need to create a _merged_ molecule that combines that properties of the two molecules. To do so we first need to align one molecule to the other, based on the mapping. This can be achieved using the `rsmdAlign` function.    
As the mapping matches the atoms for ligand 0 (ethane) to ligand 1 (methanol), and we want to align ligand 1 to ligand 0 (so align the methanol to the ethane), we need to use the inverse mapping for this:

In [20]:
# Align methanol to ethane based on the inverse mapping.
methanol_aligned = BSS.Align.rmsdAlign(methanol, ethane, inv_mapping)

We can now _merge_ the two molecules. This will create a composite molecule containing all of the molecular properties at both end states. If the molecules are a different size, then the smaller will contain dummy atoms to represent the atoms that will _appear_ during the perturbation. In this case, the merged methanol end state will contain two dummy atoms corresponding to the extra hydrogen atoms in the ethane molecule.

In [21]:
# Merge the ethane and methanol based on the mapping.
merged = BSS.Align.merge(ethane, methanol_aligned, mapping)

#### <span style="color:teal">2.3 Solvation</span>
<a id="solv"></a>

Before we can run a free energy simulation we will have to solvate the system. In this case, rather than passing ethane and methanol separately we will solvate the whole merged system. Here we use a cubic box with a base length of 40 Angstroms.

In [22]:
solvated = BSS.Solvent.tip3p(molecule=merged, box=3*[40*BSS.Units.Length.angstrom])

You can see which solvation models are available in BSS by running `print(BSS.Solvent.waterModels())` .

#### <span style="color:teal">2.4 Reading and writing perturbable systems</span>
<a id="readwrite"></a>

You might wish to save a perturbable system to file for use in a future simulation, or to share with a colleague. To do so you can use the `BioSimSpace.IO.savePerturbableSystem` function. This writes the topology and coordinate files for the two end states, which can be read back to reconstruct the system. For example:

In [23]:
BSS.IO.savePerturbableSystem("pert", solvated)

There should now be four new AMBER format files in your working directory:

In [24]:
! ls pert*

pert0.prm7  pert0.rst7	pert1.prm7  pert1.rst7


Here `pert0.prm7` and `pert1.prm7` are the topology files for the two end states and `pert0.rst7` and `pert1.rst7` are the coordinates. To re-load the files we can use:

In [25]:
solvated = BSS.IO.readPerturbableSystem("pert0.prm7", "pert0.rst7", "pert1.prm7", "pert1.rst7")

### <span style="color:teal">3. Solvation free energy</span>
<a id="free"></a>
We now need to define a protocol to describe the parameters used for the free energy perturbation. A simple protocol consists of a 2 fs timestep,, a runtime of 4 ns and using 9 equally spaced $\lambda$ windows:

In [26]:
protocol = BSS.Protocol.FreeEnergy(timestep=2*BSS.Units.Time.femtosecond, runtime=4*BSS.Units.Time.nanosecond, num_lam=9)

Next we want to create objects to configure and run the two legs associated with the relative free-energy perturbation calculation:

In [28]:
fep_free = BSS.FreeEnergy.Relative(solvated, protocol, work_dir="ethane_methanol_somd/free")
fep_vac  = BSS.FreeEnergy.Relative(merged.toSystem(), protocol, work_dir="ethane_methanol_somd/vacuum")

Decoupling the two legs means that we can use a different protocol for each, e.g. less lambda windows for the vacuum leg, or re-use data from a leg in an entirely different calculation, e.g. for a binding free-energy simulation, or to combine results for legs that were run with entirely simulation engines, e.g. SOMD or GROMACS.

To run simulations for all of the individual lambda windows for the free leg, you can use:

`fep_free.run()`   

If you want to start the simulation, but wait for it to finish before the next part of code is executed (i.e. the analysis), you can use the following:

`fep_free.wait()`   

This only makes sense on a workstation with GPUs or GPU cloud resources or a GPU cluster. Otherwise you will have to wait for too long to run these simulations on the notebook server.

Let's have a look the `ethane_methanol_somd/free` directory. In this directory you have now all the files setup and ready for simulation using SOMD as the simulation engine, which is the default.

In [29]:
! ls ethane_methanol_somd/free

lambda_0.0000  lambda_0.2500  lambda_0.5000  lambda_0.7500  lambda_1.0000
lambda_0.1250  lambda_0.3750  lambda_0.6250  lambda_0.8750


#### <span style="color:teal">3.1. Exercises</span>
<a id="exerc1"></a>

The exercises are announced by the keyword <span style="color:teal">Exercise</span> and followed by an incomplete cell.
Missing parts are indicated by:

```python
#FIXME
```

#### <span style="color:teal">3.1.1. Exercise on selecting lambda windows</span>

Above we defined a protocol with 9 $\lambda$ windows. For this system this isn't an optimal protocol and we would like to instead use 12 lambda window. Can you write down a protocol that would allow you to run 12 rather than 9 lambda windows?

In [None]:
protocol = BSS.Protocol.FreeEnergy(#FIXME)


In [30]:
# answer
protocol = BSS.Protocol.FreeEnergy(timestep=2*BSS.Units.Time.femtosecond, runtime=4*BSS.Units.Time.nanosecond, num_lam=12)

#### <span style="color:teal">3.1.2. Exercise on merged molecules</span>

Previously we have set up an ethane to methanol alchemical free energy simulation. One way of assessing how good an estimate of a free energy difference from an alchemical simulation is by running the simulation in the opposite direction, i.e. methanol to ethane. Can you set up a new merged molecule and run the necessary steps for the free energy setup?

In [None]:
mapping = #FIXME

# Align methanol to ethane based on the mapping.
methanol = BSS.Align.rmsdAlign(#FIXME)

# Merge the two molecules based on the mapping.
merged_methanol = #FIXME
solvated_methanol = #FIXME
fep_methanol_free = #FIXME
fep_methanol_vac = #FIXME

In [50]:
# answer
# map methanol to ethane
mapping = BSS.Align.matchAtoms(methanol, ethane)
# get the inverse mapping
inv_mapping = {v: k for k, v in mapping.items()}

# Align ethane to methanol based on the inverse mapping.
ethane_aligned = BSS.Align.rmsdAlign(ethane, methanol, inv_mapping)

# Merge the two molecules based on the mapping.
merged_methanol = BSS.Align.merge(methanol, ethane_aligned, mapping)
# solvate
solvated_methanol = BSS.Solvent.tip3p(molecule=merged_methanol, box=3*[40*BSS.Units.Length.angstrom])
# create the directories
fep_methanol_free = BSS.FreeEnergy.Relative(solvated_methanol, protocol, work_dir="methanol_ethane_somd/free")
fep_methanol_vac = BSS.FreeEnergy.Relative(merged_methanol.toSystem(), protocol, work_dir="methanol_ethane_somd/vacuum")

#### <span style="color:teal">3.1.3. Exercise on using different simulation engines</span>

Currently, alchemical free energy simulations with SOMD and Gromacs are supported. Can you figure out how to setup and run the simulations for the free leg using GROMACS rather than the default of SOMD?

**Hint**: look at the `engine` keyword of `FreeEnergy.Relative()`. You might also want to change the working directory.

In [None]:
fep_gromacs_free = BSS.FreeEnergy.Relative(#FIXME)

In [42]:
# answer
fep_gromacs_free = BSS.FreeEnergy.Relative(solvated_methanol, protocol, work_dir="methanol_ethane_gromacs/free", engine="gromacs")

You will notice that again in your `work_dir`, two directories were created `free` and `vacuum`. Have a look at the content of these directories. You will notice that the input files that were generated are now indeed for GROMACS and not SOMD. Take a moment to look at the config files etc, e.g.:

In [52]:
! ls methanol_ethane_gromacs/free/lambda_0.0000
! ls methanol_ethane_somd/free/lambda_0.0000

gromacs.err  gromacs.mdp  gromacs.out.mdp  gromacs.tpr
gromacs.gro  gromacs.out  gromacs.top
somd.cfg  somd.err  somd.out  somd.pert  somd.prm7  somd.rst7


### <span style="color:teal">4. Free energy of binding</span>
<a id="bind"></a>

So far we have done a setup for free energies of hydration. Next we'll learn how to use BioSimSpace to set up alchemical free energy simulations that can be used to compute free energies of binding. The thermodynamic cycle for the free energy of binding looks like this:

![reltherm](images/thermo_cycle_rel_eq.png)

In our case the host is Lysozyme, an antimicrobial protein, which has been studied extensivley using alchemical free energy calculations in the past.  

#### <span style="color:teal">4.1 Loading the protein and ligands</span>
<a id="prot"></a>
Loading the protein is done in the same way as loading the small molecules. However, in order to compute free energies of binding, we have to make sure that the ligand is aligned with the protein and in an appropriate binding site. BioSimSpace is not a docking program. Therefore ligands will have to be aligned for alchemical free energy calculations in a different way. First, we will load the protein and ligands to check if they are aligned correctly for the calculation. 

In [68]:
# Load the protein and two ligands.
lysozyme = BSS.IO.readMolecules("input/protein.pdb")[0]
benzene = BSS.IO.readMolecules("input/benzene.mol2")[0]
o_xylene = BSS.IO.readMolecules("input/o-xylene.mol2")[0]

In [69]:
# Combine the molecules into a single container.
molecules = lysozyme + benzene + o_xylene

In [70]:
# Create a view to visualise the molecules.
view = BSS.Notebook.View(molecules)
# View the entire system.
view.system()

NGLWidget(gui_style='ngl')

#### <span style="color:teal">4.2 Parametrisation</span>
<a id="paramp"></a>

For the protein, we can use a standard Amber forcefield such as `Amber 14 SB`:

In [71]:
lysozyme = BSS.Parameters.ff14SB(lysozyme).getMolecule()

For the two ligands we can chose to parametrise them using `gaff2`. 

In [72]:
o_xylene = BSS.Parameters.gaff2(o_xylene).getMolecule()
benzene = BSS.Parameters.gaff2(benzene).getMolecule()

#### <span style="color:teal">4.3 Morphing again</span>
<a id="morphp"></a>

Now all we have to do is go back through the morphing process and then combine the system. 

In [73]:
mapping = BSS.Align.matchAtoms(o_xylene, benzene)
inv_mapping = {v: k for k, v in mapping.items()}

# we can visualise this mapping again as before
BSS.Align.viewMapping(o_xylene, benzene, mapping)

<py3Dmol.view at 0x7f491e2a2c10>

In [74]:
# Align benzene to o_xylene based on the inverse mapping.
benzene_aligned = BSS.Align.rmsdAlign(benzene, o_xylene, inv_mapping)

# Merge the two ligands based on the mapping.
merged = BSS.Align.merge(o_xylene, benzene_aligned, mapping)

Next we need to create a composite system containing the merged molecule and the protein:

In [76]:
complx = merged + lysozyme

#### <span style="color:teal">4.4 Binding free energy simulation</span>
<a id="runbind"></a>
Now we can solvate and set up the binding free energy simulation. It looks very similar to the solvation one.

In [80]:
# Solvate the protein ligand complex in a 60 angstrom box of TIP3P water.
complex_sol = BSS.Solvent.tip3p(molecule=complx, box=3*[60*BSS.Units.Length.angstrom])

# Solvate the merged ligand in a 60 angstrom box of TIP3P water.
merged_sol = BSS.Solvent.tip3p(molecule=merged, box=3*[60*BSS.Units.Length.angstrom])

# Create the free energy protocol.
protocol = BSS.Protocol.FreeEnergy(runtime=4*BSS.Units.Time.nanosecond, num_lam=9)

# Initialise relative free energy objects for each leg.
# (Once again, this defaults to the SOMD engine.)
fep_bound = BSS.FreeEnergy.Relative(complex_sol, protocol, work_dir="o_xylene_benzene/bound")
fep_free  = BSS.FreeEnergy.Relative(merged_sol, protocol, work_dir="o_xylene_benzene/free")

`fep_bound.run()`
Would run the simulation for the _bound_ leg.

#### <span style="color:teal">4.5 Exercises</span>
<a id="exerc2"></a>
Exercises for binding free energies. 

#### <span style="color:teal">4.5.1 Box sizes of the solvated leg</span>
<span style="color:purple">Extra:</span>
One thing you can notice is that the box size of the bound and free leg are the same. This of course is a bit silly, because you don't need to use such a large box for just running the ligand in water. Since the two legs are decoupled, we can just solvate the ligand in a smaller box for the free leg.

In [None]:
# Solvate the ligand in a smaller 30 angstrom box of TIP3P water.
merged_sol = #FIXME

# Recreate the object using the smaller system.
fep_free = BSS.FreeEnergy.Relative(merged_sol, protocol, work_dir="o_xylene_benzene/free")

In [78]:
# answer
# Solvate the ligand in a smaller 30 angstrom box of TIP3P water.
# Try using a work_dir called "exercise_4" and a box size for the free leg of the simulations of 30 Angstrom. 
merged_sol = BSS.Solvent.tip3p(molecule=merged, work_dir="exercise_4", box=3*[30*BSS.Units.Length.angstrom])

# Recreate the object using the smaller system.
fep_free = BSS.FreeEnergy.Relative(merged_sol, protocol, work_dir="o_xylene_benzene/free")

#### <span style="color:teal">4.5.2. Running a minimisation and equilibration before the production</span>

It is good practice to minimise and equilibrate the molecular system _before_ setting up the free energy simulations. Thankfully BiomSpace can handle systems containing perturbable molecules for simulation protocols other than `BioSimSpace.Protocol.FreeEnergy`. For example, this means that you can create a process to minimise a specified end state (lambda = 0 by default) of a perturbable system. In the box below, write some code to run a minmisation and equilibration on the system (`complx`).

**Hint**: use the [documentation](https://biosimspace.org/). Also when returning the system use `getSystem(block=True)` so that we wait for the minimisation and equilibration simulations to finish before returning the system.

In [None]:
minimised = #FIXME
equilibrated = #FIXME

# Create the free energy protocol.
protocol = BSS.Protocol.FreeEnergy(runtime=4*BSS.Units.Time.nanosecond, num_lam=9)

# Initialise relative free energy objects for each leg.
fep_bound = BSS.FreeEnergy.Relative(equilibrated, protocol, work_dir="exercise_4_5/bound")

In [None]:
# answer
# Minimise the system.
minimised = BSS.Process.Gromacs(solvated, BSS.Protocol.Minimisation()).start().getSystem(block=True)

# Equilibrate the system.
equilibrated = BSS.Process.Gromacs(minimised, BSS.Protocol.Equilibration()).start().getSystem(block=True)

# Create the free energy protocol.
protocol = BSS.Protocol.FreeEnergy(runtime=4*BSS.Units.Time.nanosecond, num_lam=9)

# Initialise relative free energy objects for each leg.
fep_bound = BSS.FreeEnergy.Relative(equilibrated, protocol, work_dir="exercise_4_5/bound")

We can look at the folders created using this. They should contain xyz files.

In [None]:
! ls exercise_4_5/bound
! ls exercise_4_5/bound/lambda_0.0000

#### <span style="color:teal">4.5.3. Running the production run</span>

As mentioned earlier, these free energy simulations can in general be run using:

`process.run()` and `process.wait()` .

This is not ideal for production runs however, as it runs the lambda windows sequentially. It is best to run this via a bash script using some kind of job schedueler, such as Slurm. An example script for this is in the 'o_xylene_benzene_for_analysis' folder. This is also discussed in more detail in the RBFE tutorial.

It is important to note as well, that for engines other than SOMD (such as GROMACS), additional equilibration protocols are usually required at each lambda window as well (not just at lambda 0.0 before setting up the Free Energy). This is to ensure that simulations do not experience errors. This setup is currently not supported in the main/devel branch of BSS, but is under development.

### <span style="color:teal">5. Analysis</span>
<a id="exerc2"></a>
<a id="ana"></a>

A run with SOMD for o-xylene to benzene has already been carried out. The results are in the 'o_xylene_benzene_for_analysis' folder. In this section, we will look at how we can analyse this run.

 <span style="color:pink">Further reading </span>: 8.1,8.2,8.3


#### <span style="color:teal">5.1 Calculating the RBFE between a pair of ligands</span>

We will first calculate the RBFE. This is achieved relatively simply in BSS by using 'analyse'. This implements the auto equilibrium detection and statistical inefficiency from the alchemlyb python implementation in order to obtain uncorrelated samples. Following this, MBAR is used as the default analysis method.

Once we have obtained the energy for each leg, we can then calculate the difference to obtain the RBFE.


In [None]:
pmf_free, overlap_matrix_free = BSS.FreeEnergy.Relative.analyse(f'o_xylene_benzene_for_analysis/free')
pmf_bound, overlap_matrix_bound = BSS.FreeEnergy.Relative.analyse(f'o_xylene_benzene_for_analysis/bound')
freenrg_rel = BSS.FreeEnergy.Relative.difference(pmf_bound, pmf_free)
print(freenrg_rel)

#### <span style="color:teal">5.2 Checking the overlap</span>
For MBAR, we can assess the reliability of the calculations by checking the phase space overlap between lambda windows. The off-diagonals should be atleast 0.03 so that the obtained free energy estimate is reliable.

The check_overlap function should return a warning if the value for any off-diagonal is less than 0.03 - otherwise, there will be no output for it.

 <span style="color:pink">Further reading </span>: 8.5 (overlap matrix)

In [None]:
BSS.FreeEnergy.Relative.check_overlap(bound_overlap)
BSS.FreeEnergy.Relative.check_overlap(free_overlap)

We can also plot the overlap for a chosen leg of a perturbation to visualise it.

In [None]:
ax = _plot_mbar_overlap_matrix(overlap_matrix_bound) # pick either the bound or the free overlap
ax.set_title(f"Overlap matrix")
ax.show()

#### <span style="color:teal">5.3 Exercises</span>
<span style="color:purple">Extra:</span>
Instead of using MBAR, try using TI (Thermodynamic Integration) to anaylse the results.
This can be done by setting the estimator to TI when analysing.

```python
pmf, overlap_matrix = BSS.FreeEnergy.Relative.analyse(f'{folder}', estimator="TI")
```

Are the results different?

Now that we know how to setup, run, and analyse a free energy perturbation in BSS, we will next look at how to carry this out for an entire network of perturbations in the RBFE tutorial.