# Calculating PMFs with AWH in GROMACS <a id='another_cell'></a>
Here we learn how to calculate the potential of mean force
(PMF) along a reaction coordinate (RC) using the accelerated weight
histogram method (AWH) in GROMACS.  We will go through both how to set
up the input files needed, as well as how to extract and analyze the
output after having run the simulation. For more information about
the AWH method itself and how it can be used we refer to \cite{}[TODO how cite?].
What you need to know right now is that AWH applies a time-dependent bias
potential along the chosen RC, which is tuned during the simulation
such that it "flattens" the barriers in the PMF to improve sampling along the RC.
With better sampling, the PMF can be calculated more accurately than using unbiased MD. [TODO Movie/figure?]

*Author: Viveca Lindahl   
Email: vivecal@kth.se*

## The case study: DNA base pair opening
We will calculate a PMF for opening a DNA base
pair. The DNA double helix is a very stable structure. Opening a base pair
requires breaking hydrogen bonds between the bases and crossing a high free energy
barrier.  That's why we need to enhance the sampling by applying a bias!
<img src="figs/dna-helix.png" alt="dna" style="height: 300px;"/>
As our RC we use the distance between the two atoms forming the central hydrogen-bond the two bases in a pair. Let's have a look at the system and the reaction coordinate using VMD. The `-e` flag below tells VMD to excute the commands that are in the following tcl-script. These commands change how the system is visually represented. For instance, we have hidden all the water to better see the DNA itself, and we have put punchy colors on the atoms defining the RC of our target base pair. Now run this and VMD should pop up:

In [1]:
!vmd visualization/dna-centered.gro -e visualization/representation.tcl

/usr/local/lib/vmd/vmd_LINUXAMD64: /usr/lib/nvidia-384/libGL.so.1: no version information available (required by /usr/local/lib/vmd/vmd_LINUXAMD64)
Info) VMD for LINUXAMD64, version 1.9.3 (November 30, 2016)
Info) http://www.ks.uiuc.edu/Research/vmd/                         
Info) Email questions and bug reports to vmd@ks.uiuc.edu           
Info) Please include this reference in published work using VMD:   
Info)    Humphrey, W., Dalke, A. and Schulten, K., `VMD - Visual   
Info)    Molecular Dynamics', J. Molec. Graphics 1996, 14.1, 33-38.
Info) -------------------------------------------------------------
Info) Multithreading available, 8 CPUs detected.
Info)   CPU features: SSE2 AVX AVX2 FMA 
Info) Free system memory: 11GB (71%)
Info) Creating CUDA device pool and initializing hardware...
Info) Detected 1 available CUDA accelerator:
Info) [0] GeForce GTX 760     6 SM_3.0 @ 1.07 GHz, 1.9GB RAM, KTO, AE1, ZCP
Info) OpenGL renderer: GeForce GTX 760/PCIe/SSE2
Info)   Features: STENCIL 

Rotate the structure and look for the two (nitrogen) atoms in green. The distance between these will serve as our RC for that base pair. Now quit VMD (in the *VMD Main* window: *File*>*Quit*)

## The MD parameter (.mdp) file 
Here we assume we have already built and equilibrated the system, i.e. we are almost ready to go, we basically just need to add some extra parameters in the mdp file to use AWH. Go to and check out the directory that has all the run files of our first AWH example:

In [14]:
%cd awh-1d
!ls -l
!pwd

[Errno 2] No such file or directory: 'awh-1d'
/home/viveca/awh-tutorial/dna-base-pair-opening/awh-1d
total 744
lrwxrwxrwx 1 viveca ipausers     31 May 25 14:40 amber99bsc1.ff -> ../template-npt/amber99bsc1.ff/
lrwxrwxrwx 1 viveca ipausers     24 May 25 14:40 conf.gro -> ../template-npt/conf.gro
-rw-r--r-- 1 viveca ipausers   3005 May 25 14:42 grompp.mdp
-rw-r--r-- 1 viveca ipausers 181632 May 22 13:39 index.ndx
-rw-r--r-- 1 viveca ipausers  14025 May 25 14:43 mdout.mdp
lrwxrwxrwx 1 viveca ipausers     25 May 25 14:40 topol.top -> ../template-npt/topol.top
-rw-r--r-- 1 viveca ipausers 554072 May 25 14:43 topol.tpr
/home/viveca/awh-tutorial/dna-base-pair-opening/awh-1d


Find the differences between this mdp file for AWH and an mdp file for a vanilla MD simulation:

In [8]:
!diff ../template-npt/grompp.mdp grompp.mdp

3c3
< nsteps                   = 25000000
---
> nsteps                   = 100000000
35a36,59
> 
> pull                     = yes                 # The reaction coordinate (RC) is defined using pull coordinates.
> pull-ngroups             = 2                   # The number of atom groups needed to define the pull coordinate.
> pull-ncoords             = 1                   # Number of pull coordinates.
> pull-nstxout             = 5000                # Step interval to output the coordinate values to the pullx.xvg file.
> pull-nstfout             = 0                   # Step interval to output the applied force, skip this output here.
> 					       
> pull-group1-name         = base_N1orN3         # Name of pull group 1 corresponding to an entry in an index file.
> pull-group2-name         = partner_N1orN3      # Same, but for group 2.
> 					       
> pull-coord1-groups       = 1 2                 # Which groups define coordinate 1? Here, groups 1 and 2.
> pull-coord1-

Here '<' refers to the content of the first argument to `diff` (NPT mdp) and '>' to the second (AWH mdp). So, e.g., we increased the number of steps (`nsteps`) for AWH. The more relevant parameters are the ones prefixed `pull` and `awh`. What do these parameters mean? Click here to Google for ["gromacs documentation"](http://www.google.com/search?q=gromacs+documentation). Hint: we are using GROMACS 2018 and we are interested in "Molecular dynamic parameters (.mdp options)". But the comments I put in the mdp file should be enough here.

## The index (.ndx) file
We saw that the .mdp file now depends on some definitions of atom groups; we need to have an index file for these. Here our groups are as simple as they get: each group contains a single nitrogen atom. But don't get tempted to edit an index file manually! The traditional tool to use is `gmx make_ndx` and a more general and powerful tool is `gmx select`. We focus on AWH here and provide the index file, and leave the index file generation as an [exercise](#sec:make-index). Let's double-check that the groups in the .mdp file are actually defined:


In [10]:
!grep -A 1  N1orN3  index.ndx  # '-A 1' to show also 1 line after the match 

[ base_N1orN3 ]
 338 
--
[ partner_N1orN3 ]
 936 


One atom per group looks right. In a real study, a better check would be to visualize these atom indices (e.g. with VMD).

## Starting and analyzing the simulation
Now generate the tpr as usual with `grompp` (assuming default naming of input files grompp.mdp, conf.gro, topol.top, index.ndx) 

In [18]:
!gmx grompp -n -quiet

Replacing old mdp entry 'nstxtcout' by 'nstxout-compressed'
Setting the AWH bias MC random seed to -310591473
Setting the LD random seed to -745771594
Generated 2485 of the 2485 non-bonded parameter combinations
Generating 1-4 interactions: fudge = 0.5
Generated 2485 of the 2485 1-4 parameter combinations
Excluding 3 bonded neighbours molecule type 'DNA_chain_A'
turning H bonds into constraints...
Excluding 3 bonded neighbours molecule type 'DNA_chain_B'
turning H bonds into constraints...
Excluding 2 bonded neighbours molecule type 'SOL'
turning H bonds into constraints...
Excluding 1 bonded neighbours molecule type 'NA'
turning H bonds into constraints...
Removing all charge groups because cutoff-scheme=Verlet
Pull group 1 'base_N1orN3' has 1 atoms
Pull group 2 'partner_N1orN3' has 1 atoms
Number of degrees of freedom in T-Coupling group System is 19587.00
Determining Verlet buffer for a tolerance of 0.005 kJ/mol/ps at 300 K
Calculated rlist for 1x1 atom pair-list as 1.034 nm, buffer

Note that the values of the pull coordinate values in the provided .gro file are printed by grompp. Does it look reasonable? Now assume we have run a simulation. Some information related to the AWH initial convergence can be found in the `mdrun` log file.

In [24]:
!grep 'covering' data/md.log
!grep 'out of the' data/md.log

awh1: covering at t = 1243.8 ps. Decreased the update size.
awh1: covering at t = 1530 ps. Decreased the update size.
awh1: out of the initial stage at t = 1530.


After exiting the initial stage, the free energy update size will decrease steadily with time.

In [29]:
a=10
!ls {a}

ls: cannot access 10: No such file or directory


## Extracurricular: effect of changing the force constant

## Extracurricular: effect of changing the diffusion parameter

## Extracurricular: making an index file  with `gmx select`
<a id='sec:make-index'></a>

but here I'll instead show how to use the flashier and more general tool `gmx select`. Learn about it using `gmx select -h`, as for any gmx command. Alternatively, the same information canbe found in the online GROMACS docs that you hopefully found above. HereFor making and index file we need to provide either a selection file (flag `-sf`) or a selection string (`-select`)

In [None]:
!echo hello

In [7]:
%pwd
%ls -l

total 32
drwxr-xr-x 2 viveca ipausers  4096 May 25 14:43 [0m[34mawh-1d[0m/
-rw-r--r-- 1 viveca ipausers 14273 May 25 14:52 awh-tutorial.ipynb
drwxr-xr-x 2 viveca ipausers  4096 May 22 17:58 [34mfigs[0m/
drwxr-xr-x 3 viveca ipausers  4096 May 23 10:21 [34mtemplate-npt[0m/
drwxr-xr-x 2 viveca ipausers  4096 May 23 17:33 [34mvisualization[0m/
