# Calculating PMFs with AWH in GROMACS
Here we learn how to calculate the potential of mean force
(PMF) along a reaction coordinate (RC) using the accelerated weight
histogram method (AWH) in GROMACS.  We will go through both how to set
up the input files needed, as well as how to extract and analyze the
output after having run the simulation. For more information about
the AWH method itself and how it can be used we refer to \cite{}[TODO how cite?].
What you need to know right now is that AWH applies a time-dependent bias
potential along the chosen RC, which is tuned during the simulation
such that it "flattens" the barriers in the PMF to improve sampling along the RC.
With better sampling, the PMF can be calculated more accurately than using unbiased MD. [TODO Movie/figure?]

*Author: Viveca Lindahl   
Email: vivecal@kth.se*

## The case study: DNA base pair opening
We will calculate a PMF for opening a DNA base
pair. The DNA double helix is a very stable structure. Opening a base pair
requires breaking hydrogen bonds between the bases and crossing a high free energy
barrier.  That's why we need to enhance the sampling by applying a bias!
<img src="figs/dna-helix.png" alt="dna" style="height: 300px;"/>
As our RC we use the distance between the two atoms forming the central hydrogen-bond the two bases in a pair. Let's have a look at the system and the reaction coordinate using VMD. The `-e` flag below tells VMD to excute the commands that are in the following tcl-script. These commands change how the system is visually represented. For instance, we have hidden all the water to better see the DNA itself, and we have put punchy colors on the atoms defining the RC of our target base pair. Now run this and VMD should pop up:

In [6]:
!vmd visualization/dna-centered.gro -e visualization/representation.tcl

/usr/local/lib/vmd/vmd_LINUXAMD64: /usr/lib/nvidia-384/libGL.so.1: no version information available (required by /usr/local/lib/vmd/vmd_LINUXAMD64)
Info) VMD for LINUXAMD64, version 1.9.3 (November 30, 2016)
Info) http://www.ks.uiuc.edu/Research/vmd/                         
Info) Email questions and bug reports to vmd@ks.uiuc.edu           
Info) Please include this reference in published work using VMD:   
Info)    Humphrey, W., Dalke, A. and Schulten, K., `VMD - Visual   
Info)    Molecular Dynamics', J. Molec. Graphics 1996, 14.1, 33-38.
Info) -------------------------------------------------------------
Info) Multithreading available, 8 CPUs detected.
Info)   CPU features: SSE2 AVX AVX2 FMA 
Info) Free system memory: 10GB (67%)
Info) Creating CUDA device pool and initializing hardware...
Info) Detected 1 available CUDA accelerator:
Info) [0] GeForce GTX 760     6 SM_3.0 @ 1.07 GHz, 1.9GB RAM, KTO, AE1, ZCP
Info) OpenGL renderer: GeForce GTX 760/PCIe/SSE2
Info)   Features: STENCIL 

Rotate the structure and look for the two (nitrogen) atoms in green. The distance between these will serve as our RC for that base pair. Now quit VMD through the "VMD Main" window ("Quit" under the "File" tab)

## The MD parameter (.mdp) file 
Here we assume we have already built and equilibrated the system, i.e. we are almost ready to go, we basically just need to add some extra parameters in the mdp file to use AWH. Find the differences between an mdp file for a vanilla NPT simulation with an mdp file applying AWH:

In [16]:
!diff template-npt/grompp.mdp awh-1d/grompp.mdp

3c3
< nsteps                   = 25000000
---
> nsteps                   = 100000000
35a36,59
> 
> pull                     = yes                 # The reaction coordinate (RC) is defined using pull coordinates.
> pull-ngroups             = 2                   # The number of atom groups needed to define the pull coordinate.
> pull-ncoords             = 1                   # Number of pull coordinates.
> pull-nstxout             = 5000                # Step interval to output the coordinate values to the pullx.xvg file.
> pull-nstfout             = 0                   # Step interval to output the applied force, skip this output here.
> 					       
> pull-group1-name         = base_N1orN3         # Name of pull group 1 corresponding to an entry in an index file.
> pull-group2-name         = partner_N1orN3      # Same, but for group 2.
> 					       
> pull-coord1-groups       = 1 2                 # Which groups define coordinate 1? Here, groups 1 and 2.
> pull-coord1-

Here '<' refers to the content of the first argument to `diff` (NPT mdp) and '>' to the second (AWH mdp). The relevant things. So, e.g., we increased the number of steps (`nsteps`) for AWH. The more relevant parameters are the ones prefixed `pull` and `awh`. What do these parameters mean? Click here to Google for ["gromacs documentation"](http://www.google.com/search?q=gromacs+documentation). Hint: we are using GROMACS 2018 and we are interested in "Molecular dynamic parameters (.mdp options)". But the comments I put in the mdp file should be enough here.

## The index (.ndx) file
We saw in the .mdp file that it now depends on some atom group definitions. This is exactly what an index files contains! Here our groups are as simple as they get: each group contains a single nitrogen atom. But we should not get tempted to edit an index file manually, no! The simple, traditional tool to use is `gmx make_ndx`, check out the documentation if you want:

In [23]:
!gmx make_ndx -h -quiet

SYNOPSIS

gmx make_ndx [-f [<.gro/.g96/...>]] [-n [<.ndx> [...]]] [-o [<.ndx>]]
             [-natoms <int>] [-[no]twin]

DESCRIPTION

Index groups are necessary for almost every GROMACS program. All these
programs can generate default index groups. You ONLY have to use gmx make_ndx
when you need SPECIAL index groups. There is a default index group for the
whole system, 9 default index groups for proteins, and a default index group
is generated for every other residue name.

When no index file is supplied, also gmx make_ndx will generate the default
groups. With the index editor you can select on atom, residue and chain names
and numbers. When a run input file is supplied you can also select on atom
type. You can use boolean operations, you can split groups into chains,
residues or atoms. You can delete and rename groups. Type 'h' in the editor
for more details.

The atom numbering in the editor and the index file starts at 1.

The -twin switch duplicates all inde

but here I'll instead show how to use the flashier and more general tool `gmx select`. Learn about it using `gmx select -h`, as for any gmx command. Alternatively, the same information canbe found in the online GROMACS docs that you hopefully found above. For making and index file we need to provide either a selection file (flag `-sf`) or a selection string (`-select`)

In [27]:
!gmx select -h -quiet | head -n 30

SYNOPSIS

gmx select [-f [<.xtc/.trr/...>]] [-s [<.tpr/.gro/...>]] [-n [<.ndx>]]
           [-os [<.xvg>]] [-oc [<.xvg>]] [-oi [<.dat>]] [-on [<.ndx>]]
           [-om [<.xvg>]] [-of [<.xvg>]] [-ofpdb [<.pdb>]] [-olt [<.xvg>]]
           [-b <time>] [-e <time>] [-dt <time>] [-tu <enum>]
           [-fgroup <selection>] [-xvg <enum>] [-[no]rmpbc] [-[no]pbc]
           [-sf <file>] [-selrpos <enum>] [-seltype <enum>]
           [-select <selection>] [-[no]norm] [-[no]cfnorm] [-resnr <enum>]
           [-pdbatoms <enum>] [-[no]cumlt]

DESCRIPTION

gmx select writes out basic data about dynamic selections. It can be used for
some simple analyses, or the output can be combined with output from other
programs and/or external analysis programs to calculate more complex things.
For detailed help on the selection syntax, please use gmx help selections.

Any combination of the output options is possible, but note that -om only
operates on the first selection. Also note that if

## Extra-curricular 1: effect of changing the force constant

## Extra-curricular 2: effect of changing the diffusion parameter