# Metadynamics Exercise


In [None]:
import os
import numpy as np
import matplotlib.pyplot as plt
import glob
import subprocess

In [None]:
%%bash
module list

**This notebook has been tested to work on PSC Ondemand**.

**You should use the default `ipykernel` for this run, not the `icomse_cpu` one**

**For running this on other resources, you need to changed the `plmued_bin` variable here below.**

**Furthermore, you might need to change the `GMX_BIN` and `GMX_MDRUN_BIN` variables in all the places where GROMACS is run (you also need to comment the `module load` commands)**

## Define the PLUMED binary used

In the `plumed_bin` variable below, you should define the name of the PLUMED binary. Normally this is just `plumed`, but on PSC, we need to employ a version from the singularity container for this workshop.

In [None]:
# define the PLUMED binary

# This is PLUMED binary in most cases
#plumed_bin="plumed"

# PLMUED 2.8 on PSC via icomse container 
plumed_bin="singularity exec /ocean/projects/see220002p/shared/icomse_cpu.sif plumed"

In [None]:
# Define a function that helps us run PLUMED commands
def run_plumed_cmd(cmd,verbose=False):
    cmd_str="{} ".format(plumed_bin)+cmd
    print("PLUMED command: {}".format(cmd_str))
    if verbose:
        subprocess.run(cmd_str.split())
    else:
        subprocess.run(cmd_str.split(),stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)    

In [None]:
# Test that the PLUMED commands work 
run_plumed_cmd("sum_hills -h",verbose=True)

## Define the GROMACS Binaries 

In the `gmx_bin` and `gmx_mdrun_bin` variables below, you should define the name of `gmx` and `mdrun` binary used by the GROMACS installations that you are using. You should also include any mpirun command and options in the `gmx_mdrun_bin` variable

In [None]:
# Default value on PSC
gmx_bin="gmx_mpi"
gmx_mdrun_bin="mpirun -np 1 gmx_mpi mdrun -ntomp 1"

In [None]:
%%bash -s "$gmx_bin" "$gmx_mdrun_bin"

if [[ `hostname` =~ "bridges2" ]]; then
  source /ocean/projects/see220002p/shared/gromacs+plumed/gromacs-2022.6_plumed-2.9.0/load-gromacs-plumed.sh
fi

GMX_BIN=${1}
GMX_MDRUN_BIN=${2}

echo "## Trying ${GMX_BIN}"
${GMX_BIN} -h 
echo "#################"
echo ""

echo "## Trying ${GMX_MDRUN_BIN}"
${GMX_MDRUN_BIN} -h 
echo "#################"


In [None]:
# for larger figures
plt.rcParams['figure.dpi'] = 500

In [None]:
home_dir=os.getcwd()
print(home_dir)

We backup any previous exercise files and move then to a seperate folder 

In [None]:
%%bash
# delete any previous exercise files
numdir=`ls | grep ^Exercise- | wc -l`

if [ ${numdir} -gt 0 ]; then

  timestamp=`date "+%Y-%m-%d-%H%M"`
  backup_folder="Exercises-Backup-${timestamp}"

  mkdir ${backup_folder}

  mv Exercise-1 ${backup_folder}
  mv Exercise-2 ${backup_folder}
  mv Exercise-3 ${backup_folder}
  mv Exercise-4 ${backup_folder}
fi

## System

In this tutorial, we will consider the association/dissociation of NaCl in aqueous solution. The system consists of 1 Na atom, 1 Cl atom, and 107 water molecules for a total of 323 atoms. In an effort to speed up the simulations, we employ a rather small water box, and thus need to employ smaller cutoffs than usually used. Therefore, this simulation setup should not be used in production runs. 

We will run the simulations at 300 K in the NPT ensemble. 

In [None]:
# kB*T at 300 K in kJ/mol
kBT=2.494353
beta=1/kBT

We will consider the following exercises
- Exercise 1 - Unbiased Simulation
- Exercise 2 - Biasing with distance CV 
- Exercise 2B - Reweighting 
- Exercise 3 - Biasing with distance and solvation CVs
- Exercise 3B - Restarting metadynamics runs
- Exercise 4 - Varying Metadynamics Parameters (Left as an optional exercise)


# Exercise 1 - Unbiased Simulation

In this exercise we will perform an unbiased simulations of the system and monitor the relevant CVs.

We start copying the files from `Template-Exercise-X` to new folder `Exercise-1` where we will perform the simulation. 

In [None]:
%cp -r Template-Exercise-X Exercise-1 

In [None]:
%cd Exercise-1/

In [None]:
%ls

In this folder we have multiple files needed to perfom Gromacs simulations, including the topology file `NaCl.top` and initial structures `NaCl_StartingStructure-1.gro` and `NaCl_StartingStructure-2.gro`, the other files will needed will be generated for the Jupyter cells below. 

Let's look at this files

In [None]:
%cat NaCl.top

In [None]:
%cat NaCl_StartingStructure-1.gro

In particlar, note the structure file. In this file we have the index of a given atom in the 3rd column.

We can see that the Na atom is number 322 and the Cl atom is number 323. 

We can also see that the oxygen atoms of the water moleculer are every 3rd atom starting from 1, that is 1,4,7,10,...,319. 

We will use this information to define the CVs for this system.

### Collective Variables (CVs)

#### Na-Cl Distance

The most relevant CV for this system is the distance between the Na and Cl atoms. As we say above the Na atom is number 322 while the Cl atom is number 323. Thus, we can define the distance CV in PLUMED as

`dist: DISTANCE ATOMS=322,323`

#### Na Solvation 

Furthermore, the NaCl association/dissociation is coupled to the collective motion of the solvent. To measure that, we will use a CV that measures the solvation of the Na atom. For this, we employ the coordination number of the Na atom with respect to the oxygens of the water molecules that we define in PLUMED as

<code>
coord: COORDINATION ...
   GROUPA=322   
   GROUPB=1-321:3 
   SWITCH={RATIONAL R_0=0.315 D_MAX=0.5 NN=12 MM=24}  
   NLIST 
   NL_CUTOFF=0.55 
   NL_STRIDE=10    
...
</code>

Here we are using a rational switching function that measures the coordiation number in a smooth way. The atom defined in `GROUPA=322` is the Na while `GROUPB=1-321:3` is syntax to select every 3rd oxygen atom starting from atom 1 up to 321, that is 1,4,7,10,...,319, as we want. See the manual for __[COORDINATION](https://www.plumed.org/doc-v2.8/user-doc/html/_c_o_o_r_d_i_n_a_t_i_o_n.html)__ for further information. 

Before we can start an unbiased simulation, we need to set the MD parameters for GROMACS that are set in the `MD-NPT.mdp` file. 
We will generate this file below by using `cat` with the `EOF` command. 

We are using a MD timestep of 0.002 ps.

We will run a simulation for 1000 ps, or (1000 ps)/(0.002 ps) = 500000 MD steps. This is set by the `nsteps` parameter.

We are running NPT at 300 K and 1.01325 bar.

Note that due to the small system size, the cutoff parameters (rlist, rcoulomb, rvdw) are too low for production MD runs. 

For further information see the GROMACS manual [here](https://manual.gromacs.org/documentation/current/user-guide/mdp-options.html)

In [None]:
%%bash
cat <<EOF > MD-NPT.mdp
integrator = md
dt = 0.002
nsteps = 500000
cutoff-scheme = Verlet
coulombtype = PME
rlist = 0.6
rcoulomb = 0.6
rvdw = 0.6
constraints = h-bonds
tcoupl =  V-rescale
ref_t = 300
tau-t = 1.0
tc-grps = System
gen-vel = no
gen-temp = 300
gen-seed = -1
DispCorr = AllEnerPres
pcoupl = Parrinello-Rahman
pcoupltype = isotropic
ref-p = 1.01325
compressibility = 4.5e-5
nstxout-compressed = 250
nstxout = 0
nstvout = 0
EOF

In [None]:
# to check that the MD-NPT.mdp file was succesfully generated
%ls

We then need to generate a GROMACS tpr file that we then run using `mdrun`. For this we need to have the mdp parameter file `MD-NPT.mdp`, a topology file `NaCl.top`, and a inital geometry `NaCl_StartingStructure-1.gro`. 

The following command will generate the TPR file.

In [None]:
%%bash -s "$gmx_bin" "$gmx_mdrun_bin"

if [[ `hostname` =~ "bridges2" ]]; then
  source /ocean/projects/see220002p/shared/gromacs+plumed/gromacs-2022.6_plumed-2.9.0/load-gromacs-plumed.sh
fi

GMX_BIN=${1}
GMX_MDRUN_BIN=${2}

StartingGeometry=NaCl_StartingStructure-1.gro
RunFilename=NaCl_NPT-300K
${GMX_BIN}  grompp -f MD-NPT.mdp -c ${StartingGeometry} -p NaCl.top -o ${RunFilename}.tpr  -maxwarn 1

In [None]:
# To check that the tpr file was succesfully generated 
%ls

We then need to generate the PLUMED input file. 

In this file, we define the CVs discussed above and print them to a file called `colvar.dat` by using the `PRINT` action. The `STRIDE=250` keyword defines that we print the CV values every 250 MD steps. 

We also employ a harmonic wall at a distance of 0.6 nm to reduce the flucations in the dissocated state. This is set using the `UPPER_WALLS` action 

In [None]:
%%bash
cat <<EOF > plumed.dat
# vim:ft=plumed

# Distance between Na and Cl atoms
dist: DISTANCE ATOMS=322,323

# Solvation of Na atom
COORDINATION ...
  LABEL=coord
  GROUPA=322
  GROUPB=1-321:3
  SWITCH={RATIONAL R_0=0.315 D_MAX=0.5 NN=12 MM=24}
  NLIST
  NL_CUTOFF=0.55
  NL_STRIDE=10
... COORDINATION

uwall: UPPER_WALLS ...
   ARG=dist 
   AT=0.6
   KAPPA=4000.0 
...

PRINT ARG=dist,coord,uwall.* FILE=colvar.data STRIDE=250
EOF

In [None]:
%ls

We then have all the files need to run the simulation. 

We will run using the following commands. 

This run should take a few minutes.

**Note: On PSC Bridge2, there is no output to the cell above until the job has finished, but you can logon on via ssh and see files there during the run**

In [None]:
%%bash -s "$gmx_bin" "$gmx_mdrun_bin"

if [[ `hostname` =~ "bridges2" ]]; then
  source /ocean/projects/see220002p/shared/gromacs+plumed/gromacs-2022.6_plumed-2.9.0/load-gromacs-plumed.sh
fi

GMX_BIN=${1}
GMX_MDRUN_BIN=${2}

RunFilename=NaCl_NPT-300K
${GMX_MDRUN_BIN}  -deffnm ${RunFilename}   -plumed plumed.dat 
echo "0" | ${GMX_BIN} trjconv -f ${RunFilename}.xtc -s ${RunFilename}.tpr -pbc whole -o ${RunFilename}.pbc-whole.xtc

As we can see, this simulation has generated various files

In [None]:
%ls

The main file is the colvar file `colvar.data`. Let's look at this file. We can see that the first line is a header line that shows what variables are in which column.

In [None]:
%%bash
head colvar.data

The colvar file has the time series of the CVs that we monitored. 

The easiest way to look at them is read them using numpy.loadtxt. We can the visualize the time series using matplotlib and calculate averages and so on.

**Note that python indexing starts from 0 so the first column is 0 in python, the second column is 1, etc.**

In [None]:
colvar_tmp = np.loadtxt("./colvar.data")

# time is column 1
time = colvar_tmp[:,0]
# distance is column 2
dist = colvar_tmp[:,1]

plt.plot(time,dist,".")

In [None]:
print(np.average(dist))

We define two python functions to read the data from the colvar file, and to plot it, that we will use in the following as we need to read from colvar and plot CV repeatly through the tutorial. 

In [None]:
def get_colvardata(filename_colvar):
    with open(filename_colvar,'r') as colvar_file:
        colvar_labels = colvar_file.readline().split()[2:]
    colvar_data=np.loadtxt(filename_colvar)
    return (colvar_data,colvar_labels)

In [None]:
def plot_colvardata(filename_colvar,column):
    (colvar_data,colvar_labels)=get_colvardata(filename_colvar)
    plt.plot(colvar_data[:,0],colvar_data[:,column-1],".")
    plt.title(colvar_labels[column-1])
    plt.xlabel("Time [ps]")
    plt.ylabel(colvar_labels[column-1])
    plt.show()

Let us plot the time series of the distance, which is column 2 in the colvar file, using this command. 

In [None]:
plot_colvardata("colvar.data",2)

We can see that there are some transitions between the associated and dissociated state.  

We want to estimate the fluctations (i.e., standard deviations) of the distance CV to select the sigma value for the metadynamics simulations. However, we need to do that separately for the associated and dissociated state. We use a distance of 0.36 to separate the two states. We can do that with following python code. 

In [None]:
(colvar_data,labels)=get_colvardata("colvar.data")
# distance is 2nd column, 1 in python indexing 
time = colvar_data[:,0]
dist = colvar_data[:,1]

dist_separate=0.36
dist_assoc=   dist[dist<dist_separate]
dist_dissoc=  dist[dist>dist_separate]
time_assoc=   time[dist<dist_separate]
time_dissoc=  time[dist>dist_separate]

plt.plot(time_assoc,dist_assoc,'.',label="Associated")
plt.plot(time_dissoc,dist_dissoc,'.',label="Dissociated")
plt.xlabel("Time [ps]")
plt.ylabel("Distance [nm]")
plt.legend()
plt.show()

print("Distance CV")
print(" Associated State")
print("   Average: {:.3f}".format(np.average(dist_assoc)))
print("   Standard Deviation: {:.3f}".format(np.std(dist_assoc)))
print(" Dissociated State")
print("   Average: {:.3f}".format(np.average(dist_dissoc)))
print("   Standard Deviation: {:.3f}".format(np.std(dist_dissoc)))


When determining the approriate value for the width of the Gaussian (i.e., SIGMA parameter) for a CV, we need to consider the lower value of the fluctions, which is in this case is for the more ordered associated state. 

Thus, from these results, we can determine that an approriate value for the width of the Gaussian (i.e., SIGMA parameter), for the distance CV is around 0.01 nm. If we would use the value for the dissociated state of around 0.2 nm it is likely that we smear out the free energy surface. 

**Thus we get that an approriate value is around 0.01 nm**

Before continuing, we should look at the behavior of the solvation CV (the 3rd column)

In [None]:
plot_colvardata("colvar.data",3)

We can also look at the correlation of the distance and solvation CVs

In [None]:
(colvar_data,labels)=get_colvardata("colvar.data")
plt.plot(colvar_data[:,1],colvar_data[:,2],'.')
plt.xlabel(labels[1])
plt.ylabel(labels[2])
plt.show()

Which are tightly correlated, as expected

In [None]:
%cd ..

# Exercise 2 - Biasing Distance CV

We will now consider exercise 2 where we will bias the distance CV. 

We will start with copying a new folder for the runs and generating all the GROMACS files. 

Now we will run the simulation for 10000 ps (5000000 MD steps). 

In [None]:
os.chdir(home_dir)

In [None]:
%%bash
cp -r Template-Exercise-X Exercise-2

In [None]:
%cd Exercise-2/

In [None]:
%%bash
cat <<EOF > MD-NPT.mdp
integrator = md
dt = 0.002
nsteps = 5000000
cutoff-scheme = Verlet
coulombtype = PME
rlist = 0.6
rcoulomb = 0.6
rvdw = 0.6
constraints = h-bonds
tcoupl =  V-rescale
ref_t = 300
tau-t = 1.0
tc-grps = System
gen-vel = no
gen-temp = 300
gen-seed = -1
DispCorr = AllEnerPres
pcoupl = Parrinello-Rahman
pcoupltype = isotropic
ref-p = 1.01325
compressibility = 4.5e-5
nstxout-compressed = 250
nstxout = 0
nstvout = 0
EOF

In [None]:
%%bash -s "$gmx_bin" "$gmx_mdrun_bin"

if [[ `hostname` =~ "bridges2" ]]; then
  source /ocean/projects/see220002p/shared/gromacs+plumed/gromacs-2022.6_plumed-2.9.0/load-gromacs-plumed.sh
fi

GMX_BIN=${1}
GMX_MDRUN_BIN=${2}

StartingGeometry=NaCl_StartingStructure-1.gro
RunFilename=NaCl_NPT-300K
${GMX_BIN}  grompp -f MD-NPT.mdp -c ${StartingGeometry} -p NaCl.top -o ${RunFilename}.tpr  -maxwarn 1


Now we need to setup the PLMUED input file for the Metadynamics runs. For that we add a `METAD` action to the input file with the following keywords

__Note that all units in PLUMED are given in nm, kJ/mol, and ps__

- `ARG=dist`: this defines the CV to be biased, the distance in this case
- `PACE=500`: we add Gaussians every 500 MD steps. This means every 1 ps, which is a typical value
- `SIGMA=0.01`: we employ a width of 0.01 nm for the distance CV as determined above
- `HEIGHT=1.25`: the inital height of the Gaussian is set as 1.25 kJ/mol, which is around 0.5 of kBT at 300 K
- `BIASFACTOR=5`: we employ a bias factor of 5
- `GRID_MIN=0.0` and `GRID_MAX=1.0`: we employ the grid to represent the bias potential as this is better for performance. These keywords define the range of this grid. The other parameters are set automatically by PLUMED based on the sigma value. 
- `CALC_RCT`: this enables the calculation of the c(t) factor and rbias=bias-rct that we use for reweighting
- `FILE=hills.data`: the Gaussian hills added will be written to the file `hills.data`

See the PLUMED manual entry for [METAD](https://www.plumed.org/doc-v2.8/user-doc/html/_m_e_t_a_d.html) for further information about the keywords. 

We also add the variables related to the metadynamics (`mtd.*`) to the colvar file by adding it to the `PRINT` action.

In [None]:
%%bash
cat <<EOF > plumed.dat
# vim:ft=plumed

# Distance between Na and Cl atoms
dist: DISTANCE ATOMS=322,323

# Solvation of Na atom
COORDINATION ...
  LABEL=coord
  GROUPA=322
  GROUPB=1-321:3
  SWITCH={RATIONAL R_0=0.315 D_MAX=0.5 NN=12 MM=24}
  NLIST
  NL_CUTOFF=0.55
  NL_STRIDE=10
... COORDINATION

uwall: UPPER_WALLS ...
   ARG=dist 
   AT=0.6
   KAPPA=4000.0 
...

METAD ...
  LABEL=mtd
  ARG=dist
  PACE=500
  SIGMA=0.01
  HEIGHT=1.25
  BIASFACTOR=5
  GRID_MIN=0.0
  GRID_MAX=1.0
  CALC_RCT
  FILE=hills.data
... METAD

PRINT ARG=dist,coord,mtd.*,uwall.* FILE=colvar.data STRIDE=250
EOF

In [None]:
%ls

In [None]:
%%bash -s "$gmx_bin" "$gmx_mdrun_bin"

if [[ `hostname` =~ "bridges2" ]]; then
  source /ocean/projects/see220002p/shared/gromacs+plumed/gromacs-2022.6_plumed-2.9.0/load-gromacs-plumed.sh
fi

GMX_BIN=${1}
GMX_MDRUN_BIN=${2}

RunFilename=NaCl_NPT-300K
${GMX_MDRUN_BIN}  -deffnm ${RunFilename}   -plumed plumed.dat 
echo "0" | ${GMX_BIN} trjconv -f ${RunFilename}.xtc -s ${RunFilename}.tpr -pbc whole -o ${RunFilename}.pbc-whole.xtc

This run should take around 10-15 minutes. 

__Note: On PSC Bridge2, there is no output to the cell above until the job has finished, but you can logon on via ssh and see files there during the run__

In [None]:
%ls

As we can see, there are now a number of files. Let's look at the variables written out in the `colvar.data` and `hills.data` files

In [None]:
%%bash
head colvar.data

In [None]:
%%bash
head hills.data

Let start with looking at the time series of the distance CV that we are biasing

In [None]:
plot_colvardata("./colvar.data",2)

We can see that now there are much more frequent transtions between the two states 

We now also have the `hills.data` file that includes the added Gaussians. 

In [None]:
%%bash
head hills.data

We should also look at the height of the added Gaussians, which should decrease as the simulation progresses and go to zero in the long time limit. This is in the 4th column in the `hills.data` file when we have 1 CV. 

In [None]:
plot_colvardata("./hills.data",4)

We can obtain the estimate of the FES from the added Gaussian by using the `sum_hills` command line tool of PLUMED, which will sum up the Gaussians. This done with the command below. 

We also get another FES where the minimum has been set to zero by using the `--mintozero` flag, which is convient when comparing results (by default this is not done). 

In [None]:
%%bash
mkdir fes

In [None]:
run_plumed_cmd("sum_hills --hills hills.data --outfile fes/fes.data",verbose=True)
run_plumed_cmd("sum_hills --hills hills.data  --mintozero --outfile fes/fes.mintozero.data ",verbose=False)

The `sum_hills` command line tool has various other option that one sometimes need to use as we can see by running `plumed sum_hills -h`

In [None]:
run_plumed_cmd("sum_hills -h",verbose=True)

Let's plot the FES

In [None]:
# Plot FES 

fes=np.loadtxt("fes/fes.data")
plt.plot(fes[:,0],fes[:,1]/kBT,'-')
plt.xlabel("Na-Cl Distance [nm]")
plt.ylabel("Free Energy [kBT]")

To understand the convergence of the FES, we should look at how the FES behaves over time. For this we can use the `--stride 1000` flag that will tell the `sum_hills` tool to output the fes every 1000 added Gausssians, that every 1000 ps. Here we will need to use the `--mintozero` flag as otherwise it is not possible to compare the FES. 

In [None]:
%%bash
rm -rf fes-stride-1000
mkdir fes-stride-1000

In [None]:
run_plumed_cmd("sum_hills --hills hills.data --mintozero --stride 1000 --outfile fes-stride-1000/fes.stride-")

In [None]:
fes_stride=1000
num_fesfile=len(glob.glob("fes-stride-1000/fes.stride-*"))
for i in range(num_fesfile):  
    if i == num_fesfile-1: continue
    time=(i+1)*fes_stride
    fes=np.loadtxt("fes-stride-1000/fes.stride-{}.dat".format(i))
    plt.plot(fes[:,0],fes[:,1]/kBT,'-',label="{:4d} ps".format(time))    
plt.xlabel("Na-Cl Distance [nm]")
plt.ylabel("Free Energy [kBT]")
plt.legend()
plt.ylim([0,20])
plt.show()

As we can see, after the some time, the FES do not change much anymore and converge. 

## Free Energy Difference 

One way to gauge the convergence is to look at the free energy difference between the two states, the associated state and the dissociated state. To define the regions for the two states, we need to select a distance that seperates the two states. The best way to do that is to look at the probability distribution, $P(\mathbf{s}) \propto \exp(-\beta*F(\mathbf{s}))$.

In [None]:
# Plot PDF

fes=np.loadtxt("fes/fes.data")
pdf=np.exp(-beta*(fes[:,1]-np.min(fes[:,1])))
plt.plot(fes[:,0],pdf,'-')
plt.xlabel("Na-Cl Distance [nm]")
plt.ylabel("Probablity")

plt.ylim([0,1.1])

barrier_location=0.36
plt.axvspan(0.2,barrier_location,alpha=0.2,color='orange')
plt.axvspan(barrier_location,np.max(fes[:,0]),alpha=0.2,color='green')

As we can see, a value of 0.36 seems to be good value to seperates the two states. We will the use the following function that does a simple free energy difference calculaton for 1D case. 

**Note that this function will give the results in units of kBT**

In [None]:
def calc_free_energy_difference(fes_file):
    barrier_location=0.36
    fes_data = np.loadtxt(fes_file)
    distance = fes_data[:,0]
    fes = fes_data[:,1]-np.min(fes_data[:,1])
    prob = np.exp(-beta*fes)
    prob = prob/np.sum(prob)        
    prob_A = 0.0
    prob_B = 0.0
    for i in range(prob.size):
        if(distance[i]<barrier_location): 
            prob_A += prob[i]
        if(distance[i]>barrier_location): 
            prob_B += prob[i]
    free_energy_difference = -np.log(prob_A/prob_B)
    return free_energy_difference

To get a more frequent time series of the free energy difference, we will get the FES every 10 added Gaussians by using the `--stride 10` flag to `sum_hills`

In [None]:
%%bash
rm -rf fes-stride-10
mkdir fes-stride-10

In [None]:
run_plumed_cmd("sum_hills --hills hills.data --mintozero --stride 10 --outfile fes-stride-10/fes.stride-",verbose=False)

Now we loop over the different FES files and plot the results, by using the following code. 

In [None]:
fe_time=[]
fe_diff=[]
fes_stride=10
num_fesfile=len(glob.glob("fes-stride-10/fes.stride-*"))
for i in range(num_fesfile):  
    if i == num_fesfile-1: continue
    time=(i+1)*fes_stride
    fes_file="fes-stride-10/fes.stride-{}.dat".format(i)
    fe_time.append(time)
    fe_diff.append(calc_free_energy_difference(fes_file))
plt.plot(fe_time,fe_diff)
plt.xlabel("Time [ps]")
plt.ylabel("Free Energy Difference [kBT]")
plt.title("Free Energy Difference")
#plt.ylim([-1.0,1.0])



You should hopefully see that the free energy difference starts to converge. 

# Exercise 2B - Reweighting

We can also obtain FES through reweighting by using the $c(t)$ reweighting procedure. Then we use weight calculated from the re-normalized $\tilde{V}=V-c(t)$. As we used the `CALC_RCT` keyword during the simulation, the $c(t)$ factor is calculated as the variable `mtd.rct`, along with the re-normalized bias $\tilde{V}$ as the variable `mtd.rbias`. The direct bias is the variable `mtd.bias`. We can use this value to obtain the FES for both the biased CV and also any other CV. 

## Reweighthing on distance CV

Let us start by reweighting on the biased distance CV.

In [None]:
%%bash
head colvar.data

We can see that the relevant varibles are in columns 4 to 6. Let's plot them.

In [None]:
plot_colvardata("colvar.data",4)
plot_colvardata("colvar.data",5)
plot_colvardata("colvar.data",6)

We can see that in the inital phase of the simulation, the re-normalized bias (`mtd.rbias`) fluctuates. Therefore, we normally ignore the initial part of the simulation where the weights might be unreliable. 

In [None]:
%%bash
mkdir fes-reweight


Here we define the PLUMED input file that we use to do the reweighting. In this input, we read in the colvar file using the `READ` actions, and then calculate a weighted histogram by using the weights calculated using the `mtd.rbias` values. 

This PLUMED file can then be used in the `plumed driver` tool, see below. 

Here we are using a discrete histogram, but in PLUMED it is also possible to employ a kernel density estimation to get a smoother profile, but one needs to be careful not to oversmooth the surface. So, normally it is better to compare with a discrete histogram when finding an optional bandwidth for the kernels. 

In principle, it is also possible to use the data from the colvar file in any python action that can calculate weighted histograms. 

In [None]:
%%bash
cat <<EOF > fes-reweight/plumed_reweight.dat
# vim:ft=plumed

dist:   READ FILE=fes-reweight/colvar_reweight.data IGNORE_TIME VALUES=dist
coord:  READ FILE=fes-reweight/colvar_reweight.data IGNORE_TIME VALUES=coord
mtd:    READ FILE=fes-reweight/colvar_reweight.data IGNORE_TIME VALUES=mtd.rbias

weights: REWEIGHT_BIAS TEMP=300 ARG=mtd.rbias

HISTOGRAM ...
  ARG=dist
  GRID_MIN=0.2
  GRID_MAX=0.9
  GRID_BIN=100
  KERNEL=DISCRETE
  LOGWEIGHTS=weights
  LABEL=hg_dist
... HISTOGRAM

fes_dist: CONVERT_TO_FES GRID=hg_dist TEMP=300 MINTOZERO
DUMPGRID GRID=fes_dist FILE=fes-reweight/fes-reweight.dist.data FMT=%24.16e
EOF

To find how much we should ignore of the inital part of the simulation, we use this code that will ignore different time ranges from the beginning of the colvar file (this is done with the ./trim-colvar-file.py script that is given in the folder), and calculate the FES. We then plot the results. 

In [None]:
for i in range(0,1001,100):
    subprocess.run("rm -f fes-reweight/fes-reweight.dist.data".split(),stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
    subprocess.run("rm -f fes-reweight/colvar_reweight.data".split(),stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
    subprocess.run("./trim-colvar-file.py --colvar-file colvar.data --output-file fes-reweight/colvar_reweight.data --time-min {0}".format(i).split(),stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
    run_plumed_cmd("driver --plumed fes-reweight/plumed_reweight.dat --noatoms")
    fes_data=np.loadtxt("fes-reweight/fes-reweight.dist.data")
    plt.plot(fes_data[:,0],(fes_data[:,1]-np.min(fes_data[:,1]))/kBT,label="Trim {} ps".format(i))
    plt.xlabel("Na-Cl distance [nm]")  
    plt.ylabel("Free Energy [kBT]")
    plt.title("Reweighted FES")
    plt.legend()
    subprocess.run("rm -f fes-reweight/fes-reweight.dist.data".split())
    subprocess.run("rm -f fes-reweight/colvar_reweight.data".split())
    plt.ylim([0,10])

However, as we can see, there is not much difference. However, let's go with ignoring the first 500 ps of the simulation, which is 5% of the total simulation time. We then calculate the reweighted FES. 

In [None]:
time_trim=500
subprocess.run("./trim-colvar-file.py --colvar-file colvar.data --output-file fes-reweight/colvar_reweight.data --time-min {0}".format(time_trim).split(),stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
run_plumed_cmd("driver --plumed fes-reweight/plumed_reweight.dat --noatoms")



To compare the FES from obtained directly from the bias potential (i.e., `sum_hills`) and from reweighting, we will define the following function to wrap the plotting of the FESs

In [None]:
def plot_fes(fes_files,labels=None): 
    for i, f in enumerate(fes_files):
        fes_data=np.loadtxt(f)
        cv=fes_data[:,0]
        fes=fes_data[:,1]
        fes = fes - np.min(fes)
        fes = fes/kBT
        if labels:
            label=labels[i]
        else:
            label=f
        plt.plot(cv,fes,'-',label=label)
    plt.xlabel("Na-Cl Distance [nm]")
    plt.ylabel("Free Energy [kBT]")
    plt.legend()
    plt.ylim([0,10])    
    plt.show()
        
    

We use this function to compare the two FES

In [None]:
fes_files=["fes/fes.mintozero.data",
           "fes-reweight/fes-reweight.dist.data"]
labels=["FES from bias",
        "FES from rewighting"]
plot_fes(fes_files,labels)

As we can see, the two FES are in a good agreement

## Reweighting on both distance and solvation CV

We can also reweight on other CVs that are not biased during the simulation. We use this to obtain the 2D FES as a function of distance and solvation CV. 

We define the PLUMED input for the reweighting in the following way. 

Here we are using a kernel density estimation using a bandwidth of 0.004 nm for distance and 0.04 for the solvation CV (`BANDWIDTH=0.004,0.04`), these values where obtained with trial and error previously. 

In [None]:
%%bash
cat << EOF > fes-reweight/plumed_reweight.2D.dat 
# vim:ft=plumed

dist:   READ FILE=fes-reweight/colvar_reweight.data IGNORE_TIME VALUES=dist
coord:  READ FILE=fes-reweight/colvar_reweight.data IGNORE_TIME VALUES=coord
mtd:    READ FILE=fes-reweight/colvar_reweight.data IGNORE_TIME VALUES=mtd.rbias

weights: REWEIGHT_BIAS TEMP=300 ARG=mtd.rbias

HISTOGRAM ...
  ARG=dist,coord
  GRID_MIN=0.2,2.5
  GRID_MAX=0.9,7.5
  GRID_BIN=200,200
  BANDWIDTH=0.004,0.04
  LOGWEIGHTS=weights
  LABEL=hg_dist_coord
... HISTOGRAM

fes_dist_coord: CONVERT_TO_FES GRID=hg_dist_coord TEMP=300 MINTOZERO
DUMPGRID GRID=fes_dist_coord FILE=fes-reweight/fes-reweight.dist-coord.data FMT=%24.16e
EOF

In [None]:
run_plumed_cmd("driver --plumed fes-reweight/plumed_reweight.2D.dat --noatoms")


We can use the following code to plot the 2D FES

In [None]:
fes_data = np.loadtxt("./fes-reweight/fes-reweight.dist-coord.data")
distance = fes_data[:,0].reshape(201,201)
coord =    fes_data[:,1].reshape(201,201)
fes_2d =   fes_data[:,2].reshape(201,201)-np.min(fes_data[:,2].reshape(201,201))
from matplotlib import cm
plt.contourf(distance,coord,fes_2d/kBT, levels=np.linspace(0,10,101), cmap=cm.jet)
plt.xlabel("Na-Cl distance [nm]")
plt.ylabel("Solvation of Na")
plt.title("FES - Reweighting")
plt.colorbar(label="Free Energy [kBT]", ticks=range(0,11))

# Exercise 3 - Biasing both distance and solvation CV

We will now consider exercise 2 where we will bias both the  distance CV and the solvation CV. 

We will start with copying a new folder for the runs and generating all the GROMACS files. 

Now we will run the simulation for 5000 ps (2500000 MD steps). 

In [None]:
os.chdir(home_dir)

In [None]:
%%bash
cp -r Template-Exercise-X Exercise-3


In [None]:
%cd Exercise-3/

In [None]:
%%bash
cat <<EOF > MD-NPT.mdp
integrator = md
dt = 0.002
nsteps = 2500000
cutoff-scheme = Verlet
coulombtype = PME
rlist = 0.6
rcoulomb = 0.6
rvdw = 0.6
constraints = h-bonds
tcoupl =  V-rescale
ref_t = 300
tau-t = 1.0
tc-grps = System
gen-vel = no
gen-temp = 300
gen-seed = -1
DispCorr = AllEnerPres
pcoupl = Parrinello-Rahman
pcoupltype = isotropic
ref-p = 1.01325
compressibility = 4.5e-5
nstxout-compressed = 250
nstxout = 0
nstvout = 0
EOF

In [None]:
%%bash -s "$gmx_bin" "$gmx_mdrun_bin"

if [[ `hostname` =~ "bridges2" ]]; then
  source /ocean/projects/see220002p/shared/gromacs+plumed/gromacs-2022.6_plumed-2.9.0/load-gromacs-plumed.sh
fi

GMX_BIN=${1}
GMX_MDRUN_BIN=${2}

StartingGeometry=NaCl_StartingStructure-1.gro
RunFilename=NaCl_NPT-300K
${GMX_BIN}  grompp -f MD-NPT.mdp -c ${StartingGeometry} -p NaCl.top -o ${RunFilename}.tpr  -maxwarn 1

Now we need to add the solvation CV to the `METAD` action and we get the following PLUMED input

In [None]:
%%bash
cat <<EOF > plumed.dat
# vim:ft=plumed

# Distance between Na and Cl atoms
dist: DISTANCE ATOMS=322,323

# Solvation of Na atom
COORDINATION ...
  LABEL=coord
  GROUPA=322
  GROUPB=1-321:3
  SWITCH={RATIONAL R_0=0.315 D_MAX=0.5 NN=12 MM=24}
  NLIST
  NL_CUTOFF=0.55
  NL_STRIDE=10
... COORDINATION

uwall: UPPER_WALLS ...
   ARG=dist 
   AT=0.6
   KAPPA=4000.0 
...

METAD ...
  LABEL=mtd
  ARG=dist,coord
  PACE=500
  SIGMA=0.01,0.1 
  HEIGHT=1.25
  BIASFACTOR=5
  GRID_MIN=0.0,1.0
  GRID_MAX=1.0,10.0
  CALC_RCT
  FILE=hills.data
... METAD

PRINT ARG=dist,coord,mtd.*,uwall.* FILE=colvar.data STRIDE=250
EOF

Then we run the simulation

In [None]:
%%bash -s "$gmx_bin" "$gmx_mdrun_bin"

if [[ `hostname` =~ "bridges2" ]]; then
  source /ocean/projects/see220002p/shared/gromacs+plumed/gromacs-2022.6_plumed-2.9.0/load-gromacs-plumed.sh
fi

GMX_BIN=${1}
GMX_MDRUN_BIN=${2}

RunFilename=NaCl_NPT-300K
${GMX_MDRUN_BIN}  -deffnm ${RunFilename}   -plumed plumed.dat 
echo "0" | ${GMX_BIN} trjconv -f ${RunFilename}.xtc -s ${RunFilename}.tpr -pbc whole -o ${RunFilename}.pbc-whole.xtc

Let's look at the time series of the biased CVs and the hill height. We can see that the Gaussian hill height is in the 6th column in the `hills.data` file when we have 2 CV biased.  

In [None]:
%%bash
head hills.data 

In [None]:
plot_colvardata("./colvar.data",2)
plot_colvardata("./colvar.data",3)
plot_colvardata("./hills.data",6)

When biasing 2 CVs it is difficult to gauge the convergence of the FES from the 2D surface. Thus, it is convenient to look at the 1D projections on the two CVs. This can be calculated by using the `--idw dist` flag to `sum_hills` tool, but in this case we need to give the kBT value for the simulation, which is 2.494353 kJ/mol at 300 K. 

We then look at the 1D projected FESs every 1000 ps. 

In [None]:
%%bash
rm -rf fes-stride-1000 
mkdir fes-stride-1000

In [None]:
run_plumed_cmd("sum_hills --hills hills.data --mintozero --stride 1000 --kt 2.494353 --idw dist --outfile fes-stride-1000/fes.dist.stride-")
run_plumed_cmd("sum_hills --hills hills.data --mintozero --stride 1000 --kt 2.494353 --idw coord --outfile fes-stride-1000/fes.coord.stride-")

In [None]:
fes_stride=1000
num_fesfile=len(glob.glob("fes-stride-1000/fes.dist.stride-*"))
for i in range(num_fesfile):  
    if i == num_fesfile-1: continue
    fesfile="fes-stride-1000/fes.dist.stride-{}.dat".format(i)
    time=(i+1)*fes_stride
    fes=np.loadtxt(fesfile)
    plt.plot(fes[:,0],fes[:,1]/kBT,'-',label="{:4d} ps".format(time))    
plt.xlabel("Na-Cl Distance [nm]")
plt.ylabel("Free Energy [kBT]")
plt.legend()
plt.ylim([0,10])
plt.show()


In [None]:
fes_stride=1000
num_fesfile=len(glob.glob("fes-stride-1000/fes.coord.stride-*"))
for i in range(num_fesfile):  
    if i == num_fesfile-1: continue
    fesfile="fes-stride-1000/fes.coord.stride-{}.dat".format(i)
    time=(i+1)*fes_stride
    fes=np.loadtxt(fesfile)
    plt.plot(fes[:,0],fes[:,1]/kBT,'-',label="{:4d} ps".format(time))    
plt.xlabel("Na Solvation")
plt.ylabel("Free Energy [kBT]")
plt.legend()
plt.ylim([0,10])
plt.show()

In [None]:
%%bash
rm -rf fes
mkdir fes

In [None]:
run_plumed_cmd("sum_hills --hills hills.data --mintozero --kt 2.494353 --idw dist --outfile fes/fes.dist.data")
run_plumed_cmd("sum_hills --hills hills.data --mintozero --kt 2.494353 --idw coord --outfile fes/fes.coord.data")


Let's also calculate the reweighted FES for the distance. First let's look at the how the `mtd.rbias` values look.

In [None]:
plot_colvardata("./colvar.data",5)

We then use the same input for the reweighting as above

In [None]:
%%bash
mkdir fes-reweight

cat <<EOF > fes-reweight/plumed_reweight.dat
# vim:ft=plumed

dist:   READ FILE=fes-reweight/colvar_reweight.data IGNORE_TIME VALUES=dist
coord:  READ FILE=fes-reweight/colvar_reweight.data IGNORE_TIME VALUES=coord
mtd:    READ FILE=fes-reweight/colvar_reweight.data IGNORE_TIME VALUES=mtd.rbias

weights: REWEIGHT_BIAS TEMP=300 ARG=mtd.rbias

HISTOGRAM ...
  ARG=dist
  GRID_MIN=0.2
  GRID_MAX=0.9
  GRID_BIN=100
  KERNEL=DISCRETE
  LOGWEIGHTS=weights
  LABEL=hg_dist
... HISTOGRAM

fes_dist: CONVERT_TO_FES GRID=hg_dist TEMP=300 MINTOZERO
DUMPGRID GRID=fes_dist FILE=fes-reweight/fes-reweight.dist.data FMT=%24.16e
EOF

We ignore the first 500 ps of the simulation or 10% of the simulation

In [None]:
time_trim=500
subprocess.run("./trim-colvar-file.py --colvar-file colvar.data --output-file fes-reweight/colvar_reweight.data --time-min {0}".format(time_trim).split(),stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
run_plumed_cmd("driver --plumed fes-reweight/plumed_reweight.dat --noatoms")



Let's then compare the difference FES for the distance, either biasing 1 or 2 CVs, and obtained both directly from the bias potential and from reweighthing

In [None]:
fes_files=["../Exercise-2/fes/fes.mintozero.data",
           "../Exercise-2/fes-reweight/fes-reweight.dist.data",
           "fes/fes.dist.data",
           "fes-reweight/fes-reweight.dist.data"]
labels=["1 CV biased: FES from bias",
        "1 CV biased: FES from rewighting",
        "2 CV biased: FES from bias",
        "2 CV biased: FES from rewighting"]
plot_fes(fes_files,labels)


Finally let's calculate and plot the 2D FES obtained directly from the bias potential

In [None]:
run_plumed_cmd("sum_hills --hills hills.data --mintozero  --bin 200,200  --outfile fes/fes.dist-coord.data")

In [None]:
%%bash
head ./fes/fes.dist-coord.data

In [None]:
fes_data = np.loadtxt("./fes/fes.dist-coord.data")
# have to change the value in reshape to fit fes file generated by sum_hills
distance = fes_data[:,0].reshape(201,201)
coord =    fes_data[:,1].reshape(201,201)
fes_2d =   fes_data[:,2].reshape(201,201)-np.min(fes_data[:,2].reshape(201,201))
from matplotlib import cm
plt.contourf(distance,coord,fes_2d/kBT, levels=np.linspace(0,10,101), cmap=cm.jet)
plt.xlabel("Na-Cl distance [nm]")
plt.ylabel("Solvation of Na")
plt.title("FES")
plt.colorbar(label="Free Energy [kBT]", ticks=range(0,11))

# Exercise 3B - Restarting Metadynamics Runs

Often we need to split simulations into different parts, or we want to run a finished simulation for longer time to obtain better convergence. In this case, we need to restart the METAD runs as we do not want to start from scratch. 

This is quite easy in PLUMED. We just need to add the `RESTART` to the top of the PLUMED input and restart the MD simulation (which will depend on the MD code). PLUMED will the read in the previously added Gaussians and append to the colvar and hills file. 

For GROMACS restarting is even easier as we do not need to add a `RESTART` action as the PLUMED will automatically detect the GROMACS is restarting. GROMACS writes out a checkpoint file that we can restart from. See the GROMACS manual for further information of restarting GROMACS jobs: 

[https://manual.gromacs.org/current/user-guide/managing-simulations.html](https://manual.gromacs.org/current/user-guide/managing-simulations.html)

In [None]:
%ls

We first need to extend the time in the TPR file that is done with the `gmx convert-tpr` tool. Here we will extend the time by 5000 ps so the total time is 10000 ps. 

In [None]:
%%bash -s "$gmx_bin" "$gmx_mdrun_bin"

if [[ `hostname` =~ "bridges2" ]]; then
  source /ocean/projects/see220002p/shared/gromacs+plumed/gromacs-2022.6_plumed-2.9.0/load-gromacs-plumed.sh
fi

GMX_BIN=${1}
GMX_MDRUN_BIN=${2}

RunFilename=NaCl_NPT-300K
mv ${RunFilename}.tpr ${RunFilename}.old.tpr

# Extend the time of the TPR file
${GMX_BIN} convert-tpr -s ${RunFilename}.old.tpr -o ${RunFilename}.tpr -extend 5000.0

In [None]:
%%bash -s "$gmx_bin" "$gmx_mdrun_bin"

if [[ `hostname` =~ "bridges2" ]]; then
  source /ocean/projects/see220002p/shared/gromacs+plumed/gromacs-2022.6_plumed-2.9.0/load-gromacs-plumed.sh
fi

GMX_BIN=${1}
GMX_MDRUN_BIN=${2}

RunFilename=NaCl_NPT-300K
${GMX_MDRUN_BIN}  -deffnm ${RunFilename}   -plumed plumed.dat -cpi ${RunFilename}.cpt
echo "0" | ${GMX_BIN} trjconv -f ${RunFilename}.xtc -s ${RunFilename}.tpr -pbc whole -o ${RunFilename}.pbc-whole.xtc

As we can see by greping the GROMACS log file, we can see that PLUMED read in the Gaussians correctly

In [None]:
%%bash
grep "Restarting from hills.data" NaCl_NPT-300K.log

By plotting the colvar and Gaussian height, we can see that behavior during the restaring is exactly as we wanted 

In [None]:
plot_colvardata("colvar.data",2)
plot_colvardata("hills.data",6)

We can calculate the FESs obtained after 10000 ps and look at the time behavior 

In [None]:
run_plumed_cmd("sum_hills --hills hills.data --mintozero --stride 1000 --kt 2.494353 --idw dist --outfile fes-stride-1000/fes.dist.stride-")
run_plumed_cmd("sum_hills --hills hills.data --mintozero --stride 1000 --kt 2.494353 --idw coord --outfile fes-stride-1000/fes.coord.stride-")

In [None]:
fes_stride=1000
num_fesfile=len(glob.glob("fes-stride-1000/fes.dist.stride-*"))
for i in range(num_fesfile):  
    if i == num_fesfile-1: continue
    fesfile="fes-stride-1000/fes.dist.stride-{}.dat".format(i)
    time=(i+1)*fes_stride
    fes=np.loadtxt(fesfile)
    plt.plot(fes[:,0],fes[:,1]/kBT,'-',label="{:4d} ps".format(time))    
plt.xlabel("Na-Cl Distance [nm]")
plt.ylabel("Free Energy [kBT]")
plt.legend()
plt.ylim([0,10])
plt.show()


In [None]:
fes_stride=1000
num_fesfile=len(glob.glob("fes-stride-1000/fes.coord.stride-*"))
for i in range(num_fesfile):  
    if i == num_fesfile-1: continue
    fesfile="fes-stride-1000/fes.coord.stride-{}.dat".format(i)
    time=(i+1)*fes_stride
    fes=np.loadtxt(fesfile)
    plt.plot(fes[:,0],fes[:,1]/kBT,'-',label="{:4d} ps".format(time))    
plt.xlabel("Na Solvation")
plt.ylabel("Free Energy [kBT]")
plt.legend()
plt.ylim([0,10])
plt.show()

In [None]:
run_plumed_cmd("sum_hills --hills hills.data --mintozero --kt 2.494353 --idw dist --outfile fes/fes.dist.10000ps.data")
run_plumed_cmd("sum_hills --hills hills.data --mintozero --kt 2.494353 --idw coord --outfile fes/fes.coord.10000ps.data")


In [None]:
fes_files=["../Exercise-2/fes/fes.mintozero.data",
           "../Exercise-2/fes-reweight/fes-reweight.dist.data",
           "fes/fes.dist.data",
           "fes/fes.dist.10000ps.data"]

labels=["1 CV biased: FES from bias",
        "1 CV biased: FES from rewighting",
        "2 CV biased: FES from bias after 5000 ps",
        "2 CV biased: FES from bias after 10000ps"]

plot_fes(fes_files,labels)

In [None]:
run_plumed_cmd("sum_hills --hills hills.data --mintozero  --bin 200,200  --outfile fes/fes.dist-coord.10000ps.data")

In [None]:
fes_data = np.loadtxt("./fes/fes.dist-coord.10000ps.data")
# have to change the value in reshape to fit fes file generated by sum_hills
distance = fes_data[:,0].reshape(201,201)
coord =    fes_data[:,1].reshape(201,201)
fes_2d =   fes_data[:,2].reshape(201,201)-np.min(fes_data[:,2].reshape(201,201))
from matplotlib import cm
plt.contourf(distance,coord,fes_2d/kBT, levels=np.linspace(0,10,101), cmap=cm.jet)
plt.xlabel("Na-Cl distance [nm]")
plt.ylabel("Solvation of Na")
plt.title("FES")
plt.colorbar(label="Free Energy [kBT]", ticks=range(0,11))

# Exercise 4 - Varying Metadynamics Parameters 

You can try to vary some of the metadynamics parameters, such as the bias factor, for example, you can use a value of 2 or 10. I would recommend biasing just the distance CV for this. 

This will be left as exercise. You can use the analysis tools from above. 

Here you should obtain at the FES for the distance CV both directly from the bias potential and reweighting and compare it to the results from Exercise 1 and 2

In [None]:
os.chdir(home_dir)

In [None]:
%%bash
cp -r Template-Exercise-X Exercise-4


In [None]:
%cd Exercise-4/

In [None]:
%%bash
cat <<EOF > MD-NPT.mdp
integrator = md
dt = 0.002
nsteps = 2500000
cutoff-scheme = Verlet
coulombtype = PME
rlist = 0.6
rcoulomb = 0.6
rvdw = 0.6
constraints = h-bonds
tcoupl =  V-rescale
ref_t = 300
tau-t = 1.0
tc-grps = System
gen-vel = no
gen-temp = 300
gen-seed = -1
DispCorr = AllEnerPres
pcoupl = Parrinello-Rahman
pcoupltype = isotropic
ref-p = 1.01325
compressibility = 4.5e-5
nstxout-compressed = 250
nstxout = 0
nstvout = 0
EOF

In [None]:
%%bash -s "$gmx_bin" "$gmx_mdrun_bin"

if [[ `hostname` =~ "bridges2" ]]; then
  source /ocean/projects/see220002p/shared/gromacs+plumed/gromacs-2022.6_plumed-2.9.0/load-gromacs-plumed.sh
fi

GMX_BIN=${1}
GMX_MDRUN_BIN=${2}

StartingGeometry=NaCl_StartingStructure-1.gro
RunFilename=NaCl_NPT-300K
${GMX_BIN}  grompp -f MD-NPT.mdp -c ${StartingGeometry} -p NaCl.top -o ${RunFilename}.tpr  -maxwarn 1

In [None]:
%%bash
cat <<EOF > plumed.dat
# vim:ft=plumed

# Distance between Na and Cl atoms
dist: DISTANCE ATOMS=322,323

# Solvation of Na atom
COORDINATION ...
  LABEL=coord
  GROUPA=322
  GROUPB=1-321:3
  SWITCH={RATIONAL R_0=0.315 D_MAX=0.5 NN=12 MM=24}
  NLIST
  NL_CUTOFF=0.55
  NL_STRIDE=10
... COORDINATION

uwall: UPPER_WALLS ...
   ARG=dist 
   AT=0.6
   KAPPA=4000.0 
...

METAD ...
  LABEL=mtd
  ARG=dist
  PACE=500
  SIGMA=0.01
  HEIGHT=1.25
  BIASFACTOR=__FILL__
  GRID_MIN=0.0
  GRID_MAX=1.0
  CALC_RCT
  FILE=hills.data
... METAD

PRINT ARG=dist,coord,mtd.*,uwall.* FILE=colvar.data STRIDE=250
EOF

In [None]:
%%bash -s "$gmx_bin" "$gmx_mdrun_bin"

if [[ `hostname` =~ "bridges2" ]]; then
  source /ocean/projects/see220002p/shared/gromacs+plumed/gromacs-2022.6_plumed-2.9.0/load-gromacs-plumed.sh
fi

GMX_BIN=${1}
GMX_MDRUN_BIN=${2}

RunFilename=NaCl_NPT-300K
${GMX_MDRUN_BIN}  -deffnm ${RunFilename}   -plumed plumed.dat 
echo "0" | ${GMX_BIN} trjconv -f ${RunFilename}.xtc -s ${RunFilename}.tpr -pbc whole -o ${RunFilename}.pbc-whole.xtc

In [None]:
plot_colvardata("./colvar.data",3)

In [None]:
# FILL IN THE REST WITH ANALYSIS SCRIPTS

In [None]:
# Compare the different FES for the distance 
fes_files=["../Exercise-2/fes/fes.mintozero.data",
           "../Exercise-2/fes-reweight/fes-reweight.dist.data",
           "../Exercise-3/fes/fes.dist.data",
           "../Exercise-3fes-reweight/fes-reweight.dist.data",
           "FILL_IN_PATH_TO_FES_FROM_BIAS",
           "FILL_IN_PATH_TO_FES_FROM_REWEIGHTING"]

labels=["Exercise 2: FES from bias",
        "Exercise 2: FES from rewighting",
        "Exercise 3: FES from bias",
        "Exercise 3: FES from rewighting",
        "Exercise 4: FES from bias",
        "Exercise 4: FES from rewighting"]

plot_fes(fes_files,labels)
