Skip to content

Commit

Permalink
Updated Problem 2
Browse files Browse the repository at this point in the history
  • Loading branch information
Wei-Tse Hsu authored and Wei-Tse Hsu committed Jul 21, 2021
1 parent 6319a48 commit 545046e
Show file tree
Hide file tree
Showing 7 changed files with 62 additions and 64 deletions.
Binary file removed Problem_2/CV_hist.png
Binary file not shown.
Binary file added Problem_2/CV_hist_no_wall.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Problem_2/CV_hist_with_wall.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
114 changes: 51 additions & 63 deletions Problem_2/Problem_2.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -11,12 +11,12 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"To examine the efficacy of alchemical metadynamics, I ran a 2D alchemical metadynamics on the CB8-G3 host-guest binding complex, where the configurational CV was the number of water molecules in the binding cavity. After a lot of attempts, below is the content of the PLUMED input file I adopted to ensure flat sampling in both the alchemical and configurational space."
"In one of the tests I am planning to present in the paper, I ran a 2D alchemical metadynamics on the CB8-G3 host-guest binding complex, where the configurational CV was the number of water molecules in the binding cavity. After a lot of attempts, below is the content of the PLUMED input file I finally adopted to ensure flat sampling in both the alchemical and configurational space."
]
},
{
"cell_type": "code",
"execution_count": 1,
"execution_count": 3,
"metadata": {},
"outputs": [
{
Expand All @@ -42,7 +42,17 @@
"FILE=HILLS_2D\n",
"... METAD\n",
"\n",
"PRINT STRIDE=10 ARG=n,lambda,metad.bias FILE=COLVAR\n"
"UPPER_WALLS ...\n",
" ARG=n\n",
" AT=12\n",
" KAPPA=200.0\n",
" EXP=2\n",
" EPS=1\n",
" OFFSET=0\n",
" LABEL=uwall\n",
"... UPPER_WALLS\n",
"\n",
"PRINT STRIDE=10 ARG=* FILE=COLVAR"
]
}
],
Expand All @@ -55,7 +65,14 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"I am aware that I'm using peculiar values for a lot of parameters and below is the corresponding explanation for each of them:\n",
"## Section 1: Appropriate metadynamics parameters"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"I am aware that I'm using peculiar values for metadynamics parameters and below are the corresponding explanations:\n",
"- **Large height**: I expected the free energy barrier of removing the water molecules from the binding cavity to be pretty large, so I set the initial Gaussian height as 5 kT (around 12.394 kJ/mol). \n",
"- **Large bias factor**: Before deciding the value of the bias factor, I conducted some preliminary tests, which are metadynamics simulations with different bias factors only biasing the number of water molecules. As a result, due to the high free energy barrier, a bias factor between 60 to 150 would lead to a flatter distribution in the CV space. With the bias factor being 60, the sampling in the alchemical space was slow, so I decided to use 150, which did lead to a relatively flat distribution in both the alchemical space and the configurational CV space (the number of water molecules). In addition, from the 1D metadynamics simulations, it could be estimated that the largest free energy barrier in the configurational CV space was about 170 kT, so I thought a bias factor around 150 would probably reasonable. (In the paper [Well-Tempered Metadynamics: A Smoothly Converging and Tunable Free-Energy Method](https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.100.020603), it was experimentally found that the error tends to be the lowest given that the bias factor was set such that $k(T+\\Delta T)$ is of the order of magnitude of the barrier height.)\n",
"- **Small pace**: Given a large bias factor, the Gaussian height would decrease pretty slowly in the simulation. To accelerate the sampling, instead of using a large pace (e.g. 500 simulation steps), I used PACE=10."
Expand All @@ -72,131 +89,102 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"<img src=CV_hist.png width=800>"
"<img src=CV_hist_with_wall.png width=800>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Up to this point, I have my **first question**: Is this strategy for deciding the metadynamics parameters appropriate? I thought it would be reasonable since it reaches the goal of getting a relatively flat distribution, but I still wanted to make sure the parameters make sense."
"Up to this point, I have my the following question:\n",
"\n",
"**Question 1**: Is this strategy for deciding the metadynamics parameters appropriate? I thought it would be reasonable since it reaches the goal of getting a relatively flat distribution, but I still wanted to make sure if the parameters make sense."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The distribution of the number of water molecules shown above, however, still shows a potential problem in the sampling in the configurational space. Specifically, given the small radius of the binding cavity (0.35 nm), having over 12 water molecules in the cavity is likely unphysical. In the figure, it can be seen that the system spent a fair amount of time sampling these unphysical states, which could potentially slow down the convergence of the free energy calculations. Additionally, the simulation crashed due to the following PLUMED error after about 90 ns:\n",
"```\n",
"PLUMED:\n",
"PLUMED:\n",
"PLUMED: ################################################################################\n",
"PLUMED:\n",
"PLUMED:\n",
"PLUMED: +++ PLUMED error\n",
"PLUMED: +++ at Grid.cpp:170, function PLMD::GridBase::index_t PLMD::GridBase::getIndex(const std::vector<unsigned int>&) const\n",
"PLUMED: +++ message follows +++\n",
"PLUMED: ERROR: the system is looking for a value outside the grid along the 1 (n) index!\n",
"PLUMED:\n",
"PLUMED: ################################################################################\n",
"PLUMED\n",
"```"
"## Section 2: Appropriate parameters for the wall potential"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The upper bound for the configurational CV was set as 20 and it seems that the system was trying to sample the configuration with more than 20 water molecules inside the binding cavity. I've also noticed that the Gaussian height is high in the unphysical region (as shown in the figures below). Although this is natural since the region is less sampled, I'm wondering if I can regard the bias in the end of the simulation to be quasi-stationary, as it seems that the Gaussian height for the physical regions are not changing a lot in the end. My understanding is that we could state the the bias is stationary. The large Gaussian height in the unphysical region would just cause large errors, but it is fine since we are not interested in unphysical configurations anyway. (Note: The `COLVAR` and `HILLS` files can be downloaded via [this link](https://drive.google.com/drive/folders/19mCLDtWa1L9jtyh13_DHYnLnhJqpXaN8?usp=sharing).)"
"As can be checked above, in the PLUMED input file, I have also set up a wall potential to prevent the system from sampling configurations that are very likely to be unphysical, which are the ones with a lot of water molecules (roughly > 12) regardless of the value of $\\lambda$. The wall potential is centered at $n=12$ because in a 20 ns standard MD simulation previously done as the preliminary test, the maximum number of water molecules in the simulation was around 10. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<img src=time_series.png width=1000>"
"As a reference, below are the CV histograms of another simulation where the wall potential was absent (with other parameters remained the same). "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To prevent the system from sampling unphysical configurations or having this kind of error, I'm considering modifying the PLUMED input file as below to add a wall potential."
"<img src=CV_hist_no_wall.png width=800>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"```\n",
"center: CENTER ATOMS=1-144 # geometric center of the host molecule\n",
"water_group: GROUP ATOMS=207-6656:3 # oxygen atom of the water molecules\n",
"n: COORDINATION GROUPA=center GROUPB=water_group R_0=0.35 # radius: 0.6 nm\n",
"lambda: EXTRACV NAME=lambda\n",
"\n",
"METAD ...\n",
"ARG=lambda,n\n",
"SIGMA=0.01,0.05 # small SIGMA ensure that the Gaussian approaximate a delta function\n",
"HEIGHT=12.394781044629076\n",
"PACE=10\n",
"GRID_MIN=0,0 # index of alchemical states starts from 0\n",
"GRID_MAX=39,20 # we have 40 states in total\n",
"GRID_BIN=39,100\n",
"TEMP=298\n",
"BIASFACTOR=150\n",
"LABEL=metad\n",
"FILE=HILLS_2D\n",
"CALC_RCT\n",
"... METAD\n",
"\n",
"UPPER_WALLS ...\n",
" ARG=n\n",
" AT=12\n",
" KAPPA=200\n",
" EXP=2\n",
" EPS=1\n",
" OFFSET=0\n",
" LABEL=uwall\n",
"... UPPER_WALLS\n",
"As can be compared above, with the wall potential, the maximum number of water molecules in the binding cavity occurred in the simulaiton decreased. However, the system still spent a fair amount of time sampling the configurations that had a lot of water molecules. This can also be seen in the plot below, which is the Gaussian height as a function of time."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<img src=hills.png width=400>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"As shown above, we can see that overall the Gaussian height was converging. The spikes we see in the plot correspond to the times that the system was sampling the \"unphysical\" configurations, as for these less-sampled regions, the Gaussian height in a well-temepered metadynamics would not decay too much. Currently, the simulation is being extended to the expected length (100 ns). The `COLVAR` and `HILLS` files can be downloaded via [this link](https://drive.google.com/drive/folders/19mCLDtWa1L9jtyh13_DHYnLnhJqpXaN8?usp=sharing).\n",
"\n",
"PRINT STRIDE=10 ARG=* FILE=COLVAR\n",
"```"
"Overall, regarding this system with the wall potential, I have the following quesitons:"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"As a prelminary test, previously I've tried running a 20 ns standard MD with the system where the ligand is absent. As a result, the maximum number of water molecules was around 10, so I set `AT=12` in the section of `UPPER_WALLS`. Originally, I set `KAPPA` as 2000, but the simulation crashed with a LINC error after around 5 ns. I therefore changed the value of `KAPPA` to 200 and set `EPS` as 10. As such, if 15 water molecules are forced into the binding cavity, the energy penalty would be \n",
"$$U_{\\text{wall}} = 200 \\left( \\frac{15-12}{10}\\right)^2=18 \\; \\text{kJ}/\\text{mol}$$"
"- **Question 2**: I'm wondering if the parameters related to the wall potential are reasonable. Specifically, an excessivly large force constant might lead to a LINCS error, while a too small force constant could not prevent unphysical sampling effectively. Given the energy penalty set up for now (200 kJ/mol if there are 13 water molecules in the binding cavity), I'm not sure if the parameters require further adjustment."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"However, since I've just started the simulation, I'm not entirely sure if this energy penalty would be too large or too small such that another error would be caused. I'm wondering if you have general suggestions about deciding the parameters for a wall potential. "
"- **Question 3**: As a wall potential is applied in alchemical metadynamics, I am also wondering about the best practices of reweighting the data. In [this tutorial](https://www.plumed.org/doc-v2.6/user-doc/html/ves-lugano2017-metad.html), actions or keywords such as `HISTOGRAM` and `LOGWEIGHT` were used. However, this is pretty different from the protocol I adopted in the notebook `Problem_1.ipynb` in the same repo. Specifically, previously what I did was use `RESTART=YES` to sum up the metadynamics bias, but I'm not entirely sure if the same protocol can be applied if there is a fixed potential. I'm wondering if you could offer some guidance about how I should modify the protocol I previously used to consider the wall potential when calculating free energy differences and their uncertainties. \n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"As a wall potential is applied in alchemical metadynamics, I was also wondering about the best practices of reweighting the data. In [this tutorial](https://www.plumed.org/doc-v2.6/user-doc/html/ves-lugano2017-metad.html), actions or keywords such as `HISTOGRAM` and `LOGWEIGHT` were used. However, this is pretty different from the protocol I adopted in the notebook `Problem_1.ipynb` in the same repo. Specifically, previously what I did was use `RESTART=YES` to sum up the metadynamics bias, but I'm not entirely sure if the same protocol can be applied if there is a fixed potential. I'm wondering if you could offer some guidance about how I should modify the protocol I previously used to consider the wall potential when calculating free energy differences and their uncertainties. "
"- **Question 4**: Given the plot of the Gaussian height above, I expect that after 100 ns, the Guassian height of the physical region should be almost stationary, which meets the requirement of the method I used to calculate free energy differences in Problem 1. However, for the unphysical region, the height would probably still be hight since they are less sampled. Since we are only interested in the physical region, I'm wondering if it is still okay to use the same protocol as the one used in `Problem_1.ipynb`. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To further restrict the sampling region to accelerate the convergence, I have another question. Setting a wall potential as above would prevent the system from exploring any region with `n` (the number of water molecules) over 12, regardless of the $\\lambda$ values. Since 12 water molecules being in the binding cavity is not physical with any $\\lambda$ values, prohibiting exploration of such regions with a wall potential meets my needs. However, I'm wondering if there is a way to prohibit the exploration of regions such as $\\lambda=40$ (the uncoupled) and $n=0$ at the same time, which is also likely to be unphysical. Using a simple wall potential does not seem to be able to deal with this, since setting a wall potential at $n=0$ is going to prohibit some physical configurations as well (like $\\lambda=0$ and $n=0$)."
"- **Question 5**: Setting a wall potential as above would prevent the system from exploring any region with `n` (the number of water molecules) over 12, regardless of the $\\lambda$ values. Since 12 water molecules being in the binding cavity is not physical with any $\\lambda$ values, prohibiting exploration of such regions with a wall potential meets my needs. However, to further restrict the sampling region, I'm wondering if there is a way to prohibit the exploration of regions such as $\\lambda=40$ (the uncoupled) and $n=0$ at the same time, which is also likely to be unphysical. Using a simple wall potential does not seem to be able to deal with this, since setting a wall potential at $n=0$ is going to prohibit some physical configurations as well (like $\\lambda=0$ and $n=0$)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To summarize, my main questions proposed in this notebook are as follows:\n",
"Briefly, the questions above can be summarized as follows:\n",
"- Are the strategies I used for deciding metadynamics parameters appropriate?\n",
"- Can the bias be regarded as quasi-stationary if the Gaussian height in the unphysical region is still very high? \n",
"- What are the general suggestions you would give about deciding the parameters for a wall potential? How large should the energy penalty typically be?\n",
Expand Down Expand Up @@ -228,7 +216,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.3"
"version": "3.9.5"
}
},
"nbformat": 4,
Expand Down
Binary file added Problem_2/hills.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
12 changes: 11 additions & 1 deletion Problem_2/plumed.dat
Original file line number Diff line number Diff line change
Expand Up @@ -17,4 +17,14 @@ LABEL=metad
FILE=HILLS_2D
... METAD

PRINT STRIDE=10 ARG=n,lambda,metad.bias FILE=COLVAR
UPPER_WALLS ...
ARG=n
AT=12
KAPPA=200.0
EXP=2
EPS=1
OFFSET=0
LABEL=uwall
... UPPER_WALLS

PRINT STRIDE=10 ARG=* FILE=COLVAR
Binary file removed Problem_2/time_series.png
Binary file not shown.

0 comments on commit 545046e

Please sign in to comment.