Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Comments follow-up #8

Open
ramess101 opened this issue Dec 11, 2018 · 13 comments
Open

Comments follow-up #8

ramess101 opened this issue Dec 11, 2018 · 13 comments

Comments

@ramess101
Copy link
Owner

@jpotoff @msoroush

I just wanted to ask some questions about a few of your comments.

On issue with HS-GCMC is that it is unclear how much efficiency is gained by actually sampling the different states within one simulation. That was never quantified by Errington et al. MBAR has the benefit of not requiring any additional CPU time to sample different Hamiltonians.

Do we want to raise this point? I wasn't sure if I want to speculate on the efficiency gain of HS-GCMC. However, it is nice to point out that MBAR does not require any additional CPU time.

would it be better to state that we found a typographical error in the table (missed negative sign)? What we have here sounds a bit ominous when this was really just a transcription error.

Is this any better?

image

BTW, this is how our reweighting code works. It's summing over snapshots. Nigel Wilding explained the method to me back around 1997. Originally we did it to get around hardware memory limitations.

So are the equations in the histogram reweighting section not actually representative of how you compute VLE?

How is the 3X array generated? Are you taking the snapshot lists produced by GOMC and reprocessing them with the basis functions to add the third column of data?

I plan on explaining this more in the supporting information. Do you think an explanation needs to be in the main text?

Seems a little strange that TraPPE to MiPPE is quite a bit better than MiPPE to TraPPE for 2,2,4-trimethylpentane (and most other compounds). Any ideas why?

I would say they are equally accurate in the vapor phase, but your observation appears to apply to the liquid phase. My only idea is that because TraPPE has a softer potential (lam = 12 and smaller sigma) it samples a more diverse group of short range distances while MiPPE samples a much more narrow group of short range distances.

@ramess101
Copy link
Owner Author

@jpotoff @msoroush

My NIST reviewers had a few comments I was not sure about. I was hoping you could provide some insight. I admit that some of these questions are probably not that important, so let me know if you think we actually need to answer them in the manuscript.

  1. In Towhee GCMC requires a range of molecules. That is not the case in GOMC, right?
  2. How were the box sizes determined for each compound? I tried to explain that we don't really need to use an experimental density for the initial box, but they still thought we needed to explain how we chose 3 nm, 3.5 nm, and 4 nm.
  3. Are 20 and 150 molecules enough for the initial configuration of the vapor and liquid phases, respectively? I tried to explain that finite size effects are not significant here and that these are just the initial configurations.
  4. What were the minimum distances and other parameters supplied to Packmol?
  5. Why do we not need torsions moves? My understanding is that we use the torsion sampling in the particle swap move, is that correct?
  6. Do we need to report the alkyne MiPPE parameters? We do not actually simulate alkynes in this study, we just reprocess the data. However, we do perform the epsilon scaling for alkynes, so it probably would be important to report the epsilon/sigma parameters. Should we also report the bond-lengths, angles, and torsions for alkynes then?
  7. Should we report the optimal cyclohexane MiPPE parameters in the methods section even though we don't optimize them until the Case study?

@msoroush
Copy link
Collaborator

msoroush commented Dec 11, 2018 via email

@ramess101
Copy link
Owner Author

@msoroush

Thanks for answering the NIST reviewer questions

@ramess101
Copy link
Owner Author

@jpotoff @msoroush

Can you verify that the notation I have used for different atom types is clear and correct? And that I have included all the sites you need to simulate the branched alkanes and alkynes without including any that are not needed?

Also, note that I did include the MiPPE cyclohexane parameters, although right now they are just the pseudo-optimal 16-6 parameters.

image

@msoroush
Copy link
Collaborator

msoroush commented Dec 11, 2018 via email

@ramess101
Copy link
Owner Author

@msoroush

Right, I forgot about that...

@jpotoff
Copy link
Collaborator

jpotoff commented Dec 11, 2018

@ramess101 just to add to what @msoroush wrote:

In Towhee GCMC requires a range of molecules. That is not the case in GOMC, right?

GOMC is like Towhee in that in GCMC it still stores data in two "boxes", one "real" box with full interactions and one "ideal gas" box where molecules are non-interacting. We just need to make sure the ideal gas phase (reservoir) as a sufficient number molecules in it. Normally we pack it with significantly more molecules than will be needed in the simulation. The simulation also terminates with an error if the reservoir is ever completely emptied.

How were the box sizes determined for each compound? I tried to explain that we don't really need to use an experimental density for the initial box, but they still thought we needed to explain how we chose 3 nm, 3.5 nm, and 4 nm.
Are 20 and 150 molecules enough for the initial configuration of the vapor and liquid phases, respectively? I tried to explain that finite size effects are not significant here and that these are just the initial configurations.

Additionally, we calculated the compressibility factor for each compound and have shown it approaches the limit of Z=1 at low temperatures. We have found that the compressibility factor is sensitive to system size effects, and if the system is too small, we will not obtain the expected convergence of Z, even though the liquid densities show little change.

Why do we not need torsions moves? My understanding is that we use the torsion sampling in the particle swap move, is that correct?

To add to what Mohammad said, we are doing about 50-60% swap moves, and each time the molecule is completely regrown. During the calculations for the previous paper, we calculated the dihedral angle distributions for a few of the compounds to verify we were sampling the correct distribution of dihedral angles.

Do we need to report the alkyne MiPPE parameters? We do not actually simulate alkynes in this study, we just reprocess the data. However, we do perform the epsilon scaling for alkynes, so it probably would be important to report the epsilon/sigma parameters. Should we also report the bond-lengths, angles, and torsions for alkynes then?

Yes, I think it would be helpful to include all parameters in the table to prevent confusion among readers.

Should we report the optimal cyclohexane MiPPE parameters in the methods section even though we don't optimize them until the Case study?

I think we should. We should include a footnote indicating that those parameters were optimized as part of this work.

@ramess101
Copy link
Owner Author

@jpotoff

Thanks for the additional explanation. That was very helpful.

@jpotoff
Copy link
Collaborator

jpotoff commented Dec 11, 2018

_On issue with HS-GCMC is that it is unclear how much efficiency is gained by actually sampling the different states within one simulation. That was never quantified by Errington et al. MBAR has the benefit of not requiring any additional CPU time to sample different Hamiltonians.

Do we want to raise this point? I wasn't sure if I want to speculate on the efficiency gain of HS-GCMC. However, it is nice to point out that MBAR does not require any additional CPU time._

I think we should point out that MBAR is basically "free". Unfortunately, we don't have enough data in the the Errington paper to quantify the efficiency of Hamiltonian scaling vs. MBAR. Perhaps that could be done as part of a future paper.

BTW, this is how our reweighting code works. It's summing over snapshots. Nigel Wilding explained the method to me back around 1997. Originally we did it to get around hardware memory limitations.
So are the equations in the histogram reweighting section not actually representative of how you compute VLE?

Leave the text as is, otherwise people are going to get really confused. It was just a comment that what we are doing in practice is actually a lot more like the "histogram-free" method here.

How is the 3X array generated? Are you taking the snapshot lists produced by GOMC and reprocessing them with the basis functions to add the third column of data?
I plan on explaining this more in the supporting information. Do you think an explanation needs to be in the main text?

If you can do it in one or two sentences, it would be best to put in the main body of the paper. If it's more complicated, supporting information would be fine.

Seems a little strange that TraPPE to MiPPE is quite a bit better than MiPPE to TraPPE for 2,2,4-trimethylpentane (and most other compounds). Any ideas why?
I would say they are equally accurate in the vapor phase, but your observation appears to apply to the liquid phase. My only idea is that because TraPPE has a softer potential (lam = 12 and smaller sigma) it samples a more diverse group of short range distances while MiPPE samples a much more narrow group of short range distances.

Makes sense. We should include a sentence or two on this in the paper. I'm sure it will come up in review otherwise.

@ramess101
Copy link
Owner Author

ramess101 commented Dec 11, 2018

@jpotoff @msoroush

I think we should point out that MBAR is basically "free". Unfortunately, we don't have enough data in the the Errington paper to quantify the efficiency of Hamiltonian scaling vs. MBAR. Perhaps that could be done as part of a future paper.

I am hesitant to claim that MBAR is basically "free", especially compared to Hamiltonian scaling. It is true that MBAR for constant theta or simple epsilon scaling (where energies are multiplied by U) is essentially free. However, for the scenario where we are computing VLE for force field j from snapshots of force field i, there is always some additional cost. Either the cost is at run time by generating basis functions or at rerun time by recomputing the energies. Since this is the scenario that is a direct parallel with Hamiltonian scaling (i.e., where we want to estimate properties of different force fields) it would seem misleading to say that MBAR is free. Although basis functions and rerun calculations are fast, I don't really know if they are significantly more efficient than Hamiltonian scaling.

Leave the text as is, otherwise people are going to get really confused. It was just a comment that what we are doing in practice is actually a lot more like the "histogram-free" method here.

That is very interesting. So it sounds like you might already be using something quite similar to MBAR. Have you looked into whether your approach could estimate properties for different force fields?

If you can do it in one or two sentences, it would be best to put in the main body of the paper. If it's more complicated, supporting information would be fine.

I will try both. A brief explanation in the text but also the more complicated derivation in SI.

Makes sense. We should include a sentence or two on this in the paper. I'm sure it will come up in review otherwise.

You are right that it will likely come up in the review. There are other possible explanations though. For example, the state points we sample are different. Specifically, the bridge temperature is different (reflecting the difference in Tc) and the chemical potentials are different.

It would seem to me that having a bridge temperature for force field i that is actually super-critical for force field j could impact performance. However, from Table S3 of Mick et al. it does not appear that there is a clear trend when comparing Tc of MiPPE and TraPPE. In other words, the MiPPE Tc (and thereby, bridge temperature) is not consistently higher or lower than the TraPPE bridge. Interestingly, the 2,2,4-trimethylpentane Tc is essentially the same for MiPPE and TraPPE, but the bridge temperatures are different. Specifically, the bridge temperature is 550 K and 560 K for TraPPE and MiPPE, respectively. 560 K appears to be supercritical for both TraPPE and MiPPE (unless there is a typo in Table S3) so that could explain why TraPPE to MiPPE worked so well in this case. What do you think?

The TraPPE liquid phase simulations have chemical potentials that are consistently more negative than their MiPPE counterparts. Since we simulate at chemical potentials that put us well into the liquid phase (i.e., less negative) we have some wiggle room between the simulated chemical potentials and the saturation line. So TraPPE would appear to be sampling somewhat closer to MiPPE's saturation chemical potentials (although still not metastable) while MiPPE is even farther away from TraPPE's saturation chemical potentials. Does that make any sense?

@ramess101
Copy link
Owner Author

@msoroush

Did you also include bond, angle and dihedral parameter?

I have now included the bonds, angles, and dihedrals. Please verify that the values are accurate and that the notation is consistent and clear. Thanks

image

image

image

@msoroush
Copy link
Collaborator

@ramess101 It looks good.
In angle's parameter table:

  • maybe it's better to use CHx-CH2-C(sp) rather than CHx-CHy-C(sp)

In torsion's parameter table:

  • I see different font for CHx-CH2-CH2-C(sp)

@ramess101
Copy link
Owner Author

@msoroush

Thanks. I have fixed those two issues.

This was referenced Dec 13, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants