Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I have been experiencing some instabilities which, after much difficulty, I was able to track down to the minimizer. The observed behavior is that the minimizer will finish 'successfully', but some time later, during equilibration and/or propagating replicas, the simulation may or may not randomly crash with NaNs or even throw CUDA_ERROR_ILLEGAL_ADDRESS (700).
Nearly identical behavior can be found here openmm/openmm#3414. In that ticket, the issue was that the box vectors were not being updated correctly. For stability, during minimization the box vectors should not be allowed to change at all. Subsequent equilibration at NPT should be used to adjust the box vectors.
In my case(s), the box vectors were changing only at the fourth significant figure, and yet that was sufficient to (usually) crash the simulations. Inserting Gradient Descent before FIRE seems to greatly ameliorate but not completely eliminate the instabilities. Replacing FIRE with L-BFGS (with or without Gradient Descent) also seems to fix the instabilities.
This PR adds three changes to improve the stability of minimization: