Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sum_hills truncates gaussians #420

Closed
mnmelo opened this issue Dec 9, 2018 · 11 comments · Fixed by #757
Closed

sum_hills truncates gaussians #420

mnmelo opened this issue Dec 9, 2018 · 11 comments · Fixed by #757

Comments

@mnmelo
Copy link

mnmelo commented Dec 9, 2018

Dear devs,

I ran into this issue, which can be exemplified by a simple MWE:

I create a HILLS file of a single CV, with a single deposed gaussian at 0.0. (SIGMA=0.05 and HEIGHT=0.25):

#! FIELDS time cv.val sigma_cv.val height biasf
#! SET multivariate false
#! SET kerneltype gaussian
      0.0    0.0                   0.05                   0.25                     -1

I then ask sum_hills to give me the FES from 0 to 1, over 1000 bins:

plumed sum_hills --min 0 --max 1 --bin 1000 --hills HILLS

which yields a fes.dat with the following values (abridged above and below):

    ...
    0.175000000   -0.000546873    0.038281095
    0.176000000   -0.000509799    0.035889838
    0.177000000   -0.000475048    0.033633410
    0.178000000   -0.000000000   -0.000000000
    0.179000000   -0.000000000   -0.000000000
    0.180000000   -0.000000000   -0.000000000
    ...

As can be seen, the single gaussian does not extend all the way till the end of the [0,1] range, and is instead truncated around 3.5*σ. Granted, the truncation error is small, but still well above the machine precision limit. I couldn't find any documentation for sum_hills describing it.

Tied to this is the summation handling during METAD: are gaussians similarly truncated when added to a GRID?

Thanks!
(using v2.5, at commit b1c3c74, but had the issue at least since 2c38c46)

@carlocamilloni
Copy link
Member

carlocamilloni commented Dec 10, 2018 via email

@maxbonomi
Copy link
Member

maxbonomi commented Dec 10, 2018 via email

@GiovanniBussi
Copy link
Member

Truncation is required in order to make the addition of a Gaussian to a grid computationally efficient. Notice that Gaussians are consistently truncated in sum_hills and when applied by the METAD keyword.

We can expect a small error coming from this, in particular due to the fact that there is a discontinuity in the potential. I suspect that the problem will be very small and likely smaller than the typical statistical error.

One way to fix it would be to "stretch" the potential as we do for switching functions since PLUMED 2.2. The main problem is that this breaks backward compatibility. My guess is that the discontinuity on switching functions is most critical (especially because people can play with D_MAX). Still I see that it would be nicer to have a continuous energy function..

Implementing the stretching of Gaussians would be very easy (actually, it would affect all Gaussian kernels that we use). We could make it optional to allow people to keep results consistent. Instead of using a NOSTRETCH flag (as in switching functions) I would use an environment variable, basically because we use Gaussian kernels in many places and it would be difficult to add a NOSTRETCH flag to all of them. Something like export PLUMED_GAUSSIAN_STRETCH=no.

An additional issue would be that if we stretch the Gaussians used in histograms we should also
scale them to be correctly normalized.

What do you think?

Giovanni

@GiovanniBussi
Copy link
Member

Actually the last problem is not a problem. For histograms we already use truncated Gaussians with correct normalization:

} else if( ktype==truncatedgaussian ) {

So it would be easy to add a "stretched Gaussian" to the list of kernels and use that in METAD and SumHills by default, perhaps with a env var to select the new/old behavior.

@carlocamilloni
Copy link
Member

carlocamilloni commented Dec 10, 2018 via email

@GiovanniBussi
Copy link
Member

Actually the code that computes Gaussians is repeated in multiple places. I think we can find all of them using git grep DP2CUTOFF

@spiwokv
Copy link
Contributor

spiwokv commented Dec 10, 2018

You can use moving pre-calculated Gaussians as we did in metadynview and metadynminer.
Vojtech

@spiwokv
Copy link
Contributor

spiwokv commented Dec 10, 2018

Unless you have variable hill widths.

@GiovanniBussi
Copy link
Member

@spiwokv I see this is making the calculation of the exponential functions faster. However, it will mean that the Gaussian's centers are discretized on the grid, right?

In case we have some evidence that this can make the calculation faster, I think we could add it as an optional feature. I would be careful however in using this as the only choice.

@spiwokv
Copy link
Contributor

spiwokv commented Dec 10, 2018

@GiovanniBussi yes, it is discretized on a grid (i don't see any simple interpolation). I did mean it for sum_hills, not for metadynamics. If you use for example periodic cv from -pi to pi, you can make a hill centered at 0,0 and calculate potential for all points. Next you can role it to center it to CV values. This does all periodicity. For a non-periodic cv you have to calculate it for -2pi to +2pi.

@spiwokv
Copy link
Contributor

spiwokv commented Dec 11, 2018

In metadynminer by fast algorithm at 256x256 grid 2.304877 secs
fast
by slow algorithm 54.28688 secs
slow
difference
difference
histogram of differences
histo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants