Remaining issues in the GeometryEngine stuff #49

pgrinaway · 2015-11-30T21:02:05Z

Currently, there are a couple issues with the GeometryEngine, in particular relating to the implementation that uses all angle and torsion potentials when choosing a torsion.

Sometimes, the proposal potential results in some nans somehow making their way to the rejection sampler. When a uniform random number is drawn with nan as the upper limit, numpy throws an exception. Looking into why this is still happening despite replacing nans with inf.
The speed is a bit of an issue. I've discovered through profiling that this is a result of the conversion between internal and cartesian coordinates. This can probably be accelerated a lot via vectorization or something of the sort.

The text was updated successfully, but these errors were encountered:

jchodera · 2015-12-01T01:19:24Z

Why use a rejection sampler here? If we have computed the probability vector p, why not just use numpy.random.choice(numpy.arange(phis, p=p)?

jchodera · 2015-12-01T01:20:06Z

p must be normalized for numpy.random.choice, but that is easy to do with p / p.sum() if that hasn't been done already.

jchodera · 2015-12-01T01:24:50Z

For speeding things up, we have a few possibilities:

Compile a list of all torsions and angles and vectorize computations with numpy

This is a bit tricky, but should be straightforward given that all the other positions are fixed.

Code the energy terms in Cython

This might be the most straightforward way to recode the existing code.

Construct a minimum OpenMM system containing only the relevant atoms, bonds, and torsions

The trick here is to keep track of the atom mapping and compute the Cartesian positions for rotation around the single torsion, but it could also allow us to do more sophisticated things in the future.

jchodera · 2015-12-01T01:25:40Z

You might want to return phis in this function to make it easier to do numpy.random.choice (and to make sure you don't get out of sync between the probability vector and torsions).

jchodera · 2015-12-01T01:31:55Z

We can also refine your log_p computations a bit, and vectorize them.

For example, the _bond_logp currently looks like this:

        k_eq = bond.type.k*units.kilocalories_per_mole/(units.angstrom**2)
        r0 = bond.type.req*units.nanometers
        sigma = beta*2.0/np.sqrt(2.0*k_eq/k_eq.unit)
        logp = stats.distributions.norm.logpdf(r/r.unit, r0/r0.unit, sigma)
        return logp

We can use the fact that the reduced potential is u(x) = \beta (K/2) (r - r0)^2 to compute this directly in a vectorized manner:

k_eq = bond.type.k*units.kilocalories_per_mole/(units.angstrom**2)
r0 = bond.type.req*units.nanometers
return -beta*(k_eq/2.0)*(r-r0)**2

Does ParmEd store k and req as simtk.unit units natively, or do you need to attach units? It is weird that they would be listed in kcal/mol/A^2 and nm, since that seems like an inconsistent choice; OpenMM certainly uses a different unit system, and AMBER I believe uses kcal/mol and A consistently.

If you need log p(r), you would also need to compute sigma as you note:

k_eq = bond.type.k*units.kilocalories_per_mole/(units.angstrom**2)
r0 = bond.type.req*units.nanometers
sigma = unit.sqrt(1.0 / (beta * k_eq))
return -beta*(k_eq/2.0)*(r-r0)**2 - np.log(np.sqrt(2*pi)*sigma / base_distance_unit)

though note that you need this with reference to an arbitrary base_distance_unit, so it may not be useful to include this (since it is an irrelevant additive constant).

jchodera · 2015-12-01T01:34:38Z

Also, there is a bit of confusion between whether you want the log normalized probability density log p(r), log of the unnormalized probability density log q(r), or the reduced potential (negative log of unnormalized density) u(r) = - log q(r).

Here, you seem to be accumulating reduced potentials as ub_angles, but are actually accumulating log p(theta), which has the opposite sign and an irrelevant constant added.

jchodera · 2015-12-01T01:41:45Z

I would carefully check all your _log_p functions for:

unit errors and consistency of units
sign issues regarding whether you want u(x) = - log q(x) or log q(x)
eliminate use of states.distributions.norm.logpdf when the additive constants are irrelevant
factors of 2 seem to be in the wrong place; remember that harmonic bonds and angles are defined by the reduced potentials

u(r) = beta * (K/2) (r - r0)**2
u(theta) = beta * (K/2) (theta - theta0)**2

the beta seems to be on the wrong side of fractions in computing [sigma], so check that

sigma = unit.sqrt(1 / (beta * k))

jchodera · 2015-12-01T01:42:56Z

You are returning q(phi) instead of log q(phi) in _torsion_logp here:

        q = np.exp(-beta*(V/2.0)*(1+np.cos(n*internal_coordinates[2]-gamma)))
        return q

pgrinaway · 2015-12-01T15:56:47Z

Why use a rejection sampler here?

Good point. I had originally done that because I was imagining sampling from a continuous distribution, but it works just as well and is simpler to just treat it as a categorical variable.

Does ParmEd store k and req as simtk.unit units natively, or do you need to attach units?

I needed to attach units, and I found those units in the documentation.

Thanks for the rest of the pointers--you're right, those are errors.

I think this code is fairly disorganized and difficult to test, as you've pointed out. I think from your suggestions, the idea of using OpenMM to calculate these potentials is my favorite in the end (it enables a lot of extra stuff, and outsources calculation of the potential).

jchodera · 2015-12-01T16:32:50Z

Let's not do this quite yet, but thinking ahead to what we would do if we wanted to do this in OpenMM:

I bet the easiest way to do this is to use ParmEd where we keep the Topology object (and hence atom numbers) the same, but only include a subset of the force term definitions:

use ParmEd to make a deep copy of the Structure object
eliminate all the force terms
add back only the forcefield terms involving bonds, angles, and torsions for the atom to be placed and atoms that already have their positions defined
use createSystem() to create an OpenMM system
crank the new atom around its torsion orbit to evaluate energies with the Reference platform

jchodera · 2015-12-01T16:41:37Z

I needed to attach units, and I found those units in the documentation.

It's a bit scattered in terms of what units are in play, but looking at the omm_* family of functions will help us figure out the correct units to attach, e.g. omm_bond_force():

length_conv = u.angstroms.conversion_factor_to(u.nanometers)
        _ambfrc = u.kilocalorie_per_mole/u.angstrom**2
        _ommfrc = u.kilojoule_per_mole/u.nanometer**2
        frc_conv = _ambfrc.conversion_factor_to(_ommfrc)

More generally, I believe AMBER uses akma_unit_system: http://parmed.github.io/ParmEd/html/dimensional_analysis.html#available-unitsystem-options

lengths are in angstroms
energies are in kcal/mol

I wonder if there is a way to retrieve parameters with units attached automatically...

pgrinaway · 2015-12-16T20:04:45Z

The refactors that are pending are covered in other issues.

pgrinaway added the bug 🐛 label Nov 30, 2015

pgrinaway self-assigned this Nov 30, 2015

pgrinaway mentioned this issue Dec 2, 2015

Fix remaining GeometryEngine issues #51

Merged

pgrinaway closed this as completed Dec 16, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remaining issues in the GeometryEngine stuff #49

Remaining issues in the GeometryEngine stuff #49

pgrinaway commented Nov 30, 2015

jchodera commented Dec 1, 2015

jchodera commented Dec 1, 2015

jchodera commented Dec 1, 2015

jchodera commented Dec 1, 2015

jchodera commented Dec 1, 2015

jchodera commented Dec 1, 2015

jchodera commented Dec 1, 2015

jchodera commented Dec 1, 2015

pgrinaway commented Dec 1, 2015

jchodera commented Dec 1, 2015

jchodera commented Dec 1, 2015

pgrinaway commented Dec 16, 2015

Remaining issues in the GeometryEngine stuff #49

Remaining issues in the GeometryEngine stuff #49

Comments

pgrinaway commented Nov 30, 2015

jchodera commented Dec 1, 2015

jchodera commented Dec 1, 2015

jchodera commented Dec 1, 2015

Compile a list of all torsions and angles and vectorize computations with numpy

Code the energy terms in Cython

Construct a minimum OpenMM system containing only the relevant atoms, bonds, and torsions

jchodera commented Dec 1, 2015

jchodera commented Dec 1, 2015

jchodera commented Dec 1, 2015

jchodera commented Dec 1, 2015

jchodera commented Dec 1, 2015

pgrinaway commented Dec 1, 2015

jchodera commented Dec 1, 2015

jchodera commented Dec 1, 2015

pgrinaway commented Dec 16, 2015