Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] How to correctly calculate the center of mass for wrapped molecules #63

Closed
lohedges opened this issue May 31, 2023 · 3 comments · Fixed by #71
Closed

[BUG] How to correctly calculate the center of mass for wrapped molecules #63

lohedges opened this issue May 31, 2023 · 3 comments · Fixed by #71
Labels
bug Something isn't working
Milestone

Comments

@lohedges
Copy link
Contributor

In trying to fix this issue I'd like to translate a molecule (or molecules) back to the box center based on their current center-of-mass. Doing this via the Sire object evaluator doesn't work if the molecule is wrapped across the boundary of any periodic box since the calculation doesn't take account of the space.

It is possible to fix this in BioSimSpace by manually computing the center of mass as shown below, but this is slow for large systems. Is it possible to get the evaluator to respect the space, or should I just add a C++ utility function to do this for me, e.g. like the other BIoSimSpace ones.

Here is a unit test that reproduces the issue:

from math import isclose as _isclose
from sire.legacy.Maths import Vector as _SireVector

import BioSimSpace as BSS


def test_com_wrapping():
    # Load the single-molecule system.
    system = BSS.IO.readMolecules("wrapped.*7")

    # Compute the centre-of-mass using Sire's evaluator.
    com_sire = system[0]._sire_object.evaluate().centerOfMass()

    # Now compute it manually, taking into account periodic boundaries.
    # We want the centre-of-mass of the molecule as a connected object.

    # Store the space.
    space = system._sire_object.property("space")

    # Get the first atom in the molecule..
    atom = system[0].getAtoms()[0]

    total_mass = atom._sire_object.property("mass").value()

    # Reference coordinate.
    ref_coord = atom._sire_object.property("coordinates")

    # Initialise the centre-of-mass.
    com_bss = total_mass * atom._sire_object.property("coordinates")

    # Sum over the rest of the atoms.
    for atom in system[0].getAtoms()[1:]:
        # Get the mass and update the total.
        mass = atom._sire_object.property("mass").value()
        total_mass += mass

        # Get the coordinate and add to the reference coord using the
        # its distance from the reference in the minimum image convention.
        coord = atom._sire_object.property("coordinates")
        coord = ref_coord + _SireVector(space.calcDistVector(ref_coord, coord))

        # Update the centre of mass.
        com_bss += mass * coord

    # Normalise.
    com_bss /= total_mass

    # Make sure the centre-of-masses are approximately the same.
    for x, y in zip(com_sire, com_bss):
        assert _isclose(x.value(), y.value())

The input files are here.

As you can see, the molecule is wrapped across the periodic boundary in the input file.

wrapped

The two center of masses that are computed are:

# Sire
( 18.4199 Å, 28.3353 Å, 3.9244 Å )

# Manual
( 31.0734 Å, 31.8911 Å, 0.871955 Å )

Note that I've needed to use the manual center-of-mass calculation when working out the correct location to place a funnel during the funnel metadynamics setup, i.e. locating the binding site of the protein, which is performed here.

I am also in the process of converting the existing ABFE restraint generation code, which uses MDAnalysis searches, over to a native Sire implementation. This performs some distance measurements along a trajectory so will also need to deal with wrapping.

Perhaps this needs to be solved by having a way to unwrap the molecules. (I think there might already be a way to do this with the new trajectory code, and I know that MDTraj has a simple way to do this.)

Let me know what you think. I'm happy to add my own function within biosimspace.cpp to achieve what I want, but was wondering if it was already possible with the existing code.

For example, to do this with mdtraj we can do:

In [1]: import mdtraj

In [2]: traj = mdtraj.load("wrapped.rst7", top="wrapped.prm7")

In [3]: 10*mdtraj.compute_center_of_mass(traj)
Out[3]: array([[18.41984188, 28.33505072,  3.92459827]])

In [4]: traj.make_molecules_whole()
Out[4]: <mdtraj.Trajectory with 1 frames, 22 atoms, 3 residues, and unitcells at 0x7f36bd391090>

In [5]: 10*mdtraj.compute_center_of_mass(traj)
Out[5]: array([[31.07343482, 31.89108144,  0.87194654]])

Cheers.

@lohedges lohedges added the bug Something isn't working label May 31, 2023
@lohedges
Copy link
Contributor Author

Oh, and I don't think this a bug, but forgot to remove the label. As shown, other packages don't get the centre-of-mass correct unless you make the molecules whole.

@lohedges
Copy link
Contributor Author

lohedges commented Jun 1, 2023

Just to say that it looks like the space is respected when performing searches. (This think this might be a recent addition.) For example:

In [1]: import sire as sr

In [2]: s = sr.load("wrapped.*7")

In [3]: s.num_atoms()
Out[3]: 22

In [4]: len(s["atoms within 6 of atomidx 11"])
Out[4]: 22

@chryswoods
Copy link
Contributor

I think you are right that sire needs a make_molecule_whole function (or equivalent). Sire's wrapping works at the CutGroup level. It won't break a CutGroup across periodic boundaries, but can break a multi-CutGroup molecule between CutGroups (hence their name). Normally molecules are loaded so that each residue is in its own CutGroup.

One challenge comes when an input file is loaded where the atoms are already split across boundaries. In this case, it could be fixed using a function as you've written in BioSimSpace that re-wraps on the atomic level, rather than the CutGroup level.

I think this may be worth adding as a check in sire on load, as the code does assume that CutGroups aren't split across periodic boundaries.

I'll take a look at your input files and will think about how to add it. It will likely be something that will be default on load, with a flag that can be passed to switch it off.

Yes, this doesn't impact distance calculations (i.e. as used in the search string) as distance calculations always use the minimum distance, so don't see that the molecule is wrapped across the boundary. I think the only place it crops up is centre of mass (or similar) where the code has assumed that all the atoms in a CutGroup are on the same side of the boundary.

lohedges added a commit to OpenBioSim/biosimspace that referenced this issue Jun 2, 2023
@chryswoods chryswoods added this to the 2023.3.0 milestone Jun 2, 2023
@chryswoods chryswoods linked a pull request Jun 8, 2023 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants