New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MCS Ca alignment #2146

Open
pslacerda opened this Issue Nov 23, 2018 · 5 comments

Comments

Projects
None yet
2 participants
@pslacerda

pslacerda commented Nov 23, 2018

Is your feature request related to a problem? Please describe.
I need to align by the maximum common substructure.

Describe the solution you'd like
Corrected code:

def mcs_ca_alignto(mobile, reference, weights=None, subselection=None,
        tol_mass=0.1, strict=False, n_substructures=1,
        min_substructure_ratio=0.0):
    """Same as alignto selecting the protein maximum common substructures.
    
    Parameters
    ----------
    n_substructures : int
        Up to ``n_substructure`` common substructures are mapped between
        structures.
    min_substructure_ratio : float
        Minimum substructure size at least ``min_substructure_size``% of the
        length of the smaller structure.
    
    Returns
    -------
    old_rmsd : float
        RMSD before spatial alignment
    new_rmsd : float
        RMSD after spatial alignment
    
    """
    mobile = mobile.select_atoms('name CA')
    ref = reference.select_atoms('name CA')
    
    matcher = SequenceMatcher(None, mobile.resnames, ref.resnames)
    matches = sorted(matcher.get_matching_blocks(), key=lambda m: m.size)
    match = list(matches)[:n_substructures]
    
    min_length = min(len(mobile.atoms), len(ref_atoms.atoms))
    min_legnth_size = int(min_length * min_substructure_ratio)

    weights = get_weights(ref_atoms, weights

    mobile_atoms = []
    ref_atoms = []
    for match in matches[:n_substructures]:
        if len(match.size) < min_legnth_ratio:
            break
        mobile_atoms.extend(mobile.atoms[match.a:match.size])
        ref_atoms.extend(ref.atoms[match.b:match.size])
    if len(mobile_atoms) == 0:
        raise ValueError("Maximum common substructure smaller than "
                         "min_substructure_ratio")
    mobile_atoms = AtomGroup(mobile_atoms)
    ref_atoms = AtomGroup(ref_atoms)

    mobile_com = mobile_atoms.center(weights)
    ref_com = ref_atoms.center(weights)
    
    mobile_coordinates = mobile_atoms.positions - mobile_com
    ref_coordinates = ref_atoms.positions - ref_com

    old_rmsd = rms.rmsd(mobile_coordinates, ref_coordinates, weights)
    
    if subselection is None:
        # mobile_atoms is Universe
        mobile_atoms = mobile.universe.atoms
    elif isinstance(subselection, string_types):
        # select mobile_atoms from string
        mobile_atoms = mobile.select_atoms(subselection)
    else:
        try:
            # treat subselection as AtomGroup
            mobile_atoms = subselection.atoms
        except AttributeError:
            raise TypeError("subselection must be a selection string, an"
                            " AtomGroup or Universe or None")

    ref_coordinates = ref_atoms.positions - ref_com
    mobile_coordinates = mobile_atoms.positions - mobile_com
    
    # _fit_to DOES subtract center of mass, will provide proper min_rmsd
    mobile_atoms, new_rmsd = _fit_to(mobile_coordinates, ref_coordinates,
                                     mobile_atoms, mobile_com, ref_com,
                                     weights=weights)
    return old_rmsd, new_rmsd

Describe alternatives you've considered
Implemented a MCS aligner.

@orbeckst

This comment has been minimized.

Member

orbeckst commented Nov 27, 2018

Is SequenceMatcher existing code?

If you already have the code then you could submit a PR. It seems to be a natural fit for the align module.

@pslacerda

This comment has been minimized.

pslacerda commented Nov 27, 2018

Hi @orbeckst

SequenceMatcher is a useful and built in Python class. The idea here was to align mutiple identical blocks of length above certain percentage threshold. There is lots of alignment approachs and this one is simple and objective.

@orbeckst

This comment has been minimized.

Member

orbeckst commented Nov 27, 2018

Oh, cool. Does it exist in Python 2.7?

Please create a PR!

@pslacerda

This comment has been minimized.

pslacerda commented Nov 27, 2018

Yes, it is in 2.7. I found searching for "search maximum lengh substring" or somewhat.

Sorry my touchscreen is crapware, I closed by mistake!

@pslacerda pslacerda closed this Nov 27, 2018

@pslacerda pslacerda reopened this Nov 27, 2018

@pslacerda

This comment has been minimized.

pslacerda commented Nov 27, 2018

@orbeckst you belong here too, nice. That issue at gromacswrapper is still opened? I'm coming back to school and want (and need to give back) to contribute. That issue is on my list.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment