Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MCS Ca alignment #2146

Open
pslacerda opened this issue Nov 23, 2018 · 5 comments
Open

MCS Ca alignment #2146

pslacerda opened this issue Nov 23, 2018 · 5 comments

Comments

@pslacerda
Copy link

pslacerda commented Nov 23, 2018

Is your feature request related to a problem? Please describe.
I need to align by the maximum common substructure.

Describe the solution you'd like
Corrected code:

def mcs_ca_alignto(mobile, reference, weights=None, subselection=None,
        tol_mass=0.1, strict=False, n_substructures=1,
        min_substructure_ratio=0.0):
    """Same as alignto selecting the protein maximum common substructures.
    
    Parameters
    ----------
    n_substructures : int
        Up to ``n_substructure`` common substructures are mapped between
        structures.
    min_substructure_ratio : float
        Minimum substructure size at least ``min_substructure_size``% of the
        length of the smaller structure.
    
    Returns
    -------
    old_rmsd : float
        RMSD before spatial alignment
    new_rmsd : float
        RMSD after spatial alignment
    
    """
    mobile = mobile.select_atoms('name CA')
    ref = reference.select_atoms('name CA')
    
    matcher = SequenceMatcher(None, mobile.resnames, ref.resnames)
    matches = sorted(matcher.get_matching_blocks(), key=lambda m: m.size)
    match = list(matches)[:n_substructures]
    
    min_length = min(len(mobile.atoms), len(ref_atoms.atoms))
    min_legnth_size = int(min_length * min_substructure_ratio)

    weights = get_weights(ref_atoms, weights

    mobile_atoms = []
    ref_atoms = []
    for match in matches[:n_substructures]:
        if len(match.size) < min_legnth_ratio:
            break
        mobile_atoms.extend(mobile.atoms[match.a:match.size])
        ref_atoms.extend(ref.atoms[match.b:match.size])
    if len(mobile_atoms) == 0:
        raise ValueError("Maximum common substructure smaller than "
                         "min_substructure_ratio")
    mobile_atoms = AtomGroup(mobile_atoms)
    ref_atoms = AtomGroup(ref_atoms)

    mobile_com = mobile_atoms.center(weights)
    ref_com = ref_atoms.center(weights)
    
    mobile_coordinates = mobile_atoms.positions - mobile_com
    ref_coordinates = ref_atoms.positions - ref_com

    old_rmsd = rms.rmsd(mobile_coordinates, ref_coordinates, weights)
    
    if subselection is None:
        # mobile_atoms is Universe
        mobile_atoms = mobile.universe.atoms
    elif isinstance(subselection, string_types):
        # select mobile_atoms from string
        mobile_atoms = mobile.select_atoms(subselection)
    else:
        try:
            # treat subselection as AtomGroup
            mobile_atoms = subselection.atoms
        except AttributeError:
            raise TypeError("subselection must be a selection string, an"
                            " AtomGroup or Universe or None")

    ref_coordinates = ref_atoms.positions - ref_com
    mobile_coordinates = mobile_atoms.positions - mobile_com
    
    # _fit_to DOES subtract center of mass, will provide proper min_rmsd
    mobile_atoms, new_rmsd = _fit_to(mobile_coordinates, ref_coordinates,
                                     mobile_atoms, mobile_com, ref_com,
                                     weights=weights)
    return old_rmsd, new_rmsd

Describe alternatives you've considered
Implemented a MCS aligner.

@orbeckst
Copy link
Member

Is SequenceMatcher existing code?

If you already have the code then you could submit a PR. It seems to be a natural fit for the align module.

@pslacerda
Copy link
Author

Hi @orbeckst

SequenceMatcher is a useful and built in Python class. The idea here was to align mutiple identical blocks of length above certain percentage threshold. There is lots of alignment approachs and this one is simple and objective.

@orbeckst
Copy link
Member

Oh, cool. Does it exist in Python 2.7?

Please create a PR!

@pslacerda
Copy link
Author

pslacerda commented Nov 27, 2018

Yes, it is in 2.7. I found searching for "search maximum lengh substring" or somewhat.

Sorry my touchscreen is crapware, I closed by mistake!

@pslacerda pslacerda reopened this Nov 27, 2018
@pslacerda
Copy link
Author

@orbeckst you belong here too, nice. That issue at gromacswrapper is still opened? I'm coming back to school and want (and need to give back) to contribute. That issue is on my list.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants