# Template for the User Tutorial


#### Developed at Volkamer Lab, Charité/FU Berlin 

by Annie Pham

# Introduction

## Matchmaker
*MatchMaker superimposes protein by first creating pairwise sequence alignments, then fitting the aligned residue pairs.
        The standard Needleman-Wunsch and Smith-Waterman algorithms are available for producing global and local sequence alignments
        and MatchMaker can identify the best-matching chains based on alignment scores.
        Alignment scores can include residue similarity, secondary structure information, and gap penalties.
        MatchMaker performs a spatial superposition by minimizing the RMSD.* [(MatchMaker)](https://www.cgl.ucsf.edu/chimera/docs/ContributedSoftware/matchmaker/matchmaker.html)
        

### Which structures you choose?
6HG4 and 6HG9


# Preparation

### How to get the structure from the CLI

To get the structures directly from the RCSB, the syntax looks like this:

In [1]:
!superposer --method=YOUR_METHOD NAME_OF_STRUCTURE_1 NAME_OF_STRUCTURE_2

usage: superposer [-h] [--version] [-v]
                           [--method {theseus,mmligner,matchmaker}]
                           [--method-options METHOD_OPTIONS]
                           structures [structures ...]
superposer: error: argument --method: invalid choice: 'YOUR_METHOD' (choose from 'theseus', 'mmligner', 'matchmaker')


When you want to use structures which are locally saved, do this:

In [2]:
!superposer --method=YOUR_METHOD PATH_OF_STRUCTURE_1 PATH_OF_STRUCTURE_2

usage: superposer [-h] [--version] [-v]
                           [--method {theseus,mmligner,matchmaker}]
                           [--method-options METHOD_OPTIONS]
                           structures [structures ...]
superposer: error: argument --method: invalid choice: 'YOUR_METHOD' (choose from 'theseus', 'mmligner', 'matchmaker')


### Getting the structure in python

The method will use atomium.models as input.

If you want to get the structures from the RCSB, you can do the following:

In [1]:
%load_ext autoreload

In [2]:
%autoreload 2

In [3]:
import atomium

structure1 = atomium.fetch("6HG4").model
structure2 = atomium.fetch("6HG9").model

# Alignment 

Factory to configure an aligner based on UCSF Chimera's MatchMaker algorithms.
Generating pairwise sequence alignments and matching, i.e., superimposing the structures according to those pairwise alignments




The algorithm follows these steps:
    1. Sequence alignment -- using "biotite" and "MDAnalysis"
    2. Structural superposition -- using "MDAnalysis"
    

### Parameters for MatchMakerAligner

```alignment_strategy : str, optional, default=global```<br> 
What type of algorithm will be used to calculate the sequence alignment.
Choose between:
    - "global" (Needleman-Wunsch)
    - "local" (Smith-Waterman)
        
```alignment_matrix : str, optional, default=BLOSUM62``` <br> 
The substitution matrix used for scoring
        
```alignment_gap : int or (tuple, dtype=int), optional```<br> 
Int the value will be interpreted as general gap penalty. Tupel is provided, an affine gap penalty is used. The first  integer in the tuple is the gap opening penalty, the second integer is the gap extension penalty. The values need to be negative.(Default: -10)
       
```strict_superposition: bool, optional, default=False```<br> 
*True:* Will raise SelectionError if a single atom does not match between the two selections. <br> 
*False:* Will try to prepare a matching selection by dropping residues with non-matching atoms.
    
```superposition_selection: str or AtomGroup or None, optional,  default=None```<br> 
*None:* Apply to mobile.universe.atoms (i.e., all atoms in the context of the selection from mobile such as the rest of a protein, ligands and the surrounding water) <br> 
*str:* Apply to mobile.select_atoms(selection-string), e.g "protein and name CA"<br> 
AtomGroup: Apply to the arbitrary group of atoms
        
```superposition_weights: {“mass”, None} or array_like, optional```<br> 
choose weights. With "mass" uses masses as weights;
*None:* weigh each atom equally<br> 
If a float array of the same length as mobile is provided, use each element of the array_like as a weight for the corresponding atom in mobile.

```superposition_delta_mass_tolerance: float, optional, default=0.1```<br> 
Reject match if the atomic masses for matched atoms differ by more than tol_mass

[References](<https://www.mdanalysis.org/docs/documentation_pages/analysis/align.html#MDAnalysis.analysis.align.alignto>)

*calculate:* command superimposes protein or nucleic acid structures by first creating a pairwise sequence alignment, then fitting the aligned residue pairs.


In [4]:
%%time
from superposer.superposition.matchmaker import MatchMakerAligner, mda_align
from MDAnalysis.analysis.rms import rmsd
aligner = MatchMakerAligner(alignment_strategy="global", superposition_selection="name CA or name CB")
res = aligner.calculate([structure1, structure2])

CPU times: user 1.83 s, sys: 124 ms, total: 1.95 s
Wall time: 2.28 s


In [5]:
print(f'From RMSD = {res["metadata"]["initial_rmsd"]:.3f}A to optimized RMSD of {res["scores"]["rmsd"]:.3f}A')

From RMSD = 187.820A to optimized RMSD of 1.597A


# Analysis

### NGLview

If you have trouble with NGLview, follow this [troubleshooting guide](https://github.com/SBRG/ssbio/wiki/Troubleshooting#tips-for-nglview).

In [6]:
import nglview as nv
print("nglview version = {}".format(nv.__version__))
# your nglview version should be 1.1.7 or later

view = nv.show_mdanalysis(res["superposed"][0].atoms)
view.add_component(res["superposed"][1].atoms)

view

_ColormakerRegistry()

nglview version = 2.7.5


NGLWidget()

In [47]:
#b.write("mob.pdb")