# Molecule substructural alignment

There are cases when we want to align molecules which only share 
a fraction of their structure, like ligands in a congeneric series, where
a common scaffold or core is common to all ligands in the series.
We can perform this alignment using HTMD function: maximalSubstructureAlignment 

In this example, we will use two ligands which bind to the Beta Adrenergic receptor.
They share a fraction of their scaffold and bind in the same pocket.

## Quick Example

In [1]:
from htmd.ui import *
from htmd.molecule.graphalignment import maximalSubstructureAlignment
config(viewer='ngl')

#load the molecule which will be used as reference: PDB 2Y02 with resname WHJ
reference_crystal = Molecule('2Y02')
reference_crystal.filter('protein or resname WHJ')
reference_ligand = reference_crystal.copy()
#Extracts the ligand from the protein
reference_ligand.filter('resname WHJ')
#There are two ligands with the same resname, so we select one of them
reference_ligand.filter('residue 1') 

#Extract other ligand from this congeneric series: PDB 2Y03 with resname 5FW
ligand_to_align = Molecule('2Y03')
ligand_to_align.filter('resname 5FW')
#pick one of the two molecules
ligand_to_align.filter('residue 0') 

#align
lig_aligned = maximalSubstructureAlignment(reference_ligand,ligand_to_align)

#show the results, with the reference ligand displayed in Lines style (narrower bonds and atoms)
show_result = reference_ligand.copy()
show_result.append(lig_aligned)
show_result.append(reference_crystal)
show_result.view(sel='resname WHJ and residue 0',style='Lines',hold=True)
show_result.view(sel='resname 5FW',style='Licorice')


Please cite HTMD: Doerr et al.(2016)JCTC,12,1845. 
https://dx.doi.org/10.1021/acs.jctc.6b00049
Documentation: http://software.acellera.com/
To update: conda update htmd -c acellera -c psi4

You are on the latest HTMD version (1.13.6).



2018-07-25 13:27:31,979 - htmd.molecule.readers - INFO - Attempting PDB query for 2Y02
  crystalinfo['rotations'] = parsedsymmetry[['rot1', 'rot2', 'rot3']].as_matrix().reshape((numcopies, 3, 3))
  crystalinfo['translations'] = parsedsymmetry['trans'].as_matrix().reshape((numcopies, 3))
  serials = parsedtopo.serial.as_matrix()
2018-07-25 13:27:33,537 - htmd.molecule.molecule - INFO - Removed 370 atoms. 4694 atoms remaining in the molecule.
2018-07-25 13:27:33,641 - htmd.molecule.molecule - INFO - Removed 4640 atoms. 54 atoms remaining in the molecule.
2018-07-25 13:27:33,645 - htmd.molecule.molecule - INFO - Removed 27 atoms. 27 atoms remaining in the molecule.
2018-07-25 13:27:33,647 - htmd.molecule.readers - INFO - Attempting PDB query for 2Y03
2018-07-25 13:27:34,757 - htmd.molecule.molecule - INFO - Removed 4832 atoms. 30 atoms remaining in the molecule.
2018-07-25 13:27:34,760 - htmd.molecule.molecule - INFO - Removed 15 atoms. 15 atoms remaining in the molecule.


TypeError: No matching definition for argument type(s) array(float32, 3d, C), array(float32, 3d, C), reflected list(int64), reflected list(int64), array(int64, 1d, C), int64

## Detailed explanation

First, we load the molecule which will be used as reference, cocrystallized in PDB entry 2Y02 with resname WHJ

In [2]:
from htmd.ui import *
from htmd.molecule.graphalignment import maximalSubstructureAlignment
config(viewer='ngl')
reference_crystal = Molecule('2Y02')
reference_crystal.filter('protein or resname WHJ')
reference_ligand = reference_crystal.copy()
#Extracts the ligand from the protein
reference_ligand.filter('resname WHJ')
#There are two ligands with the same resname, so we select one of them
reference_ligand.filter('residue 1') 

2018-07-25 12:26:07,260 - htmd.molecule.readers - INFO - Attempting PDB query for 2Y02
2018-07-25 12:26:08,586 - htmd.molecule.molecule - INFO - Removed 370 atoms. 4694 atoms remaining in the molecule.
2018-07-25 12:26:08,689 - htmd.molecule.molecule - INFO - Removed 4640 atoms. 54 atoms remaining in the molecule.
2018-07-25 12:26:08,693 - htmd.molecule.molecule - INFO - Removed 27 atoms. 27 atoms remaining in the molecule.


array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19, 20, 21, 22, 23, 24, 25, 26], dtype=int32)

Now, let's extract the other ligand from this congeneric series, cocristallized in PDB entry 2Y03 with resname 5FW

In [3]:
ligand_to_align = Molecule('2Y03')
ligand_to_align.filter('resname 5FW')
#Again, there are two residues with the same name, and
#residue 1 happens to be already aligned with the reference ligand, so we use residue 0
ligand_to_align.filter('residue 0') 

2018-07-25 12:26:10,732 - htmd.molecule.readers - INFO - Attempting PDB query for 2Y03
2018-07-25 12:26:11,979 - htmd.molecule.molecule - INFO - Removed 4832 atoms. 30 atoms remaining in the molecule.
2018-07-25 12:26:11,985 - htmd.molecule.molecule - INFO - Removed 15 atoms. 15 atoms remaining in the molecule.


array([15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29],
      dtype=int32)

Let's take a look at both molecules as they are now. The reference ligand is displayed with narrower bonds.

In [4]:
template_and_ligand = reference_ligand.copy()
template_and_ligand.append(ligand_to_align)
template_and_ligand.view(sel='resname WHJ', style='Lines', hold=True)
template_and_ligand.view(sel='resname 5FW',style='Licorice')

A Jupyter Widget

Now we align the extracted ligand to the crystal of the molecule we are using as reference. Then, we can see the result. Again, the reference ligand is displayed with narrower bonds.

In [6]:
lig_aligned = maximalSubstructureAlignment(reference_ligand,ligand_to_align)
show_result = reference_ligand.copy()
show_result.append(lig_aligned)
show_result.append(reference_crystal)
show_result.view(sel='resname WHJ and residue 0',style='Lines',hold=True)
show_result.view(sel='resname 5FW',style='Licorice')

A Jupyter Widget