Pocket matching is the process of matching pockets in difference structure of a similiar protein, for example, the apo and holo structure of a receptor. Being able to find similiar pockets enables us to find differences in the binding region from a pocket centric view. 

Pocket matching is based on average linkage clustering using the distance matrix generated from Jaccard distance of pocket lining atoms. The atoms are compared based on their index in the protein structure, so one needs to pre align the sequence of two proteins should they have variations in atom composition.

Here is an example of how you can compute the pocket lining atom difference, using it to cluster the pockets and generated a matching table of pockets between two snapshots. 

First we import the alphaspace module and load the structure. from the match directory. We are using the mdm5 apo and PPI structure.

In [1]:
import alphaspace,mdtraj
import numpy as np


apo_universe = alphaspace.AS_Universe(mdtraj.load('examples/match/A.pdb'),
                                   guess_receptor_binder=False,guess_by_order=False,tag='Apo')
ppi_universe = alphaspace.AS_Universe(mdtraj.load('examples/match/AB.pdb'),
                                   guess_receptor_binder=True,guess_by_order=True,tag="PPI")

print(apo_universe.receptor)
print(ppi_universe.receptor)
print(ppi_universe.binder)

Receptor Structure with 1 frames, 468 residues, 3689 atoms
Receptor Structure with 1 frames, 467 residues, 3670 atoms
Binder Structure with 1 frames, 12 residues, 99 atoms


You can remove non-overlapping atoms here like this.

In [2]:
# Creating overlapping atom set and slice the trajectory
atom_list1 = set([(atom.name,atom.residue.index) for atom in list(apo_universe.receptor.atoms)])
atom_list2 = set([(atom.name,atom.residue.index) for atom in list(ppi_universe.receptor.atoms)])

shared_atom_list = atom_list2.intersection(atom_list1)

for universe in apo_universe,ppi_universe:
    kept_atom_indice = []
    for atom in universe.receptor.atoms:
        if (atom.name,atom.residue.index) in shared_atom_list:
            kept_atom_indice.append(atom.index)
    universe.receptor.traj.atom_slice(inplace=True,atom_indices=kept_atom_indice)   
print(apo_universe.receptor)
print(ppi_universe.receptor)

Receptor Structure with 1 frames, 467 residues, 3667 atoms
Receptor Structure with 1 frames, 467 residues, 3667 atoms


Now we run the alphaspace on the given proteins. Note since there is no binder in the APO structure, you can not use *screen_by_lig_cntct* option.

In [3]:
apo_universe.config.screen_by_face = False
apo_universe.config.screen_by_lig_cntct = False

ppi_universe.config.screen_by_face = False
ppi_universe.config.screen_by_lig_cntct = False

apo_universe.run_alphaspace()
ppi_universe.run_alphaspace()

0 snapshot processed


0 snapshot processed


Same as before, to access the pockets you can simply iterate through them by calling .pockets method in the universe object. You can access the lining atom indices by:

In [5]:
for pocket in apo_universe.pockets():
    print(pocket.lining_atoms_idx)
    print([atom.name for atom in pocket.lining_atoms])
    break

To calculate the number of atoms shared between two pockets, and also number of total atoms in them, we can use 

In [6]:
from scipy.cluster.hierarchy import linkage, fcluster
from scipy.spatial.distance import squareform


pocket_list = (list(apo_universe.pockets(snapshot_idx=0,active_only=False))+list(ppi_universe.pockets(snapshot_idx=0,active_only=False)))

pocket_diff_matrix = np.ones((len(pocket_list),len(pocket_list))) * 10

for i, pocket1 in enumerate(pocket_list):
    for j, pocket2 in enumerate(pocket_list):
        if pocket1.parent_structure.universe.tag != pocket2.parent_structure.universe.tag and i > j:

            pocket_diff_matrix[i][j] = pocket_diff_matrix[j][i] = 1- pocket1.jaccard_similarity(pocket2)
        if i == j:
            pocket_diff_matrix[i][i] = 0
        
zmat = linkage(squareform(pocket_diff_matrix),method='average')
pocket_cluster_indices = fcluster(zmat, 0.75,criterion='distance') - 1


matched_pockets = [[] for _ in range(max(pocket_cluster_indices)+1)]
for pocket_index, pocket_cluster_index in enumerate(pocket_cluster_indices):
    matched_pockets[pocket_cluster_index].append(pocket_list[pocket_index])


From here you can access the pairs of pockets that has been matched:

In [8]:

print(len(matched_pockets))
for pockets in matched_pockets:
    if len(pockets) == 2:
        print(pockets[0].parent_structure.universe.tag,pockets[1].parent_structure.universe.tag)
    else:
        print(pockets[0].parent_structure.universe.tag,)

325
Apo PPI
Apo PPI
Apo PPI
Apo
PPI
Apo
PPI
Apo PPI
Apo PPI
Apo
PPI
Apo PPI
Apo PPI
Apo PPI
Apo
PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo
PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo
PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo
PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo
PPI
Apo PPI
Apo
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo
PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo
PPI
Apo PPI
Apo PPI
Apo PPI
Apo
PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo PPI
Apo
PPI
