# Protein-protein interactions

This notebooks shows how to compute a fingerprint for protein-protein interactions.

Here we will investigate the interactions in a G-protein coupled receptor (GPCR) between a particular helix (called TM3) and the rest of the protein.

This can obviously be applied to proteins that don't belong to the same chain/segment, as long as you can figure out an appropriate [MDAnalysis selection](https://docs.mdanalysis.org/stable/documentation_pages/selections.html)

There is also an example at the end of this tutorial for generating an IFP of PPI without considering the backbone.

In [None]:
import MDAnalysis as mda
import prolif as plf

In [None]:
# load traj
u = mda.Universe(plf.datafiles.TOP, plf.datafiles.TRAJ)
tm3 = u.select_atoms("resid 119:152")
prot = u.select_atoms("protein and not group tm3", tm3=tm3)
tm3, prot

In [None]:
# prot-prot interactions
fp = plf.Fingerprint(["HBDonor", "HBAcceptor", "PiStacking", "PiCation", "CationPi", "Anionic", "Cationic"])
fp.run(u.trajectory[::10], tm3, prot)

In [None]:
df = fp.to_dataframe()
df.head()

In [None]:
# show interactions for a specific ligand residue
df.xs("ARG147.A", level="ligand", axis=1).head(5)

In [None]:
# same for a protein residue
df.xs("GLU309.B", level="protein", axis=1).head(5)

In [None]:
# display a specific type of interaction
df.xs("Cationic", level="interaction", axis=1).head(5)

In [None]:
# calculate the occurence of each interaction on the trajectory
occ = df.mean()
# restrict to the frequent ones
occ.loc[occ > 0.3]

In [None]:
# regroup all interactions together and do the same
g = (df.groupby(level=["ligand", "protein"], axis=1)
       .sum()
       .astype(bool)
       .mean())
g.loc[g > 0.3]

## Ignoring backbone interactions

In some cases, you might want to dismiss backbone interactions. While it might be tempting to just modify the MDAnalysis selection with `"protein and not backbone"`, this won't work as expected and will lead to adding a charges where the backbone was bonding with the sidechain.  
However there is a temporary workaround (which will be directly included in the code in the near future):

In [None]:
from rdkit import Chem
from rdkit.Chem import AllChem
from tqdm.auto import tqdm

# remove backbone
backbone = Chem.MolFromSmarts("[C^2](=O)-[C;X4](-[H])-[N;+0]")
fix_h = Chem.MolFromSmarts("[H&D0]")

def remove_backbone(atomgroup):
    mol = plf.Molecule.from_mda(atomgroup)
    mol = AllChem.DeleteSubstructs(mol, backbone)
    mol = AllChem.DeleteSubstructs(mol, fix_h)
    return plf.Molecule(mol)

# generate IFP
ifp = []
for ts in tqdm(u.trajectory[::10]):
    tm3_mol = remove_backbone(tm3)
    prot_mol = remove_backbone(prot)
    data = fp.generate(tm3_mol, prot_mol)
    data["Frame"] = ts.frame
    ifp.append(data)
df = plf.to_dataframe(ifp, fp.interactions.keys())
df.head()