---
badges: true
author: "Samdani Ansar"
categories:
- Cheminformatics
date: '2023-09-04'
title: Descriptors calculation cheatsheet
description: Cheatsheet for different modules used for descriptor calculation
toc: true
image: images/descriptor.png

---

This notebook will provide you the concepts on different tools which can be used to calculate descriptors for the molecules.

To run the notebook in Google Colab. [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/samdani1593/samdani1593.github.io/blob/main/posts/2023-09-04-Descriptors-calculation-cheatsheet.ipynb)



In this post we will see how to use PLIP to calculate interactions of
* protein-ligand
* protein-peptide/protein-protein 
* Intra protein
* protein-dna interaction 

All the demonstration in this post uses PLIP python function.

In [None]:
#@title *Install PLIP**
!conda install -c conda-forge plip -y

In [None]:
#@title *Download Input files*
%%bash
mkdir data
cd data
DATA_DIR_PATH='https://raw.githubusercontent.com/samdani1593/samdani1593.github.io/main/posts/data'
for f in '2etr.pdb' '6fbx.pdb' '4x9j.pdb'
do
    wget $DATA_DIR_PATH/"$f"
done

In [1]:
from plip.basic import config
from plip.structure.preparation import PDBComplex
from plip.exchange.report import BindingSiteReport
import pandas as pd


In [2]:
#To know the Default variables used in PLIP
# For more information on each variable look into https://github.com/pharmai/plip/blob/master/plip/basic/config.py
# dir(config)

'''
Few variable which we need to consider for performing different calculations
'INTRA' # For intramolecular interaction
'NOHYDRO' # For adding protonation
'PEPTIDES' # For protein-protein or protein/peptide interaction
'NOFIXFILE' # For not writing the pdb fix file by plip
'''
print('--Default values stored in PLIP--')
print('INTRA : ',config.INTRA)
print('NOHYDRO :',config.NOHYDRO)
print('PEPTIDES :',config.PEPTIDES)
print('NOFIXFILE :',config.NOFIXFILE)


--Default values stored in PLIP--
INTRA :  None
NOHYDRO : False
PEPTIDES : []
NOFIXFILE : False


# Protein-Ligand interaction

In this section we will calculate the protein (ROCK1) and ligand (Y27) inhibitor interactions from the PDB 2ETR.

In [3]:
interaction_types = {'hbond': [], 'hydrophobic': [], 'pication': [], 'pistacking': [], 'saltbridge': [],
                         'halogen': [], 'metal': [], 'waterbridge': []}

config.NOFIXFILE = True  # Do not write pdb fix file
protonate = 1 #Perform protonation by PLIP

if protonate: # Perform protonation by PLIP which uses openbabel
    config.NOHYDRO = False
else: # Do not protonate by PLIP if input pdb is already protonated
    config.NOHYDRO = True
    

# Initialize the PLIP class
pdbfile = PDBComplex()
pdbfile.load_pdb('data/2etr.pdb')  # Load PDB file

# Parse and get the ligands
ligands = []
for lig in pdbfile.ligands:
    ligands.append(lig)
    # Check for specific ligand name, chain and position to calculate that alone
    # if (ligname == lig.hetid) & (ligchain == lig.chain) & (lignum == lig.position):
    #   ligands.append(lig)
    # print(lig.hetid, lig.chain, lig.position)
    # calculate interaction with the specific ligands and add as dict to interaction_sets
    pdbfile.characterize_complex(lig)
for key, site in pdbfile.interaction_sets.items():
    report = BindingSiteReport(site)
    for interaction in interaction_types.keys():
        for interaction_data in getattr(report,interaction+'_info'):
            interaction_types[interaction].append(dict(zip(getattr(report,interaction+'_features'), interaction_data)))
interaction_types.keys()

dict_keys(['hbond', 'hydrophobic', 'pication', 'pistacking', 'saltbridge', 'halogen', 'metal', 'waterbridge'])

In [4]:
# H-bond interaction
df = pd.DataFrame(interaction_types['hbond'])
df

Unnamed: 0,RESNR,RESTYPE,RESCHAIN,RESNR_LIG,RESTYPE_LIG,RESCHAIN_LIG,SIDECHAIN,DIST_H-A,DIST_D-A,DON_ANGLE,PROTISDON,DONORIDX,DONORTYPE,ACCEPTORIDX,ACCEPTORTYPE,LIGCOO,PROTCOO
0,216,ASP,A,416,Y27,A,True,1.93,2.79,151.5,True,1735,O3,6486,N3,"(54.587, 105.396, 33.251)","(52.968, 107.442, 32.264)"
1,156,MET,A,416,Y27,A,False,2.06,3.02,164.44,True,1257,Nam,6491,Nar,"(50.166, 98.291, 25.262)","(50.555, 97.125, 22.5)"
2,216,ASP,A,416,Y27,A,True,1.77,2.79,176.51,False,6486,N3,1735,O3,"(54.587, 105.396, 33.251)","(52.968, 107.442, 32.264)"
3,216,ASP,B,416,Y27,B,True,2.21,3.05,148.67,True,4993,O3,6504,N3,"(-3.983, 132.778, 25.359)","(-1.67, 130.789, 25.321)"
4,156,MET,B,416,Y27,B,False,2.04,3.0,164.94,True,4515,Nam,6509,Nar,"(0.582, 136.368, 15.369)","(0.281, 136.308, 12.382)"
5,216,ASP,B,416,Y27,B,True,2.12,3.05,150.82,False,6504,N3,4993,O3,"(-3.983, 132.778, 25.359)","(-1.67, 130.789, 25.321)"


In [5]:
# Hydrophobic interaction
df = pd.DataFrame(interaction_types['hydrophobic'])
df

Unnamed: 0,RESNR,RESTYPE,RESCHAIN,RESNR_LIG,RESTYPE_LIG,RESCHAIN_LIG,DIST,LIGCARBONIDX,PROTCARBONIDX,LIGCOO,PROTCOO
0,82,ILE,A,416,Y27,A,3.75,6493,638,"(51.021, 98.246, 27.478)","(48.842, 95.36, 28.477)"
1,90,VAL,A,416,Y27,A,3.58,6483,693,"(50.995, 102.439, 31.452)","(47.964, 100.548, 31.649)"
2,205,LEU,A,416,Y27,A,3.73,6493,1652,"(51.021, 98.246, 27.478)","(54.524, 98.63, 26.253)"
3,215,ALA,A,416,Y27,A,3.84,6479,1728,"(52.867, 103.017, 29.912)","(53.479, 103.985, 26.244)"
4,368,PHE,A,416,Y27,A,3.88,6493,2938,"(51.021, 98.246, 27.478)","(53.352, 95.236, 28.215)"
5,216,ASP,A,416,Y27,A,3.88,6480,1733,"(53.739, 103.422, 31.104)","(52.782, 106.993, 29.932)"
6,82,ILE,B,416,Y27,B,3.75,6507,3892,"(-0.477, 137.181, 17.334)","(1.381, 140.433, 17.271)"
7,90,VAL,B,416,Y27,B,3.41,6501,3947,"(-0.298, 134.71, 22.639)","(2.197, 137.037, 22.508)"
8,205,LEU,B,416,Y27,B,3.78,6507,4910,"(-0.477, 137.181, 17.334)","(-3.852, 135.97, 16.149)"
9,216,ASP,B,416,Y27,B,3.85,6499,4991,"(-2.261, 133.818, 23.957)","(-1.496, 130.157, 23.029)"


# Protein-peptide/ Protein-Protein interaction

In this section we will calculate the protein (NR-13) and peptide (BAD) interactions from the PDB 6FBX.

The same method can also be used for calculating the protein-protein interaction by specifing the chain id.

In [6]:
# Peptide chain
peptides = 'B'
if peptides:
    config.PEPTIDES = peptides

interaction_types = {'hbond': [], 'hydrophobic': [], 'pication': [], 'pistacking': [], 'saltbridge': [],
                         'halogen': [], 'metal': [], 'waterbridge': []}

config.NOFIXFILE = True  # Do not write pdb fix file
protonate = 1 #Perform protonation by PLIP

if protonate: # Perform protonation by PLIP which uses openbabel
    config.NOHYDRO = False
else: # Do not protonate by PLIP if input pdb is already protonated
    config.NOHYDRO = True
    

# Initialize the PLIP class
pdbfile = PDBComplex()
pdbfile.load_pdb('data/6fbx.pdb')  # Load PDB file

# Parse and get the ligands
ligands = []
for lig in pdbfile.ligands:
    ligands.append(lig)
    pdbfile.characterize_complex(lig)
for key, site in pdbfile.interaction_sets.items():
    report = BindingSiteReport(site)
    for interaction in interaction_types.keys():
        for interaction_data in getattr(report,interaction+'_info'):
            interaction_types[interaction].append(dict(zip(getattr(report,interaction+'_features'), interaction_data)))
interaction_types.keys()


dict_keys(['hbond', 'hydrophobic', 'pication', 'pistacking', 'saltbridge', 'halogen', 'metal', 'waterbridge'])

In [7]:
# H-bond interaction
df = pd.DataFrame(interaction_types['hbond'])
df

Unnamed: 0,RESNR,RESTYPE,RESCHAIN,RESNR_LIG,RESTYPE_LIG,RESCHAIN_LIG,SIDECHAIN,DIST_H-A,DIST_D-A,DON_ANGLE,PROTISDON,DONORIDX,DONORTYPE,ACCEPTORIDX,ACCEPTORTYPE,LIGCOO,PROTCOO
0,53,LYS,A,15,ARG,B,True,3.34,4.04,137.58,True,821,N3+,2525,Ng+,"(14.024, 66.582, 17.524)","(16.235, 65.645, 14.269)"
1,82,GLY,A,14,ARG,B,False,3.13,3.62,118.24,True,1267,Nam,2501,Ng+,"(8.7, 75.748, 13.819)","(7.074, 77.599, 11.165)"
2,83,ASP,A,14,ARG,B,False,3.27,3.76,118.46,True,1274,Nam,2501,Ng+,"(8.7, 75.748, 13.819)","(7.646, 79.313, 13.263)"
3,89,GLY,A,21,ASP,B,False,3.56,3.99,114.21,True,1370,Nam,2620,O.co2,"(21.54, 77.535, 8.091)","(18.934, 77.756, 5.079)"
4,87,ASN,A,21,ASP,B,True,3.38,3.67,103.48,True,1339,Nam,2621,O.co2,"(20.232, 78.011, 9.788)","(17.1, 79.87, 9.314)"
5,64,THR,A,9,TYR,B,True,2.35,3.16,163.9,False,2421,O2,1008,O3,"(5.023, 60.974, 10.735)","(2.446, 60.242, 9.053)"
6,60,GLU,A,8,LYS,B,True,2.38,3.01,127.43,False,2396,N3,945,O3,"(7.917, 60.468, 15.608)","(7.831, 58.294, 13.533)"
7,79,GLU,A,14,ARG,B,False,1.94,2.65,139.07,False,2501,Ng+,1220,O2,"(8.7, 75.748, 13.819)","(7.329, 74.648, 11.832)"
8,79,GLU,A,7,LYS,B,True,2.45,3.07,127.39,False,2374,N3,1224,O3,"(3.995, 73.777, 16.673)","(4.292, 73.495, 13.628)"
9,80,LEU,A,14,ARG,B,False,2.23,2.96,141.79,False,2498,Ng+,1235,O2,"(10.98, 75.465, 13.709)","(10.178, 75.88, 10.893)"


In [8]:
# Hydrophobic interaction
df = pd.DataFrame(interaction_types['hydrophobic'])
df

Unnamed: 0,RESNR,RESTYPE,RESCHAIN,RESNR_LIG,RESTYPE_LIG,RESCHAIN_LIG,DIST,LIGCARBONIDX,PROTCARBONIDX,LIGCOO,PROTCOO
0,42,LEU,A,20,PHE,B,3.86,2604,633,"(23.955, 72.753, 3.959)","(25.519, 74.093, 0.698)"
1,54,PHE,A,13,LEU,B,3.56,2479,843,"(13.42, 67.834, 7.501)","(15.931, 66.014, 5.746)"
2,57,LEU,A,12,GLN,B,3.97,2459,896,"(12.378, 66.333, 13.006)","(13.004, 63.978, 9.878)"
3,57,LEU,A,9,TYR,B,3.86,2416,897,"(7.394, 63.709, 11.22)","(11.142, 62.813, 11.073)"
4,76,VAL,A,13,LEU,B,3.65,2478,1165,"(11.142, 68.829, 7.217)","(7.491, 68.894, 7.235)"
5,147,PHE,A,24,GLN,B,3.72,2660,2275,"(26.587, 78.443, 5.884)","(24.204, 77.653, 3.133)"
6,46,MET,A,20,PHE,B,3.86,2600,696,"(23.59, 71.657, 6.054)","(21.394, 68.894, 4.493)"


In [9]:
# Salt bridge interaction
df = pd.DataFrame(interaction_types['saltbridge'])
df

Unnamed: 0,RESNR,RESTYPE,RESCHAIN,PROT_IDX_LIST,RESNR_LIG,RESTYPE_LIG,RESCHAIN_LIG,DIST,PROTISPOS,LIG_GROUP,LIG_IDX_LIST,LIGCOO,PROTCOO
0,50,HIS,A,764767,19,GLU,B,4.74,True,Carboxylate,25862587,"(21.935499999999998, 68.5595, 14.6765)","(21.761499999999998, 66.88, 10.2525)"
1,90,ARG,A,138413861387,18,ASP,B,5.4,True,Carboxylate,25732574,"(17.4, 76.054, 14.0495)","(13.357, 78.39466666666665, 11.333666666666666)"
2,83,ASP,A,12801281,14,ARG,B,4.11,False,Guanidine,249825002501,"(9.893, 75.802, 14.395)","(10.9865, 79.66, 13.477)"


# Intra Protein interaction

In this section we will calculate the intra peptide (BAD) interactions from the PDB 6FBX.


In [10]:
intra = 'B'
config.PEPTIDES = []
# Intra chain
if intra:
    config.INTRA = intra

interaction_types = {'hbond': [], 'hydrophobic': [], 'pication': [], 'pistacking': [], 'saltbridge': [],
                         'halogen': [], 'metal': [], 'waterbridge': []}

config.NOFIXFILE = True  # Do not write pdb fix file
protonate = 1 #Perform protonation by PLIP

if protonate: # Perform protonation by PLIP which uses openbabel
    config.NOHYDRO = False
else: # Do not protonate by PLIP if input pdb is already protonated
    config.NOHYDRO = True
    

# Initialize the PLIP class
pdbfile = PDBComplex()
pdbfile.load_pdb('data/6fbx.pdb')  # Load PDB file

# Parse and get the ligands
ligands = []
for lig in pdbfile.ligands:
    ligands.append(lig)
    pdbfile.characterize_complex(lig)
for key, site in pdbfile.interaction_sets.items():
    report = BindingSiteReport(site)
    for interaction in interaction_types.keys():
        for interaction_data in getattr(report,interaction+'_info'):
            interaction_types[interaction].append(dict(zip(getattr(report,interaction+'_features'), interaction_data)))
interaction_types.keys()


dict_keys(['hbond', 'hydrophobic', 'pication', 'pistacking', 'saltbridge', 'halogen', 'metal', 'waterbridge'])

In [11]:
# H-bond interaction
df = pd.DataFrame(interaction_types['hbond'])
df

Unnamed: 0,RESNR,RESTYPE,RESCHAIN,RESNR_LIG,RESTYPE_LIG,RESCHAIN_LIG,SIDECHAIN,DIST_H-A,DIST_D-A,DON_ANGLE,PROTISDON,DONORIDX,DONORTYPE,ACCEPTORIDX,ACCEPTORTYPE,LIGCOO,PROTCOO
0,7,LYS,B,4,TRP,B,False,3.14,3.84,140.37,True,2366,Nam,2330,Npl,"(-0.08, 69.546, 13.561)","(3.643, 68.764, 13.009)"
1,8,LYS,B,4,TRP,B,False,2.04,2.88,164.73,True,2388,Nam,2325,O2,"(2.966, 66.396, 14.999)","(5.598, 67.488, 14.576)"
2,9,TYR,B,5,ALA,B,False,2.01,2.84,163.31,True,2410,Nam,2349,O2,"(4.628, 65.784, 12.352)","(7.369, 66.518, 12.532)"
3,10,GLY,B,6,ALA,B,False,2.08,2.92,165.1,True,2431,Nam,2359,O2,"(5.161, 68.844, 11.348)","(8.074, 68.934, 11.276)"
4,11,GLN,B,7,LYS,B,False,2.11,2.96,166.77,True,2438,Nam,2369,O2,"(6.824, 69.365, 14.378)","(9.443, 70.292, 13.364)"
5,12,GLN,B,8,LYS,B,False,2.21,3.04,163.76,True,2455,Nam,2391,O2,"(8.983, 67.02, 14.017)","(11.605, 68.535, 13.741)"
6,13,LEU,B,9,TYR,B,False,2.13,2.97,165.13,True,2472,Nam,2413,O2,"(10.091, 68.017, 10.883)","(12.92, 68.842, 11.238)"
7,14,ARG,B,10,GLY,B,False,2.06,2.86,153.77,True,2491,Nam,2434,O2,"(10.685, 71.232, 11.739)","(13.511, 71.553, 11.479)"
8,15,ARG,B,11,GLN,B,False,2.17,2.99,159.98,True,2515,Nam,2441,O2,"(12.823, 70.331, 14.342)","(15.419, 71.612, 13.581)"
9,16,MET,B,12,GLN,B,False,2.36,3.18,161.27,True,2539,Nam,2458,O2,"(14.82, 68.541, 12.403)","(17.534, 70.203, 12.393)"


# DNA/RNA interaction

In this section we will calculate the protein (EGR1) and DNA interactions from the PDB 4X9J.

In [12]:
# DNA chain
interaction_types = {'hbond': [], 'hydrophobic': [], 'pication': [], 'pistacking': [], 'saltbridge': [],
                         'halogen': [], 'metal': [], 'waterbridge': []}

config.NOFIXFILE = True  # Do not write pdb fix file
protonate = 1 #Perform protonation by PLIP
config.INTRA = None

if protonate: # Perform protonation by PLIP which uses openbabel
    config.NOHYDRO = False
else: # Do not protonate by PLIP if input pdb is already protonated
    config.NOHYDRO = True
    

# Initialize the PLIP class
pdbfile = PDBComplex()
pdbfile.load_pdb('data/4x9j.pdb')  # Load PDB file

# Parse and get the ligands
ligands = []
for lig in pdbfile.ligands:
    ligands.append(lig)
    pdbfile.characterize_complex(lig)
for key, site in pdbfile.interaction_sets.items():
    report = BindingSiteReport(site)
    for interaction in interaction_types.keys():
        for interaction_data in getattr(report,interaction+'_info'):
            interaction_types[interaction].append(dict(zip(getattr(report,interaction+'_features'), interaction_data)))
interaction_types.keys()


dict_keys(['hbond', 'hydrophobic', 'pication', 'pistacking', 'saltbridge', 'halogen', 'metal', 'waterbridge'])

In [13]:
# H-bond interaction
df = pd.DataFrame(interaction_types['hbond'])
df

Unnamed: 0,RESNR,RESTYPE,RESCHAIN,RESNR_LIG,RESTYPE_LIG,RESCHAIN_LIG,SIDECHAIN,DIST_H-A,DIST_D-A,DON_ANGLE,PROTISDON,DONORIDX,DONORTYPE,ACCEPTORIDX,ACCEPTORTYPE,LIGCOO,PROTCOO
0,180,ARG,A,2,DG,B,True,2.06,2.91,166.76,True,1281,Ng+,1475,O2,"(-11.001, -23.069, -17.68)","(-8.974, -24.039, -19.524)"
1,180,ARG,A,2,DG,B,True,2.04,2.88,165.54,True,1282,Ng+,1472,N2,"(-12.161, -22.025, -20.324)","(-10.09, -23.648, -21.489)"
2,174,ARG,A,4,DG,B,True,2.1,2.9,155.28,True,1174,Ng+,1541,O2,"(-7.303, -17.795, -16.424)","(-7.246, -17.135, -19.251)"
3,174,ARG,A,4,DG,B,True,2.07,2.92,168.25,True,1173,Ng+,1538,N2,"(-9.54, -15.731, -16.56)","(-8.799, -15.476, -19.369)"
4,145,SER,A,6,DG,B,True,3.24,4.03,157.7,True,720,O3,1591,O3,"(-6.279, -9.161, -8.312)","(-6.646, -6.593, -11.399)"
5,149,HIS,A,6,DG,B,True,1.87,2.71,162.35,True,782,Nar,1603,N2,"(-3.351, -11.966, -13.471)","(-4.624, -10.555, -15.396)"
6,124,ARG,A,8,DG,B,True,2.0,2.85,166.96,True,356,Ng+,1669,N2,"(4.285, -9.676, -14.094)","(2.5, -8.361, -12.305)"
7,146,ARG,A,7,DG,B,True,2.21,3.07,175.51,True,735,Ng+,1636,N2,"(1.036, -11.51, -13.566)","(-1.109, -9.313, -13.646)"
8,146,ARG,A,7,DG,B,True,2.03,2.84,158.45,True,736,Ng+,1639,O2,"(0.561, -11.417, -16.582)","(-1.1, -9.201, -15.933)"
9,124,ARG,A,8,DG,B,True,2.22,3.05,162.69,True,355,Ng+,1672,O2,"(2.546, -8.743, -16.407)","(1.584, -7.132, -13.997)"


In [14]:
# H-bond interaction
df = pd.DataFrame(interaction_types['saltbridge'])
df

Unnamed: 0,RESNR,RESTYPE,RESCHAIN,PROT_IDX_LIST,RESNR_LIG,RESTYPE_LIG,RESCHAIN_LIG,DIST,PROTISPOS,LIG_GROUP,LIG_IDX_LIST,LIGCOO,PROTCOO
0,103,ARG,A,222425,8,DG,B,4.87,True,Phosphate,165616561657165816591631,"(6.751, -14.324, -9.43)","(9.572333333333333, -13.421, -5.562666666666668)"
1,114,ARG,A,179181182,7,DG,B,4.27,True,Phosphate,162316231625162615981624,"(1.013, -12.685, -7.638)","(3.4233333333333333, -12.407666666666666, -4.1..."
2,133,LYS,A,512,5,DT,B,5.08,True,Phosphate,155815581561153315591560,"(-11.472, -10.521, -11.328)","(-12.603, -6.803, -8.063)"
3,142,ARG,A,664666667,5,DT,B,4.82,True,Phosphate,155815581561153315591560,"(-11.472, -10.521, -11.328)","(-16.022000000000002, -10.019, -12.83733333333..."
4,153,HIS,A,843846,4,DG,B,4.92,True,Phosphate,152515251506152615271528,"(-13.357, -12.387, -17.568)","(-15.801000000000002, -8.328, -18.886)"
5,127,ARG,A,413415416,52,DA,C,5.15,True,Phosphate,181918191794182018211822,"(1.209, 2.821, -13.342)","(-2.3363333333333336, 0.26133333333333336, -10..."
6,179,LYS,A,1258,57,DC,C,3.33,True,Phosphate,20372037203820392040,"(4.86, -18.43, -23.48)","(2.436, -20.689, -23.165)"


With the interaction calculated and stored as dictionary it can be extended and used further for analysis.