# Antibody Benchmark

This data is a subset of the Docking Benchmark5.5 containing only Antibody-Antigen Complexes.

The excel sheet is downloaded from https://piercelab.ibbr.umd.edu/antibody_benchmark/antibody_benchmark_cases.xlsx

The PDB Files are downloaded from their GitHub Repository: https://github.com/piercelab/antibody_benchmark

In [2]:
import pandas as pd
import yaml
import os

In [4]:
# load config with filepaths
with open("../config.yaml", "r") as f:
    config = yaml.safe_load(f)

# define filepaths
benchmark_path = os.path.join(config["DATA"]["path"], config["DATA"]["AntibodyBenchmark"]["folder_path"])
ab_cases_path = os.path.join(benchmark_path, config["DATA"]["AntibodyBenchmark"]["ab_ag_cases"])
pdb_path = os.path.join(benchmark_path, config["DATA"]["AntibodyBenchmark"]["pdb_path"])

In [13]:
data = pd.read_excel(ab_cases_path)
print("There are {} ab-ag cases".format(len(data)))
data.head()

There are 67 ab-ag cases


Unnamed: 0,Complex PDB,Antibody PDB,Antibody,Antigen PDB,Antigen,I-RMSD (Å),ΔASA (Å2),Category,New,Kd (nM),ΔG (kcal/mol)
0,1AHW_AB:C,1FGN_LH,Fab 5g9,1TFH_A,Tissue factor,0.69,1899.0,Rigid,,,-11.55
1,1DQJ_AB:C,1DQQ_CD,Fab Hyhel63,3LZT_,HEW lysozyme,0.75,1765.0,Rigid,,,-11.67
2,1E6J_HL:P,1E6O_HL,Fab,1A43_,HIV-1 capsid protein p24,1.05,1245.0,Rigid,,,-10.28
3,1JPS_HL:T,1JPT_HL,Fab D3H44,1TFH_B,Tissue factor,0.51,1852.0,Rigid,,,-13.64
4,1MLC_AB:E,1MLB_AB,Fab44.1,3LZT_,HEW lysozyme,0.6,1392.0,Rigid,,,-9.61


### Remove all entires without binding affinity

In [14]:
data = data[data["ΔG (kcal/mol)"].notnull() & data["Kd (nM)"].notnull()]
print("There are {} ab-ag cases with affinity".format(len(data)))
data.head()

There are 53 ab-ag cases with affinity


Unnamed: 0,Complex PDB,Antibody PDB,Antibody,Antigen PDB,Antigen,I-RMSD (Å),ΔASA (Å2),Category,New,Kd (nM),ΔG (kcal/mol)
5,1S78_DC:A,1L7I_HL,pertuzumab (Perjeta),2A91_A,ErbB2,1.13,2175.1,Rigid,X,500.0,-8.45
8,2DD8_HL:S,2G75_AB,m396,2GHV_E,SARS spike,2.19,1709.7,Medium,X,20.0,-10.5
10,2FJG_HL:VW,2FJF_HL,G6,4KZN_AB,VEGF,2.51,1678.2,Difficult,X,20.0,-10.92
13,2VXT_HL:I,2VXU_HL,Murine reference antibody 125-2H FAB,1J0S_A(6),Interleukin-18,1.32,2163.0,Rigid,,0.533,-12.65
14,2W9E_HL:A,2W9D_HL,ICSM 18 FAB fragment,1QM1_A,Prion protein fragment,1.13,1677.0,Rigid,,0.13,-13.49


In [35]:
data

Unnamed: 0,Complex PDB,Antibody PDB,Antibody,Antigen PDB,Antigen,I-RMSD (Å),ΔASA (Å2),Category,New,Kd (nM),ΔG (kcal/mol)
5,1S78_DC:A,1L7I_HL,pertuzumab (Perjeta),2A91_A,ErbB2,1.13,2175.1,Rigid,X,500.0,-8.45
8,2DD8_HL:S,2G75_AB,m396,2GHV_E,SARS spike,2.19,1709.7,Medium,X,20.0,-10.5
10,2FJG_HL:VW,2FJF_HL,G6,4KZN_AB,VEGF,2.51,1678.2,Difficult,X,20.0,-10.92
13,2VXT_HL:I,2VXU_HL,Murine reference antibody 125-2H FAB,1J0S_A(6),Interleukin-18,1.32,2163.0,Rigid,,0.533,-12.65
14,2W9E_HL:A,2W9D_HL,ICSM 18 FAB fragment,1QM1_A,Prion protein fragment,1.13,1677.0,Rigid,,0.13,-13.49
16,3EOA_LH:I,3EO9_LH,Efalizumab FAB fragment,3F74_A,Integrin alpha-L I domain,0.39,1272.0,Rigid,,2.2,-11.81
17,3G6D_LH:A,3G6A_LH,CNTO607 FAB,1IK0_A(10),Interleukin-13,1.86,1793.0,Medium,,0.0184,-14.65
18,3HI6_XY:B,3HI5_HL,AL-57 FAB fragment,1MJN_A,Integrin alpha-L I domain,1.65,1871.0,Medium,,4700.0,-7.27
20,3L5W_LH:I,3L7E_LH,C836 FAB,1IK0_A(11),Interleukin-13,0.48,1138.0,Medium,,0.54,-14.01
21,3MJ9_HL:A,3MJ8_HL,HL4E10,3MJ6_A,JAML,1.48,2456.6,Rigid,X,8.0,-11.05


## Reading PDB Files

In [15]:
from Bio.PDB.PDBParser import PDBParser
parser = PDBParser(PERMISSIVE=1)

In [17]:
structure_id = "1S78"
protein = "r"
type_ = "u"
filename = os.path.join(pdb_path, structure_id + "_" + protein + "_" + type_ + ".pdb")
structure = parser.get_structure(structure_id, filename)

In [34]:
residues = list(structure.get_residues())
residues[0]

<Residue GLU het=  resseq=1 icode= >

In [31]:
atoms = list(residues[0].get_atoms())
atoms[0]

<Atom N>