# Local copy of KLIFS IDs

In the `local` module of `opencadd.databases.klifs`, we load KLIFS metadata from two KLIFS download files, i.e. `overview.csv` and `KLIFS_export.csv`, to create one KLIFS metadata table (which is standardized across the `local` and `remote` module).

These KLIFS download files do not contain KLIFS kinase, ligand and structure IDs. In order to make results from the `local` and `remote` module easily comparable, we add these KLIFS IDs to the local KLIFS metadata table upon local session initialization (`local.SessionInitialization`). 

Therefore, we need to find for each locally available structure (max. about 11,000 structures) its associated kinase, ligand and structure ID. 
Since we do not want to query the KLIFS webserver with each of theses structures every time we initialize a local session, we fetch here a local copy of KLIFS IDs.

In [1]:
from opencadd.databases.klifs.api import setup_remote

In [2]:
# Work with remote KLIFS data
remote = setup_remote()

INFO:opencadd.databases.klifs.api:Set up remote session...
INFO:opencadd.databases.klifs.api:Remote session is ready!


## Get kinase and structure IDs

In [3]:
# Fetch all structures (keep only ID related columns)
structures_all = remote.structures.all_structures()
structures_all = structures_all[["structure.id", "structure.pdb", "structure.alternate_model", "structure.chain", "kinase.name", "kinase.id", "ligand.pdb"]]
# Sort by structures ID
structures_all.sort_values("structure.id", inplace=True)
# Show data
print(structures_all.shape)
structures_all.head()

(11377, 7)


Unnamed: 0,structure.id,structure.pdb,structure.alternate_model,structure.chain,kinase.name,kinase.id,ligand.pdb
7513,1,3dko,A,A,EphA7,415,IHZ
7512,2,2rei,B,A,EphA7,415,-
7514,3,3dko,B,A,EphA7,415,IHZ
7515,4,2rei,A,A,EphA7,415,-
9343,5,3v8t,B,A,ITK,474,477


In [4]:
print("Sanity check: Are there multiple KLIFS structure IDs for one KLIFS structure?")
sizes = structures_all.groupby(["structure.pdb", "structure.alternate_model", "structure.chain"]).size()
if len(sizes[sizes > 1]) > 0:
    print(sizes[sizes > 1])
else:
    print("All good!")

Sanity check: Are there multiple KLIFS structure IDs for one KLIFS structure?
All good!


In [5]:
# Save local copy of KLIFS IDs
structures_all.to_csv("klifs_ids.csv", index=None)

## Get ligand IDs?

In [6]:
ligands_all = remote.ligands.all_ligands()

In [7]:
print("Sanity check: Are there multiple KLIFS ligands IDs for one ligand PDB?")
sizes = ligands_all.groupby(["ligand.pdb"]).size()
if len(sizes[sizes > 1]) > 0:
    print(ligands_all[ligands_all["ligand.pdb"].isin(sizes[sizes > 1].index)][["ligand.id", "ligand.pdb"]].sort_values("ligand.pdb"))
    print("These PDB IDs need to be check manually!")
else: 
    print("All good!")

Sanity check: Are there multiple KLIFS ligands IDs for one ligand PDB?
      ligand.id ligand.pdb
2713       3015        6VL
2759       3065        6VL
64           65        7KC
2881       2967        7KC
These PDB IDs need to be check manually!


In [8]:
structures_all[structures_all["ligand.pdb"] == "6VL"]

Unnamed: 0,structure.id,structure.pdb,structure.alternate_model,structure.chain,kinase.name,kinase.id,ligand.pdb
4280,9564,5t31,-,B,GSK3B,238,6VL
4282,9565,5t31,-,A,GSK3B,238,6VL
4167,9728,5kpl,A,B,GSK3B,238,6VL
4168,9729,5kpl,A,A,GSK3B,238,6VL
4171,9730,5kpl,B,A,GSK3B,238,6VL
4209,9733,5kpl,B,B,GSK3B,238,6VL


Different protonation states:
- https://klifs.vu-compmedchem.nl/details.php?structure_id=9564
- https://klifs.vu-compmedchem.nl/details.php?structure_id=9729

In [9]:
structures_all[structures_all["ligand.pdb"] == "7KC"]

Unnamed: 0,structure.id,structure.pdb,structure.alternate_model,structure.chain,kinase.name,kinase.id,ligand.pdb
6206,171,4o0r,-,A,PAK1,367,7KC
6262,1387,2x4z,-,A,PAK4,370,7KC
6305,4681,4ks7,B,A,PAK6,371,7KC
6302,4682,4ks7,A,A,PAK6,371,7KC
969,8731,5mag,B,A,MELK,128,7KC
1030,8736,5mag,A,A,MELK,128,7KC


Different protonation states:

- https://klifs.vu-compmedchem.nl/details.php?structure_id=4682
- https://klifs.vu-compmedchem.nl/details.php?structure_id=8736