# Local copy of KLIFS IDs

In the `local` module of `opencadd.databases.klifs`, we load KLIFS metadata from two KLIFS download files, i.e. `overview.csv` and `KLIFS_export.csv`, to create one KLIFS metadata table (which is standardized across the `local` and `remote` module).

These KLIFS download files do not contain KLIFS kinase, ligand and structure IDs. In order to make results from the `local` and `remote` module easily comparable, we add these KLIFS IDs to the local KLIFS metadata table upon local session initialization (`local.SessionInitialization`). 

Therefore, we need to find for each locally available structure (max. about 11,000 structures) its associated kinase, ligand and structure ID. 
Since we do not want to query the KLIFS webserver with each of theses structures every time we initialize a local session, we fetch here a local copy of KLIFS IDs.

In [1]:
from opencadd.databases.klifs_new.api import setup_remote

In [2]:
# Work with remote KLIFS data
remote = setup_remote()

INFO:opencadd.databases.klifs_new.api:Set up remote session...
INFO:opencadd.databases.klifs_new.api:Remote session is ready!


In [3]:
# Fetch all structures (keep only ID related columns)
structures_all = remote.structures.all_structures()
structures_all = structures_all[["structure.id", "structure.pdb", "structure.alternate_model", "structure.chain", "kinase.name", "kinase.id", "ligand.pdb"]]
# Sort by structures ID
structures_all.sort_values("structure.id", inplace=True)
# Show data
print(structures_all.shape)
structures_all.head()

(11377, 7)


Unnamed: 0,structure.id,structure.pdb,structure.alternate_model,structure.chain,kinase.name,kinase.id,ligand.pdb
7513,1,3dko,A,A,EphA7,415,IHZ
7512,2,2rei,B,A,EphA7,415,-
7514,3,3dko,B,A,EphA7,415,IHZ
7515,4,2rei,A,A,EphA7,415,-
9343,5,3v8t,B,A,ITK,474,477


In [4]:
# Sanity check: Are there multiple KLIFS structure IDs for one KLIFS structure?
sizes = structures_all.groupby(["structure.pdb", "structure.alternate_model", "structure.chain"]).size()
sizes[sizes > 1]

Series([], dtype: int64)

In [5]:
# Save local copy of KLIFS IDs
structures_all.to_csv("klifs_ids.csv", index=None)