# Side chain angles: Calculation based on PDB vs. mol2

Two approaches are implemented to calculate side chain angles:

- PDB-based: Load PDB file, extract KLIFS pocket, get CA, CB and side chain atoms, calculate centroid, calculate angle (all using Biopython).
- MOL2-based: Load MOL2 file, get CA, CB and side chain atoms, calculate centroid (using biopandas), calculate angle (using Biopython).

## Imports

In [1]:
%load_ext autoreload

In [2]:
%autoreload 2

In [1]:
from pathlib import Path
import sys

import pandas as pd

sys.path.append('../..')
from kissim.auxiliary import KlifsMoleculeLoader, PdbChainLoader
from kissim.encoding import SideChainAngleFeature, SideChainAngleFeatureMol2

## IO paths

In [2]:
path_to_kinsim = Path('.') / '..' / '..'
path_to_data = path_to_kinsim / 'examples' / 'data'
path_to_results = path_to_kinsim / 'examples' / 'results' / 'features' / 'sca_centroid_wo_backbone' 

## Load KLIFS metadata

In [3]:
klifs_metadata = pd.read_csv(path_to_data / 'postprocessed' / 'klifs_metadata_postprocessed.csv' , index_col=0)

In [4]:
klifs_metadata.shape

(3878, 23)

In [7]:
klifs_metadata_entry = klifs_metadata.iloc[100]
klifs_metadata_entry

metadata_index                                                            9842
kinase                                                                     ALK
family                                                                     ALK
groups                                                                      TK
pdb_id                                                                    6e0r
chain                                                                        A
alternate_model                                                              A
species                                                                  Human
ligand_orthosteric_name      N-[(1S)-1-(5-fluoropyridin-2-yl)ethyl]-1-(5-me...
ligand_orthosteric_pdb_id                                                  HKJ
ligand_allosteric_name                                                       -
ligand_allosteric_pdb_id                                                     -
dfg                                                 

## Load molecule

In [8]:
ml = KlifsMoleculeLoader(klifs_metadata_entry=klifs_metadata_entry)
molecule = ml.molecule

In [9]:
pdb = PdbChainLoader(klifs_metadata_entry=klifs_metadata_entry)
chain = pdb.chain

Check how long loading takes...

In [10]:
%timeit KlifsMoleculeLoader(klifs_metadata_entry=klifs_metadata_entry)

19.7 ms ± 1.04 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [11]:
%timeit PdbChainLoader(klifs_metadata_entry=klifs_metadata_entry)

207 ms ± 8.97 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


## Side chain angle calculation

### Based on PDB

In [12]:
sca_pdb = SideChainAngleFeature()
%timeit sca_pdb.from_molecule(molecule, chain)

38.4 ms ± 1.41 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)


### Based on MOL2

In [13]:
sca_mol2 = SideChainAngleFeatureMol2()
%timeit sca_mol2.from_molecule(molecule)

508 ms ± 23.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


### Compare

Coordinates are not comparable (KLIFS aligns structures, so not original PDB coordinates any more) but angles are!

In [15]:
pd.concat([sca_pdb.features_verbose.sca, sca_mol2.features_verbose.sca], axis=1)

Unnamed: 0,sca,sca.1
1,154.58,154.58
2,180.00,180.00
3,126.57,126.57
4,180.00,180.00
5,,
6,180.00,180.00
11,129.41,129.41
12,112.13,112.14
13,147.03,147.03
14,128.73,128.73
