# The `kissim` API

Let's take a look at the `kissim` API, which is separated into `encode` and `compare`:

- `encode`: structures > fingerprints
- `compare`: fingerprints > feature distances and fingerprint distance per fingerprint pair (all-against-all comparison)

In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
from pathlib import Path

from opencadd.databases.klifs import setup_local

from kissim.api import encode, compare



In [3]:
HERE = Path(_dh[-1])  # noqa: F821
DATA = HERE / "../../kissim/tests/data/KLIFS_download/"

## Get local KLIFS structures

We use the `opencadd.databases.klifs` module to access structures in our local KLIFS download.

In [4]:
klifs_session = setup_local(DATA)

In [5]:
structures = klifs_session.structures.all_structures()

In [6]:
structure_klifs_ids = structures["structure.klifs_id"].to_list()
print(f"Number of structures: {len(structure_klifs_ids)}")
print(*structure_klifs_ids)

Number of structures: 16
109 118 110 113 111 116 112 114 115 117 12347 1641 2542 3833 5399 9122


## Encode structures into fingerprints

The `encode` function is an API to generate fingerprints in bulk based on structure KLIFS IDs. The return object is of type `FingerprintGenerator`.

In [7]:
encode?

[0;31mSignature:[0m
[0mencode[0m[0;34m([0m[0;34m[0m
[0;34m[0m    [0mstructure_klifs_ids[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mfingerprints_json_filepath[0m[0;34m=[0m[0;32mNone[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mn_cores[0m[0;34m=[0m[0;36m1[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mlocal_klifs_download_path[0m[0;34m=[0m[0;32mNone[0m[0;34m,[0m[0;34m[0m
[0;34m[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m
Encode structures.

Parameters
----------
structure_klifs_ids : list of int
    Structure KLIFS IDs.
fingerprints_json_filepath : str or pathlib.Path
    Path to output json file. Default None.
n_cores : int
    Number of cores used to generate fingerprints.
local_klifs_download_path : str or None
    If path to local KLIFS download is given, set up local KLIFS session.
    If None is given, set up remote KLIFS session.

Returns
-------
kissim.encoding.FingerprintGenerator
    Fingerprints.
[0;31mFile:[0m      ~/Docum

In [8]:
fingerprint_generator = encode(
    structure_klifs_ids, fingerprints_json_filepath=None, n_cores=2, local_klifs_download_path=DATA
)

117: Local complex.pdb or pocket.pdb file missing: /home/dominique/Documents/GitHub/kissim/docs/tutorials/../../kissim/tests/data/KLIFS_download/HUMAN/ABL2/3gvu_altA_chainA/complex.pdb
117: Empty fingerprint (data unaccessible).
9122: Non-standard residue MSE is set to MET.
9122: Non-standard residue MSE is set to MET.
9122: Non-standard residue MSE is set to MET.
9122: Non-standard residue MSE is set to MET.
9122: Non-standard residue MSE is set to MET.
9122: Non-standard residue MSE is set to MET.
9122: Non-standard residue MSE is set to MET.
9122: Non-standard residue MSE is set to MET.
9122: Non-standard residue MSE is set to MET.
9122: Non-standard residue MSE is set to MET.
9122: Non-standard residue MSE is set to MET.
9122: Non-standard residue MSE is set to MET.


### `FingerprintGenerator` object

In [9]:
print(f"Number of fingerprints: {len(fingerprint_generator.data.keys())}")
fingerprint_generator

Number of fingerprints: 15


<kissim.encoding.fingerprint_generator.FingerprintGenerator at 0x7f4135b4d0d0>

In [14]:
fingerprint_generator.data[109].values_dict

{'physicochemical': {'size': [2.0,
   2.0,
   2.0,
   1.0,
   1.0,
   1.0,
   2.0,
   3.0,
   1.0,
   2.0,
   1.0,
   3.0,
   1.0,
   1.0,
   1.0,
   1.0,
   2.0,
   1.0,
   2.0,
   2.0,
   3.0,
   2.0,
   2.0,
   2.0,
   1.0,
   1.0,
   1.0,
   2.0,
   2.0,
   2.0,
   2.0,
   2.0,
   1.0,
   2.0,
   2.0,
   1.0,
   2.0,
   2.0,
   2.0,
   1.0,
   1.0,
   3.0,
   2.0,
   1.0,
   1.0,
   2.0,
   3.0,
   2.0,
   1.0,
   3.0,
   1.0,
   2.0,
   2.0,
   2.0,
   2.0,
   3.0,
   2.0,
   3.0,
   2.0,
   3.0,
   2.0,
   2.0,
   2.0,
   2.0,
   2.0,
   3.0,
   2.0,
   2.0,
   3.0,
   2.0,
   2.0,
   1.0,
   1.0,
   3.0,
   2.0,
   1.0,
   2.0,
   1.0,
   1.0,
   1.0,
   2.0,
   3.0,
   1.0,
   2.0,
   1.0],
  'hbd': [1.0,
   1.0,
   0.0,
   0.0,
   0.0,
   0.0,
   1.0,
   1.0,
   0.0,
   0.0,
   0.0,
   1.0,
   0.0,
   0.0,
   0.0,
   0.0,
   1.0,
   1.0,
   0.0,
   0.0,
   0.0,
   0.0,
   1.0,
   0.0,
   0.0,
   0.0,
   0.0,
   0.0,
   1.0,
   0.0,
   0.0,
   1.0,
   0.0,
   1.0,
   0.0,
   0.

## Compare fingerprints

In [10]:
# flake8-noqa-cell
compare?

[0;31mSignature:[0m
[0mcompare[0m[0;34m([0m[0;34m[0m
[0;34m[0m    [0mfingerprint_generator[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0moutput_path[0m[0;34m=[0m[0;32mNone[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mn_cores[0m[0;34m=[0m[0;36m1[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mfeature_weights[0m[0;34m=[0m[0;32mNone[0m[0;34m,[0m[0;34m[0m
[0;34m[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m
Compare fingerprints (pairwise).

Parameters
----------
fingerprint_generator : kissim.encoding.FingerprintGenerator
    Fingerprints for KLIFS dataset.
output_path : str
    Path to output folder.
n_cores : int
    Number of cores used to generate fingerprint distances.
feature_weights : None or list of float
    Feature weights of the following form:
    (i) None
        Default feature weights: All features equally distributed to 1/15
        (15 features in total).
    (ii) By feature (list of 15 floats):
        Features to be set in th

In [11]:
feature_distances_generator, fingerprint_distance_generator = compare(fingerprint_generator)

### `FingerprintDistanceGenerator` object

For final fingerprint distances, please refer to the `FingerprintDistanceGenerator` object.

In [12]:
fingerprint_distance_generator.data

Unnamed: 0,structure1,structure2,kinase1,kinase2,distance,coverage
0,109,118,ABL2,ABL2,0.074214,0.992000
1,109,110,ABL2,ABL2,0.061968,0.986667
2,109,113,ABL2,ABL2,0.064064,0.984000
3,109,111,ABL2,ABL2,0.064064,0.984000
4,109,116,ABL2,ABL2,0.058630,0.978000
...,...,...,...,...,...,...
100,2542,5399,AKT1,ALK,0.340086,0.800667
101,2542,9122,AKT1,ADCK3,0.420406,0.814000
102,3833,5399,AAK1,ALK,0.267515,0.976667
103,3833,9122,AAK1,ADCK3,0.303542,0.990000


### `FeatureDistancesGenerator` object

For more information about feature-specific distances, please refer to the `FeatureDistancesGenerator` object.

In [13]:
feature_distances_generator.data

{(109,
  118): <kissim.comparison.feature_distances.FeatureDistances at 0x7f4135b06190>,
 (109,
  110): <kissim.comparison.feature_distances.FeatureDistances at 0x7f4135b06220>,
 (109,
  113): <kissim.comparison.feature_distances.FeatureDistances at 0x7f4135b062b0>,
 (109,
  111): <kissim.comparison.feature_distances.FeatureDistances at 0x7f4135b06310>,
 (109,
  116): <kissim.comparison.feature_distances.FeatureDistances at 0x7f4135b06370>,
 (109,
  112): <kissim.comparison.feature_distances.FeatureDistances at 0x7f4135b063d0>,
 (109,
  114): <kissim.comparison.feature_distances.FeatureDistances at 0x7f4135b06430>,
 (109,
  115): <kissim.comparison.feature_distances.FeatureDistances at 0x7f4135b06490>,
 (109,
  12347): <kissim.comparison.feature_distances.FeatureDistances at 0x7f4135b064f0>,
 (109,
  1641): <kissim.comparison.feature_distances.FeatureDistances at 0x7f4135b06550>,
 (109,
  2542): <kissim.comparison.feature_distances.FeatureDistances at 0x7f4135b065b0>,
 (109,
  3833): <