# The `kissim` CLI

Let's take a look at the `kissim` CLI, which is separated into `encode` and `compare` and follows the logic of the `kissim` API:

- `encode`: structures > fingerprints
- `compare`: fingerprints > feature distances and fingerprint distance per fingerprint pair (all-against-all comparison)

In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
from pathlib import Path

from kissim.encoding import FingerprintGenerator
from kissim.comparison import FeatureDistancesGenerator, FingerprintDistanceGenerator



In [3]:
HERE = Path(_dh[-1])

## Encode structures into fingerprints

In [4]:
%%bash
kissim encode -h
# flake8-noqa-cell

usage: kissim encode [-h] -i INPUT [INPUT ...] -o OUTPUT [-l LOCAL]
                     [-c NCORES]

optional arguments:
  -h, --help            show this help message and exit
  -i INPUT [INPUT ...], --input INPUT [INPUT ...]
                        List of structure KLIFS IDs or path to txt file
                        containing structure KLIFS IDs.
  -o OUTPUT, --output OUTPUT
                        Path to output json file containing fingerprint data.
  -l LOCAL, --local LOCAL
                        Path to KLIFS download folder. If set local KLIFS data
                        is used, else remote KLIFS data.
  -c NCORES, --ncores NCORES
                        Number of cores. If 1 fingerprint generation in
                        sequence, else in parallel.


### Command

In [5]:
%%bash
kissim encode -i 109 118 110 113 111 116 112 114 115 117 12347 1641 2542 3833 5399 9122 -o fingerprints.json -l ../../kissim/tests/data/KLIFS_download/ -c 2
# flake8-noqa-cell

INFO:opencadd.databases.klifs.api:Set up local session...
INFO:opencadd.databases.klifs.local:Load overview.csv...
INFO:opencadd.databases.klifs.local:Load KLIFS_export.csv...
INFO:opencadd.databases.klifs.local:Merge both csv files...
INFO:opencadd.databases.klifs.local:Add paths to coordinate folders to structures...
INFO:opencadd.databases.klifs.local:Add KLIFS IDs to structures (uses remote since not available locally!)...
INFO:opencadd.databases.klifs.api:Local session is ready!
INFO:kissim.encoding.fingerprint_generator:GENERATE FINGERPRINTS
INFO:kissim.encoding.fingerprint_generator:Number of input structures: 16
INFO:kissim.encoding.fingerprint_generator:Fingerprint generation started at: 2021-03-18 14:53:12.840491
INFO:kissim.utils:Number of cores used: 2.
INFO:kissim.encoding.fingerprint_generator:109: Generate fingerprint...
INFO:kissim.encoding.fingerprint_generator:110: Generate fingerprint...
INFO:kissim.encoding.fingerprint_generator:113: Generate fingerprint...
INFO:kis

### Output

This command generate two files:

- `fingerprints.json`
- `fingerprint.log`

You can load the content of the `fingerprints.json` file as `FingerprintGenerator` object.

In [6]:
fingerprints_path = HERE / "fingerprints.json"
fingerprints_path

PosixPath('/home/dominique/Documents/GitHub/kissim/docs/tutorials/fingerprints.json')

In [7]:
fingerprint_generator = FingerprintGenerator.from_json(fingerprints_path)
print(f"Number of fingerprints: {len(fingerprint_generator.data.keys())}")

Number of fingerprints: 15


## Compare fingerprints

In [8]:
%%bash
kissim compare -h
# flake8-noqa-cell

usage: kissim compare [-h] -i INPUT -o OUTPUT
                      [-w WEIGHTS WEIGHTS WEIGHTS WEIGHTS WEIGHTS WEIGHTS WEIGHTS WEIGHTS WEIGHTS WEIGHTS WEIGHTS WEIGHTS WEIGHTS WEIGHTS WEIGHTS]
                      [-c NCORES]

optional arguments:
  -h, --help            show this help message and exit
  -i INPUT, --input INPUT
                        Path to json file containing fingerprint data.
  -o OUTPUT, --output OUTPUT
                        Path to output folder where distance json files will
                        be saved.
  -w WEIGHTS WEIGHTS WEIGHTS WEIGHTS WEIGHTS WEIGHTS WEIGHTS WEIGHTS WEIGHTS WEIGHTS WEIGHTS WEIGHTS WEIGHTS WEIGHTS WEIGHTS, --weights WEIGHTS WEIGHTS WEIGHTS WEIGHTS WEIGHTS WEIGHTS WEIGHTS WEIGHTS WEIGHTS WEIGHTS WEIGHTS WEIGHTS WEIGHTS WEIGHTS WEIGHTS
                        Feature weights. Eeach feature must be set
                        individually, all weights must sum up to 1.0.
  -c NCORES, --ncores NCORES
                        Number of cor

### Command

In [9]:
%%bash
kissim compare -i fingerprints.json -o . -c 2
# flake8-noqa-cell

INFO:kissim.comparison.feature_distances_generator:GENERATE FEATURE DISTANCES
INFO:kissim.comparison.feature_distances_generator:Number of input input fingerprints: 15
INFO:kissim.comparison.feature_distances_generator:Feature distances generation started at: 2021-03-18 14:53:31.598133
INFO:kissim.utils:Number of cores used: 2.
INFO:kissim.comparison.feature_distances_generator:Number of ouput feature distances: 105
INFO:kissim.comparison.feature_distances_generator:Runtime: 0:00:00.327817
INFO:kissim.comparison.fingerprint_distance_generator:GENERATE FINGERPRINT DISTANCES
INFO:kissim.comparison.fingerprint_distance_generator:Number of input feature distances: 105
INFO:kissim.comparison.fingerprint_distance_generator:Fingerprint distance generation started at: 2021-03-18 14:53:31.927612
INFO:kissim.utils:Number of cores used: 2.
INFO:kissim.comparison.fingerprint_distance_generator:Feature weights: [0.06666667 0.06666667 0.06666667 0.06666667 0.06666667 0.06666667
 0.06666667 0.0666666

### Output

This command generate two files:

- `feature_distances.json`
- `fingerprint_distances_WEIGHT-WEIGHT-...-WEIGHT.json` (`WEIGHT` refers to the per-feature weight as per-thousand value)

You can load the content of 

- the `feature_distances.json` file as `FeatureDistancesGenerator` object and 
- the `fingerprint_distances.json` file as `FingerprintDistancesGenerator` object.

#### Feature distances generator

In [10]:
feature_distances_path = HERE / Path("feature_distances.json")
feature_distances_path

PosixPath('/home/dominique/Documents/GitHub/kissim/docs/tutorials/feature_distances.json')

In [11]:
feature_distances_generator = FeatureDistancesGenerator.from_json(feature_distances_path)

In [12]:
feature_distances_generator.data

{(109,
  118): <kissim.comparison.feature_distances.FeatureDistances at 0x7fef926ad610>,
 (109,
  110): <kissim.comparison.feature_distances.FeatureDistances at 0x7fef319cb550>,
 (109,
  113): <kissim.comparison.feature_distances.FeatureDistances at 0x7fef319cb8e0>,
 (109,
  111): <kissim.comparison.feature_distances.FeatureDistances at 0x7fef319cb8b0>,
 (109,
  116): <kissim.comparison.feature_distances.FeatureDistances at 0x7fef319cb6a0>,
 (109,
  112): <kissim.comparison.feature_distances.FeatureDistances at 0x7fef319cba30>,
 (109,
  114): <kissim.comparison.feature_distances.FeatureDistances at 0x7fef319cb640>,
 (109,
  115): <kissim.comparison.feature_distances.FeatureDistances at 0x7fef319cb0d0>,
 (109,
  12347): <kissim.comparison.feature_distances.FeatureDistances at 0x7fef926a8e80>,
 (109,
  1641): <kissim.comparison.feature_distances.FeatureDistances at 0x7fef926a8fd0>,
 (109,
  2542): <kissim.comparison.feature_distances.FeatureDistances at 0x7fef926a8490>,
 (109,
  3833): <

#### Fingerprint distance generator

In [13]:
fingerprint_distance_path = list(HERE.glob("fingerprint_distances_*.json"))[0]
fingerprint_distance_path

PosixPath('/home/dominique/Documents/GitHub/kissim/docs/tutorials/fingerprint_distances_66-66-66-66-66-66-66-66-66-66-66-66-66-66-66.json')

In [14]:
fingerprint_distance_generator = FingerprintDistanceGenerator.from_json(fingerprint_distance_path)

In [15]:
fingerprint_distance_generator.data

Unnamed: 0,structure1,structure2,kinase1,kinase2,distance,coverage
0,109,118,ABL2,ABL2,0.074214,0.992000
1,109,110,ABL2,ABL2,0.061968,0.986667
2,109,113,ABL2,ABL2,0.064064,0.984000
3,109,111,ABL2,ABL2,0.064064,0.984000
4,109,116,ABL2,ABL2,0.058630,0.978000
...,...,...,...,...,...,...
100,2542,5399,AKT1,ALK,0.340086,0.800667
101,2542,9122,AKT1,ADCK3,0.420406,0.814000
102,3833,5399,AAK1,ALK,0.267515,0.976667
103,3833,9122,AAK1,ADCK3,0.303542,0.990000


## Clean up output files

In [16]:
[i.unlink() for i in HERE.glob("*.json")]
[i.unlink() for i in HERE.glob("*.log")]

[None, None]