# Loading `kissim` results

This is a short notebook showing how to load the `kissim` output files as Python objects.

- `fingerprint.json`: Fingerprints for all successfully encoded structures
- `fingerprint_clean.json`: Fingerprints dataset without outlier structures
- `feature_distances.csv`: Feature distances between all fingerprint pairs
- `fingerprint_distances.csv`: Fingerprint distances between all fingerprint pairs

In [1]:
from pathlib import Path

from kissim.encoding import FingerprintGenerator
from kissim.comparison import FeatureDistancesGenerator, FingerprintDistanceGenerator

from src.paths import PATH_RESULTS



In [2]:
HERE = Path(_dh[-1])  # noqa: F821
RESULTS = PATH_RESULTS / "all"

## Load fingerprints

### Without outlier filtering

In [3]:
%%time
fingerprints = FingerprintGenerator.from_json(RESULTS / "fingerprints.json")
len(fingerprints.data)

CPU times: user 1.13 s, sys: 94.9 ms, total: 1.22 s
Wall time: 1.26 s


4681

### With outlier filtering

In [4]:
%%time
fingerprints = FingerprintGenerator.from_json(RESULTS / "fingerprints_clean.json")
len(fingerprints.data)

CPU times: user 1.41 s, sys: 83.6 ms, total: 1.5 s
Wall time: 1.53 s


4681

## Load feature distances

In [5]:
%%time
feature_distances = FeatureDistancesGenerator.from_csv(RESULTS / "feature_distances.csv")
len(feature_distances.data)

CPU times: user 37.5 s, sys: 1.71 s, total: 39.2 s
Wall time: 42.4 s


10953540

## Load fingerprint distances

In [6]:
%%time
fingerprint_distances = FingerprintDistanceGenerator.from_csv(
    RESULTS / "fingerprint_distances.csv"
)
len(fingerprint_distances.data)

CPU times: user 4.87 s, sys: 72 ms, total: 4.94 s
Wall time: 4.94 s


10953540