## scipy.hierarchy.cluster integration

IDendro is built to support SciPy's hierarchical clustering data structures (linkage matrix and flat cluster assignments). As a result, using IDendro is as simple as passing outputs of `scipy.cluster.hierarchy.linkage` and `scipy.cluster.hierarchy.fcluster` functions. 

In [None]:
import os, sys
sys.path.insert(1, os.path.join(sys.path[0], '../..'))
import altair 
altair.renderers.set_embed_options(actions=False)
pass

In [10]:
import scipy.cluster.hierarchy as sch
from sklearn.datasets import load_iris
import idendro

# do the usual scipy hierarchical clustering
data = load_iris(as_frame=True)
linkage_matrix = sch.linkage(
    data['data'], method='single', metric='euclidean'
)
flat_clusters = sch.fcluster(
    linkage_matrix, t=0.8, criterion='distance'
)

#pass it to idendro and visualize
cl_data = idendro.ClusteringData(
    linkage_matrix = linkage_matrix, 
    cluster_assignments = flat_clusters
)

idd = idendro.IDendro()
idd.set_cluster_info(cl_data)

idd.create_dendrogram().plot(
    backend='altair',
    height=200, width=550
)

### Using previously created SciPy's dendrogram objects

In some situations, you may have a dendrogram object created by SciPy that you want to visualize using IDendro. That's possible, too.

In [11]:
## create a scipy dendrogram object
D = sch.dendrogram(
    linkage_matrix, 
    p=4, truncate_mode="level",
    no_plot=True
)

If you have just the dendrogram object (and not the underlying linkage matrix), you cannot compute/plot the nodes in the dendrogram, but you can still use the available backends.

In [12]:
## pass it to idendro and visualize
idd = idendro.IDendro()
idd.convert_scipy_dendrogram(D, compute_nodes=False).plot(
    backend='altair',
    height=200, width=550,
    show_nodes=False
)

Not all customization functionality is available, too - in most cases, it is recommended that you generate the dendrogram with IDendro itself.

## Scikit-learn Agglomerative Clustering integration

To use [scikit-learn agglomerative clustering](https://scikit-learn.org/stable/auto_examples/cluster/plot_agglomerative_dendrogram.html) outputs, wrap the fit object with `idendro.ScikitLearnClusteringData` before passing it to idendro.

In [5]:
from sklearn.cluster import AgglomerativeClustering
from sklearn.datasets import load_iris
import idendro

data = load_iris(as_frame=True)

# do the usual scikit-learn hierarchical clustering
model = AgglomerativeClustering(
    distance_threshold=0.8,
    linkage='single',
    n_clusters=None
).fit(data['data'])

#pass it to idendro and visualize
idd = idendro.IDendro()
idd.set_cluster_info(idendro.ScikitLearnClusteringData(model))

idd.create_dendrogram().plot(
    backend='altair',
    height=200, width=550
)

## HDBSCAN integration

IDendro can visualize [HDBSCAN](https://hdbscan.readthedocs.io/) clustering results, too. Simply wrap the HDBSCAN model object with `idendro.HDBSCANClusteringData` before passing it as clustering information. The model object is available via `cluster_data.get_model()` function in all callback functions (see case studies for ideas on how it can be leveraged).

In [6]:
import hdbscan
clusterer = hdbscan.HDBSCAN()
clusterer.fit(data['data'])

#pass it to idendro and visualize
idd = idendro.IDendro()
idd.set_cluster_info(idendro.HDBSCANClusteringData(clusterer))

idd.create_dendrogram().plot(
    backend='altair',
    height=200, width=550
)