## Clustering

This example will show you how to perform dimensionality reduction and visualize 
any resulting clusters. We will also show how certain preprocessing steps can 
be done using `preprocess.clean_data`.

### Datasets

This example makes use of the `ros-mito` data set which contains features 
extracted from high-content images.

In [1]:
# Imports
from hcitools import datasets, plot, analysis, preprocess

# Load dataset
ros = datasets.load_dataset('ros-mito')

# Plotly renderer
plot.set_renderer('notebook')  # Use this when running notebook
plot.set_renderer('iframe_connected')  # Use this when rendering docs

In [2]:
# Preprocessing
meta = ['Well', 'Row', 'Column', 'Timepoint', 'Compound', 'Conc']
df, dropped, LOG = preprocess.clean_data(
    data=ros,
    metacols=meta,
    dropna=True,
    drop_low_var=0.0,
    corr_thresh=0.9,
    verbose=True
)
df = df.set_index(meta)

# Generate clusters with default arguments
proj, expvar = analysis.dim_reduction(data=df, method=['pca', 'tsne'])

In [3]:
# Plot PCA components
fig = plot.pca_comps(proj, expvar, n_comps=3)
fig.update_layout(width=700, height=400)

fig.show()

In [4]:
# Compare 2 compounds
fig = plot.clusters(proj, 'Sorafenib Tosylate', 'Imatinib mesylate', 'tsne')
fig.update_layout(width=750, height=450)

fig.show()