# Iris Species Analysis with interdim

In this notebook, we'll demonstrate how to use the `interdim` package to analyze and visualize data using a very simple example: the Iris dataset. We'll perform dimensionality reduction, clustering, and interactive visualization of the data.

This just involves loading the data, creating an interdim pipeline object, and then just using the included methods to reduce, cluster, and visualize the data.

In [1]:
from sklearn.datasets import load_iris
from interdim import InterDimAnalysis

iris = load_iris()

analysis = InterDimAnalysis(iris.data, true_labels=iris.target, verbose=True)
analysis.reduce(method='tsne', n_components=3)
analysis.cluster(method='dbscan')
analysis.score(method='adjusted_rand')
analysis.show(which_data='reduced', point_visualization='bar')

Performing dimensionality reduction via TSNE with default arguments.
Reduced data shape: (150, 3)
Performing clustering via DBSCAN with default arguments.
Clustering complete. Number of clusters: 1
Clustering Evaluation Result (adjusted_rand):
Score: 0.0


<dash.dash.Dash at 0x7488bc16a930>

If you want to customize the bar plot, you can do so by passing an InteractionPlot object instead of just 'bar'. You can customize all sorts of things in this way!

In [2]:
from interdim import InterDimAnalysis, InteractionPlot

point_visualization = InteractionPlot(iris.data, plot_type='bar', trace_kwargs={'x': iris.feature_names})
analysis.show(which_data='reduced', point_visualization=point_visualization)

<dash.dash.Dash at 0x7488b7130350>

Feel free to try out different reduction methods (we find UMAP tends to work the best), different clustering methods, different visualization types, etc.