Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 29 additions & 0 deletions plots/silhouette-basic/specification.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# silhouette-basic: Silhouette Plot

## Description

A silhouette plot visualizes the quality of clustering results by showing the silhouette coefficient for each sample, grouped by cluster assignment. Each horizontal bar represents a sample's silhouette score (-1 to 1), where positive values indicate good cluster membership and negative values suggest potential misclassification. This visualization helps evaluate cluster cohesion (how similar samples are to their own cluster) and separation (how distinct they are from neighboring clusters).

## Applications

- Evaluating K-means, hierarchical, or other clustering algorithm results
- Comparing different numbers of clusters to find optimal k value
- Identifying poorly clustered or potentially misclassified samples
- Validating cluster assignments before downstream analysis

## Data

- `samples` (numeric) - feature vectors for each data point to be clustered
- `cluster_labels` (integer) - cluster assignment for each sample (0 to k-1)
- `silhouette_values` (numeric) - silhouette coefficient per sample (-1 to 1)
- Size: 50-500 samples with 2-10 clusters for readable visualization
- Example: clustering iris dataset into 3 species groups

## Notes

- Display horizontal bars for each sample's silhouette score, sorted within each cluster
- Group samples by cluster with distinct colors per cluster
- Include vertical line at average silhouette score for reference
- Annotate each cluster section with its average silhouette score
- Use sklearn.metrics.silhouette_samples for computing individual scores
- Clusters with consistently high scores (close to 1) indicate well-separated groups
28 changes: 28 additions & 0 deletions plots/silhouette-basic/specification.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# Specification-level metadata for silhouette-basic
# Auto-synced to PostgreSQL on push to main

spec_id: silhouette-basic
title: Silhouette Plot

# Specification tracking
created: 2025-12-26T19:28:09Z
updated: null
issue: 2334
suggested: MarkusNeusinger

# Classification tags (applies to all library implementations)
# See docs/concepts/tagging-system.md for detailed guidelines
tags:
plot_type:
- silhouette
- bar
data_type:
- numeric
- categorical
domain:
- statistics
- machine-learning
features:
- basic
- clustering
- evaluation