```{contents}
```

# Performance Metrics

Performance of **DBSCAN** is evaluated with cluster quality metrics, since it does not optimize an internal cost function.

---

### **1. Internal evaluation metrics (no ground truth required)**

These assess clustering quality based only on the dataset.

* **Silhouette Coefficient**

  $$
  s(i) = \frac{b(i) - a(i)}{\max(a(i), b(i))}
  $$

  * $a(i)$: average distance of point $i$ to points in its own cluster.
  * $b(i)$: average distance of point $i$ to nearest other cluster.
  * Range: $[-1, 1]$. Higher = better separation.

* **Davies–Bouldin Index (DBI)**

  $$
  DBI = \frac{1}{k} \sum_{i=1}^k \max_{j \neq i} \frac{s_i + s_j}{d_{ij}}
  $$

  * $s_i$: average scatter within cluster $i$.
  * $d_{ij}$: distance between cluster centers.
  * Lower = better (compact and well-separated clusters).

* **Calinski–Harabasz Index**

  $$
  CH = \frac{\text{Between-cluster variance} / (k-1)}{\text{Within-cluster variance} / (n-k)}
  $$

  Higher = better separation.

---

### **2. External evaluation metrics (ground truth available)**

* **Adjusted Rand Index (ARI)**
  Measures agreement between predicted clusters and true labels.

  $$
  ARI = \frac{\text{Index} - \text{Expected Index}}{\text{Max Index} - \text{Expected Index}}
  $$

  Range: $[-1, 1]$. 1 = perfect match, 0 = random, negative = worse than random.

* **Normalized Mutual Information (NMI)**
  Based on information theory.

  $$
  NMI = \frac{2 \cdot I(Y;C)}{H(Y) + H(C)}
  $$

  where $I$ is mutual information, $H$ entropy.
  Range: $[0,1]$. Higher = better.

* **Homogeneity, Completeness, V-measure**

  * Homogeneity: each cluster contains only members of one class.
  * Completeness: all members of a class are in the same cluster.
  * V-measure: harmonic mean of both.

---

### **3. DBSCAN-specific considerations**

* **Noise points**: Many metrics (e.g., silhouette) ignore points labeled as noise ($-1$). This must be handled carefully.
* **Cluster count flexibility**: DBSCAN may return different numbers of clusters depending on $\epsilon$, minPts. Evaluation metrics help tune these parameters.

