```{contents}
```

# Assumptions

## **1.1 Meaningful Distance Metric**

* **Assumption:** The distance metric you choose (Euclidean, Manhattan, Cosine, etc.) **accurately reflects similarity between points**.
* **Implication:**

  * If distances do not represent similarity well, the algorithm will merge or split clusters incorrectly.
  * Example: Euclidean distance assumes **continuous numerical features** and is sensitive to scale.

---

## **1.2 Clusters Have a Hierarchical/Nested Structure**

* **Assumption:** The data can be meaningfully represented in a **nested, tree-like hierarchy**.
* **Implication:**

  * HC works best if clusters are naturally nested.
  * If data has flat clusters with no hierarchy, HC may produce arbitrary splits or merges.

---

## **1.3 Homogeneity Within Clusters**

* **Assumption:** Points within a cluster are **more similar to each other than to points in other clusters**.
* **Implication:**

  * If clusters have very different densities or shapes, some linkage methods (like single or complete linkage) may fail.

---

## **1.4 Choice of Linkage Matters**

* **Assumption:** The chosen linkage method (single, complete, average, Ward) **appropriately reflects inter-cluster distances**.
* **Implication:**

  * Single linkage → can cause “chaining” effect (long, snake-like clusters).
  * Complete linkage → favors compact clusters, may break elongated clusters.
  * Ward → assumes variance minimization is meaningful.

---

## **1.5 Scale of Features**

* **Assumption:** Features are **comparable in scale** or have been standardized.
* **Implication:**

  * If one feature dominates due to scale, distance metrics (especially Euclidean) will be biased, leading to poor clustering.

---

## **1.6 No Strong Noise or Outliers**

* **Assumption:** Data is relatively clean; noise and outliers are minimal.
* **Implication:**

  * Outliers can create their own singleton clusters or distort dendrogram structure.

---

**Summary**

| Assumption                                     | Explanation                                                        |
| ---------------------------------------------- | ------------------------------------------------------------------ |
| Distance metric is meaningful                  | Similar points are close, dissimilar points are far                |
| Hierarchical structure exists                  | Data can be represented in nested clusters                         |
| Homogeneity within clusters                    | Points in a cluster are more similar to each other than to others  |
| Linkage method reflects inter-cluster distance | Choice of single, complete, average, or Ward affects cluster shape |
| Features are scaled                            | Avoid one feature dominating the distance metric                   |
| Minimal noise/outliers                         | Outliers don’t distort hierarchy                                   |

---

**Key Intuition**

Hierarchical clustering is like **building a tree of data points**:

* If the distances, linkage method, and feature scaling are appropriate → the tree accurately reflects the nested structure of the data.
* Violating assumptions → dendrogram may be misleading, merges may be arbitrary, and resulting clusters may not be meaningful.