
### ðŸ”¹ Introduction

* Goal: Group data points into clusters, similar to K-means, but **without centroids**.
* Two main types:

  * **Agglomerative** (bottom-up: combine small clusters into bigger ones).
  * **Divisive** (top-down: split large cluster into smaller ones).

---

### ðŸ”¹ Steps in **Agglomerative Hierarchical Clustering**

1. **Start**: Each data point is its own cluster.
2. **Find nearest points/clusters** (using a distance metric like **Euclidean/Manhattan**).
3. **Merge nearest clusters** into a new cluster.
4. **Repeat** steps until all points are merged into a single cluster.

---

### ðŸ”¹ Dendrogram

* A tree-like diagram that shows how clusters are formed.
* **X-axis** â†’ data points.
* **Y-axis** â†’ distance (e.g., Euclidean).
* Steps:

  * Merge closest points first, then merge clusters progressively.
  * Heights in dendrogram correspond to distances between clusters.

---

### ðŸ”¹ Choosing Number of Clusters (k)

* Decide using **distance threshold**:

  * Draw a horizontal line across dendrogram at a chosen distance.
  * Number of clusters = number of vertical lines the horizontal line cuts.
* **Rule of thumb (hack)**:

  * Select the **longest vertical line** in dendrogram that no horizontal line crosses.
  * Draw a horizontal cut through it â†’ gives optimal number of clusters.

---

### ðŸ”¹ Key Points

* **Agglomerative** = bottom-up merging.
* **Divisive** = top-down splitting.
* Dendrogram helps in deciding **k**.
* Threshold determines granularity of clusters:

  * Lower threshold â†’ more clusters.
  * Higher threshold â†’ fewer clusters.

---

âœ… So hierarchical clustering = **iterative merging/splitting + dendrogram for visualization + threshold to pick k.**



### **Dimensionality Reduction**

* **Why do it?**

  1. Prevent the *curse of dimensionality* (too many features hurt model performance).
  2. Improve model training efficiency and accuracy.
  3. Enable visualization (humans can only see up to 3D).

---

### **Feature Selection**

* Goal: Select the most important features that strongly impact the target.
* Methods:

  * Use **covariance** and **correlation** (e.g., Pearson correlation) to measure relationships between features and target.
  * Strong positive/negative correlation â†’ feature is important.
  * Near-zero correlation â†’ feature is unimportant and can be dropped.
* Example:

  * **House size** vs. **price** â†’ strong correlation â†’ keep.
  * **Fountain size** vs. **price** â†’ weak correlation â†’ drop.

---

### **Feature Extraction**

* Goal: Create new, informative features from existing ones (instead of dropping).
* Process: Apply transformations to combine or derive features.
* Example:

  * From **room size** + **number of rooms**, derive a new feature: **house size**, which can still predict house price effectively.
* Key point: Some information is lost, but the new feature captures the essence of the originals while reducing dimensions.

---

### **Key Distinction**

* **Feature Selection** â†’ Choose from existing features (drop irrelevant ones).
* **Feature Extraction** â†’ Transform existing features to create new ones.

---

ðŸ‘‰ In practice: both are used in dimensionality reduction before applying models or visualization (e.g., PCA for feature extraction).



| Aspect           | K-Means                      | Hierarchical Clustering              |
| ---------------- | ---------------------------- | ------------------------------------ |
| Input needed     | k (number of clusters)       | No k needed initially                |
| Approach         | Partition-based              | Tree (nested clusters)               |
| Shape assumption | Spherical clusters           | Flexible shapes                      |
| Initialization   | Random centroids (K-Means++) | Deterministic                        |
| Complexity       | O(n Ã— k Ã— iterations)        | O(nÂ²)                                |
| Best for         | Large datasets               | Small/medium datasets                |
| Output           | Flat clustering              | Hierarchical clustering (dendrogram) |
