

# Unsupervised Learning: Clustering Algorithms

---

### 1. What is unsupervised learning in the context of machine learning?

Unsupervised learning is a type of machine learning where the model learns from **unlabeled data**. It aims to find hidden patterns or intrinsic structures in the input data.

**Example:** Clustering customers based on purchasing behavior.

---

### 2. How does K-Means clustering algorithm work?

K-Means works by:

1. Selecting `k` initial centroids.
2. Assigning each data point to the nearest centroid.
3. Recomputing centroids based on assigned points.
4. Repeating steps 2–3 until convergence.

**Example:**
Clustering animals based on weight and height.

---

### 3. Explain the concept of a dendrogram in hierarchical clustering.

A **dendrogram** is a tree-like diagram that shows the merging process in hierarchical clustering. It visualizes how clusters are formed at each step and helps choose the optimal number of clusters by cutting the tree.

---

### 4. What is the main difference between K-Means and Hierarchical Clustering?

* **K-Means** requires the number of clusters beforehand and is efficient for large datasets.
* **Hierarchical clustering** builds a hierarchy of clusters and doesn't need the number of clusters upfront.

---

### 5. What are the advantages of DBSCAN over K-Means?

* Does **not require** the number of clusters beforehand.
* Can detect **arbitrarily shaped** clusters.
* Can **identify noise/outliers** naturally.

---

### 6. When would you use Silhouette Score in clustering?

Silhouette Score is used to **evaluate clustering quality**. It measures how similar an object is to its own cluster vs. other clusters (range: -1 to 1).

---

### 7. What are the limitations of Hierarchical Clustering?

* **Scalability:** Computationally expensive for large datasets.
* **Irreversible:** Once merged or split, operations cannot be undone.

---

### 8. Why is feature scaling important in clustering algorithms like K-Means?

K-Means uses **Euclidean distance**, so features with larger scales dominate. **Standardization or normalization** ensures fair contribution from all features.

---

### 9. How does DBSCAN identify noise points?

DBSCAN labels points as **noise** if they do not belong to any cluster, i.e., they are not within `eps` distance of a core point and not reachable.

---

### 10. Define inertia in the context of K-Means.

Inertia is the **sum of squared distances** between each point and its cluster centroid. Lower inertia means tighter clusters.

---

### 11. What is the elbow method in K-Means clustering?

The **elbow method** plots the number of clusters (k) vs. inertia. The "elbow" point where the inertia stops decreasing significantly suggests the optimal number of clusters.

---

### 12. Describe the concept of "density" in DBSCAN.

In DBSCAN, density refers to the number of points within a radius (`eps`). A region with **enough points (minPts)** is considered a **dense region**, forming a cluster.

---

### 13. Can hierarchical clustering be used on categorical data?

Yes, but it requires a suitable **distance metric for categorical variables**, such as Hamming or Jaccard distance, or encoding before using Euclidean distance.

---

### 14. What does a negative Silhouette Score indicate?

A negative Silhouette Score indicates that a point is **closer to a different cluster** than to the one it was assigned — implying **poor clustering**.

---

### 15. Explain the term "linkage criteria" in hierarchical clustering.

Linkage criteria determine how the **distance between clusters** is calculated:

* **Single:** minimum distance
* **Complete:** maximum distance
* **Average:** average distance
* **Ward's:** minimizes variance

---

### 16. Why might K-Means clustering perform poorly on data with varying cluster sizes or densities?

K-Means assumes clusters are **spherical and of similar size**. It struggles with:

* Varying densities
* Non-convex shapes
* Unequal cluster sizes

---

### 17. What are the core parameters in DBSCAN, and how do they influence clustering?

* **eps:** maximum radius of neighborhood
* **minPts:** minimum number of points to form a cluster

They affect how clusters and noise points are defined.

---

### 18. How does K-Means++ improve upon standard K-Means initialization?

K-Means++ selects initial centroids in a **smart way** to spread them out, improving:

* Convergence speed
* Final cluster quality

---

### 19. What is agglomerative clustering?

Agglomerative clustering is a **bottom-up hierarchical method**:

* Start with each point as its own cluster
* Merge closest clusters iteratively until all points are in one cluster or a stopping criterion is met

---

### 20. What makes Silhouette Score a better metric than just inertia for model evaluation?

* **Inertia** only measures compactness.
* **Silhouette Score** considers both **cohesion and separation**, giving a more holistic view of clustering quality.

