### **Q1. What is hierarchical clustering, and how is it different from other clustering techniques?**
Ans: \

**Hierarchical clustering** builds a hierarchy of clusters either by:
- **Merging smaller clusters** into larger ones (bottom-up / agglomerative), or
- **Dividing a big cluster** into smaller ones (top-down / divisive).

**Key Differences from Other Techniques:**
- **Does not require pre-defining the number of clusters (unlike K-Means)**.
- Produces a **dendrogram** (tree-like structure) to visualize the merging/splitting.
- Can handle **nested clusters** better than flat clustering algorithms.

---

### **Q2. What are the two main types of hierarchical clustering algorithms? Describe each in brief.**
Ans: \

1. **Agglomerative Hierarchical Clustering (Bottom-Up)**
   - Start with each point as its own cluster.
   - Merge the two **closest clusters** step-by-step.
   - Stop when all points are merged into one large cluster or based on a threshold.

2. **Divisive Hierarchical Clustering (Top-Down)**
   - Start with one large cluster (all points).
   - Recursively **split** clusters until each point is its own cluster or a desired number is reached.
   - Less common than agglomerative due to higher computational cost.

---

### **Q3. How do you determine the distance between two clusters in hierarchical clustering, and what are the common distance metrics used?**
Ans: \

The way you measure distance between **clusters** is called a **linkage criterion**:

**Common Linkage Methods:**
- **Single linkage**: Minimum distance between any two points from each cluster.
- **Complete linkage**: Maximum distance between any two points.
- **Average linkage**: Average distance between all pairs of points.
- **Ward’s linkage**: Minimizes the increase in total within-cluster variance.

**Common Distance Metrics:**
- **Euclidean** (most common for numerical data)
- **Manhattan**
- **Cosine similarity** (for text data)
- **Hamming distance** (for categorical data)

---

### **Q4. How do you determine the optimal number of clusters in hierarchical clustering, and what are some common methods used for this purpose?**
Ans: \

Common methods include:

1. **Dendrogram Cut-Off**  
   - Look for the **longest vertical line** (biggest jump in distance) in the dendrogram.
   - Draw a horizontal line to cut the tree — the number of intersections = optimal clusters.

2. **Silhouette Score**  
   - Evaluates how well-separated the clusters are.
   - Values range from -1 to 1 — higher is better.

3. **Elbow Method (on linkage distances)**  
   - Plot number of clusters vs. linkage distance.
   - Look for the point where the drop in distance slows down.

---

### **Q5. What are dendrograms in hierarchical clustering, and how are they useful in analyzing the results?**
Ans: \

A **dendrogram** is a **tree-like diagram** that shows how clusters are formed by successively merging or splitting points.

**How it's useful:**
- Helps visualize the **structure and hierarchy** of clusters.
- You can **cut the dendrogram** at different levels to choose how many clusters to form.
- Reveals **natural groupings** and **outliers**.

---

### **Q6. Can hierarchical clustering be used for both numerical and categorical data? If yes, how are the distance metrics different for each type of data?**
Ans: \

**Yes**, hierarchical clustering can handle both:

- **Numerical data**: Use **Euclidean**, **Manhattan**, or **Ward's method**.
- **Categorical data**: Use **Hamming distance**, **Jaccard index**, or **Gower distance**.
- **Mixed data** (numerical + categorical): Use **Gower distance**, which handles both types.

Libraries like `scipy`, `sklearn`, and `gower` in Python can help compute appropriate distance matrices.

---

### **Q7. How can you use hierarchical clustering to identify outliers or anomalies in your data?**
Ans: \

**Outliers** can be identified by:
- Looking at **small clusters** that are far from the rest in the dendrogram.
- Observing **data points that merge late** (at a large distance threshold).
- Plotting the dendrogram and checking for **single-point merges** (these could be anomalies).

**Tip:** Outliers tend to **remain isolated** until the very end of the merging process.