In [None]:
# Q1. What is hierarchical clustering, and how is it different from other clustering techniques?
# Answer:
# Hierarchical clustering is a clustering technique that builds a hierarchy of clusters by either agglomerating small clusters into larger ones (agglomerative) or by dividing large clusters into smaller ones (divisive).
# It is different from other clustering algorithms like K-means as it doesn't require the number of clusters to be pre-specified and can create a tree-like structure known as a dendrogram.
# Hierarchical clustering provides more flexibility in terms of cluster shape and is useful for identifying the structure of the data.

from sklearn.cluster import AgglomerativeClustering
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import make_blobs

# Generating sample data
X, _ = make_blobs(n_samples=300, centers=4, random_state=42)

# Agglomerative Hierarchical Clustering
agg_clust = AgglomerativeClustering(n_clusters=4)
labels = agg_clust.fit_predict(X)

# Plotting the results
plt.scatter(X[:, 0], X[:, 1], c=labels, cmap='viridis')
plt.title("Hierarchical Clustering")
plt.show()

# Q2. What are the two main types of hierarchical clustering algorithms? Describe each in brief.
# Answer:
# 1. **Agglomerative Hierarchical Clustering**: A bottom-up approach where each data point starts as its own cluster, and pairs of clusters are merged based on proximity.
# 2. **Divisive Hierarchical Clustering**: A top-down approach where all data points start in one cluster, and splits are made iteratively to divide the data into smaller clusters.

# Q3. How do you determine the distance between two clusters in hierarchical clustering, and what are the common distance metrics used?
# Answer:
# In hierarchical clustering, the distance between two clusters is computed using a linkage function, which defines how to calculate the distance between clusters.
# Common distance metrics:
# 1. **Single Linkage**: Distance between the closest pair of points in the two clusters.
# 2. **Complete Linkage**: Distance between the farthest pair of points in the two clusters.
# 3. **Average Linkage**: Average distance between all pairs of points in the two clusters.
# 4. **Ward's Linkage**: Minimizes the variance within the clusters.

from scipy.cluster.hierarchy import dendrogram, linkage

# Perform hierarchical clustering using ward linkage
linked = linkage(X, method='ward')

# Dendrogram plotting
plt.figure(figsize=(10, 7))
dendrogram(linked)
plt.title('Dendrogram')
plt.show()

# Q4. How do you determine the optimal number of clusters in hierarchical clustering, and what are some common methods used for this purpose?
# Answer:
# Common methods to determine the optimal number of clusters in hierarchical clustering:
# 1. **Dendrogram**: By visualizing the dendrogram, you can decide where to "cut" the tree, which represents the optimal number of clusters.
# 2. **Gap Statistic**: Compares the clustering with a random clustering to find the optimal K.
# 3. **Silhouette Score**: Measures how well each data point fits into its assigned cluster compared to other clusters.

# Q5. What are dendrograms in hierarchical clustering, and how are they useful in analyzing the results?
# Answer:
# Dendrograms are tree-like diagrams that show the arrangement of clusters at different levels of similarity.
# They are useful because they provide a visual representation of the hierarchical structure of the data, allowing for the selection of the optimal number of clusters based on the "cut" in the tree.

# Q6. Can hierarchical clustering be used for both numerical and categorical data? If yes, how are the distance metrics different for each type of data?
# Answer:
# Yes, hierarchical clustering can be used for both numerical and categorical data.
# 1. **For Numerical Data**: Common distance metrics are Euclidean distance, Manhattan distance, or other continuous metrics.
# 2. **For Categorical Data**: Distance metrics such as **Hamming distance** or **Jaccard similarity** are used to measure the similarity between categories or sets.

# For numerical data, linkage methods like 'ward' or 'average' are used, while for categorical data, distance measures like Jaccard can be applied.

# Q7. How can you use hierarchical clustering to identify outliers or anomalies in your data?
# Answer:
# Hierarchical clustering can be used to identify outliers or anomalies by observing the dendrogram:
# 1. Outliers typically form their own small clusters or are far away from other data points in the dendrogram.
# 2. By cutting the dendrogram at a higher level, small, isolated clusters can be detected as potential outliers.
