Hierarchical clustering is a method used in unsupervised machine learning to group similar data points into clusters or groups based on their characteristics. In simple words, it's like organizing items into groups based on how similar they are to each other.

Here's a simplified explanation of hierarchical clustering:

1. **Start with Each Data Point as its Own Cluster**:
   - Initially, each data point is treated as a separate cluster.

2. **Merge Similar Clusters**:
   - At each step, the two closest clusters are merged into a single cluster.
   - The "closeness" of clusters is determined by a similarity measure, such as distance between data points.
   - This process continues until all data points belong to a single cluster, forming a hierarchical structure of clusters.

3. **Visual Representation as a Dendrogram**:
   - The result of hierarchical clustering can be visualized as a dendrogram, which is a tree-like structure that shows the order and manner in which clusters were merged.
   - The height of each branch in the dendrogram represents the distance or dissimilarity between the clusters being merged.

4. **Determine the Number of Clusters**:
   - The dendrogram can help in deciding the optimal number of clusters by identifying the level at which to cut the tree.
   - Cutting the dendrogram at a certain height results in a particular number of clusters.

5. **Use Cases**:
   - Hierarchical clustering is used in various fields, including biology (e.g., clustering genes based on expression levels), customer segmentation in marketing, and document clustering in natural language processing.

In summary, hierarchical clustering is a method for grouping data points into clusters based on their similarity, with the result represented as a dendrogram. It's an intuitive and visual approach to clustering that helps uncover structure in the data without requiring the number of clusters to be specified in advance.

In [None]:
#loading libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

In [None]:
df=pd.DataFrame([10,7,28,20,35],columns=["Marks"])

In [None]:
import scipy.cluster.hierarchy as shc
plt.figure(figsize=(10, 7))  
plt.title("Dendrograms")  
dend = shc.dendrogram(shc.linkage(df, method='ward'))
plt.axhline(y=3, color='r', linestyle='--')

In [None]:
# running clustering
from sklearn.cluster import AgglomerativeClustering
cluster = AgglomerativeClustering(n_clusters=2, affinity='euclidean', linkage='ward')  
cluster.fit_predict(df)