Agglomerative clustering and feature agglomeration are both clustering techniques, but they cluster different things:

* **Agglomerative Clustering:** This method groups data points (observations or samples) together based on their similarity. It starts with each data point in its own cluster and iteratively merges the most similar clusters until a desired number of clusters is reached or a stopping criterion is met.

* **Feature Agglomeration:** This method groups features (variables or attributes) together based on their similarity. It treats the data matrix as transposed, where each row represents a feature and each column represents a data point. It then clusters these features based on how similar they are in terms of the data they represent.

Here's a table summarizing the key differences:

| Feature                 | Agglomerative Clustering | Feature Agglomeration |
|-------------------------|--------------------------|------------------------|
| What is clustered        | Data points (observations/samples) | Features (variables/attributes) |
| Data Matrix View         | Standard data matrix (rows: data points, columns: features) | Transposed data matrix (rows: features, columns: data points) |
| Purpose                  | Group similar data points together | Group similar features together |
| Use Case                 | Unsupervised learning (data exploration, segmentation) | Dimensionality reduction, feature selection |

**When to use which:**

* Use agglomerative clustering when you want to group similar data points together to identify patterns or segments within your data.
* Use feature agglomeration when you have a high number of features and want to reduce dimensionality by grouping redundant or highly correlated features. This can improve the performance of machine learning models by reducing overfitting and training time.

Here are some additional points to consider:

* Both techniques use similarity measures like Euclidean distance or cosine similarity to determine how close data points or features are.
* Agglomerative clustering can create a hierarchical clustering structure (dendrogram) that shows how clusters are formed and merged. Feature agglomeration typically doesn't have this hierarchical structure.

I hope this clarifies the distinction between agglomerative clustering and feature agglomeration!

In [20]:
from sklearn.cluster import AgglomerativeClustering
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from joblib import dump

from matplotlib.markers import MarkerStyle
from matplotlib.patches import Circle
from sklearn.metrics import (
    calinski_harabasz_score,
    silhouette_score,
    davies_bouldin_score,
)

In [21]:
df = pd.read_csv("../../../Datasets/ClusterPoints.csv")
df.head()

Unnamed: 0,X,Y
0,33,43
1,48,10
2,15,32
3,41,16
4,47,11


In [22]:
aggl = AgglomerativeClustering(n_clusters=3, compute_distances=True)
aggl.fit(df)

In [23]:
aggl.labels_

array([1, 2, 0, 2, 2, 1, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 2, 1, 0, 1, 1,
       0, 0, 2, 2, 1, 2, 0, 0, 1, 2, 0, 1, 2, 0, 1, 0, 2, 2, 1, 0, 0, 2,
       0, 1, 2, 1, 2, 0, 1, 1, 2, 0, 2, 1, 0, 0, 2, 1, 2, 0, 0, 1, 1, 2,
       2, 2, 0, 1, 2, 0, 2, 2, 0, 2, 1, 2, 1, 1, 1, 2, 2, 0, 0, 0, 1, 0,
       1, 1, 0, 0, 2, 1, 1, 0, 0, 2, 1, 0])