
# 📊 Clustering Method Comparison Summary

This section provides a **comparison between Hierarchical Clustering and K-Means Clustering** applied to the *Mall Customers* dataset.

---

## 🔍 Dataset Overview
- Dataset: `Mall_Customers.csv`
- Features used: `Age`, `Annual Income (k$)`, `Spending Score (1-100)`
- Preprocessing: Standardized using `StandardScaler`

---

## 🧠 Method 1: Hierarchical Clustering
- **Linkage Method:** Ward (best for variance minimization)
- **Number of Clusters:** 5 (chosen based on dendrogram)
- **Visualization:** Dendrogram + 2D + 3D PCA
- **Evaluation:**
  - Clear, compact clusters observed
  - Visual separation was strong in both 2D and 3D
- **Advantages:**
  - Doesn’t require predefined `k`
  - Good for small datasets
- **Limitations:**
  - Computationally expensive for large data
  - Cannot update model with new data easily

---

## ⚙️ Method 2: K-Means Clustering
- **Initialization:** k-means++ (better centroid start)
- **Number of Clusters:** 5 (chosen using Elbow Method)
- **Evaluation:**
  - **Silhouette Score:** Provided to evaluate cluster cohesion
  - Clusters well-separated in 2D and 3D PCA
- **Advantages:**
  - Efficient for large datasets
  - Can easily retrain or update clusters
- **Limitations:**
  - Sensitive to outliers and initial placement
  - Requires prior knowledge of `k`

---

## ✅ Final Notes:
- **Both methods produced similar clustering results** in terms of number and separation.
- **Ward linkage in Hierarchical Clustering** showed slightly tighter clusters visually.
- **K-Means is more scalable**, and preferred when working with bigger or evolving datasets.

---

## 📌 Recommendation:
If you want:
- **Quick clustering and scalability → Use K-Means**
- **Detailed exploration and no need for re-training → Use Hierarchical (Ward)**
