## Why Top 3 Vectors?

**The number of eigenvectors = number of desired clusters**

In spectral clustering theory:

1. **Graph Structure**: Our matrix M represents a graph where each data point is a node, and the edge weights represent similarity between points

2. **Eigenvalue Theory**: The top eigenvectors (corresponding to largest eigenvalues) of the normalized adjacency matrix encode the cluster structure:
   - The **1st eigenvector** typically captures the overall connectivity
   - The **2nd and 3rd eigenvectors** reveal the main divisions/separations in the data
   - For k clusters, we need the **top k eigenvectors**

3. **Cluster Information**: Each eigenvector provides a different "view" of how to separate the data:
   - Together, the top k eigenvectors create a new k-dimensional space where clusters become more separable
   - This new space often makes clusters more "spherical" which K-means can handle better

## Theoretical Justification:

```python
# If we have k=3 clusters, we take top 3 eigenvectors because:
# - 1st eigenvector: captures main connectivity pattern
# - 2nd eigenvector: captures first major division  
# - 3rd eigenvector: captures second major division
# Together they span the "cluster subspace"
```

## What if we chose different numbers?

- **Too few vectors (k=1 or k=2)**: We lose important cluster information
- **Too many vectors (k=5 or k=10)**: We include noise and less meaningful patterns

So yes, **the choice of 3 vectors is specifically because we want to find 3 clusters**. This is a fundamental principle in spectral clustering - the dimensionality of the embedding space should match the number of clusters you're looking for.