# K-Means Clustering

This notebook demonstrates K-Means clustering, a popular unsupervised learning algorithm for grouping data into clusters.

## 1. Import Libraries

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import make_blobs
from sklearn.cluster import KMeans

## 2. Generate Example Data

In [None]:
X, _ = make_blobs(n_samples=300, centers=4, random_state=42)
plt.scatter(X[:, 0], X[:, 1], c='gray', alpha=0.6)
plt.title('Data for K-Means')
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
plt.show()

## 3. Run K-Means Clustering

In [None]:
kmeans = KMeans(n_clusters=4, random_state=42)
labels = kmeans.fit_predict(X)
plt.scatter(X[:, 0], X[:, 1], c=labels, cmap='viridis', alpha=0.6)
plt.scatter(kmeans.cluster_centers_[:,0], kmeans.cluster_centers_[:,1], s=200, c='red', marker='X', label='Centers')
plt.title('K-Means Clustering Results')
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
plt.legend()
plt.show()

## 4. How K-Means Works
- Choose k (number of clusters)
- Randomly assign cluster centers
- Assign points to the nearest center
- Recompute centers and repeat until stable

## 5. Choosing k (Elbow Method)

In [None]:
inertia = []
ks = range(1, 10)
for k in ks:
    kmeans = KMeans(n_clusters=k, random_state=42)
    kmeans.fit(X)
    inertia.append(kmeans.inertia_)
plt.plot(ks, inertia, 'o-')
plt.xlabel('Number of Clusters (k)')
plt.ylabel('Inertia (within-cluster sum of squares)')
plt.title('Elbow Method for Optimal k')
plt.show()

## 6. Summary
- K-Means is simple and efficient for clustering.
- You must specify the number of clusters in advance.
- Results can vary based on initialization and choice of k.