# Implementation: The "Hello World" of Unsupervised Learning

**Goal**: Rediscover the 3 species of Iris flowers **without** looking at the labels.
We will use K-Means Clustering.

In [None]:
import matplotlib.pyplot as plt
from sklearn.datasets import load_iris
from sklearn.cluster import KMeans

# 1. Load Data
iris = load_iris()
X = iris.data
y_true = iris.target # We will NOT use this for training, only for checking our work later.

# 2. Unsupervised Learning (K-Means)
# We tell it to find 3 clusters (because we secretly know there are 3 species)
kmeans = KMeans(n_clusters=3, random_state=42)
kmeans.fit(X)

labels_pred = kmeans.labels_

# 3. Visualization
# Let's compare the "Truth" vs "What the Machine Found"
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Plot 1: Ground Truth
axes[0].scatter(X[:, 0], X[:, 1], c=y_true, cmap='viridis', edgecolor='k')
axes[0].set_title("Ground Truth (Actual Species)")
axes[0].set_xlabel("Sepal Length")
axes[0].set_ylabel("Sepal Width")

# Plot 2: K-Means Clusters
# Note: The colors might swap (Cluster 0 might be Virginica), that's fine.
axes[1].scatter(X[:, 0], X[:, 1], c=labels_pred, cmap='viridis', edgecolor='k')
axes[1].set_title("K-Means Clustering (Machine Discovery)")
axes[1].set_xlabel("Sepal Length")
axes[1].set_ylabel("Sepal Width")

plt.show()

## Conclusion
Notice how the machine successfully grouped the flowers (Right plot) almost identically to the real species (Left plot), purely by looking at the geometry of the data!