# Supervised Learning vs Unsupervised Learning (Simple Comparison)
This notebook demonstrates a simple example comparing **supervised learning** and **unsupervised learning** using scikit-learn.
Both might produce a plot where:
- Points are colored by group.
- Groups are separated with boundaries or circles.

**BUT**:
- In **supervised learning**, the colors come from **known labels**.
- In **unsupervised learning**, the model **guessed** the groups from scratch.


In [None]:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import make_blobs
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.cluster import KMeans
from scipy.spatial import ConvexHull
import seaborn as sns
sns.set(style="whitegrid")

In [None]:
# Generate dataset
X, y = make_blobs(n_samples=90, centers=3, cluster_std=3.0, random_state=42)

In [None]:
# Train classifier (Supervised Learning)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
clf = LogisticRegression(multi_class='ovr', solver='lbfgs')
clf.fit(X_train, y_train)

In [None]:
# Train clusterer (Unsupervised Learning)
kmeans = KMeans(n_clusters=3, random_state=42)
y_kmeans = kmeans.fit_predict(X)

In [None]:
# Plot classification vs clustering
fig, axes = plt.subplots(1, 2, figsize=(14, 6))

# Classification plot
x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
xx, yy = np.meshgrid(np.linspace(x_min, x_max, 300),
                     np.linspace(y_min, y_max, 300))
Z = clf.predict(np.c_[xx.ravel(), yy.ravel()]).reshape(xx.shape)
axes[0].contourf(xx, yy, Z, alpha=0.25, cmap='Pastel1')
markers = ['o', 's', 'X']
colors = ['r', 'g', 'b']
for i in range(3):
    axes[0].scatter(X[y == i, 0], X[y == i, 1], 
                    label=f'Class {i}', marker=markers[i], color=colors[i], s=70, edgecolor='k')
axes[0].set_title("Classification\nSupervised Learning", fontsize=14)
axes[0].legend()
axes[0].set_xticks([])
axes[0].set_yticks([])

# Clustering plot
for i in range(3):
    cluster_points = X[y_kmeans == i]
    
    # Scatter plot of points
    axes[1].scatter(cluster_points[:, 0], cluster_points[:, 1],
                    s=70, color=colors[i], alpha=0.7, edgecolor='k')
    
    # Fit circle around cluster
    center = cluster_points.mean(axis=0)
    radius = np.max(np.linalg.norm(cluster_points - center, axis=1)) * 1.1  # slight padding
    circle = plt.Circle(center, radius, color='blue', fill=False, linestyle='--', linewidth=2, alpha=0.6)
    axes[1].add_patch(circle)

axes[1].set_title("Clustering\nUnsupervised Learning", fontsize=14)
axes[1].set_xticks([])
axes[1].set_yticks([])

plt.tight_layout()
plt.show()

## Supervised vs. Unsupervised. Visually They Look Similar **BUT**:
- In **supervised learning**, the colors come from **known labels**.
- In **unsupervised learning**, the model **guessed** the groups from scratch.

### Supervised Learning
> “You already know what the categories are. You’re teaching the model to recognize them.”
- You **have labeled data** (examples with correct answers).
- The goal is to **assign known categories** to new data.
- It’s like teaching with a **cheat sheet**.

### Unsupervised Learning
> “You have no idea what the categories are. The model tries to discover them on its own.”
- You **don’t have labeled data**.
- The goal is to **discover structure** or **groupings**.
- It’s like organizing without a cheat sheet. Just using statistical cues.

### Summary
| Concept            | Summary                                      |
|--------------------|----------------------------------------------|
| **Supervised learning** | “Tell me which known category this belongs to.” |
| **Unsupervised learning**     | "Help me find natural groupings in this mess.” |
