# Unsupervised Learning

Unsupervised learning is a type of machine learning where the model learns patterns from unlabeled data. The goal is to discover the underlying structure, groupings, or features in the data without explicit supervision.

## Key Concepts
- **No labels**: Only input data (X), no output labels (y)
- **Clustering**: Grouping similar data points (e.g., K-Means)
- **Dimensionality Reduction**: Reducing the number of features (e.g., PCA)

In this notebook, we'll explore unsupervised learning with Python examples.

## Clustering: K-Means

K-Means is a popular clustering algorithm that partitions data into k clusters based on feature similarity.

Let's see a K-Means clustering example using synthetic data.

In [None]:
# K-Means Clustering Example
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import make_blobs
from sklearn.cluster import KMeans

# Generate synthetic data
X, _ = make_blobs(n_samples=300, centers=4, cluster_std=0.60, random_state=0)

# Fit K-Means
kmeans = KMeans(n_clusters=4)
kmeans.fit(X)
y_kmeans = kmeans.predict(X)

# Plot clusters
plt.scatter(X[:, 0], X[:, 1], c=y_kmeans, cmap='viridis')
plt.scatter(kmeans.cluster_centers_[:, 0], kmeans.cluster_centers_[:, 1], s=200, c='red', marker='X')
plt.title('K-Means Clustering')
plt.show()

## Dimensionality Reduction: PCA

Principal Component Analysis (PCA) is a technique to reduce the number of features in a dataset while retaining most of the variance. It is useful for visualization and speeding up learning algorithms.

Let's see a PCA example using the Iris dataset.

In [None]:
# PCA Example on Iris Dataset
from sklearn.datasets import load_iris
from sklearn.decomposition import PCA
import matplotlib.pyplot as plt

# Load data
iris = load_iris()
X = iris.data
y = iris.target

# Apply PCA
pca = PCA(n_components=2)
X_pca = pca.fit_transform(X)

# Plot PCA result
plt.scatter(X_pca[:, 0], X_pca[:, 1], c=y, cmap='viridis')
plt.xlabel('Principal Component 1')
plt.ylabel('Principal Component 2')
plt.title('PCA on Iris Dataset')
plt.show()

## Summary

- Unsupervised learning finds patterns in unlabeled data.
- We explored clustering (K-Means) and dimensionality reduction (PCA) with Python examples.
- These techniques are useful for data exploration, visualization, and preprocessing.

Try experimenting with other algorithms like hierarchical clustering or t-SNE!