### Unsupervised learning
Unsupervised learning is a type of Machine Learning where the model learns patterns from data without labels.
- Input → X only (no target y)
- Model finds hidden structure:
    - Groups
    - Patterns
    - Relationships

✔ One-line definition:
Unsupervised learning discovers hidden patterns in unlabeled data.

Clustering: Grouping similar data points.

Common Algorithms:
-   K-Means
-   Hierarchical Clustering
-   DBSCAN

Real-life Uses:
- Customer Segmentation
- Market Segmentation
- Document Clustering
- Image Segmentation

Dimensionality Reduction
Reducing the number of features while keeping important information.

Algorithms:
- PCA (Principal Component Analysis)
- t-SNE
- Autoencoders

Uses:
- Compress images
- Speed up ML models
- Visualization of high-dimensional data

#### K-Means Clustering
How K-Means works (simple steps)
- Choose number of clusters K
- Select K centroids randomly
- Assign each point to the nearest centroid
- Recalculate centroids
- Repeat until stable

Important parameters:
- n_clusters → number of groups
- random_state → fix randomness
- n_init → run multiple times and choose best

| Term         | Meaning                           |
| ------------ | --------------------------------- |
| **Cluster**  | Group of similar data points      |
| **Centroid** | Center of a cluster (mean point)  |
| **Inertia**  | How far points are from centroids |

Elbow Method (Choosing Best K):
Elbow Method helps find the optimal number of clusters.
Steps:
- Run K-Means for K = 1 to 10
- Calculate inertia for each
- Plot K vs inertia
- The point where curve bends like an elbow is the best K

#### Summary Table
| Concept   | Supervised                       | Unsupervised                          |
| --------- | -------------------------------- | ------------------------------------- |
| Input     | X + y                            | Only X                                |
| Goal      | Predict output                   | Find patterns                         |
| Examples  | Regression, Classification       | K-Means, PCA                          |
| Use Cases | Score prediction, Spam detection | Customer segmentation, Topic modeling |

### PCA (Principal Component Analysis)
PCA is a dimensionality reduction technique that converts high-dimensional data into fewer dimensions while keeping maximum important information.

One-line definition:
PCA reduces the number of features by creating new features called principal components.

Why Do We Use PCA?
- When datasets have too many features, ML models may become:
- Slow
- Hard to train
- Overfitted
- Hard to visualize

PCA helps by:
- Removing noise
- Reducing correlated features
- Speeding up training
- Improving visualization

What PCA Actually Does (Simple Explanation)
PCA:
1. Finds directions (axes) where the data varies the most
2. These directions are called principal components (PCs)
3. Projects the data onto these new axes
4. Keeps only the most important components
5. Removes less important ones

Key Terms You Must Know
| Term                         | Meaning                                           |
| ---------------------------- | ------------------------------------------------- |
| **Principal Component (PC)** | New feature created by PCA                        |
| **PC1**                      | Direction with maximum variance (most important)  |
| **PC2**                      | Second most important direction                   |
| **Variance**                 | Spread of data (more variance = more information) |
| **Explained Variance Ratio** | % of information kept by each component           |


PCA Workflow (Simple Steps)
- Step 1: Standardize data
(Important because PCA is affected by scale)
- Step 2: Calculate covariance matrix
Shows relationships between features
- Step 3: Find eigenvalues and eigenvectors
Eigenvectors → principal components
Eigenvalues → importance
- Step 4: Sort components by variance
PC1 > PC2 > PC3…
- Step 5: Keep only top K components
New reduced dataset formed

When to Use PCA?
Use when:
- Too many features
- High correlation among features
- Need visualization (2D/3D)
- Speed up ML model

Don’t use when:
- Features must be interpretable
- Non-linear patterns exist (use t-SNE, UMAP)