
# Overview

Machine Learning is a **subset of Artificial Intelligence (AI)** that focuses on teaching computers to **learn patterns from data** and **make predictions or decisions** without being explicitly programmed with fixed rules.

👉 Instead of writing step-by-step instructions, we provide **examples (data)**, and the algorithm learns the hidden relationships.

---

## Example to Understand ML

* Traditional programming:

  * Rules (explicitly coded) + Data → Output
* Machine Learning:

  * Data + Output (examples) → Algorithm learns rules → Predict new output

✨ Example: Predicting house prices

* Input: Size, Location, Number of rooms
* Output: House Price
* ML learns the mapping function:

  $$
  Price = f(Size, Location, Rooms)
  $$

---

## Types of Machine Learning

1. **Supervised Learning**

   * Learn from labeled data (input + correct output given).
   * Task: Prediction.
   * Examples:

     * Regression (predict numbers, e.g., house prices).
     * Classification (predict categories, e.g., spam vs not spam).

2. **Unsupervised Learning**

   * Learn from unlabeled data (only input, no output given).
   * Task: Discover patterns.
   * Examples:

     * Clustering (grouping customers by purchase behavior).
     * Dimensionality reduction (compressing features for visualization).

3. **Reinforcement Learning**

   * Learn by interacting with the environment (trial and error).
   * Task: Decision making.
   * Example:

     * Teaching a robot to walk.
     * AlphaGo beating humans in Go.

---



| **Category**                 | **Goal**                                                                        | **Common Algorithms / Methods**                                                                                                                                                                        |
| ---------------------------- | ------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| **Supervised Learning**      | Learn mapping from inputs → known outputs (labeled data)                        | - Linear/Logistic Regression <br> - Decision Trees, Random Forests <br> - SVM, kNN <br> - Naive Bayes <br> - Gradient Boosting (XGBoost, LightGBM, CatBoost) <br> - Neural Networks                    |
| **Unsupervised Learning**    | Find hidden structure in unlabeled data                                         | - Clustering: k-Means, DBSCAN, GMM, Hierarchical <br> - Dimensionality Reduction: PCA, ICA, t-SNE, UMAP, Autoencoders <br> - Association Rules: Apriori, FP-Growth <br> - Density Estimation (KDE, EM) |
| **Semi-Supervised Learning** | Learn from a mix of small labeled + large unlabeled dataset                     | - Self-training <br> - Label Propagation/Spreading <br> - Semi-supervised SVM <br> - Graph-based methods <br> - Semi-supervised Deep Learning (e.g., consistency regularization)                       |
| **Reinforcement Learning**   | Learn a policy of actions to maximize long-term rewards through trial and error | - Q-Learning <br> - Deep Q-Networks (DQN) <br> - Policy Gradient Methods (REINFORCE) <br> - Actor–Critic Methods (A2C, A3C) <br> - Proximal Policy Optimization (PPO) <br> - Monte Carlo Tree Search   |


## Learning approach
### Instance-based learning

* Learns by **memorizing training examples**.
* No explicit model is built.
* Prediction is made by comparing a new instance with stored instances.
* Uses a **similarity (distance) measure** to find closest examples.

**Examples:**

* k-Nearest Neighbors (kNN)
* Locally Weighted Regression

**Pros:**

* Simple, flexible.
* Works well if decision boundary is irregular.

**Cons:**

* Expensive at prediction time (must compare with many stored examples).
* Sensitive to noise and irrelevant features.

---

### Model-based learning

* Learns a **general model** from training data.
* The model captures underlying relationships, then is used for prediction.
* Parameters are estimated during training.

**Examples:**

* Linear Regression
* Logistic Regression
* Neural Networks
* Decision Trees

**Pros:**

* Fast prediction once model is trained.
* Generalizes well if model is appropriate.

**Cons:**

* Training can be computationally heavy.
* If model is too simple, it underfits; if too complex, it overfits.

---

**Key Difference**

* **Instance-based**: “Remember examples, predict by similarity.”
* **Model-based**: “Learn rules (parameters), predict by applying model.”


## Key Components of ML

1. **Dataset** → Collection of examples (features + labels).
2. **Model** → Mathematical representation that makes predictions.
3. **Training** → Process of learning patterns (adjusting model parameters).
4. **Evaluation** → Measuring performance (accuracy, error, etc.).
5. **Prediction** → Using the trained model on unseen data.

---

## Why is ML important?

* Handles **large, complex data** humans cannot analyze manually.
* **Automates tasks** (spam filtering, recommendation systems, fraud detection).
* Improves over time as it sees more data.

---

✅ **In short:**
Machine Learning = teaching computers to learn from data and improve performance without being explicitly programmed.


```{dropdown} Click here for Sections
```{tableofcontents}

