# 🔧 Hyperparameter Tuning in Machine Learning

## 📘 Definition

**Hyperparameter tuning** is the process of finding the best set of hyperparameters for a machine learning model to improve its accuracy and performance on unseen data. Hyperparameters are like configuration settings—such as learning rate, number of trees, or depth of a model—that are set **before** training begins.

---

## 🧩 Types of Hyperparameter Tuning (with Examples, Pros, and Cons)

---

### 1. **Grid Search**

**Description:**  
Tries every possible combination of hyperparameter values from a predefined grid.

**Example:**  
Imagine tuning a decision tree. You try every combination of:
- max_depth: [3, 5, 7]
- min_samples_split: [2, 4]

This gives you 3 × 2 = 6 models to train and compare.

**Pros:**
- Simple and systematic
- Guarantees testing all combinations

**Cons:**
- Very slow with many parameters
- Tries even the bad combinations

---

### 2. **Random Search**

**Description:**  
Randomly selects combinations of hyperparameters from given ranges.

**Example:**  
You randomly choose:
- max_depth: any value between 1 and 10
- learning_rate: any value between 0.01 and 0.3  
Instead of testing every pair, you randomly test, say, 10 combinations.

**Pros:**
- Faster than grid search
- Better chance to find good parameters in less time

**Cons:**
- May miss the best combination
- Still needs you to define a good range

---

### 3. **Bayesian Optimization**

**Description:**  
Uses past trial results to decide the next best combination to try (smart search).

**Example:**  
You try a few combinations first. Based on which ones worked well, it picks the next likely best combination instead of guessing randomly.

**Pros:**
- Learns from past tries
- Fewer model runs needed

**Cons:**
- Harder to implement
- Can get stuck in local optima

---

### 4. **Gradient-based Optimization**

**Description:**  
Uses gradients (slopes) to adjust hyperparameters in the direction of improvement.

**Example:**  
In deep learning, you start with a learning rate of 0.1. Based on performance feedback, the optimizer gradually reduces it to improve accuracy.

**Pros:**
- Fast convergence
- Useful in neural networks

**Cons:**
- Only works when gradients are available
- Not applicable to all algorithms

---

### 5. **Evolutionary Algorithms**

**Description:**  
Inspired by natural evolution—keeps good combinations, mutates them, and tries new ones.

**Example:**  
Imagine a population of 10 models with different hyperparameters. The best ones "mate" and produce new combinations with random changes ("mutations").

**Pros:**
- Good at exploring complex spaces
- Avoids local optima

**Cons:**
- Requires many model runs
- Complex to configure

---

### 6. **Automated Machine Learning (AutoML)**

**Description:**  
Fully automates the process of model and hyperparameter selection.

**Example:**  
You just give the data and target to an AutoML tool. It tries various models and hyperparameters behind the scenes and gives you the best one.

**Pros:**
- Saves time and effort
- Good performance with minimal tuning

**Cons:**
- Limited control and transparency
- Can be resource-intensive

---

## 📚 Summary Table

| Method              | Pros                                  | Cons                                  | Example Idea                             |
|---------------------|----------------------------------------|----------------------------------------|-------------------------------------------|
| Grid Search         | Tests all options                      | Slow for many parameters              | Trying all learning rates and depths      |
| Random Search       | Fast, simple                           | May miss best combo                   | Randomly trying values in a range         |
| Bayesian Optimization | Smart, learns from results          | Harder to set up                      | Tries best next guess based on past runs  |
| Gradient-based      | Quick adjustment                      | Limited use                           | Adjust learning rate over time            |
| Evolutionary        | Good for complex tuning               | Slow, needs setup                     | Like genetic evolution of models          |
| AutoML              | Fully automated                       | Opaque, heavy                         | Tool picks model + tuning for you         |

---

## 🧮 Common Hyperparameters (Examples)

| Algorithm        | Hyperparameters                               |
|------------------|-----------------------------------------------|
| Linear Regression | Regularization (L1, L2), alpha               |
| SVM              | Kernel type, C, gamma                         |
| Random Forest     | n_estimators, max_depth, min_samples_split   |
| Neural Networks   | Learning rate, batch size, epochs, layers    |

---

## 🔢 Formula-like Representation

There is no single formula for hyperparameter tuning, but it is often expressed as an optimization problem:

$$
\theta^* = \arg\min_{\theta \in \Theta} \mathcal{L}_{val}(f_{\theta}(X_{train}), y_{val})
$$

Where:
- $ \theta $: set of hyperparameters  
- $ \Theta $: hyperparameter search space  
- $ \mathcal{L}_{val} $: validation loss  
- $ f_{\theta} $: model trained with hyperparameters $ \theta $  
- $ X_{train}, y_{val} $: training and validation datasets  

---

## ✅ Best Practices

- Use **cross-validation** to avoid overfitting while tuning.
- **Start with random search**, then refine with **Bayesian or grid**.
- Avoid using too many combinations—focus on **important hyperparameters**.

