<a href="https://colab.research.google.com/github/samiha-mahin/A-Machine-Learning-Models-Repo/blob/main/Hyperparameter_Tuning.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>



## 🌟 What is Hyperparameter Tuning?

**Hyperparameters** are settings **you choose before training** a machine learning model. They are not learned from the data — you set them manually.

**Hyperparameter tuning** means trying different combinations of these settings to find the **best performance** of your model.

---

## ✅ Example: Random Forest

Let’s say you're using `RandomForestClassifier`. Some common hyperparameters are:

| Hyperparameter      | Meaning                                   |
| ------------------- | ----------------------------------------- |
| `n_estimators`      | Number of trees in the forest             |
| `max_depth`         | Maximum depth of each tree                |
| `min_samples_split` | Minimum samples to split an internal node |

You need to **tune** these values to get the best results.

---

## 🎯 Why Tune Them?

Because the **wrong values** may cause:

* **Underfitting** (too simple)
* **Overfitting** (too complex)
* **Slow training** (too many trees, deep trees)

---

## ⚙️ How Do We Tune?

### 1. **Manual Search**

You try values one by one.

```python
from sklearn.ensemble import RandomForestClassifier

model = RandomForestClassifier(n_estimators=100, max_depth=5)
model.fit(X_train, y_train)
```

➡️ Try `n_estimators = 50`, `100`, `150`... and see which one works best.

---

### 2. **Grid Search (Best Practice)**

It automatically tries **all combinations** of hyperparameter values.

```python
from sklearn.model_selection import GridSearchCV

param_grid = {
    'n_estimators': [50, 100, 150],
    'max_depth': [3, 5, 10]
}

grid = GridSearchCV(RandomForestClassifier(), param_grid, cv=5)
grid.fit(X_train, y_train)

print("Best Parameters:", grid.best_params_)
```

---

### 3. **Randomized Search (Faster)**

Instead of trying all combinations, it tries **random combinations**.

```python
from sklearn.model_selection import RandomizedSearchCV

param_dist = {
    'n_estimators': [50, 100, 150, 200],
    'max_depth': [3, 5, 10, 20]
}

random_search = RandomizedSearchCV(RandomForestClassifier(), param_distributions=param_dist, n_iter=5)
random_search.fit(X_train, y_train)

print("Best Parameters:", random_search.best_params_)
```

---

## 🔁 Summary

| Method            | Tries All | Faster | Best For         |
| ----------------- | --------- | ------ | ---------------- |
| Manual Search     | ❌         | ✅      | Small tests      |
| Grid Search       | ✅         | ❌      | Full exploration |
| Randomized Search | ❌         | ✅      | Large grids      |


