# 🛠️ Model Tuning in Machine Learning

Model tuning (also called **hyperparameter optimization**) is the process of improving a machine learning model’s performance by systematically choosing the best set of hyperparameters.

---

## 🎛️ What Are Hyperparameters?

- **Hyperparameters** are external configurations **set before training** (e.g., learning rate, number of trees, max depth).
- They are different from **parameters** (like weights), which are learned by the model during training.

---

## 🎯 Why Is Tuning Important?

- Prevents **underfitting** (model too simple) and **overfitting** (model too complex).
- Helps achieve optimal performance on **unseen/test data**.

---

## 🔍 Common Tuning Techniques

### 1. **Grid Search**

**Definition:** Exhaustively tries all possible combinations of hyperparameters from a specified grid.

```python
from sklearn.model_selection import GridSearchCV

params = {'n_estimators': [100, 200], 'max_depth': [3, 5, 7]}
grid = GridSearchCV(RandomForestClassifier(), param_grid=params, cv=5)
grid.fit(X_train, y_train)


- ✅ Pros: Simple to implement
- ⚠️ Cons: Can be computationally expensive

## 2. Randomized Search
Definition: Randomly samples a given number of combinations from a hyperparameter space.

```python
from sklearn.model_selection import RandomizedSearchCV
from scipy.stats import randint

params = {'n_estimators': randint(100, 500), 'max_depth': randint(3, 10)}
random_search = RandomizedSearchCV(RandomForestClassifier(), param_distributions=params, n_iter=10, cv=5)
random_search.fit(X_train, y_train)


- ✅ Pros: Faster than grid search
- ⚠️ Cons: Might miss optimal parameters

## 3. Bayesian Optimization (Advanced)
Uses probabilistic models to estimate the best next set of parameters.

Efficient for complex models or large hyperparameter spaces.

Examples: Optuna, Hyperopt, Scikit-Optimize

| Algorithm         | Hyperparameters                                |
| ----------------- | ---------------------------------------------- |
| Linear Regression | alpha (in Ridge/Lasso), fit\_intercept         |
| Decision Trees    | max\_depth, min\_samples\_split, criterion     |
| Random Forest     | n\_estimators, max\_depth, max\_features       |
| SVM               | C, kernel, gamma                               |
| XGBoost/LightGBM  | learning\_rate, n\_estimators, max\_depth      |
| KNN               | n\_neighbors, weights, metric                  |
| Neural Networks   | learning\_rate, batch\_size, epochs, optimizer |


## 🧠 Best Practices
- Start with domain knowledge to narrow the search space.
- Use cross-validation (e.g., k-fold) during tuning to avoid overfitting.
- Monitor training time — more tuning means more compute.
- Use early stopping for iterative models like XGBoost or neural networks.

📌 Tip: After tuning, always evaluate your model on a hold-out test set to measure final generalization performance.

# ⚙️ Hyperparameter Tuning in Machine Learning

Hyperparameter tuning is the process of **finding the best set of hyperparameters** for a machine learning model to improve its **accuracy and generalization** on unseen data.

---

## 🔑 What Are Hyperparameters?

- **Hyperparameters** are values set **before training** begins.
- Unlike model **parameters** (like weights), hyperparameters **control** the learning process.

**Examples:**
- `learning_rate` in gradient boosting
- `C` and `kernel` in SVM
- `max_depth`, `n_estimators` in tree-based models
- `batch_size`, `epochs` in neural networks

---

## 🎯 Why Tune Hyperparameters?

- Prevents **overfitting** or **underfitting**
- Improves **model accuracy and robustness**
- Optimizes model **efficiency and performance**

---

## 🧰 Hyperparameter Tuning Techniques

### 🔸 1. Grid Search

- Tests **all combinations** of a predefined hyperparameter grid.
- Suitable for **small search spaces**.

```python
from sklearn.model_selection import GridSearchCV

param_grid = {'max_depth': [3, 5], 'n_estimators': [100, 200]}
grid = GridSearchCV(estimator=RandomForestClassifier(), param_grid=param_grid, cv=5)
grid.fit(X_train, y_train)


## 🔸 2. Randomized Search
Randomly samples combinations from a hyperparameter space.

More efficient than grid search for large parameter spaces.

```python

from sklearn.model_selection import RandomizedSearchCV
from scipy.stats import randint

param_dist = {'n_estimators': randint(100, 500), 'max_depth': randint(3, 10)}
random_search = RandomizedSearchCV(RandomForestClassifier(), param_distributions=param_dist, n_iter=10, cv=5)
random_search.fit(X_train, y_train

✅ Less computation, faster
❌ Might miss the optimal combination



## 🔸 3. Bayesian Optimization
Uses probability models to choose promising hyperparameters.

Efficient for complex and expensive models.

Popular Libraries:

Optuna,Hyperopt,Scikit-Optimize