# **6️⃣ Hyperparameters in Machine Learning: Definition & Tuning 🎛️🤖**

## **💡 Real-Life Analogy: Tuning an NBA Player’s Shooting Technique 🏀**

Imagine you're **coaching an NBA player** to improve their shooting. You adjust:

- **Shot Arc 🎯** (Higher arc = more accuracy, but too high = misses).  
- **Release Speed ⏳** (Fast release = harder to block, but may reduce accuracy).  
- **Leg Position 🏀** (More stability = better balance, but slower motion).

📌 **These settings are like hyperparameters in ML—they control how the model learns but aren’t learned from data!**

## **📌 What Are Hyperparameters?**

✅ **Hyperparameters are settings that control the learning process of a machine learning model.**  
✅ They **must be set before training** and are **not learned from the data**.  
✅ Examples: **Learning rate, number of layers in a neural network, number of trees in a random forest.**

📌 **Two Types of Parameters in ML:**

| Type               | What It Does                    | Example                                                    |
|--------------------|---------------------------------|------------------------------------------------------------|
| **Model Parameters**   | Learned from data 📊             | Weights in Linear Regression, Splits in Decision Trees      |
| **Hyperparameters**    | Set before training 🎛️          | Learning Rate, Number of Hidden Layers, K in KNN            |

✅ **Hyperparameters control how the model trains, while parameters are learned from data!**

## **📊 Examples of Hyperparameters in Different ML Models**

| Model                         | Key Hyperparameters                    |
|-------------------------------|----------------------------------------|
| **Linear Regression**         | Regularization strength (L1, L2)         |
| **Decision Tree**             | Tree depth, min samples per leaf         |
| **Random Forest**             | Number of trees, max depth               |
| **K-Nearest Neighbors (KNN)**   | Number of neighbors (K)                  |
| **Neural Networks**           | Learning rate, number of layers, batch size|
| **Support Vector Machines (SVM)** | Kernel type, C (penalty), Gamma          |

✅ **Choosing the right hyperparameters is critical for model performance!**

## **📊 Example: Hyperparameters in Football (Soccer) ⚽**

You’re building a **machine learning model** to predict **if a player will score in a match**.  
- **Model Type:** Decision Tree  
- **Training Data:** Shots on target, xG, minutes played.

📌 **Key Hyperparameters for Decision Trees:**

| Hyperparameter            | Meaning                           | Impact                   |
|---------------------------|-----------------------------------|--------------------------|
| **Max Depth**             | Limits how deep the tree grows.   | Prevents overfitting.    |
| **Min Samples per Leaf**  | Minimum players per leaf node.    | Reduces model complexity.|
| **Criterion**             | Measure of split quality (Gini vs. Entropy). | Affects accuracy. |

✅ **Example Values:**  
- **Max Depth = 5** → Limits tree growth.  
- **Min Samples per Leaf = 10** → Prevents splits on very small data.

📌 **Why is tuning important?**  
- If the tree is **too deep**, it **memorizes training data** (overfitting).  
- If the tree is **too shallow**, it **misses important patterns** (underfitting).

## **📊 Example: Hyperparameters in NBA Analytics 🏀**

You’re predicting whether an NBA team will **win or lose** based on:  
- **3PT Shooting % 🎯**  
- **Turnovers per game 🔄**  
- **Defensive efficiency 🏀**

📌 **Using a Random Forest Model:**

| Hyperparameter                      | Meaning                                       | Impact                                                |
|-------------------------------------|-----------------------------------------------|-------------------------------------------------------|
| **Number of Trees (n_estimators)**  | How many trees in the forest?                 | More trees = better accuracy, but slower training.    |
| **Max Features**                    | How many features to use per tree?            | Less = faster, More = more accurate.                  |
| **Max Depth**                       | Maximum depth of trees.                       | Higher depth = risk of overfitting.                   |

✅ **Example Values:**  
- **n_estimators = 100** → Uses 100 trees.  
- **Max Depth = 10** → Limits tree complexity.  
- **Max Features = "sqrt"** → Uses square root of features per tree.

📌 **Why does tuning matter?**  
- Too many trees = **slow model, little extra accuracy**.  
- Too few trees = **low accuracy**.

## **🛠️ How Do We Tune Hyperparameters? (Hyperparameter Optimization)**

1️⃣ **Grid Search 🔍**  
   - Tries **all possible combinations** of hyperparameters.  
   - Works well for small search spaces, but slow for large ones.  
2️⃣ **Random Search 🎲**  
   - Randomly picks hyperparameters instead of trying every combination.  
   - Faster than Grid Search, works well with many parameters.  
3️⃣ **Bayesian Optimization 🤖**  
   - Uses probability to find the best settings.  
   - Faster than Grid Search and **works well on deep learning models**.  
4️⃣ **Genetic Algorithms (Evolutionary Search) 🧬**  
   - Mimics natural selection to evolve the best hyperparameters.  
   - Used in **complex ML models like Neural Networks**.

## **🛠️ Python Code: Hyperparameter Tuning with Grid Search**

In [1]:
from sklearn.model_selection import GridSearchCV
from sklearn.ensemble import RandomForestClassifier
import numpy as np

# Sample dataset (NBA Wins Prediction)
X = np.random.rand(1000, 5)  # 1000 games, 5 features (e.g., 3PT %, Turnovers)
y = np.random.randint(0, 2, size=1000)  # 0 = Loss, 1 = Win

# Define Random Forest model
model = RandomForestClassifier()

# Define hyperparameters to tune
param_grid = {
    'n_estimators': [50, 100, 200],
    'max_depth': [5, 10, 20],
    'max_features': ['sqrt', 'log2']
}

# Perform Grid Search
grid_search = GridSearchCV(model, param_grid, cv=5, scoring='accuracy')
grid_search.fit(X, y)

# Best hyperparameters
print("Best Hyperparameters:", grid_search.best_params_)

Best Hyperparameters: {'max_depth': 20, 'max_features': 'log2', 'n_estimators': 50}


✅ **Output Example:**  
```
Best Hyperparameters: {'max_depth': 20, 'max_features': 'log2', 'n_estimators': 50}
```

📌 **This means the best-performing model uses:**  
- **Max Depth = 20**
- **Max Features = "log2"**
- **Number of Trees = 50**

## **🚀 Why Are Hyperparameters Important?**

✅ **1️⃣ Improve Model Accuracy** → The right settings can **boost prediction performance**.  
✅ **2️⃣ Prevent Overfitting** → Tuning **regularization and depth** avoids memorizing noise.  
✅ **3️⃣ Optimize Training Speed** → **Too many layers in a neural network = slow model**.  
✅ **4️⃣ Balance Bias & Variance** → Prevent **underfitting or overfitting**.

## **🔥 Summary**

1️⃣ **Hyperparameters are settings that control the learning process (not learned from data).**  
2️⃣ **Examples include learning rate, number of trees, max depth, and regularization.**  
3️⃣ **Tuning hyperparameters optimally improves accuracy and prevents overfitting.**  
4️⃣ **Common tuning methods: Grid Search, Random Search, Bayesian Optimization.**  
5️⃣ **Used in Football (xG models), NBA (win prediction), stock markets, deep learning, and more!**