# Hyperparameter Tuning with Grid Search

Hyperparameters are settings **outside the training process** (e.g., learning rate, number of trees, regularization strength). Choosing the right hyperparameters is crucial for good model performance.

## Why Hyperparameter Tuning?
- Different hyperparameters can drastically change model accuracy.
- Manual trial-and-error is inefficient.
- Automated search strategies like **Grid Search** and **Random Search** are commonly used.

## Grid Search
- Systematically tries all possible combinations of hyperparameters.
- Uses **cross-validation** to evaluate each combination.
- Computationally expensive, but guarantees testing all options.

### Example Parameters:
- Logistic Regression: regularization parameter `C`, penalty type.
- Random Forest: number of trees (`n_estimators`), tree depth (`max_depth`).


In [None]:
# Example: Grid Search with Logistic Regression
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import GridSearchCV, train_test_split
from sklearn.linear_model import LogisticRegression

# Load dataset
data = load_breast_cancer()
X, y = data.data, data.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Define model and parameters
model = LogisticRegression(max_iter=500, solver='liblinear')
param_grid = {
    'C': [0.01, 0.1, 1, 10, 100],
    'penalty': ['l1', 'l2']
}

# Grid Search with 5-fold cross-validation
grid = GridSearchCV(model, param_grid, cv=5, scoring='accuracy')
grid.fit(X_train, y_train)

print("Best Parameters:", grid.best_params_)
print("Best Cross-Validation Accuracy:", grid.best_score_)
print("Test Accuracy:", grid.score(X_test, y_test))

## Key Takeaways
- Grid Search evaluates all possible combinations.
- Best for **small parameter spaces**.
- For larger parameter spaces, consider **Random Search** or **Bayesian Optimization**.