# Day 05 — Hyperparameter Tuning

Hyperparameters control model behavior but are not learned directly from data.
We use validation to choose good settings.

We will cover:
- Train/validation splits
- Cross-validation
- Grid search vs random search


## 1) Load dataset
We reuse the breast cancer dataset for consistency.


In [None]:
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split, GridSearchCV, RandomizedSearchCV
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score

cancer = load_breast_cancer()
X, y = cancer.data, cancer.target

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42, stratify=y
)


## 2) Baseline pipeline
We establish a baseline before tuning.


In [None]:
pipeline = Pipeline(
    [
        ("scaler", StandardScaler()),
        ("model", LogisticRegression(max_iter=1000)),
    ]
)

pipeline.fit(X_train, y_train)
baseline_acc = accuracy_score(y_test, pipeline.predict(X_test))
baseline_acc


## 3) Grid search
Grid search tries every combination in the grid.


In [None]:
param_grid = {
    "model__C": [0.01, 0.1, 1.0, 10.0],
    "model__penalty": ["l2"],
    "model__solver": ["lbfgs"],
}

grid = GridSearchCV(pipeline, param_grid, cv=5, n_jobs=-1)
grid.fit(X_train, y_train)

grid.best_params_, grid.best_score_


## 4) Randomized search
Randomized search samples a subset of parameter combinations.


In [None]:
param_dist = {
    "model__C": [0.001, 0.01, 0.1, 1, 10, 100],
    "model__solver": ["lbfgs"],
    "model__penalty": ["l2"],
}

random_search = RandomizedSearchCV(pipeline, param_dist, n_iter=4, cv=5, random_state=42, n_jobs=-1)
random_search.fit(X_train, y_train)

random_search.best_params_, random_search.best_score_


## 5) Evaluate the tuned model
We compare tuned performance on the test set.


In [None]:
best_model = grid.best_estimator_

best_acc = accuracy_score(y_test, best_model.predict(X_test))
baseline_acc, best_acc


## 6) What to do next
After tuning, you should explain the model’s behavior. That’s the focus of Day 06.
