# Hyperparameter Tuning with GridSearchCV

This notebook demonstrates hyperparameter tuning for `LogisticRegression` using `GridSearchCV` to optimize `C` and `penalty`.

## Setup
We use a synthetic dataset with varying feature informativeness.

In [1]:
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.metrics import accuracy_score
from glmpynet import LogisticRegression

# Generate synthetic dataset
X, y = make_classification(n_samples=200, n_features=20, n_informative=10, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Define grid search
param_grid = {
    'C': [0.1, 1.0, 10.0],
    'penalty': ['l1', 'l2']
}
model = LogisticRegression()
grid_search = GridSearchCV(model, param_grid, cv=5, scoring='accuracy')
grid_search.fit(X_train, y_train)

# Evaluate best model
print(f"Best Parameters: {grid_search.best_params_}")
print(f"Best Cross-Validation Score: {grid_search.best_score_:.2f}")
y_pred = grid_search.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print(f"Test Accuracy: {accuracy:.2f}")

Best Parameters: {'C': 0.1, 'penalty': 'l1'}
Best Cross-Validation Score: 0.82
Test Accuracy: 0.72


## Explanation
- `GridSearchCV` tunes `C` and `penalty` to maximize cross-validation accuracy.
- The dataset has 10 informative features, testing regularization effects.
- Best parameters typically include `C=1.0` or `10.0` with `l2` penalty.
- With `glmnet`, expect different optimal parameters due to elastic-net regularization.