In [55]:
import numpy as np
from scipy.stats import uniform
from sklearn import linear_model, datasets
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import GridSearchCV , RandomizedSearchCV

```hyperparameter tuning, hyperparameter optimization, or model
selection.```

# **Selecting Best Models Using Exhaustive Search**

Use all the cores in your machine by setting ```n_jobs=-1``` for speed up

In [56]:
iris = datasets.load_iris()
features = iris.data
target = iris.target

# Create logistic regression
logistic = linear_model.LogisticRegression(solver='liblinear' , max_iter=1000)

# Create range of candidate penalty hyperparameter values
penalty = ['l1', 'l2']

# Create range of candidate regularization hyperparameter values
C = np.logspace(0, 4, 10)

# Create dictionary hyperparameter candidates
hyperparameters = dict(C=C, penalty=penalty)

# Create grid search
gridsearch = GridSearchCV(logistic, hyperparameters, cv=5, verbose=0) # we can setverbose to 1 , 2 , 3 for more details

# Fit grid search
best_model = gridsearch.fit(features, target)

**we had 10 possible
values of C, 2 possible values of regularization penalty, and 5 folds. They created
10 × 2 × 5 = 100 candidate models from which the best was selected.**


---



Once ```GridSearchCV``` is complete, we can see the hyperparameters of the best model:

In [57]:
# View best hyperparameters
print('Best Penalty:', best_model.best_estimator_.get_params()['penalty'])
print('Best C:', best_model.best_estimator_.get_params()['C'])

Best Penalty: l1
Best C: 7.742636826811269


In [58]:
# View best hyperparameters
best_model.best_params_

{'C': np.float64(7.742636826811269), 'penalty': 'l1'}

In [59]:
# Predict target vector
best_model.predict(features)

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
       2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
       2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2])

# **Selecting Best Models Using Randomized Search**

In [66]:
iris = datasets.load_iris()
features = iris.data
target = iris.target

# Create logistic regression
logistic = linear_model.LogisticRegression(solver='liblinear' , max_iter=1000)

# Create range of candidate regularization penalty hyperparameter values
penalty = ['l1', 'l2']

# Create distribution of candidate regularization hyperparameter values
C = uniform(loc=0, scale=4)

# Create hyperparameter options
hyperparameters = dict(C=C, penalty=penalty)

# Create randomized search
randomizedsearch = RandomizedSearchCV(
    logistic, hyperparameters, random_state=1, n_iter=100, cv=5, verbose=0,
    n_jobs=-1)

# Fit randomized search
best_model = randomizedsearch.fit(features, target)

In [67]:
# View best hyperparameters
print('Best Penalty:', best_model.best_estimator_.get_params()['penalty'])
print('Best C:', best_model.best_estimator_.get_params()['C'])

Best Penalty: l1
Best C: 1.668088018810296


In [68]:
# View best hyperparameters
best_model.best_params_

{'C': np.float64(1.668088018810296), 'penalty': 'l1'}

In [69]:
# Predict target vector
best_model.predict(features)

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
       2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 2, 2,
       2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2])

**The number of sampled combinations of hyperparameters (i.e., the number of candidate
models trained) is specified with the n_iter (number of iterations) setting**

# **Speeding Up Model Selection Using Algorithm-Specific Methods**

In [64]:
# Load data
iris = datasets.load_iris()
features = iris.data
target = iris.target

# Create cross-validated logistic regression
logit = linear_model.LogisticRegressionCV(Cs=100)

# Train model
logit.fit(features, target)

Automatically search over 100 different values for the regularization parameter ```C```.



In [65]:
print("Best C value:", logit.C_)
print("Scores across folds:", logit.scores_[0].mean(axis=0))  # for class 0
print("Final accuracy:", logit.score(features, target))

Best C value: [5.85702082 5.85702082 5.85702082]
Scores across folds: [0.69333333 0.69333333 0.69333333 0.69333333 0.69333333 0.70666667
 0.71333333 0.71333333 0.72       0.72666667 0.73333333 0.73333333
 0.73333333 0.75333333 0.76666667 0.78666667 0.78       0.78666667
 0.79333333 0.79333333 0.8        0.82       0.84       0.85333333
 0.86       0.86666667 0.88       0.88666667 0.88666667 0.9
 0.90666667 0.91333333 0.92666667 0.93333333 0.94       0.94666667
 0.94666667 0.94666667 0.94666667 0.94666667 0.96       0.96
 0.96       0.96       0.96       0.96       0.96666667 0.96666667
 0.96666667 0.96666667 0.97333333 0.97333333 0.97333333 0.97333333
 0.97333333 0.97333333 0.97333333 0.97333333 0.97333333 0.98
 0.98       0.97333333 0.97333333 0.98       0.97333333 0.98
 0.98       0.98       0.98       0.98       0.98       0.98
 0.98       0.98       0.98       0.98       0.97333333 0.98
 0.98       0.98       0.98       0.98       0.98       0.98
 0.98       0.98       0.98       0

- Built-in cross-validation for hyperparameter tuning.

- Cleaner than manually using GridSearchCV.

- Automatically handles the search and fitting process.

# Notes

| Technique              | When to Use                                                                          | Best For                                                             |
| ---------------------- | ------------------------------------------------------------------------------------ | -------------------------------------------------------------------- |
| **Grid Search**        | You have **few hyperparameters** to tune and **time is not a constraint**            | Getting the **absolute best hyperparameter combo** in a small space  |
| **Randomized Search**  | You have a **large hyperparameter space** and want results **faster**                | **Quick performance boost** without full search                      |
| **Parallelization**    | Your machine has **multiple CPU cores** and you want to **speed up** model selection | **Reducing computation time** for Grid or Randomized Search          |
| **Built-in CV Models** | You want the model to **automatically tune itself** (e.g., regularization strength)  | Models like **LogisticRegressionCV, LassoCV, RidgeCV** — easy tuning |
