# Model Selection

In machine learning algorithms training is used to learn the parameters of a  model by minimizing some loss function. 

Many learning algorithms also gave hyperparameters that must be defined outside of the learning process. For example, RandomForest are collections of decision trees (hence the word forest) that must be set prior to fitting. This is referred as HYPERPARAMETER TUNNING / HYPERPARAMETER OPTIMIZATION / MODEL SELECTION. 

We can also check which model has a better performance between Support Vector Machine Classifier & RandomForest for instance. 

Model selection  == Selecting the best learning algorithm and It's best hyperparameters. 

C == Hyperparameters

# Selecting Best Models Using Exhaustive Search

You want to select the best model  by searching over a range of hyperparameters using GridSearchCV.

### GridSearchCV

It's a brute force approach to model selection using cross-validation.



In [88]:
# Load libraries

import numpy as np
from sklearn import linear_model, datasets
from sklearn.model_selection import GridSearchCV

In [89]:
# Load data

iris = datasets.load_iris()

# Create Feature matrix and target vector

features = iris.data
target = iris.target

In [90]:
# Create Logistic Regression as Learning Algorithm

logistic = linear_model.LogisticRegression(max_iter=1000, solver = "liblinear")
logistic

LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,
                   intercept_scaling=1, l1_ratio=None, max_iter=1000,
                   multi_class='auto', n_jobs=None, penalty='l2',
                   random_state=None, solver='liblinear', tol=0.0001, verbose=0,
                   warm_start=False)

In [91]:
# Create range of candidate penalty hyperparameter values.
# Two Possible values for regularization penalty.

penalty = ["l1", "l2"]

In [92]:
# Create Range of candidate regularization hyperparameter values 
# We define 10 possible values of C.

C = np.logspace(0,4,10)
C

array([1.00000000e+00, 2.78255940e+00, 7.74263683e+00, 2.15443469e+01,
       5.99484250e+01, 1.66810054e+02, 4.64158883e+02, 1.29154967e+03,
       3.59381366e+03, 1.00000000e+04])

In [93]:
# Create dictionary hyperparameter candidates. 2 Hyperparameters == C & Penalty

hyperparameters = dict(C=C, penalty = penalty)
hyperparameters

{'C': array([1.00000000e+00, 2.78255940e+00, 7.74263683e+00, 2.15443469e+01,
        5.99484250e+01, 1.66810054e+02, 4.64158883e+02, 1.29154967e+03,
        3.59381366e+03, 1.00000000e+04]),
 'penalty': ['l1', 'l2']}

### Selecting best model

For each combination of C and regularization penalty values we train a model that evaluate it using K Fold Cross Validation.

10 possible values of C * 2 penalty values * 5 folds = 100 candidates models from which the best was selected. 

In [94]:
# Create GRIDSEARCH

gridsearch = GridSearchCV(logistic, hyperparameters, cv=5, verbose = 0) # FOLDS = 5 ; VERBOSE determines the message printed by the processing. [0,3]


In [95]:
# Fit grid search

best_model = gridsearch.fit(features,target)
best_model

GridSearchCV(cv=5, error_score=nan,
             estimator=LogisticRegression(C=1.0, class_weight=None, dual=False,
                                          fit_intercept=True,
                                          intercept_scaling=1, l1_ratio=None,
                                          max_iter=1000, multi_class='auto',
                                          n_jobs=None, penalty='l2',
                                          random_state=None, solver='liblinear',
                                          tol=0.0001, verbose=0,
                                          warm_start=False),
             iid='deprecated', n_jobs=None,
             param_grid={'C': array([1.00000000e+00, 2.78255940e+00, 7.74263683e+00, 2.15443469e+01,
       5.99484250e+01, 1.66810054e+02, 4.64158883e+02, 1.29154967e+03,
       3.59381366e+03, 1.00000000e+04]),
                         'penalty': ['l1', 'l2']},
             pre_dispatch='2*n_jobs', refit=True, return_train_score=False,
       

In [96]:
# See the hyperparameters of the best model

print("Best Penalty: ", best_model.best_estimator_.get_params()["penalty"])
print("Best C: ", best_model.best_estimator_.get_params()["C"])

Best Penalty:  l1
Best C:  7.742636826811269


After identifying the best hyperparameters, GridSearch will retrain a model using the best hyperparameters on the entire dataset. We can use this model to predict values just like any other scikit-learng model

In [97]:
# Predict Target Vector


best_model.predict(features)


array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
       2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
       2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2])