## Hyperparameter optimization

A machine learning hypothesis is not simply determined by the learning algorithm but also by its hyperparameters (the parameters of the algorithm that have to be fixed prior, and which cannot be learned during the training process) and the selection of variables to be used to achieve the best learned parameters.

In [1]:
from sklearn.datasets import load_digits
from sklearn import svm
digits = load_digits()
X, y = digits.data, digits.target
h = svm.SVC() #plain SVC
hp = svm.SVC(probability=True, random_state=1) #enhanced SVC

In [5]:
#GridSearchCV will automatically search for the best parameters according to a search schedule and score the results
# with respect to a predefined or custom scoring function
from sklearn import model_selection
search_grid = [{'C': [1, 10, 100, 1000], 'kernel': ['linear']},
               {'C': [1, 10, 100, 1000], 'gamma': [0.001, 0.0001],
                'kernel': ['rbf']},
          ]
scorer = 'accuracy'
search_func = model_selection.GridSearchCV(estimator=h,  
                            param_grid=search_grid, scoring=scorer, 
                            n_jobs=-1, refit=True, cv=10)
#n_jobs =-1 use all the processors available on the computer
#refit =true so that function fits the whole training set
%timeit search_func.fit(X,y) #to know how much time it will take to complete entire procedure
print (search_func.best_estimator_)
print (search_func.best_params_)
print (search_func.best_score_)
#need to apply the search_funct.predict() method to fresh data in order to obtain new predictions

4.02 s ± 762 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
SVC(C=10, gamma=0.001)
{'C': 10, 'gamma': 0.001, 'kernel': 'rbf'}
0.9810738671632526


## Building custom scoring functions

In [7]:
""" For classification, there are five measures available (accuracy, AUC, precision, recall, and f1-score),
and for regression, there are three (R2, MAE, and MSE)"""
from sklearn.metrics import log_loss, make_scorer
Log_Loss = make_scorer(log_loss, 
                        greater_is_better=False, 
                        needs_proba=True)
search_func = model_selection.GridSearchCV(estimator=hp, 
                         param_grid=search_grid, scoring=Log_Loss, 
                           n_jobs=-1, refit=True, cv=3)
search_func.fit(X,y)
#hyperparameters are optimized for log loss, not for accuracy
print (search_func.best_score_)
print (search_func.best_params_)

-0.15958263353486224
{'C': 1, 'gamma': 0.001, 'kernel': 'rbf'}


## Feature selection

It simplifies high-dimensional structures by choosing the most predictive set of variables
Feature selection Methods:
Selection based on the variance
Univariate selection
Recursive elimination
Randomized logistic regression/stability selection
L1-based feature selection
Tree-based feature selection