<font color="#CC3D3D"><p>
# Model Tuning (Hyperparameter Optimization)

<img align="left" src="https://i1.wp.com/hugrypiggykim.com/wp-content/uploads/2017/09/hyper-parameter-search.jpg?resize=698%2C242" width=800 height=600 alt="Decision Tree">

In [1]:
from sklearn.datasets import load_digits

digits = load_digits()

In [2]:
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(
    digits.data, digits.target, random_state=0)

In [3]:
from sklearn.linear_model import LogisticRegression
from sklearn.neighbors import KNeighborsClassifier
from sklearn.tree import DecisionTreeClassifier

model = KNeighborsClassifier()
#model = LogisticRegression()
#model = DecisionTreeClassifier()

<br><font color = "darkgreen">
### Grid Search CV 

< Grid Search 과정 예시 >   
<img align='left' src='http://drive.google.com/uc?export=view&id=1uoEGcdqMVfHsibXrjII-bwgbQi1mWDH0' width=500>   

##### Set the parameters for grid search #####

In [4]:
# param_grid: dictionary with parameters names as keys and
# lists of parameter settings to try as values

param_grid = {'n_neighbors': range(1,10),
              'weights': ['uniform','distance']}
param_grid

{'n_neighbors': range(1, 10), 'weights': ['uniform', 'distance']}

##### Grid search with cross-validation ####

In [5]:
from sklearn.model_selection import GridSearchCV

grid_search = GridSearchCV(KNeighborsClassifier(), param_grid, scoring='accuracy', cv=5, n_jobs=-1)

In [6]:
# grid search is very time-consuming

grid_search.fit(X_train, y_train)

GridSearchCV(cv=5, estimator=KNeighborsClassifier(), n_jobs=-1,
             param_grid={'n_neighbors': range(1, 10),
                         'weights': ['uniform', 'distance']},
             scoring='accuracy')

##### Evaluate the model with best parameters ####

In [7]:
grid_search.score(X_test, y_test), KNeighborsClassifier().fit(X_train, y_train).score(X_test, y_test)

(0.9888888888888889, 0.98)

In [8]:
print("Best parameters: {}".format(grid_search.best_params_))
print("Best CV score: {:.2f}".format(grid_search.best_score_))

Best parameters: {'n_neighbors': 3, 'weights': 'distance'}
Best CV score: 0.99


In [9]:
print("Best estimator:\n{}".format(grid_search.best_estimator_))

Best estimator:
KNeighborsClassifier(n_neighbors=3, weights='distance')


##### When the parameters are asymmetric #####

In [10]:
# In the case of SVM

param_grid = [{'kernel': ['rbf'],
               'C': [0.001, 0.01, 0.1, 1, 10, 100],
               'gamma': [0.001, 0.01, 0.1, 1, 10, 100]},
              {'kernel': ['linear'],
               'C': [0.001, 0.01, 0.1, 1, 10, 100]}]

<br><font color = "darkgreen">
### Random Search CV

< Random Search 과정 예시 >   
<img align='left' src='http://drive.google.com/uc?export=view&id=19m7DTbD1ltuuydM-uAfjTK5X5qP0eO-C' width=500>

##### Set the parameters for random search #####

In [11]:
#from scipy.stats import uniform as sp_rand
param_grid = {'n_neighbors': range(1, 10),  # sp_randint(1, 10)
              'p': range(1,5),
              'weights': ['uniform','distance']}
param_grid

{'n_neighbors': range(1, 10),
 'p': range(1, 5),
 'weights': ['uniform', 'distance']}

##### Random search with cross-validation ####

In [12]:
from sklearn.model_selection import RandomizedSearchCV

rand_search = RandomizedSearchCV(KNeighborsClassifier(), param_distributions=param_grid, 
                                 scoring='accuracy', n_iter=8, random_state=1)

In [13]:
rand_search.fit(X_train, y_train)

RandomizedSearchCV(estimator=KNeighborsClassifier(), n_iter=8,
                   param_distributions={'n_neighbors': range(1, 10),
                                        'p': range(1, 5),
                                        'weights': ['uniform', 'distance']},
                   random_state=1, scoring='accuracy')

##### Evaluate the model with best parameters ####

In [14]:
rand_search.score(X_test, y_test)

0.9888888888888889

In [15]:
print("Best estimator:\n{}".format(rand_search.best_estimator_))

Best estimator:
KNeighborsClassifier(n_neighbors=3, weights='distance')


<br><font color = "darkgreen">
#### Grid Search 결과와 Random Search 결과 비교 예시 [Bergstra and Bengio(2012)]

<img align='left' src='http://drive.google.com/uc?export=view&id=1qa6Ouqx31a14N8Y5fE48P1aDgUJnNciC' width=600>

Random Search는 Grid Search에 비해 불필요한 반복 수행 횟수를 대폭 줄이면서, 동시에 정해진 간격(grid) 사이에 위치한 값들에 대해서도 확률적으로 탐색이 가능하므로, 최적 hyperparameter 값을 더 빨리 찾을 수 있는 것으로 알려져 있음.

<br><font color = "darkgreen">
### Bayesian Optimization

*If you were picking the next number of trees to evaluate in RF, where would you concentrate?*      
<img align="left" src="https://cdn-images-1.medium.com/max/1600/1*2qDZxQkRoP28CidZtoT-gQ.png" width=400 height=300 alt="Decision Tree">

- Grid Search와 Random Search는 이전까지의 조사 과정에서 얻어진 hyperparameter 값들의 성능 결과에 대한 '사전 지식'이 전혀 반영되어 있지 않기 때문에 비효율적인 요소가 있음.
- 매 회 새로운 hyperparameter 값에 대한 조사를 수행할 시 '사전 지식'을 충분히 반영하면서, 동시에 전체적인 탐색 과정을 체계적으로 수행할 수 있는 방법이 Bayesian Optimization임.   
(http://research.sualab.com/introduction/practice/2019/02/19/bayesian-optimization-overview-1.html)

<font color="#CC3D3D"><p>
# End