<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Introduction" data-toc-modified-id="Introduction-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Introduction</a></span></li><li><span><a href="#Import-modules" data-toc-modified-id="Import-modules-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Import modules</a></span></li><li><span><a href="#Load-the-dataset" data-toc-modified-id="Load-the-dataset-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>Load the dataset</a></span></li><li><span><a href="#Build-a-reference-model" data-toc-modified-id="Build-a-reference-model-4"><span class="toc-item-num">4&nbsp;&nbsp;</span>Build a reference model</a></span></li><li><span><a href="#Option-#1---gp_minimize" data-toc-modified-id="Option-#1---gp_minimize-5"><span class="toc-item-num">5&nbsp;&nbsp;</span>Option #1 - gp_minimize</a></span></li><li><span><a href="#Option-#2---BayesSearchCV" data-toc-modified-id="Option-#2---BayesSearchCV-6"><span class="toc-item-num">6&nbsp;&nbsp;</span>Option #2 - BayesSearchCV</a></span></li><li><span><a href="#References" data-toc-modified-id="References-7"><span class="toc-item-num">7&nbsp;&nbsp;</span>References</a></span></li></ul></div>

# Introduction
<hr style="border:2px solid black"> </hr>

<div class="alert alert-warning">
<font color=black>

**What?** Using Scikit-optimize for hyperparameter tuning

</font>
</div>

# Import modules
<hr style="border:2px solid black"> </hr>

<div class="alert alert-info">
<font color=black>

- `TypeError: __init__() got an unexpected keyword argument 'iid'`
- [Then follow this link](https://github.com/scikit-optimize/scikit-optimize/issues/978)
- Then if you are happy do (just use virtual environment to keep things tidy): 
- `pip uninstall scikit-learn`
- `pip install scikit-learn==0.23.2`

</font>
</div>

In [None]:
import skopt
import sklearn
from pandas import read_csv
from numpy import mean
from numpy import std
from pandas import read_csv
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import RepeatedStratifiedKFold
from sklearn.svm import SVC
from skopt.space import Integer
from skopt.space import Real
from skopt.space import Categorical
from skopt.utils import use_named_args
from skopt import gp_minimize
from skopt import BayesSearchCV

# Getting rid of the warning messages
import warnings
warnings.filterwarnings("ignore")

In [None]:
print('skopt %s' % skopt.__version__)
print('sklearn %s' % sklearn.__version__)

# Load the dataset
<hr style="border:2px solid black"> </hr>

<div class="alert alert-block alert-info">
<font color=black>

- We will use the ionosphere machine learning dataset. 
- This is a standard ML dataset comprising  351 rows of data with three numerical input variables and a target variable with two class values, e.g. binary classification.
</font>
</div>

In [None]:
dataframe = read_csv("../AI_learning_GitHub/DATASETS/ionosphere.csv", header=None)
dataframe.head(5)

In [None]:
# split into input and output elements
data = dataframe.values
X, y = data[:, :-1], data[:, -1]
print(X.shape, y.shape)

# Build a reference model
<hr style="border:2px solid black"> </hr>

<div class="alert alert-block alert-info">
<font color=black>

- We'll first build a reference model followed by the hyper-pramaterisation of these:
- C, the regularization parameter.
- kernel, the type of kernel used in the model.
- degree, used for the polynomial kernel.
- gamma, used in most other kernels.

</font>
</div>

In [None]:
# define model model
model = SVC()
# define test harness
cv = RepeatedStratifiedKFold(n_splits=10, n_repeats=3, random_state=1)
# evaluate model
m_scores = cross_val_score(model, X, y, scoring='accuracy', cv=cv,
                           n_jobs=-1, error_score='raise')
print('Accuracy: %.3f (%.3f)' % (mean(m_scores), std(m_scores)))

# Option #1 - gp_minimize
<hr style="border:2px solid black"> </hr>

In [None]:
# define the space of hyperparameters to search
search_space = list()
search_space.append(Real(1e-6, 100.0, 'log-uniform', name='C'))
search_space.append(Categorical(['linear', 'poly', 'rbf', 'sigmoid'], name='kernel'))
search_space.append(Integer(1, 5, name='degree'))
search_space.append(Real(1e-6, 100.0, 'log-uniform', name='gamma'))

In [None]:
"""
In our case, we want to evaluate the model using repeated stratified 
10-fold cross-validation on our ionosphere dataset.
"""

In [None]:
# define the function used to evaluate a given configuration
@use_named_args(search_space)
def evaluate_model(**params):
    # configure the model with specific hyperparameters
    model = SVC()
    model.set_params(**params)
    # define test harness
    cv = RepeatedStratifiedKFold(n_splits=10, n_repeats=3, random_state=1)
    # calculate 5-fold cross validation
    result = cross_val_score(model, X, y, cv=cv, n_jobs=-1, scoring='accuracy')
    # calculate the mean of the scores
    estimate = mean(result)
    # convert from a maximizing score to a minimizing score
    return 1.0 - estimate

In [None]:
# perform optimization
result = gp_minimize(evaluate_model, search_space)

In [None]:
dir(result)

In [None]:
# summarizing finding:
print('Best Accuracy: %.3f' % (1.0 - result.fun))
print('Best Parameters: %s' % (result.x))

# Option #2 - BayesSearchCV
<hr style="border:2px solid black"> </hr>

In [None]:
# define search space
params = dict()
params['C'] = (1e-6, 100.0, 'log-uniform')
params['gamma'] = (1e-6, 100.0, 'log-uniform')
params['degree'] = (1,5)
params['kernel'] = ['linear', 'poly', 'rbf', 'sigmoid']

In [None]:
# define evaluation
cv = RepeatedStratifiedKFold(n_splits=10, n_repeats=3, random_state=1)

In [None]:
# define the search
search = BayesSearchCV(estimator=SVC(), search_spaces=params, n_jobs=-1, cv=cv, iid=False)

In [None]:
# perform the search
search.fit(X, y)

In [None]:
dir(search)

In [None]:
# report the best result
print(search.best_score_)
print(search.best_params_)

# References
<hr style="border:2px solid black"> </hr>

<div class="alert alert-warning">
<font color=black>

- Reference: https://machinelearningmastery.com/scikit-optimize-for-hyperparameter-tuning-in-machine-learning/ 

</font>
</div>