-
Notifications
You must be signed in to change notification settings - Fork 79
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
1 changed file
with
176 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,176 @@ | ||
|
||
In this example we will use Optunity to optimize hyperparameters for a | ||
support vector machine classifier (SVC) in scikit-learn. We will learn a | ||
model to distinguish digits 8 and 9 in the MNIST data set in two | ||
settings | ||
|
||
- `tune SVM with RBF kernel <#rbf>`__ | ||
- `tune SVM with RBF, polynomial or linear kernel <#all>`__, that is | ||
choose the kernel function and its hyperparameters at once | ||
|
||
.. code:: python | ||
import optunity | ||
import optunity.metrics | ||
# comment this line if you are running the notebook | ||
%matplotlib inline | ||
import matplotlib.pyplot as plt | ||
import sklearn.svm | ||
import numpy as np | ||
Create the data set: we use the MNIST data set and will build models to | ||
distinguish digits 8 and 9. | ||
|
||
.. code:: python | ||
from sklearn.datasets import load_digits | ||
digits = load_digits() | ||
n = digits.data.shape[0] | ||
positive_digit = 8 | ||
negative_digit = 9 | ||
positive_idx = [i for i in range(n) if digits.target[i] == positive_digit] | ||
negative_idx = [i for i in range(n) if digits.target[i] == negative_digit] | ||
# add some noise to the data to make it a little challenging | ||
original_data = digits.data[positive_idx + negative_idx, ...] | ||
data = original_data + 5 * np.random.randn(original_data.shape[0], original_data.shape[1]) | ||
labels = [True] * len(positive_idx) + [False] * len(negative_idx) | ||
First, lets see the performance of an SVC with default hyperparameters. | ||
|
||
.. code:: python | ||
# compute area under ROC curve of default parameters | ||
@optunity.cross_validated(x=data, y=labels, num_folds=5) | ||
def svm_default_auroc(x_train, y_train, x_test, y_test): | ||
model = sklearn.svm.SVC().fit(x_train, y_train) | ||
decision_values = model.decision_function(x_test) | ||
auc = optunity.metrics.roc_auc(y_test, decision_values) | ||
return auc | ||
svm_default_auroc() | ||
.. parsed-literal:: | ||
0.7369062164842962 | ||
Tune SVC with RBF kernel | ||
========================= | ||
|
||
In order to use Optunity to optimize hyperparameters, we start by | ||
defining the objective function. We will use 5-fold cross-validated area | ||
under the ROC curve. For now, lets restrict ourselves to the RBF kernel | ||
and optimize :math:`C` and :math:`\gamma`. | ||
|
||
We start by defining the objective function ``svm_rbf_tuned_auroc()``, | ||
which accepts :math:`C` and :math:`\gamma` as arguments. | ||
|
||
.. code:: python | ||
#we will make the cross-validation decorator once, so we can reuse it later for the other tuning task | ||
# by reusing the decorator, we get the same folds etc. | ||
cv_decorator = optunity.cross_validated(x=data, y=labels, num_folds=5) | ||
def svm_rbf_tuned_auroc(x_train, y_train, x_test, y_test, C, gamma): | ||
model = sklearn.svm.SVC(C=C, gamma=gamma).fit(x_train, y_train) | ||
decision_values = model.decision_function(x_test) | ||
auc = optunity.metrics.roc_auc(y_test, decision_values) | ||
return auc | ||
svm_rbf_tuned_auroc = cv_decorator(svm_rbf_tuned_auroc) | ||
# this is equivalent to the more common syntax below | ||
# @optunity.cross_validated(x=data, y=labels, num_folds=5) | ||
# def svm_rbf_tuned_auroc... | ||
svm_rbf_tuned_auroc(C=1.0, gamma=1.0) | ||
.. parsed-literal:: | ||
0.5 | ||
Now we can use Optunity to find the hyperparameters that maximize AUROC. | ||
|
||
.. code:: python | ||
optimal_rbf_pars, info, _ = optunity.maximize(svm_rbf_tuned_auroc, num_evals=150, C=[0, 10], gamma=[0, 0.1]) | ||
# when running this outside of IPython we can parallelize via optunity.pmap | ||
# optimal_rbf_pars, _, _ = optunity.maximize(svm_rbf_tuned_auroc, 150, C=[0, 10], gamma=[0, 0.1], pmap=optunity.pmap) | ||
print("Optimal parameters: " + str(optimal_rbf_pars)) | ||
print("AUROC of tuned SVM with RBF kernel: %1.3f" % info.optimum) | ||
.. parsed-literal:: | ||
Optimal parameters: {'C': 2.4115966784371956, 'gamma': 0.0008425236216627763} | ||
AUROC of tuned SVM with RBF kernel: 0.982 | ||
Tune SVC without deciding the kernel in advance | ||
================================================ | ||
|
||
In the previous part we choose to use an RBF kernel. Even though the RBF | ||
kernel is known to work well for a large variety of problems (and | ||
yielded good accuracy here), our choice was somewhat arbitrary. | ||
|
||
We will now use Optunity's conditional hyperparameter optimization | ||
feature to optimize over all kernel functions and their associated | ||
hyperparameters at once. This requires us to define the search space. | ||
|
||
.. code:: python | ||
space = {'kernel': {'linear': {'C': [0, 2]}, | ||
'rbf': {'gamma': [0, 0.1], 'C': [0, 10]}, | ||
'poly': {'degree': [2, 5], 'C': [0, 5], 'coef0': [0, 2]} | ||
} | ||
} | ||
We will also have to modify the objective function to cope with | ||
conditional hyperparameters. The reason we need to do this explicitly is | ||
because scikit-learn doesn't like dealing with ``None`` values for | ||
irrelevant hyperparameters (e.g. ``degree`` when using an RBF kernel). | ||
Optunity will set all irrelevant hyperparameters in a given set to | ||
``None``. | ||
|
||
.. code:: python | ||
def train_model(x_train, y_train, kernel, C, gamma, degree, coef0): | ||
"""A generic SVM training function, with arguments based on the chosen kernel.""" | ||
if kernel == 'linear': | ||
model = sklearn.svm.SVC(kernel=kernel, C=C) | ||
elif kernel == 'poly': | ||
model = sklearn.svm.SVC(kernel=kernel, C=C, degree=degree, coef0=coef0) | ||
elif kernel == 'rbf': | ||
model = sklearn.svm.SVC(kernel=kernel, C=C, gamma=gamma) | ||
else: | ||
raise ArgumentError("Unknown kernel function: %s" % kernel) | ||
model.fit(x_train, y_train) | ||
return model | ||
def svm_tuned_auroc(x_train, y_train, x_test, y_test, kernel='linear', C=0, gamma=0, degree=0, coef0=0): | ||
model = train_model(x_train, y_train, kernel, C, gamma, degree, coef0) | ||
decision_values = model.decision_function(x_test) | ||
return optunity.metrics.roc_auc(y_test, decision_values) | ||
svm_tuned_auroc = cv_decorator(svm_tuned_auroc) | ||
Now we are ready to go and optimize both kernel function and associated | ||
hyperparameters! | ||
|
||
.. code:: python | ||
optimal_svm_pars, info, _ = optunity.maximize_structured(svm_tuned_auroc, space, num_evals=150) | ||
print("Optimal parameters" + str(optimal_svm_pars)) | ||
print("AUROC of tuned SVM: %1.3f" % info.optimum) | ||
.. parsed-literal:: | ||
Optimal parameters{'kernel': 'poly', 'C': 0.94482421875, 'coef0': 1.2373046875, 'gamma': None, 'degree': 3.99072265625} | ||
AUROC of tuned SVM: 0.980 | ||