Permalink
Switch branches/tags
Find file Copy path
Fetching contributors…
Cannot retrieve contributors at this time
133 lines (105 sloc) 5.63 KB
---
title: "mlrHyperopt"
author: "Jakob Richter"
output:
html_document:
toc: true
toc_float:
collapsed: false
smooth_scroll: false
vignette: >
%\VignetteIndexEntry{mlrHyperopt}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r setup, include = FALSE, cache = FALSE}
set.seed(123)
knitr::opts_chunk$set(cache = TRUE, collapse = FALSE)
library(mlrHyperopt)
configureMlr(show.learner.output = FALSE)
```
This Vignette is supposed to give you a short introduction and a glance at the key features of `mlrHyperopt`.
For updated information make sure to check the GitHub project page:
- [Project Page](https://github.com/jakob-r/mlrHyperopt/)
- [Overview](http://mlrhyperopt.jakob-r.de/parconfigs) of the online database.
## Purpose
The main goal of `mlrHyperopt` is to break boundaries and make Hyperparameter optimization super easy.
Often beginners of machine learning and even experts don't know which parameters have to be tuned for certain machine learning methods.
Sometimes experts also don't necessarily agree on the tuning parameters and their ranges.
This package tries to tackle these problems by offering:
- Recommended parameter space configurations for the most common learners.
- A fully automatic, *zero conf*, one line hyper parameter configuration.
- Option to **upload** a good parameter space configuration for a specific learner to share with colleagues and researchers.
- Possibility to **download** publicly available parameter space configurations for the machine learning method of your choice.
- An extensible interface to use the full variety of [mlr](http://mlr-org.github.com/mlr) of learners and tuning options.
- Namely: *grid search*, *cma-es*, *[model-based optimization](http://mlr-org.github.com/mlr/mlrMBO)* and *random search*.
## Requirements
As the name indicates `mlrHyperopt` relies heavily on `mlr`.
Additionally `mlrMBO` will be used automatically for pure numeric Parameter Spaces of dimension 2 or higher.
Most used objects are documented in `mlr`.
To create your own `task` check the mlr-tutorial on how to create [Learning Tasks](http://mlr-org.github.io/mlr-tutorial/release/html/task/index.html), [Learners](http://mlr-org.github.io/mlr-tutorial/release/html/learner/index.html), [Tuning Parameter Sets for Learners](http://mlr-org.github.io/mlr-tutorial/release/html/tune/index.html), as well as [custom resampling strategies](http://mlr-org.github.io/mlr-tutorial/release/html/resample/index.html).
## Getting started
Hyperparameter Tuning with `mlrHyperopt` can be done in one line:
```{r oneLineExample, message=FALSE, warning=FALSE}
library(mlrHyperopt)
res = hyperopt(iris.task, learner = "classif.randomForest")
res
```
To obtain full control of what is happening you can define every argument yourself or just depend partially on the automatic processes.
```{r detailedExample, warning=FALSE}
pc = generateParConfig(learner = "classif.randomForest")
# The tuning parameter set:
getParConfigParSet(pc)
# Setting constant values:
pc = setParConfigParVals(pc, par.vals = list(mtry = 3))
hc = generateHyperControl(task = iris.task, par.config = pc)
# Inspecting the resamling strategy used for tuning
getHyperControlResampling(hc)
# Changing the resampling strategy
hc = setHyperControlResampling(hc, makeResampleDesc("Bootstrap", iters = 3))
# Starting the hyperparameter tuning
res = hyperopt(iris.task, par.config = pc, hyper.control = hc, show.info = FALSE)
res
```
## Sharing Search Spaces
The predefined parameter search spaces in this package do not cover all _learners_ available in [`mlr`](https://github.com/mlr-org/mlr#-machine-learning-in-r) and they don't claim to be the best search spaces either.
So you might want to share your _ParConfig_ which includes the search space as well as constant parameter settings for a certain _mlr learner_ as it will help other people to improve their performances.
Also this way you can share this _ParConfig_ with colleagues and your future self.
At the same time you can benefit from the [online data base](http://mlrhyperopt.jakob-r.de/status.php) if you want to use a new _mlr learner_ for which you are not aware of tunable parameters.
### Uploading a Parameter Configuration
To upload a Parameter Configuration that consists of a _ParamSet_, an associated _learner_ (either given by the general `learner.name` or the concrete `learner`.) and optionally some specific parameter settings (`par.vals`).
Using just the `learner.name` indicates that you might want to use this _ParConfig_ for the regression as well as the classification version of this learner.
```{r upload, eval=FALSE}
par.set = makeParamSet(
makeIntegerParam(
id = "mtry",
lower = expression(floor(p^0.25)),
upper = expression(ceiling(p^0.75)),
default = expression(round(p^0.5))),
keys = "p")
par.config = makeParConfig(
par.set = par.set,
par.vals = list(ntree = 200),
learner.name = "randomForest"
)
uploadParConfig(par.config, "jon.doe@example.com")
```
With this id you can later download this specific _ParConfig_.
### Downloading Parameter Configurations
You can download a specific parameter configuration using the id:
```{r downloadParConfig}
downloadParConfig("1")
```
If you looking for parameter configurations for your learner you can simply run:
```{r surveyServer}
my.learner = makeLearner("classif.svm")
# only classif svm
svm.configs = downloadParConfigs(learner.class = getLearnerClass(my.learner))
svm.configs
# all svm
svm.configs = downloadParConfigs(learner.name = getLearnerName(my.learner))
```
You can also query for custom key value pair:
```{r surveyServerCustom}
user.configs = downloadParConfigs(custom.query = list("user_email"="jon.doe@example.com"))
```