Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change default sampling algorithm #2964

Closed
3 of 6 tasks
not522 opened this issue Sep 28, 2021 · 4 comments
Closed
3 of 6 tasks

Change default sampling algorithm #2964

not522 opened this issue Sep 28, 2021 · 4 comments
Labels
feature Change that does not break compatibility, but affects the public interfaces. no-stale Exempt from stale bot optuna.samplers Related to the `optuna.samplers` submodule. This is automatically labeled by github-actions.

Comments

@not522
Copy link
Member

not522 commented Sep 28, 2021

Motivation

The default sampler, TPESampler, has options expected to improve optimization performance in many situations. For Optuna v3, we consider changing the default options to enjoy effective algorithms.

Description

@not522 not522 added feature Change that does not break compatibility, but affects the public interfaces. no-stale Exempt from stale bot v3 Issue/PR for Optuna version 3. labels Sep 28, 2021
@Crissman Crissman added Hacktoberfest Good first issues to participate in Hacktoberfest contribution-welcome Issue that welcomes contribution. labels Sep 30, 2021
@HideakiImamura
Copy link
Member

Benchmark Task Candidates and Scenarios.

Now, we have the following benchmark tasks candidates.

  • kurobako: HPO bench, NAS bench, and SigOpt functions in it.
  • Bayesmark: hyperparameter tuning tasks of sklearn models in it.

And, we would like to consider the following scenarios for each benchmark task.

  • Run as many trials as possible, and create tables for the best_param, best_value, etc. at each step (e.g., 100 trials increments).
  • Run the optimization not once but multiple times, e.g. 50 times. The optimization history plot should be given with error bar. The parallel coordinate plot and parameter importance plot will be shown at the maximum, minimum, median of the best_value achieved.
  • Experiment with several sampler's parameters such as with or without constant liar, with or without multivariate, with or without distributed optimization, with or without number of parallels... etc.

@toshihikoyanase toshihikoyanase removed the Hacktoberfest Good first issues to participate in Hacktoberfest label Nov 11, 2021
@HideakiImamura HideakiImamura added the optuna.samplers Related to the `optuna.samplers` submodule. This is automatically labeled by github-actions. label Dec 20, 2021
@g-votte g-votte removed the contribution-welcome Issue that welcomes contribution. label May 20, 2022
@HideakiImamura
Copy link
Member

HideakiImamura commented Aug 9, 2022

I have collected exhaustive benchmark results to review Optuna's sampler defaults. The results and discussion are described here. Note that the experiments were conducted targeting single-objective optimization. The benchmark environment for multi-objective optimization is still under development, and the collection of exhaustive benchmark results is a future work.

The following three types of experiments were performed

  • Determination of the default argument for each sampler and determination of the default sampler for Optuna from several samplers. Verification of the multivariate argument, especially for TPESampler.
  • Checking that there is no speed reduction in the default arguments.
  • Benchmarking during distributed optimization. In particular, verification of the constant_liar argument of TPESampler.

TL;DR

  • Benchmark experiments showed that the current TPESampler performs well in many settings. In particular, we found that setting the multivariate option to True improved performance in many cases. However, we also found that different samplers are good at different types of problems and number of trials.
  • The default value of the multivariate option in TPESampler is currently False, but setting it to True sometimes resulted in worse performance on high-dimensional problems. Since changing the default argument has a significant impact on users, we decided to be conservative and not change the default value at this stage.
  • It was found that changing the multivariate argument would rather speed up sampling.
  • In the case of distributed optimization, we found that setting the constant_liar option of TPESampler to True does not necessarily improve performance. For some types of problems, BoTorchSampler was found to perform better.
  • The default value of the constant_liar option of TPESampler is currently False, and we decided not to change it to True.

Summary of Benchmark Environment

During the development of Optuna V3 we have prepared several backend benchmark scripts in optuna/benchmarks. Thank you!!! @kei-mo, @xadrianzetx, @drumehiron

These benchmark scripts can be run locally or on GitHub Actions. Since the experiments we conducted this time dealt with a very large number of studies, trials, problems, and samplers, it was difficult to complete all the experiments on GitHub Actions, so we conducted the experiments on a computational cluster owned by Preferred Networks, Inc. However, you can easily reproduce our experiments using our benchmark scripts and the GitHub Actions runtime environment. (Of course, due to GitHub Actions resource limitations, this will be a partial reproduction.) After reading the details of the experiment described below, if you have any doubts about its reproducibility, you can always try to reproduce the experiment. Also, if you are interested in settings not covered in the following experiments, we encourage you to conduct your own experiments. For actual instructions on how to perform them, please click here.

Summary of Benchmark Experiment

Study and Trial

We evaluated 1000 trials for each combination of problem and sampler to obtain a study. However, for the BoTorchSampler alone, there were only 100 trials. We evaluated 100studies for each configuration.

Problems

We have prepared a total of 178 problems. They are divided into four main types: hpobench, nasbench, bayesmark, and sigopt. The backend of hpobench, nasbench's NASBench 101, and sigopt is software called kurobako, which is part of the Optuna organization. The backend for nasbench NASBench-201 is software called naslib. bayesmark's backend is software called bayesmark. The breakdown is 4 hpobench, 4 nasbench, 35 bayesmark, and 135 sigopt. For higher dimensional problems, especially those with more than 10 dimensions, sigopt contains 14 problems each with 10, 30, 50, and 100 dimensions, and one each with 11 and 12 dimensions.

The table below summarizes the characteristics of each problem.

Problem Table
Name Type Dimension of search space Type of search space Is pruning possible? Is distributed optimization possible?
HPOBench (naval) hpobench 9 int, categorical yes yes
HPOBench (parkinsons) hpobench 9 int, categorical yes yes
HPOBench (protein) hpobench 9 int, categorical yes yes
HPOBench (slice) hpobench 9 int, categorical yes yes
NASBench 101 nasbench 7 float, int, categorical yes yes
NASBench 201 (cifar10) nasbench 6 categorical yes yes
NASBench 201 (cifar100) nasbench 6 categorical yes yes
NASBench 201 (ImageNet16-120) nasbench 6 categorical yes yes
HPO bayesmark (breast, kNN) bayesmark 2 int no no
HPO bayesmark (breast, SVM) bayesmark 3 float no no
HPO bayesmark (breast, DT) bayesmark 6 float, int no no
HPO bayesmark (breast, RF) bayesmark 6 float, int no no
HPO bayesmark (breast, MLP-sgd) bayesmark 8 float, int no no
HPO bayesmark (breast, ada) bayesmark 2 float, int no no
HPO bayesmark (breast, linear) bayesmark 2 float no no
HPO bayesmark (digits, kNN) bayesmark 2 int no no
HPO bayesmark (digits, SVM) bayesmark 3 float no no
HPO bayesmark (digits, DT) bayesmark 6 float, int no no
HPO bayesmark (digits, RF) bayesmark 6 float, int no no
HPO bayesmark (digits, MLP-sgd) bayesmark 8 float, int no no
HPO bayesmark (digits, ada) bayesmark 2 float, int no no
HPO bayesmark (digits, linear) bayesmark 2 float no no
HPO bayesmark (iris, kNN) bayesmark 2 int no no
HPO bayesmark (iris, SVM) bayesmark 3 float no no
HPO bayesmark (iris, DT) bayesmark 6 float, int no no
HPO bayesmark (iris, RF) bayesmark 6 float, int no no
HPO bayesmark (iris, MLP-sgd) bayesmark 8 float, int no no
HPO bayesmark (iris, ada) bayesmark 2 float, int no no
HPO bayesmark (iris, linear) bayesmark 2 float no no
HPO bayesmark (wine, kNN) bayesmark 2 int no no
HPO bayesmark (wine, SVM) bayesmark 3 float no no
HPO bayesmark (wine, DT) bayesmark 6 float, int no no
HPO bayesmark (wine, RF) bayesmark 6 float, int no no
HPO bayesmark (wine, MLP-sgd) bayesmark 8 float, int no no
HPO bayesmark (wine, ada) bayesmark 2 float, int no no
HPO bayesmark (wine, linear) bayesmark 2 float no no
HPO bayesmark (diabetes, kNN) bayesmark 2 int no no
HPO bayesmark (diabetes, SVM) bayesmark 3 float no no
HPO bayesmark (diabetes, DT) bayesmark 6 float, int no no
HPO bayesmark (diabetes, RF) bayesmark 6 float, int no no
HPO bayesmark (diabetes, MLP-sgd) bayesmark 8 float, int no no
HPO bayesmark (diabetes, ada) bayesmark 2 float, int no no
HPO bayesmark (diabetes, linear) bayesmark 2 float no no
ackley (2dim) sigopt 2 float no yes
ackley (5dim) sigopt 5 float no yes
ackley (10dim) sigopt 10 float no yes
ackley (30dim) sigopt 30 float no yes
ackley (50dim) sigopt 50 float no yes
ackley (100dim) sigopt 100 float no yes
adjiman sigopt 2 float no yes
alpine02 sigopt 2 float no yes
branin02 sigopt 2 float no yes
bukin06 sigopt 2 float no yes
carrom-table sigopt 2 float no yes
csendes (2dim) sigopt 2 float no yes
csendes (5dim) sigopt 5 float no yes
csendes (10dim) sigopt 10 float no yes
csendes (30dim) sigopt 30 float no yes
csendes (50dim) sigopt 50 float no yes
csendes (100dim) sigopt 100 float no yes
deb02 (2dim) sigopt 2 float no yes
deb02 (5dim) sigopt 5 float no yes
deb02 (10dim) sigopt 10 float no yes
deb02 (30dim) sigopt 30 float no yes
deb02 (50dim) sigopt 50 float no yes
deb02 (100dim) sigopt 100 float no yes
deflected-corrugated-spring (2dim) sigopt 2 float no yes
deflected-corrugated-spring (5dim) sigopt 5 float no yes
deflected-corrugated-spring (10dim) sigopt 10 float no yes
deflected-corrugated-spring (30dim) sigopt 30 float no yes
deflected-corrugated-spring (50dim) sigopt 50 float no yes
deflected-corrugated-spring (100dim) sigopt 100 float no yes
easom (2dim) sigopt 2 float no yes
easom (5dim) sigopt 5 float no yes
easom (10dim) sigopt 10 float no yes
easom (30dim) sigopt 30 float no yes
easom (50dim) sigopt 50 float no yes
easom (100dim) sigopt 100 float no yes
exponential (2dim) sigopt 2 float no yes
exponential (5dim) sigopt 5 float no yes
exponential (10dim) sigopt 10 float no yes
exponential (30dim) sigopt 30 float no yes
exponential (50dim) sigopt 50 float no yes
exponential (100dim) sigopt 100 float no yes
hartmann3 sigopt 2 float no yes
hartmann6 sigopt 6 float no yes
helical-valley sigopt 3 float no yes
himmel-blau sigopt 2 float no yes
lennard-jones6 sigopt 6 float no yes
mc-court01 sigopt 7 float no yes
mc-court02 sigopt 7 float no yes
mc-court03 sigopt 9 float no yes
mc-court06 sigopt 5 float no yes
mc-court07 sigopt 6 float no yes
mc-court08 sigopt 4 float no yes
mc-court09 sigopt 3 float no yes
mc-court10 sigopt 8 float no yes
mc-court11 sigopt 8 float no yes
mc-court12 sigopt 7 float no yes
mc-court13 sigopt 3 float no yes
mc-court14 sigopt 3 float no yes
mc-court16 sigopt 4 float no yes
mc-court17 sigopt 7 float no yes
mc-court18 sigopt 8 float no yes
mc-court19 sigopt 2 float no yes
mc-court20 sigopt 2 float no yes
mc-court22 sigopt 5 float no yes
mc-court23 sigopt 6 float no yes
mc-court26 sigopt 3 float no yes
mc-court27 sigopt 3 float no yes
mc-court28 sigopt 4 float no yes
michalewicz (2dim) sigopt 2 float no yes
michalewicz (4dim) sigopt 4 float no yes
michalewicz (8dim) sigopt 8 float no yes
michalewicz (12dim) sigopt 12 float no yes
mishra06 sigopt 2 float no yes
ned01 sigopt 2 float no yes
odd-square sigopt 2 float no yes
parsopoulos sigopt 2 float no yes
pinter (2dim) sigopt 2 float no yes
pinter (5dim) sigopt 5 float no yes
pinter (10dim) sigopt 10 float no yes
pinter (30dim) sigopt 30 float no yes
pinter (50dim) sigopt 50 float no yes
pinter (100dim) sigopt 100 float no yes
plateau (2dim) sigopt 2 float no yes
plateau (5dim) sigopt 5 float no yes
plateau (10dim) sigopt 10 float no yes
plateau (30dim) sigopt 30 float no yes
plateau (50dim) sigopt 50 float no yes
plateau (100dim) sigopt 100 float no yes
powell sigopt 4 float no yes
problem03 sigopt 1 float no yes
rastrigin sigopt 8 float no yes
rosenbrock-log sigopt 11 float no yes
sargan (2dim) sigopt 2 float no yes
sargan (5dim) sigopt 5 float no yes
sargan (10dim) sigopt 10 float no yes
sargan (30dim) sigopt 30 float no yes
sargan (50dim) sigopt 50 float no yes
sargan (100dim) sigopt 100 float no yes
schwefel20 (2dim) sigopt 2 float no yes
schwefel20 (5dim) sigopt 5 float no yes
schwefel20 (10dim) sigopt 10 float no yes
schwefel20 (30dim) sigopt 30 float no yes
schwefel20 (50dim) sigopt 50 float no yes
schwefel20 (100dim) sigopt 100 float no yes
schwefel36 sigopt 2 float no yes
shekel05 sigopt 4 float no yes
shekel07 sigopt 4 float no yes
six-hump-camel sigopt 2 float no yes
sphere (2dim) sigopt 2 float no yes
sphere (5dim) sigopt 5 float no yes
sphere (10dim) sigopt 10 float no yes
sphere (30dim) sigopt 30 float no yes
sphere (50dim) sigopt 50 float no yes
sphere (100dim) sigopt 100 float no yes
styblinski-tang (2dim) sigopt 2 float no yes
styblinski-tang (5dim) sigopt 5 float no yes
styblinski-tang (10dim) sigopt 10 float no yes
styblinski-tang (30dim) sigopt 30 float no yes
styblinski-tang (50dim) sigopt 50 float no yes
styblinski-tang (100dim) sigopt 100 float no yes
trid sigopt 6 float no yes
tripod sigopt 2 float no yes
weierstrass (2dim) sigopt 2 float no yes
weierstrass (5dim) sigopt 5 float no yes
weierstrass (10dim) sigopt 10 float no yes
weierstrass (30dim) sigopt 30 float no yes
weierstrass (50dim) sigopt 50 float no yes
weierstrass (100dim) sigopt 100 float no yes
xor sigopt 9 float no yes
yao-liu (2dim) sigopt 2 float no yes
yao-liu (5dim) sigopt 5 float no yes
yao-liu (10dim) sigopt 10 float no yes
yao-liu (30dim) sigopt 30 float no yes
yao-liu (50dim) sigopt 50 float no yes
yao-liu (100dim) sigopt 100 float no yes

Samplers

Optuna has a variety of samplers implemented, but here we focused our experiments on five of the most promising ones for practical use: TPESampler, CmaEsSampler, QMCSampler, BoTorchSampler, and RandomSampler. GridSampler was excluded from the experiments for implementation reasons. NSGAIISampler was excluded from the experiment because it is a sampler mainly used for multi-objective optimization.

Each sampler has a variety of arguments, but in this experiment we focused on those that are most likely to affect performance or whose behavior is not well understood when the algorithm's behavior changes when the argument is changed.

For each sampler, the table below summarizes the arguments that were varied in this experiment. For more detailed information, please refer to the Optuna documentation. https://optuna.readthedocs.io/en/latest/reference/samplers/index.html

TPESampler

Name of argument Type Default value List of tried values
n_startup_trials int 10 10, 100, 1000
n_ei_candidates int 24 24, 100, 1000
multivariate bool False False, True
constant_liar bool False False, True

CmaEsSampler

Name of argument Type Default value List of tried values
n_startup_trials int 1 1, 10, 100, 1000
restart_strategy Optional[str] None None, "ipop"

QMCSampler

Name of argument Type Default value List of tried values
qmc_type str "sobol" "sobol", "halton"

BoTorchSampler

Name of argument Type Default value List of tried values
n_startup_trials int 10 10, 100

RandomSampler

Metrics

For each of n_trials = 25, 50, 75, 100, and 1000, the evaluation is performed after the number of trials has been run. The winning matrix is a n_problems x n_samplers matrix whose ij component is a non-negative integer that indicates "how many times sampler j beat other samplers in problem i". "The sampler j beat sampler k (j ! = k) in problem i" means the following.

  • The (i, j) and for each of the 100 studies, we calculate the following.
    • best value of the objective function value up to n_trials
    • AUC of the objective function values up to n_trials
      • The quantity defined below as n = n_trials. (Approximate value of the lower area of the optimization history plot)
      • (1/n) sum_{i=1}^n (f_best(i) - LB), where f_best(i) is the best value of the objective function values up to i trials and LB is the lower boundary of the objective function values defined for each problem.
  • For (i, j), a vector v_best(i,j) of 100-dimensional best_values and a vector v_auc(i,j) of AUCs are obtained.
  • For sampler j and sampler k, v_best(i,j) and v_best(i,k) are first used by Mann–Whitney U test to test "for i, is the best value of j worse than the best value of k"? If the p-value is smaller than the predetermined value (0.05), the hypothesis is rejected and the best value of j is better, which means that j wins.
  • If not, the next step is to use v_auc(i,j) and v_auc(i,k) by Mann–Whitney U test to test "for i, is the AUC for j worse than the AUC for k? If the p-value is smaller than the predetermined value, the hypothesis is rejected and j wins because j's AUC is better.
  • If the hypothesis is still not rejected, a tie is declared.

Using the constructed Winning matrix, the score of each sampler j is evaluated in the following three ways.
(1) Borda score: The number of samplers with a smaller number of wins than sampler j for each problem is added up for all problems.
(2) First score: The number of problems for which sampler j has the largest number of wins.
(3) Top3 score: The number of problems for which sampler j is within the top 3 winners.

Result of First Experiment

Summary

Roughly, we conclude that TPE is stronger for HPO&NAS-based tasks, BoTorch and CMA-ES are stronger for sigopt function-based tasks, and QMC is stronger for high-dimensional problems (d>=10) among sigopt function-based tasks. The results do not change significantly with changes in n_trials.

I will explain the difference between different types of PROBLEM. TPE is strong for HPO/NAS tasks such as kurobako and naslib, and univariate TPE is strong for n=25, but the performance difference with multivariate TPE is small. In other cases, multivariate TPE is stronger. TPE is superior in the bayesmark HPO task, and QMC(sobol) is also superior at n=25. There is almost no performance difference between univariate TPE and multivariate TPE, and they win or lose depending on n_trials. BoTorch is strong in the sigopt function tasks, and CMA-ES is also strong, though not as strong as BoTorch. n=1000 indicates that CMA-ES is the strongest since there are no results for BoTorch. QMC(sobol) is strong for high-dimensional problems of 10 or more dimensions in the sigopt functions, and CMA-ES is also strong, though not as strong as QMC(sobol).

Since univariate TPE or multivariate TPE is better depends on n_trials and the type of problem, it is necessary to consider what type of problem you prioritize performance on. Below are the settings and scores (univariate, multivariate) for each setting.

Type of problem n_trials Borda score First score Top3 score
kurobako/naslib 25 18, 17 4, 3 8, 8
50 37, 49 1, 5 8, 8
75 38, 50 2, 6 8, 8
100 38, 53 0, 6 8, 8
1000 40, 63 1, 8 8, 8
bayesmark 25 129, 124 8, 7 33, 32
50 162, 159 14, 18 34, 33
75 163, 165 17, 22 33, 34
100 164, 174 17, 20 34, 34
1000 210, 203 20, 16 34, 34
sigopt 25 412, 446 12, 11 80, 79
50 486, 532 12, 15 81, 91
75 546, 545 15, 12 93, 92
100 531, 504 16, 11 108, 108
1000 594, 527 47, 28 127, 120
sigopt(d>=10) 25 121, 131 3, 1 23, 22
50 154, 162 3, 4 25, 29
75 174, 166 4, 3 32, 30
100 189, 164 5, 3 38, 38
1000 217, 193 9, 2 51, 47

Details

First, as a primary experiment, we fixed only the constant_liar argument of the TPESampler and varied the other arguments as described in the Sampler section. In all, 31 different samplers were compared. Only kurobako and bayesmark problems were included in this experiment to get a rough idea of trends. Below are the values of each score obtained.

Benchmark results: n_trials = 25
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 798 3 18
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 839 9 27
TPESampler {"n_startup_trials":10,"n_ei_candidates":100,"multivariate":false,"constant_liar":false} NopPruner {} 747 7 17
TPESampler {"n_startup_trials":10,"n_ei_candidates":100,"multivariate":true,"constant_liar":false} NopPruner {} 827 10 26
TPESampler {"n_startup_trials":10,"n_ei_candidates":1000,"multivariate":false,"constant_liar":false} NopPruner {} 713 2 17
TPESampler {"n_startup_trials":10,"n_ei_candidates":1000,"multivariate":true,"constant_liar":false} NopPruner {} 780 11 25
TPESampler {"n_startup_trials":100,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 298 2 6
TPESampler {"n_startup_trials":100,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 300 1 5
TPESampler {"n_startup_trials":100,"n_ei_candidates":100,"multivariate":false,"constant_liar":false} NopPruner {} 283 1 7
TPESampler {"n_startup_trials":100,"n_ei_candidates":100,"multivariate":true,"constant_liar":false} NopPruner {} 343 1 5
TPESampler {"n_startup_trials":100,"n_ei_candidates":1000,"multivariate":false,"constant_liar":false} NopPruner {} 224 1 5
TPESampler {"n_startup_trials":100,"n_ei_candidates":1000,"multivariate":true,"constant_liar":false} NopPruner {} 267 1 7
TPESampler {"n_startup_trials":1000,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 258 1 4
TPESampler {"n_startup_trials":1000,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 272 1 4
TPESampler {"n_startup_trials":1000,"n_ei_candidates":100,"multivariate":false,"constant_liar":false} NopPruner {} 193 1 3
TPESampler {"n_startup_trials":1000,"n_ei_candidates":100,"multivariate":true,"constant_liar":false} NopPruner {} 347 1 4
TPESampler {"n_startup_trials":1000,"n_ei_candidates":1000,"multivariate":false,"constant_liar":false} NopPruner {} 329 1 8
TPESampler {"n_startup_trials":1000,"n_ei_candidates":1000,"multivariate":true,"constant_liar":false} NopPruner {} 254 1 3
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 253 3 8
CmaEsSampler {"n_startup_trials":1,"restart_strategy":"ipop"} NopPruner {} 306 4 8
CmaEsSampler {"n_startup_trials":10,"restart_strategy":null} NopPruner {} 240 1 3
CmaEsSampler {"n_startup_trials":10,"restart_strategy":"ipop"} NopPruner {} 299 2 4
CmaEsSampler {"n_startup_trials":100,"restart_strategy":null} NopPruner {} 306 1 4
CmaEsSampler {"n_startup_trials":100,"restart_strategy":"ipop"} NopPruner {} 283 1 7
CmaEsSampler {"n_startup_trials":1000,"restart_strategy":null} NopPruner {} 266 1 6
CmaEsSampler {"n_startup_trials":1000,"restart_strategy":"ipop"} NopPruner {} 260 1 7
QMCSampler {"qmc_type":"sobol"} NopPruner {} 593 5 14
QMCSampler {"qmc_type":"halton"} NopPruner {} 578 8 14
BoTorchSampler {"n_startup_trials":10} NopPruner {} 658 6 15
BoTorchSampler {"n_startup_trials":100} NopPruner {} 321 1 6
RandomSampler {} NopPruner {} 311 1 6

Benchmark results: n_trials = 50
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 796 6 18
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 908 15 28
TPESampler {"n_startup_trials":10,"n_ei_candidates":100,"multivariate":false,"constant_liar":false} NopPruner {} 857 6 20
TPESampler {"n_startup_trials":10,"n_ei_candidates":100,"multivariate":true,"constant_liar":false} NopPruner {} 900 13 32
TPESampler {"n_startup_trials":10,"n_ei_candidates":1000,"multivariate":false,"constant_liar":false} NopPruner {} 760 4 21
TPESampler {"n_startup_trials":10,"n_ei_candidates":1000,"multivariate":true,"constant_liar":false} NopPruner {} 940 20 30
TPESampler {"n_startup_trials":100,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 168 1 7
TPESampler {"n_startup_trials":100,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 221 2 9
TPESampler {"n_startup_trials":100,"n_ei_candidates":100,"multivariate":false,"constant_liar":false} NopPruner {} 189 2 7
TPESampler {"n_startup_trials":100,"n_ei_candidates":100,"multivariate":true,"constant_liar":false} NopPruner {} 213 2 7
TPESampler {"n_startup_trials":100,"n_ei_candidates":1000,"multivariate":false,"constant_liar":false} NopPruner {} 193 2 8
TPESampler {"n_startup_trials":100,"n_ei_candidates":1000,"multivariate":true,"constant_liar":false} NopPruner {} 205 2 8
TPESampler {"n_startup_trials":1000,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 177 2 7
TPESampler {"n_startup_trials":1000,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 181 2 6
TPESampler {"n_startup_trials":1000,"n_ei_candidates":100,"multivariate":false,"constant_liar":false} NopPruner {} 175 2 8
TPESampler {"n_startup_trials":1000,"n_ei_candidates":100,"multivariate":true,"constant_liar":false} NopPruner {} 226 2 7
TPESampler {"n_startup_trials":1000,"n_ei_candidates":1000,"multivariate":false,"constant_liar":false} NopPruner {} 190 1 7
TPESampler {"n_startup_trials":1000,"n_ei_candidates":1000,"multivariate":true,"constant_liar":false} NopPruner {} 203 1 8
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 391 3 7
CmaEsSampler {"n_startup_trials":1,"restart_strategy":"ipop"} NopPruner {} 415 4 8
CmaEsSampler {"n_startup_trials":10,"restart_strategy":null} NopPruner {} 308 1 5
CmaEsSampler {"n_startup_trials":10,"restart_strategy":"ipop"} NopPruner {} 314 1 6
CmaEsSampler {"n_startup_trials":100,"restart_strategy":null} NopPruner {} 170 2 7
CmaEsSampler {"n_startup_trials":100,"restart_strategy":"ipop"} NopPruner {} 179 2 7
CmaEsSampler {"n_startup_trials":1000,"restart_strategy":null} NopPruner {} 176 2 7
CmaEsSampler {"n_startup_trials":1000,"restart_strategy":"ipop"} NopPruner {} 213 2 9
QMCSampler {"qmc_type":"sobol"} NopPruner {} 585 3 12
QMCSampler {"qmc_type":"halton"} NopPruner {} 513 3 14
BoTorchSampler {"n_startup_trials":10} NopPruner {} 518 9 16
BoTorchSampler {"n_startup_trials":100} NopPruner {} 231 1 9
RandomSampler {} NopPruner {} 165 2 7

Benchmark results: n_trials = 75
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 821 4 20
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 918 13 29
TPESampler {"n_startup_trials":10,"n_ei_candidates":100,"multivariate":false,"constant_liar":false} NopPruner {} 824 5 21
TPESampler {"n_startup_trials":10,"n_ei_candidates":100,"multivariate":true,"constant_liar":false} NopPruner {} 889 11 33
TPESampler {"n_startup_trials":10,"n_ei_candidates":1000,"multivariate":false,"constant_liar":false} NopPruner {} 749 4 19
TPESampler {"n_startup_trials":10,"n_ei_candidates":1000,"multivariate":true,"constant_liar":false} NopPruner {} 972 21 33
TPESampler {"n_startup_trials":100,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 170 1 10
TPESampler {"n_startup_trials":100,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 214 1 10
TPESampler {"n_startup_trials":100,"n_ei_candidates":100,"multivariate":false,"constant_liar":false} NopPruner {} 199 1 8
TPESampler {"n_startup_trials":100,"n_ei_candidates":100,"multivariate":true,"constant_liar":false} NopPruner {} 248 1 10
TPESampler {"n_startup_trials":100,"n_ei_candidates":1000,"multivariate":false,"constant_liar":false} NopPruner {} 259 1 9
TPESampler {"n_startup_trials":100,"n_ei_candidates":1000,"multivariate":true,"constant_liar":false} NopPruner {} 198 1 10
TPESampler {"n_startup_trials":1000,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 231 1 9
TPESampler {"n_startup_trials":1000,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 187 2 9
TPESampler {"n_startup_trials":1000,"n_ei_candidates":100,"multivariate":false,"constant_liar":false} NopPruner {} 236 1 10
TPESampler {"n_startup_trials":1000,"n_ei_candidates":100,"multivariate":true,"constant_liar":false} NopPruner {} 209 1 9
TPESampler {"n_startup_trials":1000,"n_ei_candidates":1000,"multivariate":false,"constant_liar":false} NopPruner {} 280 1 9
TPESampler {"n_startup_trials":1000,"n_ei_candidates":1000,"multivariate":true,"constant_liar":false} NopPruner {} 162 1 7
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 465 4 12
CmaEsSampler {"n_startup_trials":1,"restart_strategy":"ipop"} NopPruner {} 482 3 10
CmaEsSampler {"n_startup_trials":10,"restart_strategy":null} NopPruner {} 365 2 8
CmaEsSampler {"n_startup_trials":10,"restart_strategy":"ipop"} NopPruner {} 413 2 9
CmaEsSampler {"n_startup_trials":100,"restart_strategy":null} NopPruner {} 213 1 7
CmaEsSampler {"n_startup_trials":100,"restart_strategy":"ipop"} NopPruner {} 212 1 10
CmaEsSampler {"n_startup_trials":1000,"restart_strategy":null} NopPruner {} 202 1 9
CmaEsSampler {"n_startup_trials":1000,"restart_strategy":"ipop"} NopPruner {} 254 1 9
QMCSampler {"qmc_type":"sobol"} NopPruner {} 558 6 15
QMCSampler {"qmc_type":"halton"} NopPruner {} 474 3 9
BoTorchSampler {"n_startup_trials":10} NopPruner {} 551 10 17
BoTorchSampler {"n_startup_trials":100} NopPruner {} 219 1 9
RandomSampler {} NopPruner {} 261 1 9

Benchmark results: n_trials = 100
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 757 6 22
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 907 17 30
TPESampler {"n_startup_trials":10,"n_ei_candidates":100,"multivariate":false,"constant_liar":false} NopPruner {} 778 5 25
TPESampler {"n_startup_trials":10,"n_ei_candidates":100,"multivariate":true,"constant_liar":false} NopPruner {} 938 16 34
TPESampler {"n_startup_trials":10,"n_ei_candidates":1000,"multivariate":false,"constant_liar":false} NopPruner {} 745 5 22
TPESampler {"n_startup_trials":10,"n_ei_candidates":1000,"multivariate":true,"constant_liar":false} NopPruner {} 955 20 36
TPESampler {"n_startup_trials":100,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 240 1 12
TPESampler {"n_startup_trials":100,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 266 2 11
TPESampler {"n_startup_trials":100,"n_ei_candidates":100,"multivariate":false,"constant_liar":false} NopPruner {} 205 2 12
TPESampler {"n_startup_trials":100,"n_ei_candidates":100,"multivariate":true,"constant_liar":false} NopPruner {} 189 1 12
TPESampler {"n_startup_trials":100,"n_ei_candidates":1000,"multivariate":false,"constant_liar":false} NopPruner {} 185 2 12
TPESampler {"n_startup_trials":100,"n_ei_candidates":1000,"multivariate":true,"constant_liar":false} NopPruner {} 225 2 12
TPESampler {"n_startup_trials":1000,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 222 1 13
TPESampler {"n_startup_trials":1000,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 236 2 12
TPESampler {"n_startup_trials":1000,"n_ei_candidates":100,"multivariate":false,"constant_liar":false} NopPruner {} 171 1 12
TPESampler {"n_startup_trials":1000,"n_ei_candidates":100,"multivariate":true,"constant_liar":false} NopPruner {} 243 2 12
TPESampler {"n_startup_trials":1000,"n_ei_candidates":1000,"multivariate":false,"constant_liar":false} NopPruner {} 237 1 11
TPESampler {"n_startup_trials":1000,"n_ei_candidates":1000,"multivariate":true,"constant_liar":false} NopPruner {} 225 2 12
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 474 3 14
CmaEsSampler {"n_startup_trials":1,"restart_strategy":"ipop"} NopPruner {} 475 4 13
CmaEsSampler {"n_startup_trials":10,"restart_strategy":null} NopPruner {} 407 2 11
CmaEsSampler {"n_startup_trials":10,"restart_strategy":"ipop"} NopPruner {} 439 1 11
CmaEsSampler {"n_startup_trials":100,"restart_strategy":null} NopPruner {} 275 1 12
CmaEsSampler {"n_startup_trials":100,"restart_strategy":"ipop"} NopPruner {} 237 1 12
CmaEsSampler {"n_startup_trials":1000,"restart_strategy":null} NopPruner {} 187 1 12
CmaEsSampler {"n_startup_trials":1000,"restart_strategy":"ipop"} NopPruner {} 209 1 12
QMCSampler {"qmc_type":"sobol"} NopPruner {} 563 5 17
QMCSampler {"qmc_type":"halton"} NopPruner {} 509 4 15
BoTorchSampler {"n_startup_trials":10} NopPruner {} 553 10 19
BoTorchSampler {"n_startup_trials":100} NopPruner {} 197 2 12
RandomSampler {} NopPruner {} 216 1 12

Benchmark results: n_trials = 1000
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 734 7 16
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 879 13 25
TPESampler {"n_startup_trials":10,"n_ei_candidates":100,"multivariate":false,"constant_liar":false} NopPruner {} 730 4 14
TPESampler {"n_startup_trials":10,"n_ei_candidates":100,"multivariate":true,"constant_liar":false} NopPruner {} 870 13 26
TPESampler {"n_startup_trials":10,"n_ei_candidates":1000,"multivariate":false,"constant_liar":false} NopPruner {} 695 7 18
TPESampler {"n_startup_trials":10,"n_ei_candidates":1000,"multivariate":true,"constant_liar":false} NopPruner {} 871 18 29
TPESampler {"n_startup_trials":100,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 689 4 11
TPESampler {"n_startup_trials":100,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 812 5 20
TPESampler {"n_startup_trials":100,"n_ei_candidates":100,"multivariate":false,"constant_liar":false} NopPruner {} 672 1 10
TPESampler {"n_startup_trials":100,"n_ei_candidates":100,"multivariate":true,"constant_liar":false} NopPruner {} 785 4 19
TPESampler {"n_startup_trials":100,"n_ei_candidates":1000,"multivariate":false,"constant_liar":false} NopPruner {} 606 2 9
TPESampler {"n_startup_trials":100,"n_ei_candidates":1000,"multivariate":true,"constant_liar":false} NopPruner {} 771 4 22
TPESampler {"n_startup_trials":1000,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 326 2 10
TPESampler {"n_startup_trials":1000,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 350 2 10
TPESampler {"n_startup_trials":1000,"n_ei_candidates":100,"multivariate":false,"constant_liar":false} NopPruner {} 321 3 9
TPESampler {"n_startup_trials":1000,"n_ei_candidates":100,"multivariate":true,"constant_liar":false} NopPruner {} 354 2 10
TPESampler {"n_startup_trials":1000,"n_ei_candidates":1000,"multivariate":false,"constant_liar":false} NopPruner {} 308 2 10
TPESampler {"n_startup_trials":1000,"n_ei_candidates":1000,"multivariate":true,"constant_liar":false} NopPruner {} 366 3 11
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 247 1 4
CmaEsSampler {"n_startup_trials":1,"restart_strategy":"ipop"} NopPruner {} 273 1 3
CmaEsSampler {"n_startup_trials":10,"restart_strategy":null} NopPruner {} 260 1 3
CmaEsSampler {"n_startup_trials":10,"restart_strategy":"ipop"} NopPruner {} 279 1 2
CmaEsSampler {"n_startup_trials":100,"restart_strategy":null} NopPruner {} 369 1 3
CmaEsSampler {"n_startup_trials":100,"restart_strategy":"ipop"} NopPruner {} 384 1 3
CmaEsSampler {"n_startup_trials":1000,"restart_strategy":null} NopPruner {} 352 3 10
CmaEsSampler {"n_startup_trials":1000,"restart_strategy":"ipop"} NopPruner {} 328 2 11
QMCSampler {"qmc_type":"sobol"} NopPruner {} 467 7 12
QMCSampler {"qmc_type":"halton"} NopPruner {} 483 3 11
BoTorchSampler {"n_startup_trials":10} NopPruner {} 0 1 1
BoTorchSampler {"n_startup_trials":100} NopPruner {} 0 1 1
RandomSampler {} NopPruner {} 335 2 12

The results of this experiment showed that changing some of the arguments had no significant effect on performance. We fixed them at their defaults and reexamined the results. In all, 10 different samplers were compared. In this experiment, we included all problems from kurobako, bayesmark, and sigopt. To make the results easier to understand, we tabulated the scores for each problem and the results are shown below.

Benchmark results for kurobako: n_trials = 25
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 17 3 8
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 18 4 8
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 8 3 8
CmaEsSampler {"n_startup_trials":1,"restart_strategy":"ipop"} NopPruner {} 8 3 8
CmaEsSampler {"n_startup_trials":10,"restart_strategy":null} NopPruner {} 0 2 8
CmaEsSampler {"n_startup_trials":10,"restart_strategy":"ipop"} NopPruner {} 0 2 8
QMCSampler {"qmc_type":"sobol"} NopPruner {} 18 4 8
QMCSampler {"qmc_type":"halton"} NopPruner {} 0 2 8
BoTorchSampler {"n_startup_trials":10} NopPruner {} 0 2 8
RandomSampler {} NopPruner {} 0 2 8

Benchmark results for bayesmark: n_trials = 25
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 124 7 32
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 129 8 33
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 40 7 21
CmaEsSampler {"n_startup_trials":1,"restart_strategy":"ipop"} NopPruner {} 48 7 21
CmaEsSampler {"n_startup_trials":10,"restart_strategy":null} NopPruner {} 53 3 26
CmaEsSampler {"n_startup_trials":10,"restart_strategy":"ipop"} NopPruner {} 42 3 27
QMCSampler {"qmc_type":"sobol"} NopPruner {} 122 11 30
QMCSampler {"qmc_type":"halton"} NopPruner {} 135 15 30
BoTorchSampler {"n_startup_trials":10} NopPruner {} 121 10 30
RandomSampler {} NopPruner {} 91 5 29

Benchmark results for sigopt: n_trials = 25
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 446 11 79
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 412 12 80
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 634 36 106
CmaEsSampler {"n_startup_trials":1,"restart_strategy":"ipop"} NopPruner {} 638 40 105
CmaEsSampler {"n_startup_trials":10,"restart_strategy":null} NopPruner {} 304 4 77
CmaEsSampler {"n_startup_trials":10,"restart_strategy":"ipop"} NopPruner {} 319 4 78
QMCSampler {"qmc_type":"sobol"} NopPruner {} 562 49 87
QMCSampler {"qmc_type":"halton"} NopPruner {} 299 15 55
BoTorchSampler {"n_startup_trials":10} NopPruner {} 675 50 104
RandomSampler {} NopPruner {} 130 4 40

Benchmark results for sigopt >= 10d: n_trials = 25
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 131 1 22
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 121 3 23
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 304 13 49
CmaEsSampler {"n_startup_trials":1,"restart_strategy":"ipop"} NopPruner {} 308 16 48
CmaEsSampler {"n_startup_trials":10,"restart_strategy":null} NopPruner {} 196 0 40
CmaEsSampler {"n_startup_trials":10,"restart_strategy":"ipop"} NopPruner {} 199 0 40
QMCSampler {"qmc_type":"sobol"} NopPruner {} 375 37 46
QMCSampler {"qmc_type":"halton"} NopPruner {} 126 5 19
BoTorchSampler {"n_startup_trials":10} NopPruner {} 184 3 29
RandomSampler {} NopPruner {} 62 0 11

Benchmark results for all: n_trials = 25
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 587 21 119
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 559 24 121
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 682 46 135
CmaEsSampler {"n_startup_trials":1,"restart_strategy":"ipop"} NopPruner {} 694 50 134
CmaEsSampler {"n_startup_trials":10,"restart_strategy":null} NopPruner {} 357 9 111
CmaEsSampler {"n_startup_trials":10,"restart_strategy":"ipop"} NopPruner {} 361 9 113
QMCSampler {"qmc_type":"sobol"} NopPruner {} 702 64 125
QMCSampler {"qmc_type":"halton"} NopPruner {} 434 32 93
BoTorchSampler {"n_startup_trials":10} NopPruner {} 796 62 142
RandomSampler {} NopPruner {} 221 11 77

Benchmark results for kurobako: n_trials = 50
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 49 5 8
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 37 1 8
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 16 0 8
CmaEsSampler {"n_startup_trials":1,"restart_strategy":"ipop"} NopPruner {} 11 1 6
CmaEsSampler {"n_startup_trials":10,"restart_strategy":null} NopPruner {} 4 0 6
CmaEsSampler {"n_startup_trials":10,"restart_strategy":"ipop"} NopPruner {} 9 0 7
QMCSampler {"qmc_type":"sobol"} NopPruner {} 33 2 8
QMCSampler {"qmc_type":"halton"} NopPruner {} 12 1 6
BoTorchSampler {"n_startup_trials":10} NopPruner {} 0 0 4
RandomSampler {} NopPruner {} 2 0 5

Benchmark results for bayesmark: n_trials = 50
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 159 18 33
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 162 14 34
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 24 7 18
CmaEsSampler {"n_startup_trials":1,"restart_strategy":"ipop"} NopPruner {} 38 7 19
CmaEsSampler {"n_startup_trials":10,"restart_strategy":null} NopPruner {} 31 5 21
CmaEsSampler {"n_startup_trials":10,"restart_strategy":"ipop"} NopPruner {} 33 5 22
QMCSampler {"qmc_type":"sobol"} NopPruner {} 109 11 30
QMCSampler {"qmc_type":"halton"} NopPruner {} 111 12 27
BoTorchSampler {"n_startup_trials":10} NopPruner {} 125 13 31
RandomSampler {} NopPruner {} 90 5 31

Benchmark results for sigopt: n_trials = 50
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 532 15 91
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 486 12 81
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 624 33 103
CmaEsSampler {"n_startup_trials":1,"restart_strategy":"ipop"} NopPruner {} 631 34 104
CmaEsSampler {"n_startup_trials":10,"restart_strategy":null} NopPruner {} 375 12 82
CmaEsSampler {"n_startup_trials":10,"restart_strategy":"ipop"} NopPruner {} 358 9 81
QMCSampler {"qmc_type":"sobol"} NopPruner {} 482 41 71
QMCSampler {"qmc_type":"halton"} NopPruner {} 279 10 46
BoTorchSampler {"n_startup_trials":10} NopPruner {} 744 66 106
RandomSampler {} NopPruner {} 123 5 29

Benchmark results for sigopt >= 10d: n_trials = 50
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 162 4 29
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 154 3 25
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 282 14 49
CmaEsSampler {"n_startup_trials":1,"restart_strategy":"ipop"} NopPruner {} 283 15 49
CmaEsSampler {"n_startup_trials":10,"restart_strategy":null} NopPruner {} 200 3 44
CmaEsSampler {"n_startup_trials":10,"restart_strategy":"ipop"} NopPruner {} 198 3 43
QMCSampler {"qmc_type":"sobol"} NopPruner {} 348 30 44
QMCSampler {"qmc_type":"halton"} NopPruner {} 135 3 20
BoTorchSampler {"n_startup_trials":10} NopPruner {} 188 7 29
RandomSampler {} NopPruner {} 69 0 11

Benchmark results for all: n_trials = 50
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 740 38 132
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 685 27 123
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 664 40 129
CmaEsSampler {"n_startup_trials":1,"restart_strategy":"ipop"} NopPruner {} 680 42 129
CmaEsSampler {"n_startup_trials":10,"restart_strategy":null} NopPruner {} 410 17 109
CmaEsSampler {"n_startup_trials":10,"restart_strategy":"ipop"} NopPruner {} 400 14 110
QMCSampler {"qmc_type":"sobol"} NopPruner {} 624 54 109
QMCSampler {"qmc_type":"halton"} NopPruner {} 402 23 79
BoTorchSampler {"n_startup_trials":10} NopPruner {} 869 79 141
RandomSampler {} NopPruner {} 215 10 65

Benchmark results for kurobako: n_trials = 75
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 50 6 8
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 38 2 8
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 9 1 8
CmaEsSampler {"n_startup_trials":1,"restart_strategy":"ipop"} NopPruner {} 9 1 8
CmaEsSampler {"n_startup_trials":10,"restart_strategy":null} NopPruner {} 6 0 8
CmaEsSampler {"n_startup_trials":10,"restart_strategy":"ipop"} NopPruner {} 6 0 8
QMCSampler {"qmc_type":"sobol"} NopPruner {} 25 2 8
QMCSampler {"qmc_type":"halton"} NopPruner {} 11 1 7
BoTorchSampler {"n_startup_trials":10} NopPruner {} 8 0 4
RandomSampler {} NopPruner {} 3 0 7

Benchmark results for bayesmark: n_trials = 75
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 165 22 34
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 163 17 33
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 13 7 19
CmaEsSampler {"n_startup_trials":1,"restart_strategy":"ipop"} NopPruner {} 15 7 19
CmaEsSampler {"n_startup_trials":10,"restart_strategy":null} NopPruner {} 33 6 21
CmaEsSampler {"n_startup_trials":10,"restart_strategy":"ipop"} NopPruner {} 34 6 21
QMCSampler {"qmc_type":"sobol"} NopPruner {} 118 12 32
QMCSampler {"qmc_type":"halton"} NopPruner {} 76 9 27
BoTorchSampler {"n_startup_trials":10} NopPruner {} 146 16 34
RandomSampler {} NopPruner {} 91 9 31

Benchmark results for sigopt: n_trials = 75
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 545 12 92
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 546 15 93
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 624 32 101
CmaEsSampler {"n_startup_trials":1,"restart_strategy":"ipop"} NopPruner {} 591 30 101
CmaEsSampler {"n_startup_trials":10,"restart_strategy":null} NopPruner {} 411 14 86
CmaEsSampler {"n_startup_trials":10,"restart_strategy":"ipop"} NopPruner {} 413 14 84
QMCSampler {"qmc_type":"sobol"} NopPruner {} 437 34 68
QMCSampler {"qmc_type":"halton"} NopPruner {} 240 7 42
BoTorchSampler {"n_startup_trials":10} NopPruner {} 754 68 105
RandomSampler {} NopPruner {} 112 4 25

Benchmark results for sigopt >= 10d: n_trials = 75
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 166 3 30
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 174 4 32
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 274 15 46
CmaEsSampler {"n_startup_trials":1,"restart_strategy":"ipop"} NopPruner {} 274 16 46
CmaEsSampler {"n_startup_trials":10,"restart_strategy":null} NopPruner {} 211 5 42
CmaEsSampler {"n_startup_trials":10,"restart_strategy":"ipop"} NopPruner {} 212 6 42
QMCSampler {"qmc_type":"sobol"} NopPruner {} 308 28 39
QMCSampler {"qmc_type":"halton"} NopPruner {} 130 2 20
BoTorchSampler {"n_startup_trials":10} NopPruner {} 204 9 30
RandomSampler {} NopPruner {} 62 0 9

Benchmark results for all: n_trials = 75
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 760 40 134
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 747 34 134
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 646 40 128
CmaEsSampler {"n_startup_trials":1,"restart_strategy":"ipop"} NopPruner {} 615 38 128
CmaEsSampler {"n_startup_trials":10,"restart_strategy":null} NopPruner {} 450 20 115
CmaEsSampler {"n_startup_trials":10,"restart_strategy":"ipop"} NopPruner {} 453 20 113
QMCSampler {"qmc_type":"sobol"} NopPruner {} 580 48 108
QMCSampler {"qmc_type":"halton"} NopPruner {} 327 17 76
BoTorchSampler {"n_startup_trials":10} NopPruner {} 908 84 143
RandomSampler {} NopPruner {} 206 13 63

Benchmark results for kurobako: n_trials = 100
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 53 6 8
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 38 0 8
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 9 0 7
CmaEsSampler {"n_startup_trials":1,"restart_strategy":"ipop"} NopPruner {} 11 1 7
CmaEsSampler {"n_startup_trials":10,"restart_strategy":null} NopPruner {} 5 0 6
CmaEsSampler {"n_startup_trials":10,"restart_strategy":"ipop"} NopPruner {} 11 0 7
QMCSampler {"qmc_type":"sobol"} NopPruner {} 25 2 7
QMCSampler {"qmc_type":"halton"} NopPruner {} 19 1 7
BoTorchSampler {"n_startup_trials":10} NopPruner {} 8 0 3
RandomSampler {} NopPruner {} 3 0 6

Benchmark results for bayesmark: n_trials = 100
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 174 20 34
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 164 17 34
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 25 5 23
CmaEsSampler {"n_startup_trials":1,"restart_strategy":"ipop"} NopPruner {} 10 5 22
CmaEsSampler {"n_startup_trials":10,"restart_strategy":null} NopPruner {} 33 5 22
CmaEsSampler {"n_startup_trials":10,"restart_strategy":"ipop"} NopPruner {} 26 5 22
QMCSampler {"qmc_type":"sobol"} NopPruner {} 117 11 31
QMCSampler {"qmc_type":"halton"} NopPruner {} 106 9 28
BoTorchSampler {"n_startup_trials":10} NopPruner {} 148 15 33
RandomSampler {} NopPruner {} 82 8 29

Benchmark results for sigopt: n_trials = 100
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 504 11 108
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 531 16 108
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 604 38 112
CmaEsSampler {"n_startup_trials":1,"restart_strategy":"ipop"} NopPruner {} 612 39 112
CmaEsSampler {"n_startup_trials":10,"restart_strategy":null} NopPruner {} 419 20 95
CmaEsSampler {"n_startup_trials":10,"restart_strategy":"ipop"} NopPruner {} 424 20 97
QMCSampler {"qmc_type":"sobol"} NopPruner {} 409 35 69
QMCSampler {"qmc_type":"halton"} NopPruner {} 222 7 51
BoTorchSampler {"n_startup_trials":10} NopPruner {} 736 68 107
RandomSampler {} NopPruner {} 93 4 33

Benchmark results for sigopt >= 10d: n_trials = 100
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 164 3 38
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 189 5 38
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 273 18 49
CmaEsSampler {"n_startup_trials":1,"restart_strategy":"ipop"} NopPruner {} 260 19 47
CmaEsSampler {"n_startup_trials":10,"restart_strategy":null} NopPruner {} 213 9 45
CmaEsSampler {"n_startup_trials":10,"restart_strategy":"ipop"} NopPruner {} 211 9 45
QMCSampler {"qmc_type":"sobol"} NopPruner {} 301 27 39
QMCSampler {"qmc_type":"halton"} NopPruner {} 131 2 21
BoTorchSampler {"n_startup_trials":10} NopPruner {} 198 10 31
RandomSampler {} NopPruner {} 47 0 10

Benchmark results for all: n_trials = 100
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 731 37 150
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 733 33 150
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 638 43 142
CmaEsSampler {"n_startup_trials":1,"restart_strategy":"ipop"} NopPruner {} 633 45 141
CmaEsSampler {"n_startup_trials":10,"restart_strategy":null} NopPruner {} 457 25 123
CmaEsSampler {"n_startup_trials":10,"restart_strategy":"ipop"} NopPruner {} 461 25 126
QMCSampler {"qmc_type":"sobol"} NopPruner {} 551 48 107
QMCSampler {"qmc_type":"halton"} NopPruner {} 347 17 86
BoTorchSampler {"n_startup_trials":10} NopPruner {} 892 83 143
RandomSampler {} NopPruner {} 178 12 68

Benchmark results for kurobako: n_trials = 1000
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 63 8 8
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 40 1 8
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 16 1 8
CmaEsSampler {"n_startup_trials":1,"restart_strategy":"ipop"} NopPruner {} 11 1 7
CmaEsSampler {"n_startup_trials":10,"restart_strategy":null} NopPruner {} 9 1 8
CmaEsSampler {"n_startup_trials":10,"restart_strategy":"ipop"} NopPruner {} 9 1 8
QMCSampler {"qmc_type":"sobol"} NopPruner {} 3 1 6
QMCSampler {"qmc_type":"halton"} NopPruner {} 0 1 6
BoTorchSampler {"n_startup_trials":10} NopPruner {} 0 1 6
RandomSampler {} NopPruner {} 0 1 6

Benchmark results for bayesmark: n_trials = 1000
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 203 16 34
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 210 20 34
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 31 4 19
CmaEsSampler {"n_startup_trials":1,"restart_strategy":"ipop"} NopPruner {} 50 3 22
CmaEsSampler {"n_startup_trials":10,"restart_strategy":null} NopPruner {} 54 4 21
CmaEsSampler {"n_startup_trials":10,"restart_strategy":"ipop"} NopPruner {} 70 4 24
QMCSampler {"qmc_type":"sobol"} NopPruner {} 111 10 28
QMCSampler {"qmc_type":"halton"} NopPruner {} 148 8 31
BoTorchSampler {"n_startup_trials":10} NopPruner {} 0 1 15
RandomSampler {} NopPruner {} 127 9 27

Benchmark results for sigopt: n_trials = 1000
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 527 28 120
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 594 47 127
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 699 111 125
CmaEsSampler {"n_startup_trials":1,"restart_strategy":"ipop"} NopPruner {} 685 95 128
CmaEsSampler {"n_startup_trials":10,"restart_strategy":null} NopPruner {} 604 85 125
CmaEsSampler {"n_startup_trials":10,"restart_strategy":"ipop"} NopPruner {} 615 85 125
QMCSampler {"qmc_type":"sobol"} NopPruner {} 342 19 81
QMCSampler {"qmc_type":"halton"} NopPruner {} 237 6 65
BoTorchSampler {"n_startup_trials":10} NopPruner {} 0 3 34
RandomSampler {} NopPruner {} 122 3 50

Benchmark results for sigopt >= 10d: n_trials = 1000
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 193 2 47
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 217 9 51
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 286 47 53
CmaEsSampler {"n_startup_trials":1,"restart_strategy":"ipop"} NopPruner {} 283 46 53
CmaEsSampler {"n_startup_trials":10,"restart_strategy":null} NopPruner {} 276 45 53
CmaEsSampler {"n_startup_trials":10,"restart_strategy":"ipop"} NopPruner {} 276 45 53
QMCSampler {"qmc_type":"sobol"} NopPruner {} 214 14 37
QMCSampler {"qmc_type":"halton"} NopPruner {} 110 0 20
BoTorchSampler {"n_startup_trials":10} NopPruner {} 0 0 5
RandomSampler {} NopPruner {} 63 0 14

Benchmark results for all: n_trials = 1000
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 793 52 162
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 844 68 169
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 746 116 152
CmaEsSampler {"n_startup_trials":1,"restart_strategy":"ipop"} NopPruner {} 746 99 157
CmaEsSampler {"n_startup_trials":10,"restart_strategy":null} NopPruner {} 667 90 154
CmaEsSampler {"n_startup_trials":10,"restart_strategy":"ipop"} NopPruner {} 694 90 157
QMCSampler {"qmc_type":"sobol"} NopPruner {} 456 30 115
QMCSampler {"qmc_type":"halton"} NopPruner {} 385 15 102
BoTorchSampler {"n_startup_trials":10} NopPruner {} 0 5 55
RandomSampler {} NopPruner {} 249 13 83

@HideakiImamura
Copy link
Member

Result of Second Experiment

Summary

The TPESampler algorithm differs greatly depending on whether the multivariate option is used or not. In this experiment, we focused our comparison on the speed variation. The results show that with the multivariate option set to True, the time taken to sample TPESampler decreases.

Details

We used asv. After installing asv, you can run the speed benchmark by running asv run -strict in the optuna directory.

Speed benchmark script
from typing import cast
from typing import List
from typing import Union

import optuna
from optuna.samplers import BaseSampler
from optuna.samplers import CmaEsSampler
from optuna.samplers import RandomSampler
from optuna.samplers import TPESampler
from optuna.testing.storages import StorageSupplier


def parse_args(args: str) -> List[Union[int, str]]:
    ret: List[Union[int, str]] = []
    for arg in map(lambda s: s.strip(), args.split(",")):
        try:
            ret.append(int(arg))
        except ValueError:
            ret.append(arg)
    return ret


SAMPLER_MODES = [
    "random",
    "tpe",
    "multivariate_tpe",
    "cmaes",
]


def create_sampler(sampler_mode: str) -> BaseSampler:
    if sampler_mode == "random":
        return RandomSampler()
    elif sampler_mode == "tpe":
        return TPESampler()
    elif sampler_mode == "multivariate_tpe":
        return TPESampler(multivariate=True)
    elif sampler_mode == "cmaes":
        return CmaEsSampler()
    else:
        assert False


class OptimizeSuite:
    def objective(self, trial: optuna.Trial) -> float:
        # x = trial.suggest_float("x", -100, 100)
        # y = trial.suggest_int("y", -100, 100)
        # return x**2 + y**2
        ret = 0
        for i in range(100):
            ret += trial.suggest_float(f"x{i}", -100, 100) ** 2
        return ret

    def optimize(self, storage_mode: str, sampler_mode: str, n_trials: int) -> None:
        with StorageSupplier(storage_mode) as storage:
            sampler = create_sampler(sampler_mode)
            study = optuna.create_study(storage=storage, sampler=sampler)
            study.optimize(self.objective, n_trials=n_trials)

    def time_optimize(self, args: str) -> None:
        storage_mode, sampler_mode, n_trials = parse_args(args)
        storage_mode = cast(str, storage_mode)
        sampler_mode = cast(str, sampler_mode)
        n_trials = cast(int, n_trials)
        self.optimize(storage_mode, sampler_mode, n_trials)

    params = (
        # "inmemory, random, 1000",
        # "inmemory, random, 10000",
        "inmemory, tpe, 1000",
        "inmemory, multivariate_tpe, 1000",
        # "inmemory, cmaes, 1000",
        # "sqlite, random, 1000",
        # "cached_sqlite, random, 1000",
        # # Following benchmarks use fakeredis instead of Redis.
        # "redis, random, 1000",
        # "cached_redis, random, 1000",
    )
    param_names = ["storage, sampler, n_trials"]
    timeout = 600

The following is the output.

% asv run --strict
· Creating environments
· Discovering benchmarks
· Running 1 total benchmarks (1 commits * 1 environments * 1 benchmarks)
[  0.00%] · For Optuna commit d467b003 <master>:
[  0.00%] ·· Benchmarking virtualenv-py3.9-fakeredis
[ 50.00%] ··· Running (optimize.OptimizeSuite.time_optimize--).
[100.00%] ··· optimize.OptimizeSuite.time_optimize                                                                                                                                                              ok
[100.00%] ··· ================================== ===========
                  storage, sampler, n_trials                
              ---------------------------------- -----------
                     inmemory, tpe, 1000          6.29±0.4m 
               inmemory, multivariate_tpe, 1000   3.34±0.7m 
              ================================== ===========

Result of Third Experiment

Samplers were limited to 8 samplers: 4 types of TPE (with & without multivariate x with & without constant_liar), CMA-ES, QMC, BoTorch, and Random.
For problem, I tried kurobako and naslib (4 hpobench, 4 nasbench, 135 sigopt functions). bayesmark had a concurrency option, but it did not work well, so I excluded it.
Three patterns of parallelism (n_concurrency) were tried: 5-parallel, 10-parallel, and 50-parallel.
Performance was evaluated for n_trials of 25, 50, 75, 100, and 1000 in the same three ways as in the previous experiment: Borda score, First score, and Top3 score.

Summary

For kurobako, TPE was found to be consistently strong.
For sigopt, QMC and BoTorch were found to be strong while n_trials was small, while TPE and CMA-ES were strong when n_trials was large.
Regarding the difference of results by the combination of multivariate and constant_liar for TPE, the following trends were found by dividing the results by n_trials and n_concurrency.
Problem and n_trials are related to each other. Too small or too large a value makes no difference between the methods.
n_concurency becomes larger, the effect of exploration becomes stronger.
multivariate=True speeds up the process of reaching the (local) optimal solution.
constant_liar=True makes it difficult to perform a careful search after reaching near-optimal solutions, so performance after that will be poor.

Details

Benchmark results for kurobako: n_trials = 25, n_concurrency = 5
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":true} NopPruner {} 0 2 8
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 26 6 8
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":true} NopPruner {} 13 4 8
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 6 2 8
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 6 3 8
QMCSampler {"qmc_type":"sobol"} NopPruner {} 7 3 8
BoTorchSampler {"n_startup_trials":10} NopPruner {} 0 2 8
RandomSampler {} NopPruner {} 0 2 8
Benchmark results for sigopt: n_trials = 25, n_concurrency = 5
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":true} NopPruner {} 119 59 129
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 109 52 132
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":true} NopPruner {} 191 63 133
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 103 56 129
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 153 67 130
QMCSampler {"qmc_type":"sobol"} NopPruner {} 189 72 131
BoTorchSampler {"n_startup_trials":10} NopPruner {} 159 60 133
RandomSampler {} NopPruner {} 22 48 127
Benchmark results for sigopt >=10d: n_trials = 25, n_concurrency = 5
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":true} NopPruner {} 34 17 53
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 21 15 53
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":true} NopPruner {} 99 23 56
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 32 18 53
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 89 27 54
QMCSampler {"qmc_type":"sobol"} NopPruner {} 123 31 55
BoTorchSampler {"n_startup_trials":10} NopPruner {} 57 18 55
RandomSampler {} NopPruner {} 0 15 52
Benchmark results for all: n_trials = 25, n_concurrency = 5
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":true} NopPruner {} 119 61 137
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 135 58 140
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":true} NopPruner {} 204 67 141
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 109 58 137
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 159 70 138
QMCSampler {"qmc_type":"sobol"} NopPruner {} 196 75 139
BoTorchSampler {"n_startup_trials":10} NopPruner {} 159 62 141
RandomSampler {} NopPruner {} 22 50 135
Benchmark results for kurobako: n_trials = 25, n_concurrency = 10
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":true} NopPruner {} 20 7 8
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 7 6 8
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":true} NopPruner {} 0 5 8
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 0 5 8
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 0 5 8
QMCSampler {"qmc_type":"sobol"} NopPruner {} 0 5 8
BoTorchSampler {"n_startup_trials":10} NopPruner {} 0 5 8
RandomSampler {} NopPruner {} 0 5 8
Benchmark results for sigopt: n_trials = 25, n_concurrency = 10
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":true} NopPruner {} 138 70 133
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 96 66 132
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":true} NopPruner {} 128 71 133
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 77 65 129
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 151 79 132
QMCSampler {"qmc_type":"sobol"} NopPruner {} 124 73 131
BoTorchSampler {"n_startup_trials":10} NopPruner {} 141 72 131
RandomSampler {} NopPruner {} 31 58 130
Benchmark results for sigopt >=10d: n_trials = 25, n_concurrency = 10
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":true} NopPruner {} 45 26 55
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 43 27 54
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":true} NopPruner {} 42 27 54
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 28 25 53
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 84 36 54
QMCSampler {"qmc_type":"sobol"} NopPruner {} 80 34 55
BoTorchSampler {"n_startup_trials":10} NopPruner {} 52 28 54
RandomSampler {} NopPruner {} 11 24 52
Benchmark results for all: n_trials = 25, n_concurrency = 10
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":true} NopPruner {} 158 77 141
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 103 72 140
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":true} NopPruner {} 128 76 141
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 77 70 137
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 151 84 140
QMCSampler {"qmc_type":"sobol"} NopPruner {} 124 78 139
BoTorchSampler {"n_startup_trials":10} NopPruner {} 141 77 139
RandomSampler {} NopPruner {} 31 63 138
Benchmark results for kurobako: n_trials = 25, n_concurrency = 50
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":true} NopPruner {} 7 6 8
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 0 5 8
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":true} NopPruner {} 0 5 8
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 0 5 8
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 0 5 8
QMCSampler {"qmc_type":"sobol"} NopPruner {} 6 6 8
BoTorchSampler {"n_startup_trials":10} NopPruner {} 12 6 8
RandomSampler {} NopPruner {} 7 6 8
Benchmark results for sigopt: n_trials = 25, n_concurrency = 50
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":true} NopPruner {} 113 71 131
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 100 66 132
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":true} NopPruner {} 119 66 134
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 60 63 129
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 109 72 131
QMCSampler {"qmc_type":"sobol"} NopPruner {} 208 82 132
BoTorchSampler {"n_startup_trials":10} NopPruner {} 200 83 133
RandomSampler {} NopPruner {} 17 57 130
Benchmark results for sigopt >=10d: n_trials = 25, n_concurrency = 50
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":true} NopPruner {} 41 26 54
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 16 22 54
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":true} NopPruner {} 70 25 55
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 23 23 53
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 80 31 55
QMCSampler {"qmc_type":"sobol"} NopPruner {} 135 38 55
BoTorchSampler {"n_startup_trials":10} NopPruner {} 31 23 54
RandomSampler {} NopPruner {} 6 21 53
Benchmark results for all: n_trials = 25, n_concurrency = 50
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":true} NopPruner {} 120 77 139
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 100 71 140
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":true} NopPruner {} 119 71 142
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 60 68 137
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 109 77 139
QMCSampler {"qmc_type":"sobol"} NopPruner {} 214 88 140
BoTorchSampler {"n_startup_trials":10} NopPruner {} 212 89 141
RandomSampler {} NopPruner {} 24 63 138
Benchmark results for kurobako: n_trials = 50, n_concurrency = 5
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":true} NopPruner {} 0 1 8
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 42 7 8
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":true} NopPruner {} 19 2 8
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 12 1 8
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 4 1 8
QMCSampler {"qmc_type":"sobol"} NopPruner {} 4 1 8
BoTorchSampler {"n_startup_trials":10} NopPruner {} 0 1 8
RandomSampler {} NopPruner {} 4 1 8
Benchmark results for sigopt: n_trials = 50, n_concurrency = 5
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":true} NopPruner {} 134 56 126
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 166 57 130
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":true} NopPruner {} 191 56 132
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 150 54 128
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 161 59 126
QMCSampler {"qmc_type":"sobol"} NopPruner {} 183 63 130
BoTorchSampler {"n_startup_trials":10} NopPruner {} 180 62 131
RandomSampler {} NopPruner {} 16 42 123
Benchmark results for sigopt >=10d: n_trials = 50, n_concurrency = 5
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":true} NopPruner {} 25 20 52
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 27 19 53
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":true} NopPruner {} 94 26 56
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 30 20 53
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 88 29 54
QMCSampler {"qmc_type":"sobol"} NopPruner {} 115 33 55
BoTorchSampler {"n_startup_trials":10} NopPruner {} 66 23 55
RandomSampler {} NopPruner {} 0 18 52
Benchmark results for all: n_trials = 50, n_concurrency = 5
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":true} NopPruner {} 134 57 134
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 208 64 138
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":true} NopPruner {} 210 58 140
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 162 55 136
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 165 60 134
QMCSampler {"qmc_type":"sobol"} NopPruner {} 187 64 138
BoTorchSampler {"n_startup_trials":10} NopPruner {} 180 63 139
RandomSampler {} NopPruner {} 20 43 131
Benchmark results for kurobako: n_trials = 50, n_concurrency = 10
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":true} NopPruner {} 20 7 8
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 13 6 8
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":true} NopPruner {} 0 4 8
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 0 4 8
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 6 4 8
QMCSampler {"qmc_type":"sobol"} NopPruner {} 0 4 8
BoTorchSampler {"n_startup_trials":10} NopPruner {} 0 4 8
RandomSampler {} NopPruner {} 0 4 8
Benchmark results for sigopt: n_trials = 50, n_concurrency = 10
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":true} NopPruner {} 157 61 130
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 166 64 131
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":true} NopPruner {} 132 57 132
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 135 59 127
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 139 63 129
QMCSampler {"qmc_type":"sobol"} NopPruner {} 108 58 130
BoTorchSampler {"n_startup_trials":10} NopPruner {} 157 65 130
RandomSampler {} NopPruner {} 25 46 127
Benchmark results for sigopt >=10d: n_trials = 50, n_concurrency = 10
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":true} NopPruner {} 50 21 53
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 79 26 53
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":true} NopPruner {} 42 21 54
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 65 23 51
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 70 28 53
QMCSampler {"qmc_type":"sobol"} NopPruner {} 84 28 54
BoTorchSampler {"n_startup_trials":10} NopPruner {} 50 22 52
RandomSampler {} NopPruner {} 15 18 52
Benchmark results for all: n_trials = 50, n_concurrency = 10
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":true} NopPruner {} 177 68 138
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 179 70 139
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":true} NopPruner {} 132 61 140
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 135 63 135
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 145 67 137
QMCSampler {"qmc_type":"sobol"} NopPruner {} 108 62 138
BoTorchSampler {"n_startup_trials":10} NopPruner {} 157 69 138
RandomSampler {} NopPruner {} 25 50 135
Benchmark results for kurobako: n_trials = 50, n_concurrency = 50
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":true} NopPruner {} 26 5 8
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 0 1 8
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":true} NopPruner {} 7 2 8
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 11 2 8
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 5 2 8
QMCSampler {"qmc_type":"sobol"} NopPruner {} 7 2 8
BoTorchSampler {"n_startup_trials":10} NopPruner {} 6 1 8
RandomSampler {} NopPruner {} 7 2 8
Benchmark results for sigopt: n_trials = 50, n_concurrency = 50
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":true} NopPruner {} 128 61 128
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 121 55 128
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":true} NopPruner {} 143 56 125
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 120 56 126
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 136 61 126
QMCSampler {"qmc_type":"sobol"} NopPruner {} 179 69 126
BoTorchSampler {"n_startup_trials":10} NopPruner {} 239 78 132
RandomSampler {} NopPruner {} 11 47 122
Benchmark results for sigopt >=10d: n_trials = 50, n_concurrency = 50
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":true} NopPruner {} 47 25 53
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 33 21 53
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":true} NopPruner {} 64 23 54
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 47 23 54
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 81 29 55
QMCSampler {"qmc_type":"sobol"} NopPruner {} 123 35 54
BoTorchSampler {"n_startup_trials":10} NopPruner {} 42 22 54
RandomSampler {} NopPruner {} 0 19 52
Benchmark results for all: n_trials = 50, n_concurrency = 50
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":true} NopPruner {} 154 66 136
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 121 56 136
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":true} NopPruner {} 150 58 133
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 131 58 134
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 141 63 134
QMCSampler {"qmc_type":"sobol"} NopPruner {} 186 71 134
BoTorchSampler {"n_startup_trials":10} NopPruner {} 245 79 140
RandomSampler {} NopPruner {} 18 49 130
Benchmark results for kurobako: n_trials = 75, n_concurrency = 5
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":true} NopPruner {} 0 1 7
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 42 7 8
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":true} NopPruner {} 19 2 7
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 18 1 8
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 9 1 8
QMCSampler {"qmc_type":"sobol"} NopPruner {} 4 1 7
BoTorchSampler {"n_startup_trials":10} NopPruner {} 0 1 7
RandomSampler {} NopPruner {} 4 1 7
Benchmark results for sigopt: n_trials = 75, n_concurrency = 5
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":true} NopPruner {} 134 52 119
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 179 56 123
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":true} NopPruner {} 190 52 127
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 170 50 123
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 161 55 119
QMCSampler {"qmc_type":"sobol"} NopPruner {} 167 55 121
BoTorchSampler {"n_startup_trials":10} NopPruner {} 202 62 127
RandomSampler {} NopPruner {} 35 39 116
Benchmark results for sigopt >=10d: n_trials = 75, n_concurrency = 5
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":true} NopPruner {} 25 18 48
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 37 20 48
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":true} NopPruner {} 93 24 54
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 42 19 50
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 88 28 50
QMCSampler {"qmc_type":"sobol"} NopPruner {} 108 29 51
BoTorchSampler {"n_startup_trials":10} NopPruner {} 73 24 52
RandomSampler {} NopPruner {} 5 17 47
Benchmark results for all: n_trials = 75, n_concurrency = 5
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":true} NopPruner {} 134 53 126
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 221 63 131
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":true} NopPruner {} 209 54 134
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 188 51 131
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 170 56 127
QMCSampler {"qmc_type":"sobol"} NopPruner {} 171 56 128
BoTorchSampler {"n_startup_trials":10} NopPruner {} 202 63 134
RandomSampler {} NopPruner {} 39 40 123
Benchmark results for kurobako: n_trials = 75, n_concurrency = 10
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":true} NopPruner {} 20 7 8
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 13 6 8
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":true} NopPruner {} 0 4 8
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 0 4 8
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 6 4 8
QMCSampler {"qmc_type":"sobol"} NopPruner {} 0 4 8
BoTorchSampler {"n_startup_trials":10} NopPruner {} 0 4 8
RandomSampler {} NopPruner {} 0 4 8
Benchmark results for sigopt: n_trials = 75, n_concurrency = 10
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":true} NopPruner {} 169 59 131
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 182 63 128
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":true} NopPruner {} 127 56 130
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 141 57 124
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 155 61 127
QMCSampler {"qmc_type":"sobol"} NopPruner {} 99 56 126
BoTorchSampler {"n_startup_trials":10} NopPruner {} 184 66 127
RandomSampler {} NopPruner {} 26 45 124
Benchmark results for sigopt >=10d: n_trials = 75, n_concurrency = 10
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":true} NopPruner {} 57 20 54
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 88 26 51
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":true} NopPruner {} 43 21 52
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 65 23 49
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 70 28 51
QMCSampler {"qmc_type":"sobol"} NopPruner {} 71 27 50
BoTorchSampler {"n_startup_trials":10} NopPruner {} 67 23 50
RandomSampler {} NopPruner {} 20 18 50
Benchmark results for all: n_trials = 75, n_concurrency = 10
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":true} NopPruner {} 189 66 139
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 195 69 136
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":true} NopPruner {} 127 60 138
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 141 61 132
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 161 65 135
QMCSampler {"qmc_type":"sobol"} NopPruner {} 99 60 134
BoTorchSampler {"n_startup_trials":10} NopPruner {} 184 70 135
RandomSampler {} NopPruner {} 26 49 132
Benchmark results for kurobako: n_trials = 75, n_concurrency = 50
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":true} NopPruner {} 28 5 8
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 0 1 8
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":true} NopPruner {} 14 3 8
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 11 1 8
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 11 2 8
QMCSampler {"qmc_type":"sobol"} NopPruner {} 6 1 8
BoTorchSampler {"n_startup_trials":10} NopPruner {} 0 1 8
RandomSampler {} NopPruner {} 6 2 8
Benchmark results for sigopt: n_trials = 75, n_concurrency = 50
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":true} NopPruner {} 134 57 126
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 141 54 126
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":true} NopPruner {} 153 53 125
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 137 54 126
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 156 59 124
QMCSampler {"qmc_type":"sobol"} NopPruner {} 177 64 125
BoTorchSampler {"n_startup_trials":10} NopPruner {} 240 75 128
RandomSampler {} NopPruner {} 16 44 120
Benchmark results for sigopt >=10d: n_trials = 75, n_concurrency = 50
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":true} NopPruner {} 48 21 52
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 40 19 52
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":true} NopPruner {} 68 22 54
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 65 22 53
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 85 27 53
QMCSampler {"qmc_type":"sobol"} NopPruner {} 120 32 54
BoTorchSampler {"n_startup_trials":10} NopPruner {} 49 21 51
RandomSampler {} NopPruner {} 0 17 50
Benchmark results for all: n_trials = 75, n_concurrency = 50
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":true} NopPruner {} 162 62 134
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 141 55 134
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":true} NopPruner {} 167 56 133
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 148 55 134
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 167 61 132
QMCSampler {"qmc_type":"sobol"} NopPruner {} 183 65 133
BoTorchSampler {"n_startup_trials":10} NopPruner {} 240 76 136
RandomSampler {} NopPruner {} 22 46 128
Benchmark results for kurobako: n_trials = 100, n_concurrency = 5
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":true} NopPruner {} 0 1 7
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 42 7 8
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":true} NopPruner {} 19 2 7
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 18 1 8
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 9 1 8
QMCSampler {"qmc_type":"sobol"} NopPruner {} 4 1 7
BoTorchSampler {"n_startup_trials":10} NopPruner {} 0 1 7
RandomSampler {} NopPruner {} 4 1 7
Benchmark results for sigopt: n_trials = 100, n_concurrency = 5
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":true} NopPruner {} 122 48 119
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 180 55 122
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":true} NopPruner {} 192 53 127
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 183 48 123
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 167 55 117
QMCSampler {"qmc_type":"sobol"} NopPruner {} 176 56 121
BoTorchSampler {"n_startup_trials":10} NopPruner {} 207 63 125
RandomSampler {} NopPruner {} 33 39 114
Benchmark results for sigopt >=10d: n_trials = 100, n_concurrency = 5
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":true} NopPruner {} 25 17 48
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 36 19 48
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":true} NopPruner {} 95 24 54
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 48 18 50
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 88 27 50
QMCSampler {"qmc_type":"sobol"} NopPruner {} 107 28 51
BoTorchSampler {"n_startup_trials":10} NopPruner {} 80 24 52
RandomSampler {} NopPruner {} 5 16 47
Benchmark results for all: n_trials = 100, n_concurrency = 5
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":true} NopPruner {} 122 49 126
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 222 62 130
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":true} NopPruner {} 211 55 134
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 201 49 131
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 176 56 125
QMCSampler {"qmc_type":"sobol"} NopPruner {} 180 57 128
BoTorchSampler {"n_startup_trials":10} NopPruner {} 207 64 132
RandomSampler {} NopPruner {} 37 40 121
Benchmark results for kurobako: n_trials = 100, n_concurrency = 10
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":true} NopPruner {} 20 7 8
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 13 6 8
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":true} NopPruner {} 0 4 8
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 0 4 8
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 6 4 8
QMCSampler {"qmc_type":"sobol"} NopPruner {} 0 4 8
BoTorchSampler {"n_startup_trials":10} NopPruner {} 0 4 8
RandomSampler {} NopPruner {} 0 4 8
Benchmark results for sigopt: n_trials = 100, n_concurrency = 10
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":true} NopPruner {} 167 60 130
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 183 60 128
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":true} NopPruner {} 133 55 130
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 167 57 126
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 150 59 128
QMCSampler {"qmc_type":"sobol"} NopPruner {} 111 55 127
BoTorchSampler {"n_startup_trials":10} NopPruner {} 197 66 127
RandomSampler {} NopPruner {} 21 43 125
Benchmark results for sigopt >=10d: n_trials = 100, n_concurrency = 10
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":true} NopPruner {} 57 19 53
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 82 25 50
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":true} NopPruner {} 44 21 51
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 85 23 49
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 79 28 50
QMCSampler {"qmc_type":"sobol"} NopPruner {} 65 26 50
BoTorchSampler {"n_startup_trials":10} NopPruner {} 68 23 49
RandomSampler {} NopPruner {} 15 18 49
Benchmark results for all: n_trials = 100, n_concurrency = 10
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":true} NopPruner {} 187 67 138
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 196 66 136
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":true} NopPruner {} 133 59 138
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 167 61 134
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 156 63 136
QMCSampler {"qmc_type":"sobol"} NopPruner {} 111 59 135
BoTorchSampler {"n_startup_trials":10} NopPruner {} 197 70 135
RandomSampler {} NopPruner {} 21 47 133
Benchmark results for kurobako: n_trials = 100, n_concurrency = 50
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":true} NopPruner {} 28 5 8
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 0 1 8
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":true} NopPruner {} 14 3 8
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 11 1 8
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 11 2 8
QMCSampler {"qmc_type":"sobol"} NopPruner {} 6 1 8
BoTorchSampler {"n_startup_trials":10} NopPruner {} 0 1 8
RandomSampler {} NopPruner {} 6 2 8
Benchmark results for sigopt: n_trials = 100, n_concurrency = 50
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":true} NopPruner {} 134 52 120
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 154 51 119
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":true} NopPruner {} 159 53 123
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 164 55 124
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 162 60 119
QMCSampler {"qmc_type":"sobol"} NopPruner {} 166 58 122
BoTorchSampler {"n_startup_trials":10} NopPruner {} 247 75 125
RandomSampler {} NopPruner {} 13 43 113
Benchmark results for sigopt >=10d: n_trials = 100, n_concurrency = 50
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":true} NopPruner {} 42 17 47
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 54 18 45
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":true} NopPruner {} 75 21 52
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 86 23 50
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 92 27 50
QMCSampler {"qmc_type":"sobol"} NopPruner {} 109 27 52
BoTorchSampler {"n_startup_trials":10} NopPruner {} 52 21 48
RandomSampler {} NopPruner {} 4 16 44
Benchmark results for all: n_trials = 100, n_concurrency = 50
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":true} NopPruner {} 162 57 128
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 154 52 127
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":true} NopPruner {} 173 56 131
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 175 56 132
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 173 62 127
QMCSampler {"qmc_type":"sobol"} NopPruner {} 172 59 130
BoTorchSampler {"n_startup_trials":10} NopPruner {} 247 76 133
RandomSampler {} NopPruner {} 19 45 121
Benchmark results for kurobako: n_trials = 1000, n_concurrency = 5
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":true} NopPruner {} 0 0 7
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 42 6 8
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":true} NopPruner {} 19 1 7
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 25 1 8
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 5 0 8
QMCSampler {"qmc_type":"sobol"} NopPruner {} 0 0 7
BoTorchSampler {"n_startup_trials":10} NopPruner {} 0 0 7
RandomSampler {} NopPruner {} 0 0 7
Benchmark results for sigopt: n_trials = 1000, n_concurrency = 5
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":true} NopPruner {} 133 44 123
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 170 44 128
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":true} NopPruner {} 280 70 130
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 233 53 130
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 236 69 122
QMCSampler {"qmc_type":"sobol"} NopPruner {} 113 46 122
BoTorchSampler {"n_startup_trials":10} NopPruner {} 0 37 117
RandomSampler {} NopPruner {} 33 39 119
Benchmark results for sigopt >=10d: n_trials = 1000, n_concurrency = 5
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":true} NopPruner {} 32 17 51
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 50 20 52
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":true} NopPruner {} 133 30 56
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 59 19 53
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 97 29 52
QMCSampler {"qmc_type":"sobol"} NopPruner {} 68 22 53
BoTorchSampler {"n_startup_trials":10} NopPruner {} 0 16 50
RandomSampler {} NopPruner {} 12 17 51
Benchmark results for all: n_trials = 1000, n_concurrency = 5
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":true} NopPruner {} 133 44 130
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 212 50 136
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":true} NopPruner {} 299 71 137
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 258 54 138
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 241 69 130
QMCSampler {"qmc_type":"sobol"} NopPruner {} 113 46 129
BoTorchSampler {"n_startup_trials":10} NopPruner {} 0 37 124
RandomSampler {} NopPruner {} 33 39 126
Benchmark results for kurobako: n_trials = 1000, n_concurrency = 10
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":true} NopPruner {} 20 7 8
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 13 6 8
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":true} NopPruner {} 0 4 8
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 0 4 8
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 0 4 8
QMCSampler {"qmc_type":"sobol"} NopPruner {} 0 4 8
BoTorchSampler {"n_startup_trials":10} NopPruner {} 0 4 8
RandomSampler {} NopPruner {} 0 4 8
Benchmark results for sigopt: n_trials = 1000, n_concurrency = 10
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":true} NopPruner {} 135 53 130
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 124 54 127
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":true} NopPruner {} 219 70 132
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 201 62 126
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 226 75 130
QMCSampler {"qmc_type":"sobol"} NopPruner {} 65 54 125
BoTorchSampler {"n_startup_trials":10} NopPruner {} 0 46 122
RandomSampler {} NopPruner {} 44 48 125
Benchmark results for sigopt >=10d: n_trials = 1000, n_concurrency = 10
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":true} NopPruner {} 58 22 53
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 43 24 50
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":true} NopPruner {} 66 27 53
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 84 28 49
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 96 33 52
QMCSampler {"qmc_type":"sobol"} NopPruner {} 52 27 50
BoTorchSampler {"n_startup_trials":10} NopPruner {} 0 21 47
RandomSampler {} NopPruner {} 15 21 49
Benchmark results for all: n_trials = 1000, n_concurrency = 10
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":true} NopPruner {} 155 60 138
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 137 60 135
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":true} NopPruner {} 219 74 140
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 201 66 134
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 226 79 138
QMCSampler {"qmc_type":"sobol"} NopPruner {} 65 58 133
BoTorchSampler {"n_startup_trials":10} NopPruner {} 0 50 130
RandomSampler {} NopPruner {} 44 52 133
Benchmark results for kurobako: n_trials = 1000, n_concurrency = 50
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":true} NopPruner {} 28 5 8
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 0 1 8
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":true} NopPruner {} 14 3 8
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 12 1 8
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 7 2 8
QMCSampler {"qmc_type":"sobol"} NopPruner {} 0 1 8
BoTorchSampler {"n_startup_trials":10} NopPruner {} 0 1 8
RandomSampler {} NopPruner {} 0 1 8
Benchmark results for sigopt: n_trials = 1000, n_concurrency = 50
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":true} NopPruner {} 140 49 128
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 116 50 128
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":true} NopPruner {} 203 62 130
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 217 62 128
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 251 72 130
QMCSampler {"qmc_type":"sobol"} NopPruner {} 137 52 127
BoTorchSampler {"n_startup_trials":10} NopPruner {} 0 41 121
RandomSampler {} NopPruner {} 15 42 122
Benchmark results for sigopt >=10d: n_trials = 1000, n_concurrency = 50
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":true} NopPruner {} 60 15 50
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 44 17 49
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":true} NopPruner {} 98 21 54
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 95 22 51
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 108 26 52
QMCSampler {"qmc_type":"sobol"} NopPruner {} 84 22 52
BoTorchSampler {"n_startup_trials":10} NopPruner {} 0 13 47
RandomSampler {} NopPruner {} 5 13 48
Benchmark results for all: n_trials = 1000, n_concurrency = 50
Sampler Arguments of the sampler Pruner Arguments of the pruner Borda score First score Top3 score
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":true} NopPruner {} 168 54 136
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":true,"constant_liar":false} NopPruner {} 116 51 136
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":true} NopPruner {} 217 65 138
TPESampler {"n_startup_trials":10,"n_ei_candidates":24,"multivariate":false,"constant_liar":false} NopPruner {} 229 63 136
CmaEsSampler {"n_startup_trials":1,"restart_strategy":null} NopPruner {} 258 74 138
QMCSampler {"qmc_type":"sobol"} NopPruner {} 137 53 135
BoTorchSampler {"n_startup_trials":10} NopPruner {} 0 42 129
RandomSampler {} NopPruner {} 15 43 130

Conclusion

We conducted an exhaustive benchmark experiment to investigate changing Optuna's default sampler as one of the development items for Optuna V3. While we did not obtain enough experimental results to change the default, it was a great help in understanding the behavior of the algorithm as it exists at this time. We hope it will also be helpful to users in their choice of samplers.

In the future, we will work on improving the behavior of the algorithms by further improving the benchmarking environment. In particular, we are planning to improve TPESampler as soon as possible, so please stay tuned!

@HideakiImamura HideakiImamura removed the v3 Issue/PR for Optuna version 3. label Aug 9, 2022
@HideakiImamura
Copy link
Member

Let me close this issue since we have stopped to consider changing the default sampling algorithm. Note that we still have a lot of TODOs for Optuna's algorithms. The following are exmpales.

  • TPE performs poorly compared to the TPE in BOHB #2871. We need benchmark verifications.
  • Performance improvement of the constant_liar option of TPESampler. We need another benchmarking.
  • Verification of the group option of TPESsampler. We need benchmark problem for this option.
  • Consideration of the evidence required to change the default for the multivariate option. We may need more benchmark problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Change that does not break compatibility, but affects the public interfaces. no-stale Exempt from stale bot optuna.samplers Related to the `optuna.samplers` submodule. This is automatically labeled by github-actions.
Projects
None yet
Development

No branches or pull requests

5 participants