New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add QMC sampler #2423
Add QMC sampler #2423
Conversation
Codecov Report
@@ Coverage Diff @@
## master #2423 +/- ##
==========================================
- Coverage 91.46% 90.97% -0.50%
==========================================
Files 134 135 +1
Lines 11277 11368 +91
==========================================
+ Hits 10315 10342 +27
- Misses 962 1026 +64
Continue to review full report at Codecov.
|
@hvy Could you review this PR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the great PR. I'm excited to introduce this feature into Optuna. I have some early comments. Could you take a look?
IMO, the current design is fine. Looks reasonable as you explained.
And, could you merge the master branch to pass the CI? |
Co-authored-by: Hideaki Imamura <38826298+HideakiImamura@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice 👍 was this with Halton scrambled? |
I used the default parameter (Halton without scrambled) in the above experiment. I retried the benchmark with/without scramble parameter. Magenta refers to QMC (halton, without scramble) and green refers to QCM (halton, with scramble). 👀 💭
(1) Problem: HPO-Bench-Parkinson
(2) Problem: HPO-Bench-Protein
(3) Problem: HPO-Bench-Slice
(5) Problem: HPO-Bench-Naval
SolversID: Arecipe: {
"name": "_TPESampler_NopPruner",
"optuna": {
"loglevel": "debug",
"sampler": "TPESampler",
"sampler_kwargs": "{\"multivariate\":true,\"constant_liar\":true}",
"pruner": "NopPruner",
"pruner_kwargs": "{}"
}
} ID: Brecipe: {
"name": "_QMCSampler_NopPruner",
"optuna": {
"loglevel": "debug",
"sampler": "QMCSampler",
"sampler_kwargs": "{}",
"pruner": "NopPruner",
"pruner_kwargs": "{}"
}
} ID: Crecipe: {
"name": "_RandomSampler_NopPruner",
"optuna": {
"loglevel": "debug",
"sampler": "RandomSampler",
"sampler_kwargs": "{}",
"pruner": "NopPruner",
"pruner_kwargs": "{}"
}
} ID: Drecipe: {
"name": "_QMCSampler_NopPruner",
"optuna": {
"loglevel": "debug",
"sampler": "QMCSampler",
"sampler_kwargs": "{\"scramble\":true}",
"pruner": "NopPruner",
"pruner_kwargs": "{}"
}
} |
Co-authored-by: himkt <himkt@klis.tsukuba.ac.jp>
@himkt thanks for the details. I am a bit surprised that Halton is not performing better than a pure random sampler. TPE's results make sense as it takes into account the output, but for random it should not be the case. Or it might just be luck here or the class of problem which does not depend much on the quality of the sampler. |
@himkt Thank you so much for the quick reviews and for taking the benchmark!! |
I thought that the experiment showed that QMC sampler performed better than Random sampler. 👀 If you're interested in benchmarking on your own, you can do it on GitHub Actions. More details are described here: |
I expected a clearer separation between Random and QMC.
Thanks. |
@himkt @tupui @HideakiImamura The benchmark results do not quite agree with the results I got in #1797, so I took the benchmark on my own and got the same results as @himkt . The performance of
So I would like to take time to investigate if there is any issue with the current implementation before this PR is merged. |
Benchmark results: click here for benchmark resultsBenchmark Result Report
Please refer to "A Strategy for Ranking Optimizers using Multiple Criteria" for the ranking strategy used in this report. Table of ContentsOverall Results
Individual Results(1) Problem: HPO-Bench-Naval
(2) Problem: HPO-Bench-Protein
(3) Problem: HPO-Bench-Slice
(4) Problem: HPO-Bench-Parkinson
SolversID: a0a8ebdb9195ab1a4a61a89a0e638d5531cbd42efff393357fb2876a67c0d4a9recipe: {
"name": "_HaltonSampler_NopPruner",
"optuna": {
"loglevel": "debug",
"sampler": "QMCSampler",
"sampler_kwargs": "{}",
"pruner": "NopPruner",
"pruner_kwargs": "{}"
}
} specification: {
"name": "_HaltonSampler_NopPruner",
"attrs": {
"github": "https://github.com/optuna/optuna",
"paper": "Akiba, Takuya, et al. \"Optuna: A next-generation hyperparameter optimization framework.\" Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 2019.",
"version": "optuna=2.9.0.dev0, kurobako-py=0.1.12"
},
"capabilities": [
"UNIFORM_CONTINUOUS",
"UNIFORM_DISCRETE",
"LOG_UNIFORM_CONTINUOUS",
"LOG_UNIFORM_DISCRETE",
"CATEGORICAL",
"CONDITIONAL",
"MULTI_OBJECTIVE",
"CONCURRENT"
]
} ID: aaf9fdc7071a2414a4fd00be14adac59ea045675a2f51a90a7a9e24598660880recipe: {
"name": "_HaltonSampler_Scrambled_NopPruner",
"optuna": {
"loglevel": "debug",
"sampler": "QMCSampler",
"sampler_kwargs": "{\"scramble\":true}",
"pruner": "NopPruner",
"pruner_kwargs": "{}"
}
} specification: {
"name": "_HaltonSampler_Scrambled_NopPruner",
"attrs": {
"github": "https://github.com/optuna/optuna",
"paper": "Akiba, Takuya, et al. \"Optuna: A next-generation hyperparameter optimization framework.\" Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 2019.",
"version": "optuna=2.9.0.dev0, kurobako-py=0.1.12"
},
"capabilities": [
"UNIFORM_CONTINUOUS",
"UNIFORM_DISCRETE",
"LOG_UNIFORM_CONTINUOUS",
"LOG_UNIFORM_DISCRETE",
"CATEGORICAL",
"CONDITIONAL",
"MULTI_OBJECTIVE",
"CONCURRENT"
]
} ID: a6cf01ef67fb72912cc9949362e3eb221eb6d9aebd9c3fbf0b1795e1d86ff4acrecipe: {
"name": "_RandomSampler_NopPruner",
"optuna": {
"loglevel": "debug",
"sampler": "RandomSampler",
"sampler_kwargs": "{}",
"pruner": "NopPruner",
"pruner_kwargs": "{}"
}
} specification: {
"name": "_RandomSampler_NopPruner",
"attrs": {
"github": "https://github.com/optuna/optuna",
"paper": "Akiba, Takuya, et al. \"Optuna: A next-generation hyperparameter optimization framework.\" Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 2019.",
"version": "optuna=2.9.0.dev0, kurobako-py=0.1.12"
},
"capabilities": [
"UNIFORM_CONTINUOUS",
"UNIFORM_DISCRETE",
"LOG_UNIFORM_CONTINUOUS",
"LOG_UNIFORM_DISCRETE",
"CATEGORICAL",
"CONDITIONAL",
"MULTI_OBJECTIVE",
"CONCURRENT"
]
} ID: 784b72151fc1cb2abcc1e10ac3850a367016c94c7235b7dc7d00dfda3657ab70recipe: {
"name": "_SobolSampler_NopPruner",
"optuna": {
"loglevel": "debug",
"sampler": "QMCSampler",
"sampler_kwargs": "{\"qmc_type\":\"sobol\"}",
"pruner": "NopPruner",
"pruner_kwargs": "{}"
}
} specification: {
"name": "_SobolSampler_NopPruner",
"attrs": {
"github": "https://github.com/optuna/optuna",
"paper": "Akiba, Takuya, et al. \"Optuna: A next-generation hyperparameter optimization framework.\" Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 2019.",
"version": "optuna=2.9.0.dev0, kurobako-py=0.1.12"
},
"capabilities": [
"UNIFORM_CONTINUOUS",
"UNIFORM_DISCRETE",
"LOG_UNIFORM_CONTINUOUS",
"LOG_UNIFORM_DISCRETE",
"CATEGORICAL",
"CONDITIONAL",
"MULTI_OBJECTIVE",
"CONCURRENT"
]
} ID: c91be229b42e50d37a0f711c038496f61e47ef3af2cf0216b0647475b6c7ac46recipe: {
"name": "_SobolSampler_Scrambled_NopPruner",
"optuna": {
"loglevel": "debug",
"sampler": "QMCSampler",
"sampler_kwargs": "{\"scramble\":true,\"qmc_type\":\"sobol\"}",
"pruner": "NopPruner",
"pruner_kwargs": "{}"
}
} specification: {
"name": "_SobolSampler_Scrambled_NopPruner",
"attrs": {
"github": "https://github.com/optuna/optuna",
"paper": "Akiba, Takuya, et al. \"Optuna: A next-generation hyperparameter optimization framework.\" Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 2019.",
"version": "optuna=2.9.0.dev0, kurobako-py=0.1.12"
},
"capabilities": [
"UNIFORM_CONTINUOUS",
"UNIFORM_DISCRETE",
"LOG_UNIFORM_CONTINUOUS",
"LOG_UNIFORM_DISCRETE",
"CATEGORICAL",
"CONDITIONAL",
"MULTI_OBJECTIVE",
"CONCURRENT"
]
} ID: 3371bdcd6d1fc903733bc5d034ce94193d1e781126837c1067593e573ae6a246recipe: {
"name": "_TPESampler_NopPruner",
"optuna": {
"loglevel": "debug",
"sampler": "TPESampler",
"sampler_kwargs": "{\"multivariate\":true,\"constant_liar\":true}",
"pruner": "NopPruner",
"pruner_kwargs": "{}"
}
} specification: {
"name": "_TPESampler_NopPruner",
"attrs": {
"github": "https://github.com/optuna/optuna",
"paper": "Akiba, Takuya, et al. \"Optuna: A next-generation hyperparameter optimization framework.\" Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 2019.",
"version": "optuna=2.9.0.dev0, kurobako-py=0.1.12"
},
"capabilities": [
"UNIFORM_CONTINUOUS",
"UNIFORM_DISCRETE",
"LOG_UNIFORM_CONTINUOUS",
"LOG_UNIFORM_DISCRETE",
"CATEGORICAL",
"CONDITIONAL",
"MULTI_OBJECTIVE",
"CONCURRENT"
]
} ID: 9036c1b3f7c0515da7c7c40331769664179e82d27fc497759034938ffa244075recipe: {
"name": "_UnivariateTPESampler_NopPruner",
"optuna": {
"loglevel": "debug",
"sampler": "TPESampler",
"sampler_kwargs": "{}",
"pruner": "NopPruner",
"pruner_kwargs": "{}"
}
} specification: {
"name": "_UnivariateTPESampler_NopPruner",
"attrs": {
"github": "https://github.com/optuna/optuna",
"paper": "Akiba, Takuya, et al. \"Optuna: A next-generation hyperparameter optimization framework.\" Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 2019.",
"version": "optuna=2.9.0.dev0, kurobako-py=0.1.12"
},
"capabilities": [
"UNIFORM_CONTINUOUS",
"UNIFORM_DISCRETE",
"LOG_UNIFORM_CONTINUOUS",
"LOG_UNIFORM_DISCRETE",
"CATEGORICAL",
"CONDITIONAL",
"MULTI_OBJECTIVE",
"CONCURRENT"
]
} |
Thanks @kstoneriv3. It's indeed strange for scrambling. For Sobol' as default, if in practice it does help, fine on my side since we have the note (should be a warning maybe if Sobol' became the default) in the doc. For merging, if I may, if the user facing API and functionalities are there, I would proceed and tackle the rest in another PR. It's already a large PR with lot of comments so it's hard to keep up. (Like I would also want to be more flexible on the sampler choice. It could allow an instance of |
@tupui Thank you for the reply!
I believe
|
Yes it's less practical for that, although similar to Sobol' if you are strict (need to know |
Thank you @tupui for the insight! and sorry for my lack of algorithmic knowledge. @kstoneriv3 Thank you for your continuous effort.
I think the basic interface looks solid so it is reasonable to me. |
@tupui @himkt OK, I changed the default As for the difference in the benchmark results, I found that it comes from calling samplers from the CUI argument and from the python script (For details, see https://gist.github.com/kstoneriv3/496f8b2fa864193e538fe644202eca66). I will create a issue about this at optuna/kurobako-py#15. |
Please take a look @HideakiImamura. 👍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the long running PR @kstoneriv3, and thanks for the careful reviews @tupui @keisuke-umezawa @himkt. LGTM.
Hi @kstoneriv3, let me say again thank you so much for the great contribution! |
Sorry for the late reply! I summarized these points in #2625. |
Motivation
As in #1797, Quasi-Monte Carlo (QMC) sampler should be supported as a good alternative to
RandomSampler
. I want to discuss the implementational details of theQMCSampler
as there exist several design choices for ensuring that it works in a distributed environment.Description of the changes
QMCSampler
is added atoptuna/sampler/_qmc.py
. Sincescipy
will introduce support of QMC in the 1.7.0 release, we use their implementation of QMC to generate several kinds of QMC sequences.Design Choices around Distributed Environment
To suggest QMC samples using distributed workers, we have to synchronize them. This is because QMC sequences are strictly ordered. They cannot be sampled independently; each worker must know exactly how many QMC samples were suggested so far. Since the storage of Optuna does not support atomic transactions, (as far as I understood) we have to compromise somewhere and thus there are some design choices. The possible design I considered are as follows:
qmc_id
) it is. Then, every time we suggest a new sample, we check all past trials to see how many QMC points have been suggested so far. (Above strategy should be faster than this because they access O(1) information while this access O(n_trials) information in each trial.Trial._trial_id
orTrial.number
equal toqmc_id + const
. (This fails when there are different types of samplers running simultaneously.)QMCSampler
. In each trial, we randomly pick a point that is not yet sampled from a pool of the pre-sampled points. This strategy is the same as theGridSampler
. (This can be inefficient if we fail to sample early points of QMC sequences as the earlier points of the QMC sequences are more important than their later parts. So, it seems better to sample QMC in the ordered manner. )TODO: