add docs for automatic algorithm selection

PKU-DAIR · Feb 29, 2024 · 8fd8d1a · 8fd8d1a
1 parent 7c6ba44
commit 8fd8d1a
Show file tree

Hide file tree

Showing 12 changed files with 276 additions and 19 deletions.
diff --git a/docs/en/advanced_usage/advanced_usage.rst b/docs/en/advanced_usage/advanced_usage.rst
@@ -7,5 +7,6 @@ Advanced Usage
 
     Problem Definition with Complex Search Space <complex_space>
     Parallel Evaluation <parallel_evaluation>
+    Automatic Algorithm Selection <auto_algorithm_selection>
     Early Stopping <early_stop>
     Transfer Learning <transfer_learning>
diff --git a/docs/en/advanced_usage/auto_algorithm_selection.md b/docs/en/advanced_usage/auto_algorithm_selection.md
@@ -0,0 +1,117 @@
+# Automatic Algorithm Selection
+
+Since a large number of Bayesian optimization algorithms are proposed,
+users may find it difficult to choose the proper algorithm for their tasks.
+**OpenBox** provides an automatic algorithm selection mechanism to choose the proper 
+optimization algorithm for a given optimization task.
+
+This document provides a brief introduction to the algorithm selection mechanism,
+including the usage of the mechanism and the algorithm selection criteria.
+
+
+## Usage
+
+To use the automatic algorithm selection mechanism,
+set the following options to `'auto'` in `Advisor` or `Optimizer`:
++ `surrogate_type='auto'`
++ `acq_type='auto'`
++ `acq_optimizer_type='auto'`
+
+\*By default, the algorithm selection mechanism is enabled.
+
+For example:
+```python
+from openbox import Advisor
+advisor = Advisor(
+    ...,
+    surrogate_type='auto',
+    acq_type='auto',
+    acq_optimizer_type='auto',
+)
+```
+
+After initialization, a log message will be printed to indicate the selected algorithms:
+```
+[BO auto selection]  surrogate_type: gp. acq_type: ei. acq_optimizer_type: random_scipy.
+```
+
+
+## Algorithm Selection Criteria
+
+The algorithm selection mechanism is based on the characteristics of the problem, 
+such as the dimensionality, the types of hyperparameters, and the number of objectives. 
+It is designed to provide consistent performance for different problems.
+
+The criteria for algorithm selection are obtained from practical experience or experimental results.
+
+### For Surrogate Model
+
+Gaussian Process (GP, `'gp'`) vs. Probabilistic Random Forest (PRF, `'prf'`):
++ GP performs very well on mathematical functions.
++ GP performs well for space with continuous hyperparameters.
++ PRF is better if the space is full of categorical hyperparameters.
++ GP is not suitable for high-dimensional problems.
++ PRF can be used for high-dimensional problems.
++ Computational cost: GP is $O(n^3)$ while PRF is $O(nlogn)$, where $n$ is the number of observations.
+
+Currently, the algorithm selection mechanism selects the proper surrogate model based on the following criteria:
++ If there are 10 or more hyperparameters in the search space, `'prf'` is selected.
+  (If there are 100 or more hyperparameters, random search is used instead of BO.)
++ If there are more categorical hyperparameters than continuous hyperparameters, `'prf'` is selected.
++ Otherwise, `'gp'` is selected.
++ If the model is automatically selected to be `'gp'`, and the number of observations is greater than 300,
+  the model is automatically switched to `'prf'`.
+
+### For Acquisition Function
+
+The acquisition function is chosen based on the type of the optimization task:
+
++ For single-objective optimization (SO), the widely used Expected Improvement (EI, `'ei'`) is selected.
++ For single-objective optimization with constraints (SOC), the Expected Improvement with Constraints (EIC, `'eic'`) 
+  is selected.
++ For multi-objective optimization (MO):
+  + If `num_objectives <= 4`, the Expected Hypervolume Improvement (EHVI, `'ehvi'`) is selected.
+    (The computational cost of EHVI grows exponentially with the number of objectives, 
+    so it is not suitable for problems with too many objectives. Typically, the threshold is set to 4.)
+  + Otherwise, the Max-value Entropy Search for Multi-Objective (MESMO, `'mesmo'`) is selected.
++ For multi-objective optimization with constraints (MOC):
+  + If `num_objectives <= 4`, the Expected Hypervolume Improvement with Constraints (EHVIC, `'ehvic'`) is selected.
+  + Otherwise, the Max-value Entropy Search for Multi-Objective with Constraints (MESMOC, `'mesmoc'`) is selected.
+
+### For Acquisition Function Optimizer
+
+Currently supported acquisition function optimizers:
++ `'local_random'`: Interleaved Local and Random Search.
++ `'random_scipy'`: Random Search and L-BFGS-B optimizer from SciPy.
+
+The `'random_scipy'` optimizer requires all hyperparameters to be continuous currently.
+It costs more time than `'local_random'` but is more effective.
+
+The `'local_random'` optimizer is available for all scenarios.
+
+Currently, the algorithm selection mechanism selects the proper acquisition function optimizer 
+based on the following criteria:
++ If categorical hyperparameters exist in the search space, `'local_random'` is selected.
++ Otherwise, `'random_scipy'` is selected.
+
+
+## Extend Mechanism of Automatic Algorithm Selection
+
+For users who want to extend or customize the algorithm selection mechanism,
+please rewrite the `algo_auto_selection` method in `Advisor`.
+
+For example, if you want to use the Probability of improvement (PI) as the acquisition function 
+for single-objective optimization if there are more than 10 hyperparameters in the search space:
+
+```python
+from openbox import Advisor, logger
+
+class MyAdvisor(Advisor):
+    def algo_auto_selection(self):
+        if self.acq_type == 'auto':
+            n_dim = len(self.config_space.get_hyperparameters())
+            if self.num_objectives == 1 and self.num_constraints == 0 and n_dim > 10:
+                self.acq_type = 'pi'
+                logger.info(f'[auto selection] acq_type: {self.acq_type}')
+        super().algo_auto_selection()
+```
diff --git a/docs/en/examples/multi_objective.md b/docs/en/examples/multi_objective.md
@@ -52,9 +52,9 @@ opt = Optimizer(
     num_objectives=prob.num_objectives,
     num_constraints=0,
     max_runs=50,
-    surrogate_type='gp',
-    acq_type='ehvi',
-    acq_optimizer_type='random_scipy',
+    surrogate_type='gp',                # try using 'auto'!
+    acq_type='ehvi',                    # try using 'auto'!
+    acq_optimizer_type='random_scipy',  # try using 'auto'!
     initial_runs=2*(dim+1),
     init_strategy='sobol',
     ref_point=prob.ref_point,
@@ -77,14 +77,20 @@ In this example, `num_objectives=2`.
 + `max_runs=50` means the optimization will take 50 rounds (optimizing the objective function 50 times). 
 
 + `surrogate_type='gp'`. For mathematical problem, we suggest using Gaussian Process (`'gp'`) as Bayesian surrogate
-model. For practical problems such as hyperparameter optimization (HPO), we suggest using Random Forest (`'prf'`).
+model. For practical problems such as hyperparameter optimization (HPO), we suggest using Random Forest (`'prf'`). 
+Set to `'auto'` to enable 
+{ref}`automatic algorithm selection <advanced_usage/auto_algorithm_selection:Automatic Algorithm Selection>`.
 
 + `acq_type='ehvi'`. Use **EHVI(Expected Hypervolume Improvement)** as Bayesian acquisition function. For problems with more than 3 objectives, please
 use **MESMO**(`'mesmo'`) or **USEMO**(`'usemo'`).
+Set to `'auto'` to enable 
+{ref}`automatic algorithm selection <advanced_usage/auto_algorithm_selection:Automatic Algorithm Selection>`.
 
 + `acq_optimizer_type='random_scipy'`. For mathematical problems, we suggest using `'random_scipy'` as
 acquisition function optimizer. For practical problems such as hyperparameter optimization (HPO), we suggest
 using `'local_random'`.
+Set to `'auto'` to enable 
+{ref}`automatic algorithm selection <advanced_usage/auto_algorithm_selection:Automatic Algorithm Selection>`.
 
 + `initial_runs` sets how many configurations are suggested by `init_strategy` before the optimization loop.
 

diff --git a/docs/en/examples/multi_objective_with_constraint.md b/docs/en/examples/multi_objective_with_constraint.md
@@ -57,9 +57,9 @@ opt = Optimizer(
     num_objectives=prob.num_objectives,
     num_constraints=prob.num_constraints,
     max_runs=100,
-    surrogate_type='gp',
-    acq_type='ehvic',
-    acq_optimizer_type='random_scipy',
+    surrogate_type='gp',                # try using 'auto'!
+    acq_type='ehvic',                   # try using 'auto'!
+    acq_optimizer_type='random_scipy',  # try using 'auto'!
     initial_runs=initial_runs,
     init_strategy='sobol',
     ref_point=prob.ref_point,
@@ -83,13 +83,19 @@ In this example, `num_objectives=2` and `num_constraints=2`.
 
 + `surrogate_type='gp'`. For mathematical problem, we suggest using Gaussian Process (`'gp'`) as Bayesian surrogate
 model. For practical problems such as hyperparameter optimization (HPO), we suggest using Random Forest (`'prf'`).
+Set to `'auto'` to enable 
+{ref}`automatic algorithm selection <advanced_usage/auto_algorithm_selection:Automatic Algorithm Selection>`.
 
 + `acq_type='ehvic'`. Use **EHVIC(Expected Hypervolume Improvement with Constraint)**
 as Bayesian acquisition function.
+Set to `'auto'` to enable 
+{ref}`automatic algorithm selection <advanced_usage/auto_algorithm_selection:Automatic Algorithm Selection>`.
 
 + `acq_optimizer_type='random_scipy'`. For mathematical problems, we suggest using `'random_scipy'` as
 acquisition function optimizer. For practical problems such as hyperparameter optimization (HPO), we suggest
 using `'local_random'`.
+Set to `'auto'` to enable 
+{ref}`automatic algorithm selection <advanced_usage/auto_algorithm_selection:Automatic Algorithm Selection>`.
 
 + `initial_runs` sets how many configurations are suggested by `init_strategy` before the optimization loop.
 

diff --git a/docs/en/examples/single_objective_with_constraint.md b/docs/en/examples/single_objective_with_constraint.md
@@ -56,8 +56,8 @@ opt = Optimizer(
     space,
     num_constraints=1,
     num_objectives=1,
-    surrogate_type='gp',
-    acq_optimizer_type='random_scipy',
+    surrogate_type='gp',                # try using 'auto'!
+    acq_optimizer_type='random_scipy',  # try using 'auto'!
     max_runs=50,
     task_id='soc',
     # Have a try on the new HTML visualization feature!

diff --git a/docs/en/quick_start/quick_start.md b/docs/en/quick_start/quick_start.md
@@ -66,7 +66,7 @@ opt = Optimizer(
     branin,
     space,
     max_runs=50,
-    surrogate_type='gp',
+    surrogate_type='gp',          # try using 'auto'!
     task_id='quick_start',
     # Have a try on the new HTML visualization feature!
     # visualization='advanced',   # or 'basic'. For 'advanced', run 'pip install "openbox[extra]"' first
@@ -85,6 +85,8 @@ constraint.
 
 + `surrogate_type='gp'`. For mathematical problems, we suggest using Gaussian Process (`'gp'`) as Bayesian surrogate
 model. For practical problems such as hyperparameter optimization (HPO), we suggest using Random Forest (`'prf'`).
+Set to `'auto'` to enable 
+{ref}`automatic algorithm selection <advanced_usage/auto_algorithm_selection:Automatic Algorithm Selection>`.
 
 + `task_id` is set to identify the optimization process.
 

diff --git a/docs/zh_CN/advanced_usage/advanced_usage.rst b/docs/zh_CN/advanced_usage/advanced_usage.rst
@@ -7,5 +7,6 @@
 
     复杂搜索空间的问题定义 <complex_space>
     并行和分布式验证 <parallel_evaluation>
+    自动化算法选择 <auto_algorithm_selection>
     早停 <early_stop>
     迁移学习 <transfer_learning>
diff --git a/docs/zh_CN/advanced_usage/auto_algorithm_selection.md b/docs/zh_CN/advanced_usage/auto_algorithm_selection.md
@@ -0,0 +1,117 @@
+# 自动化算法选择
+
+Since a large number of Bayesian optimization algorithms are proposed,
+users may find it difficult to choose the proper algorithm for their tasks.
+**OpenBox** provides an automatic algorithm selection mechanism to choose the proper 
+optimization algorithm for a given optimization task.
+
+This document provides a brief introduction to the algorithm selection mechanism,
+including the usage of the mechanism and the algorithm selection criteria.
+
+
+## Usage
+
+To use the automatic algorithm selection mechanism,
+set the following options to `'auto'` in `Advisor` or `Optimizer`:
++ `surrogate_type='auto'`
++ `acq_type='auto'`
++ `acq_optimizer_type='auto'`
+
+\*By default, the algorithm selection mechanism is enabled.
+
+For example:
+```python
+from openbox import Advisor
+advisor = Advisor(
+    ...,
+    surrogate_type='auto',
+    acq_type='auto',
+    acq_optimizer_type='auto',
+)
+```
+
+After initialization, a log message will be printed to indicate the selected algorithms:
+```
+[BO auto selection]  surrogate_type: gp. acq_type: ei. acq_optimizer_type: random_scipy.
+```
+
+
+## Algorithm Selection Criteria
+
+The algorithm selection mechanism is based on the characteristics of the problem, 
+such as the dimensionality, the types of hyperparameters, and the number of objectives. 
+It is designed to provide consistent performance for different problems.
+
+The criteria for algorithm selection are obtained from practical experience or experimental results.
+
+### For Surrogate Model
+
+Gaussian Process (GP, `'gp'`) vs. Probabilistic Random Forest (PRF, `'prf'`):
++ GP performs very well on mathematical functions.
++ GP performs well for space with continuous hyperparameters.
++ PRF is better if the space is full of categorical hyperparameters.
++ GP is not suitable for high-dimensional problems.
++ PRF can be used for high-dimensional problems.
++ Computational cost: GP is $O(n^3)$ while PRF is $O(nlogn)$, where $n$ is the number of observations.
+
+Currently, the algorithm selection mechanism selects the proper surrogate model based on the following criteria:
++ If there are 10 or more hyperparameters in the search space, `'prf'` is selected.
+  (If there are 100 or more hyperparameters, random search is used instead of BO.)
++ If there are more categorical hyperparameters than continuous hyperparameters, `'prf'` is selected.
++ Otherwise, `'gp'` is selected.
++ If the model is automatically selected to be `'gp'`, and the number of observations is greater than 300,
+  the model is automatically switched to `'prf'`.
+
+### For Acquisition Function
+
+The acquisition function is chosen based on the type of the optimization task:
+
++ For single-objective optimization (SO), the widely used Expected Improvement (EI, `'ei'`) is selected.
++ For single-objective optimization with constraints (SOC), the Expected Improvement with Constraints (EIC, `'eic'`) 
+  is selected.
++ For multi-objective optimization (MO):
+  + If `num_objectives <= 4`, the Expected Hypervolume Improvement (EHVI, `'ehvi'`) is selected.
+    (The computational cost of EHVI grows exponentially with the number of objectives, 
+    so it is not suitable for problems with too many objectives. Typically, the threshold is set to 4.)
+  + Otherwise, the Max-value Entropy Search for Multi-Objective (MESMO, `'mesmo'`) is selected.
++ For multi-objective optimization with constraints (MOC):
+  + If `num_objectives <= 4`, the Expected Hypervolume Improvement with Constraints (EHVIC, `'ehvic'`) is selected.
+  + Otherwise, the Max-value Entropy Search for Multi-Objective with Constraints (MESMOC, `'mesmoc'`) is selected.
+
+### For Acquisition Function Optimizer
+
+Currently supported acquisition function optimizers:
++ `'local_random'`: Interleaved Local and Random Search.
++ `'random_scipy'`: Random Search and L-BFGS-B optimizer from SciPy.
+
+The `'random_scipy'` optimizer requires all hyperparameters to be continuous currently.
+It costs more time than `'local_random'` but is more effective.
+
+The `'local_random'` optimizer is available for all scenarios.
+
+Currently, the algorithm selection mechanism selects the proper acquisition function optimizer 
+based on the following criteria:
++ If categorical hyperparameters exist in the search space, `'local_random'` is selected.
++ Otherwise, `'random_scipy'` is selected.
+
+
+## Extend Mechanism of Automatic Algorithm Selection
+
+For users who want to extend or customize the algorithm selection mechanism,
+please rewrite the `algo_auto_selection` method in `Advisor`.
+
+For example, if you want to use the Probability of improvement (PI) as the acquisition function 
+for single-objective optimization if there are more than 10 hyperparameters in the search space:
+
+```python
+from openbox import Advisor, logger
+
+class MyAdvisor(Advisor):
+    def algo_auto_selection(self):
+        if self.acq_type == 'auto':
+            n_dim = len(self.config_space.get_hyperparameters())
+            if self.num_objectives == 1 and self.num_constraints == 0 and n_dim > 10:
+                self.acq_type = 'pi'
+                logger.info(f'[auto selection] acq_type: {self.acq_type}')
+        super().algo_auto_selection()
+```
diff --git a/docs/zh_CN/examples/multi_objective.md b/docs/zh_CN/examples/multi_objective.md
@@ -51,9 +51,9 @@ opt = Optimizer(
     num_objectives=prob.num_objectives,
     num_constraints=0,
     max_runs=50,
-    surrogate_type='gp',
-    acq_type='ehvi',
-    acq_optimizer_type='random_scipy',
+    surrogate_type='gp',                # try using 'auto'!
+    acq_type='ehvi',                    # try using 'auto'!
+    acq_optimizer_type='random_scipy',  # try using 'auto'!
     initial_runs=2*(dim+1),
     init_strategy='sobol',
     ref_point=prob.ref_point,
@@ -75,12 +75,15 @@ opt.run()
 
 + `surrogate_type='gp'` 对于数学问题，我们推荐用高斯过程 (`'gp'`) 做贝叶斯优化的替代模型。
 对于实际问题，比如超参数优化（HPO）问题，我们推荐使用随机森林(`'prf'`)。
+设置为 `'auto'` 来启用{ref}`自动化算法选择 <advanced_usage/auto_algorithm_selection:自动化算法选择>`。
 
 + `acq_type='ehvi'` 用 **EHVI(Expected Hypervolume Improvement)** 作为贝叶斯优化的acquisition function。
 对于超过三个目标的问题，请使用**MESMO**(`'mesmo'`) 或 **USEMO**(`'usemo'`)。
+设置为 `'auto'` 来启用{ref}`自动化算法选择 <advanced_usage/auto_algorithm_selection:自动化算法选择>`。
 
 + `acq_optimizer_type='random_scipy'`. 对于数学问题，我们推荐用 `'random_scipy'` 作为 acquisition function 的优化器。
   对于实际问题，比如超参数优化（HPO）问题，我们推荐使用 `'local_random'` 。
+设置为 `'auto'` 来启用{ref}`自动化算法选择 <advanced_usage/auto_algorithm_selection:自动化算法选择>`。
 
 + `initial_runs` 设置在优化循环之前，`init_strategy`推荐使用的配置数量。