Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid numerical errors and overflow. #34

Conversation

HideakiImamura
Copy link
Contributor

Motivation

In the current implementation of CMA class,

  • we calculate eigenvalues and vectors by using numpy.linalg.eigh, and then take a route for eigenvalues, and
  • we take numpy.exp each time the tell function is called.

For the first, due to the numerical error in numpy.linalg.eigh, the eigenvalue may be negative and the result of the root may be Nan. As for the second, numpy.exp may overflow when the variance of the objective function is very large. This PR aims to resolve the above problems.

Description of the changes

  • If an eigenvalue becomes negative due to a numerical error, it is corrected to a small positive number (CMA._epsilon).
  • If it is large enough to overflow, replace it with a value that does not overflow.

@HideakiImamura
Copy link
Contributor Author

Hello! I would like to merge this PR to make this library and Optuna more cooperative.

Can I ask a question about CI? kurobako test seems to be failing on the step Run gcloud config set project. I am sorry I don't know much about gcloud, can I get some advice on what is wrong? Maybe it's because the version of kurobako is low.

@HideakiImamura
Copy link
Contributor Author

HideakiImamura commented Jun 9, 2020

I leave the concrete example, in which the current cmaes fails. I use the kurobako with the version 0.14.0, the cmaes with the version 0.5.0, and the optuna updated in recent PR.

# You can skip following 2 lines, if you already installed the hpobench dataset.
$ wget http://ml4aad.org/wp-content/uploads/2019/01/fcnet_tabular_benchmarks.tar.gz
$ tar xf fcnet_tabular_benchmarks.tar.gz
$ dataset=fcnet_tabular_benchmarks/fcnet_protein_structure_data.hdf5
$ MIN_RESOURCE=1
$ N_BRACKETS=4
$ REDUCTION_FACTOR=3
$ N_RUN=1
$ BUDGET=80
$ echo -n >| ./solvers.json
$ echo -n >| ./problems.json
$ echo -n >| ./studies.json
$ kurobako problem hpobench "${dataset}" | tee -a ./problems.json
$ kurobako solver --name cma-es-median optuna \
    --loglevel debug \
    --sampler cma-es \
    --pruner median \
    --hyperband-min-resource ${MIN_RESOURCE} \
    --hyperband-n-brackets ${N_BRACKETS} \
    --hyperband-reduction-factor ${REDUCTION_FACTOR} \
| tee -a ./solvers.json
$ kurobako studies \
  --solvers $(cat ./solvers.json) \
  --problems $(cat ./problems.json) \
  --repeats ${N_RUN} \
  --budget ${BUDGET} \
| tee  -a ./studies.json
$ cat ./studies.json | kurobako run --parallelism 10 > ./results/result.json

By executing these commands on the shell, I get the following error with some logging messages.

(omitted because it's so long)
[D 2020-06-09 17:50:36,285] [CmaEsSampler Log] CmaEsSampler log for params end
[D 2020-06-09 17:50:36,285] [CmaEsSampler Log] mean   = [2.79463414 0.56219366 1.43780634 0.28109683 5.         4.72617885]
[D 2020-06-09 17:50:36,285] [CmaEsSampler Log] sigma  = 15.843962929446162
[D 2020-06-09 17:50:36,285] [CmaEsSampler Log] bounds = [[0. 3.]
 [0. 2.]
 [0. 2.]
 [0. 5.]
 [0. 5.]
 [0. 5.]]
[D 2020-06-09 17:50:36,286] [CmaEsSampler Log] _B     = [[ 0.44822352 -0.05596111 -0.57741063 -0.66806337  0.12619833 -0.01806424]
 [-0.0398787  -0.72305129  0.08343666 -0.14301021 -0.59751381 -0.30194399]
 [-0.20304854 -0.57099786  0.17406707 -0.09243633  0.76974364 -0.03719726]
 [-0.75900461 -0.05524707 -0.53656125 -0.07119865 -0.11199979  0.33963007]
 [-0.42367335  0.37994111  0.29051905 -0.55031919  0.01233164 -0.53734631]
 [-0.02625931  0.02488266 -0.50698782  0.46559047  0.14781722 -0.70924624]]
[D 2020-06-09 17:50:36,286] [CmaEsSampler Log] _D     = [       nan 0.5233484  0.81321789 0.99240755 1.32084933 1.95539179]
[D 2020-06-09 17:50:36,287] [CmaEsSampler Log] _C     = [[nan nan nan nan nan nan]
 [nan nan nan nan nan nan]
 [nan nan nan nan nan nan]
 [nan nan nan nan nan nan]
 [nan nan nan nan nan nan]
 [nan nan nan nan nan nan]]
Traceback (most recent call last):
  File "/var/folders/ft/zkf9p6x9289108nvdcz8rr_r0000gn/T/.tmpl9sKli", line 89, in <module>
    runner.run()
  File "/Users/mamu/Documents/Work/pfn/venv/lib/python3.6/site-packages/kurobako/solver/__init__.py", line 172, in run
    while self._run_once():
  File "/Users/mamu/Documents/Work/pfn/venv/lib/python3.6/site-packages/kurobako/solver/__init__.py", line 186, in _run_once
    self._handle_ask_call(message)
  File "/Users/mamu/Documents/Work/pfn/venv/lib/python3.6/site-packages/kurobako/solver/__init__.py", line 214, in _handle_ask_call
    trial = solver.ask(idg)
  File "/Users/mamu/Documents/Work/pfn/venv/lib/python3.6/site-packages/kurobako/solver/optuna.py", line 86, in ask
    trial = self._create_new_trial()
  File "/Users/mamu/Documents/Work/pfn/venv/lib/python3.6/site-packages/kurobako/solver/optuna.py", line 174, in _create_new_trial
    return optuna.trial.Trial(self._study, trial_id)
  File "/Users/mamu/Documents/Work/pfn/optuna/optuna/trial/_trial.py", line 65, in __init__
    self._init_relative_params()
  File "/Users/mamu/Documents/Work/pfn/optuna/optuna/trial/_trial.py", line 76, in _init_relative_params
    study, trial, self.relative_search_space
  File "/Users/mamu/Documents/Work/pfn/optuna/optuna/samplers/cmaes.py", line 218, in sample_relative
(ALL) [00:01:26] [STUDIES      1/1 100%] [ETA  0s] canceled

Error: InvalidInput (cause; EOF while parsing a value at line 1 column 0)
HISTORY:
  [0] at /Users/mamu/.cargo/registry/src/github.com-1ecc6299db9ec823/kurobako_core-0.1.7/src/epi/channel.rs:62 -- line=""
  [1] at /Users/mamu/.cargo/registry/src/github.com-1ecc6299db9ec823/kurobako_core-0.1.7/src/epi/solver/external_program.rs:164
  [2] at /Users/mamu/.cargo/registry/src/github.com-1ecc6299db9ec823/kurobako_core-0.1.7/src/epi/solver/embedded_script.rs:94
  [3] at /Users/mamu/.cargo/registry/src/github.com-1ecc6299db9ec823/kurobako_solvers-0.1.8/src/optuna.rs:276
  [4] at /Users/mamu/.cargo/registry/src/github.com-1ecc6299db9ec823/kurobako_core-0.1.7/src/solver.rs:176
  [5] at /Users/mamu/.cargo/registry/src/github.com-1ecc6299db9ec823/kurobako_core-0.1.7/src/solver.rs:176
  [6] at /Users/mamu/.cargo/registry/src/github.com-1ecc6299db9ec823/kurobako-0.1.14/src/runner.rs:314
  [7] at /Users/mamu/.cargo/registry/src/github.com-1ecc6299db9ec823/kurobako-0.1.14/src/runner.rs:271
  [8] at /Users/mamu/.cargo/registry/src/github.com-1ecc6299db9ec823/kurobako-0.1.14/src/runner.rs:357
  [9] at /Users/mamu/.cargo/registry/src/github.com-1ecc6299db9ec823/kurobako-0.1.14/src/runner.rs:136
  [10] at /Users/mamu/.cargo/registry/src/github.com-1ecc6299db9ec823/kurobako-0.1.14/src/runner.rs:145
  [11] at /Users/mamu/.cargo/registry/src/github.com-1ecc6299db9ec823/kurobako-0.1.14/src/main.rs:85

@c-bata
Copy link
Collaborator

c-bata commented Jun 9, 2020

Thank you! I'll reproduce the problem.

Can I ask a question about CI? kurobako test seems to be failing on the step Run gcloud config set project. I am sorry I don't know much about gcloud, can I get some advice on what is wrong? Maybe it's because the version of kurobako is low.

GitHub actions upload kurobako image to my google cloud storage via gsutil CLI. So google's service account is required for authentication. But due to limitation of github actions, forked repositories seems not be able to access github secrets (refs: https://github.community/t/allow-secrets-to-be-shared-with-forks-from-trusted-actions/16525).

Sorry I have no idea to resolve this issue for now. Please ignore the failures of kurobako benchmark. I'll run kurobako on my laptop and paste the results here.

@c-bata
Copy link
Collaborator

c-bata commented Jun 18, 2020

memo: I tried to reproduce an error with following seed numbers (cmaes revision: 751a9bf).

seed number success to reproduce
0 x
1 x
2 x
3 x
4 x
5 x
6 x
7
8
9
script to reproduce
#!/bin/sh

set -ex

DATASET=fcnet_tabular_benchmarks/fcnet_protein_structure_data.hdf5
MIN_RESOURCE=1
N_BRACKETS=4
REDUCTION_FACTOR=3
N_RUN=1
BUDGET=80
SEED=${SEED:-0}

kurobako problem hpobench "${DATASET}" | tee -a ./problems.json
kurobako solver --name cma-es-median optuna \
    --loglevel error \
    --sampler cma-es \
    --pruner median \
    --hyperband-min-resource ${MIN_RESOURCE} \
    --hyperband-n-brackets ${N_BRACKETS} \
    --hyperband-reduction-factor ${REDUCTION_FACTOR} \
| tee -a ./solvers.json

kurobako studies \
  --solvers $(cat ./solvers.json) \
  --problems $(cat ./problems.json) \
  --repeats ${N_RUN} \
  --budget ${BUDGET} \
  --seed ${SEED} \
| tee  -a ./studies.json

cat ./studies.json | kurobako run --parallelism 10 > ./results.json

@HideakiImamura
Copy link
Contributor Author

HI @c-bata! I provide an example in which the current cmaes fails with Optuna. To show the example, I created another PR that contains logging codes and non-default CmaEsSampler parameters. Please use the following Optuna: optuna/optuna#1396

And, currently, kurobako-py v0.1.7 is released to match the latest Optuna. Please use this version of kurobako.

script:

#!/bin/sh

echo -n >| ./solvers.json
echo -n >| ./problems.json
echo -n >| ./studies.json

set -ex

DATASET=fcnet_tabular_benchmarks/fcnet_protein_structure_data.hdf5
MIN_RESOURCE=1
N_BRACKETS=4
REDUCTION_FACTOR=3
N_RUN=1
BUDGET=80
SEED=${SEED:-0}

kurobako problem hpobench "${DATASET}" | tee -a ./problems.json
kurobako solver --name cma-es-median optuna \
    --loglevel error \
    --sampler cma-es \
    --pruner median \
    --hyperband-min-resource ${MIN_RESOURCE} \
    --hyperband-n-brackets ${N_BRACKETS} \
    --hyperband-reduction-factor ${REDUCTION_FACTOR} \
| tee -a ./solvers.json

kurobako studies \
  --solvers $(cat ./solvers.json) \
  --problems $(cat ./problems.json) \
  --repeats ${N_RUN} \
  --budget ${BUDGET} \
  --seed ${SEED} \
| tee  -a ./studies.json

cat ./studies.json | kurobako run --parallelism 10 > ./results.json

@c-bata
Copy link
Collaborator

c-bata commented Jun 20, 2020

Thank you! I reproduce the error now.

error log
$ pip install -U git+https://github.com/HideakiImamura/optuna.git@debug/cmaes-for-debug
$ pip install -U kurobako
$ ./reproduce-error.sh
...
[I 2020-06-20 17:23:35,613] Trial 352 finished with value: 0.2609153687953949 and parameters: {'activation_fn_1': 'relu', 'activation_fn_2': 'relu', 'batch_size': 2, 'dropout_1': 0, 'dropout_2': 0, 'init_lr': 0, 'lr_schedule': 'const',(ALL) [00:00:00] [STUDIES      0/819   0%] [ETA  0s] 
(STUDY) [00:01:25] [STEPS   4030/8000  50%] [ETA  1m] "cma-es-median" "HPO-Bench-Protein"(STUDY) [00:01:25] [STEPS   4031/8000  50%] [ETA  1m] "cma-es-median" "HPO-Bench-Protein"
(STUDY) [00:01:25] [STEPS   4031/8000  50%] [ETA  1m] "cma-es-median" "HPO-Bench-Protein"(STUDY) [00:01:25] [STEPS   4031/8000  50%] [ETA  1m] "cma-es-median" "HPO-Bench-Protein"
(STUDY) [00:01:25] [STEPS   4031/8000  50%] [ETA  1m] "cma-es-median" "HPO-Bench-Protein"(STUDY) [00:01:25] [STEPS   4031/8000  50%] [ETA  1m] "cma-es-median" "HPO-Bench-Protein"
(STUDY) [00:01:25] [STEPS   4031/8000  50%] [ETA  1m] "cma-es-median" "HPO-Bench-Protein"(STUDY) [00:01:25] [STEPS   4031/8000  50%] [ETA  1m] "cma-es-median" "HPO-Bench-Protein"
(STUDY) [00:01:25] [STEPS   3463/8000  43%] [ETA  1m] "cma-es-median" "HPO-Bench-Protein"(STUDY) [00:01:24] [STEPS   4030/8000  50%] [ETA  1m] "cma-es-median" "HPO-Bench-Protein"
/Users/a14737/src/github.com/CyberAgent/cmaes/cmaes/cma.py:222: RuntimeWarning: invalid value encountered in sqrt
  D = np.sqrt(D2)
/Users/a14737/src/github.com/CyberAgent/cmaes/cmaes/cma.py:222: RuntimeWarning: invalid value encountered in sqrt
  D = np.sqrt(D2)
/Users/a14737/src/github.com/CyberAgent/cmaes/cmaes/cma.py:222: RuntimeWarning: invalid value encountered in sqrt
  D = np.sqrt(D2)
[E 2020-06-20 17:23:59,822] Nan is detected: [nan nan nan nan nan nan]
[E 2020-06-20 17:23:59,822] Nan is detected: [nan nan nan nan nan nan]
Traceback (most recent call last):
  File "/var/folders/9q/c1wp98sd4110kvnb89ycs7vj6xh7ks/T/.tmporwfue", line 89, in <module>
Traceback (most recent call last):
  File "/var/folders/9q/c1wp98sd4110kvnb89ycs7vj6xh7ks/T/.tmporwfue", line 89, in <module>
/Users/a14737/src/github.com/CyberAgent/cmaes/cmaes/cma.py:222: RuntimeWarning: invalid value encountered in sqrt
  D = np.sqrt(D2)
    runner.run()
    runner.run()
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/__init__.py", line 172, in run
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/__init__.py", line 172, in run
    while self._run_once():
    while self._run_once():
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/__init__.py", line 186, in _run_once
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/__init__.py", line 186, in    self._handle_ask_call(message)
[E 2020-06-20 17:23:59,832] Nan is detected: [nan nan nan nan nan nan]
 _run_once
/Users/a14737/src/github.com/CyberAgent/cmaes/cmaes/cma.py:222: RuntimeWarning: invalid value encountered in sqrt
  D = np.sqrt(D2)
[E 2020-06-20 17:23:59,828] Nan is detected: [nan nan nan nan nan nan]
/Users/a14737/src/github.com/CyberAgent/cmaes/cmaes/cma.py:222: RuntimeWarning: invalid value encountered in sqrt
  D = np.sqrt(D2)
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/__init__.py", line 214, in _handle_ask_call
    trial = solver.ask(idg)
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/optuna.py", line 86, in ask
    self._handle_ask_call(message)
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/__init__.py", line 214, in _handle_ask_call
Traceback (most recent call last):
  File "/    trial = solver.ask(idg)
var/folders/9q/c1wp98sd4110kvnb89ycs7vj6xh7ks/T/.tmporwfue", line 89, in <module>
Traceback (most recent call last):
/Users/a14737/src/github.com/CyberAgent/cmaes/cmaes/cma.py:222: RuntimeWarning: invalid value encountered in sqrt
  D = np.sqrt(D2)
    trial = self._create_new_trial()
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/optuna.py", line 86, in ask
  File "/var/folders/9q/c1wp98sd4110kvnb89ycs7vj6xh7ks/T/.tmporwfue", line 89, in <module>
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/optuna.py", line 172, in _create_new_trial
    trial = self._create_new_trial()
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/optuna.py", line 172, in _create_new_trial
    return optuna.trial.Trial(self._study, trial_id)
    runner.run()
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/optuna/trial/_trial.py", line 67, in __init__
    runner.run()
    return optuna.trial.Trial(self._study, trial_id)
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/__init__.py", line 172, in run
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/optuna/trial/_trial.py", line 67, in __init__
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/__init__.py", line 172, in run
    while self._run_once():
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/__init__.py", line 186, in _run_once
    while self._run_once():
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/__init__.py", line 186, in _run_once
/Users/a14737/src/github.com/CyberAgent/cmaes/cmaes/cma.py:222: RuntimeWarning: invalid value encountered in sqrt
  D = np.sqrt(D2)
    self._handle_ask_call(message)
    self._handle_ask_call(message)
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/__init__.py", line 214, in _handle_ask_call
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/__init__.py", line 214, in _handle_ask_call
    self._init_relative_params()
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/optuna/trial/_trial.py", line 77, in _init_relative_params
    self._init_relative_params()
    trial = solver.ask(idg)
    trial = solver.ask(idg)
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/optuna.py", line 86, in ask
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/optuna.py", line 86, in ask
    self.relative_params = self.study.sampler.sample_relative(
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/optuna/samplers/_cmaes.py", line 245, in sample_relative
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/optuna/trial/_trial.py", line 77, in _init_relative_params
    trial = self._create_new_trial()
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/optuna.py", line 172, in _create_new_trial
    trial = self._create_new_trial()
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/optuna.py", line 172, in _create_new_trial
    self.relative_params = self.study.sampler.sample_relative(
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/optuna/samplers/_cmaes.py", line 245, in sample_relative
    return optuna.trial.Trial(self._study, trial_id)
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/optuna/trial/_trial.py", line 67, in __init__
    return optuna.trial.Trial(self._study, trial_id)
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/optuna/trial/_trial.py", line 67, in __init__
    self._init_relative_params()
    raise ValueError("Nan is detected: {}".format(params))
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/optuna/trial/_trial.py", line 77, in _init_relative_params
ValueError: Nan is detected: [nan nan nan nan nan nan]
    raise ValueError("Nan is detected: {}".format(params))
    self._init_relative_params()
ValueError: Nan is detected: [nan nan nan nan nan nan]
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/optuna/trial/_trial.py", line 77, in _init_relative_params
    self.relative_params = self.study.sampler.sample_relative(
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/optuna/samplers/_cmaes.py", line 245, in sample_relative
    self.relative_params = self.study.sampler.sample_relative(
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/optuna/samplers/_cmaes.py", line 245, in sample_relative
    raise ValueError("Nan is detected: {}".format(params))
ValueError: Nan is detected: [nan nan nan nan nan nan]
    raise ValueError("Nan is detected: {}".format(params))
ValueError: Nan is detected: [nan nan nan nan nan nan]
[E 2020-06-20 17:23:59,847] Nan is detected: [nan nan nan nan nan nan]
[E 2020-06-20 17:23:59,847] Nan is detected: [nan nan nan nan nan nan]
Traceback (most recent call last):
  File "/var/folders/9q/c1wp98sd4110kvnb89ycs7vj6xh7ks/T/.tmporwfue", line 89, in <module>
Traceback (most recent call last):
  File "/var/folders/9q/c1wp98sd4110kvnb89ycs7vj6xh7ks/T/.tmporwfue", line 89, in <module>
[E 2020-06-20 17:23:59,849] Nan is detected: [nan nan nan nan nan nan]
    runner.run()
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/__init__.py", line 172, in run
    runner.run()
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/__init__.py", line 172, in run
    while self._run_once():
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/__init__.py", line 186, in _run_once
    while self._run_once():
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/__init__.py", line 186, in _run_once
Traceback (most recent call last):
  File "/var/folders/9q/c1wp98sd4110kvnb89ycs7vj6xh7ks/T/.tmporwfue", line 89, in <module>
    self._handle_ask_call(message)
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/__init__.py", line 214, in _handle_ask_call
    self._handle_ask_call(message)
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/__init__.py", line 214, in _handle_ask_call
    runner.run()
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/__init__.py", line 172, in run
    trial = solver.ask(idg)
    trial = solver.ask(idg)
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/optuna.py", line 86, in ask
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/optuna.py", line 86, in ask
    while self._run_once():
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/__init__.py", line 186, in _run_once
    trial = self._create_new_trial()
    trial = self._create_new_trial()
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/optuna.py", line 172, in _create_new_trial
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/optuna.py", line 172, in _create_new_trial
    return optuna.trial.Trial(self._study, trial_id)
    self._handle_ask_call(message)
    return optuna.trial.Trial(self._study, trial_id)
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/optuna/trial/_trial.py", line 67, in __init__
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/__init__.py", line 214, in _handle_ask_call
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/optuna/trial/_trial.py", line 67, in __init__
    self._init_relative_params()
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/optuna/trial/_trial.py", line 77, in _init_relative_params
    self._init_relative_params()
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/optuna/trial/_trial.py", line 77, in _init_relative_params
    trial = solver.ask(idg)
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/optuna.py", line 86, in ask
    self.relative_params = self.study.sampler.sample_relative(
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/optuna/samplers/_cmaes.py", line 245, in sample_relative
    self.relative_params = self.study.sampler.sample_relative(
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/optuna/samplers/_cmaes.py", line 245, in sample_relative
    trial = self._create_new_trial()
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/optuna.py", line 172, in _create_new_trial
    raise ValueError("Nan is detected: {}".format(params))
ValueError: Nan is detected: [nan nan nan nan nan nan]
    raise ValueError("Nan is detected: {}".format(params))
[E 2020-06-20 17:23:59,853] Nan is detected: [nan nan nan nan nan nan]
ValueError: Nan is detected: [nan nan nan nan nan nan]
    return optuna.trial.Trial(self._study, trial_id)
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/optuna/trial/_trial.py", line 67, in __init__
    self._init_relative_params()
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/optuna/trial/_trial.py", line 77, in _init_relative_params
    self.relative_params = self.study.sampler.sample_relative(
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/optuna/samplers/_cmaes.py", line 245, in sample_relative
Traceback (most recent call last):
  File "/var/folders/9q/c1wp98sd4110kvnb89ycs7vj6xh7ks/T/.tmporwfue", line 89, in <module>
    raise ValueError("Nan is detected: {}".format(params))
ValueError: Nan is detected: [nan nan nan nan nan nan]
    runner.run()
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/__init__.py", line 172, in run
    while self._run_once():
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/__init__.py", line 186, in _run_once
    self._handle_ask_call(message)
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/__init__.py", line 214, in _handle_ask_call
    trial = solver.ask(idg)
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/optuna.py", line 86, in ask
    trial = self._create_new_trial()
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/optuna.py", line 172, in _create_new_trial
    return optuna.trial.Trial(self._study, trial_id)
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/optuna/trial/_trial.py", line 67, in __init__
/Users/a14737/src/github.com/CyberAgent/cmaes/cmaes/cma.py:222: RuntimeWarning: invalid value encountered in sqrt
  D = np.sqrt(D2)
    self._init_relative_params()
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/optuna/trial/_trial.py", line 77, in _init_relative_params
    self.relative_params = self.study.sampler.sample_relative(
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/optuna/samplers/_cmaes.py", line 245, in sample_relative
    raise ValueError("Nan is detected: {}".format(params))
ValueError: Nan is detected: [nan nan nan nan nan nan]
[E 2020-06-20 17:23:59,865] Nan is detected: [nan nan nan nan nan nan]
Traceback (most recent call last):
  File "/var/folders/9q/c1wp98sd4110kvnb89ycs7vj6xh7ks/T/.tmporwfue", line 89, in <module>
    runner.run()
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/__init__.py", line 172, in run
    while self._run_once():
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/__init__.py", line 186, in _run_once
    self._handle_ask_call(message)
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/__init__.py", line 214, in _handle_ask_call
    trial = solver.ask(idg)
(ALL) [00:01:28] [STUDIES    825/819 100%] [ETA  0s] canceled
(STUDY) [00:01:35] [STEPS   3522/8000  44%] [ETA  2m] "cma-es-median" "HPO-Bench-Protein"
(ALL) [00:01:28] [STUDIES    825/819 100%] [ETA  0s] canceled(STUDY) [00:01:44] [STEPS   3714/8000  46%] [ETA  2m] "cma-es-median" "HPO-Bench-Protein"
(ALL) [00:01:28] [STUDIES    825/819 100%] [ETA  0s] canceled(STUDY) [00:01:46] [STEPS   3799/8000  47%] [ETA  1m] "cma-es-median" "HPO-Bench-Protein"
/Users/a14737/src/github.com/CyberAgent/cmaes/cmaes/cma.py:222: RuntimeWarning: invalid value encountered in sqrt
  D = np.sqrt(D2)
[E 2020-06-20 17:24:20,025] Nan is detected: [nan nan nan nan nan nan]
Traceback (most recent call last):
  File "/var/folders/9q/c1wp98sd4110kvnb89ycs7vj6xh7ks/T/.tmporwfue", line 89, in <module>
    runner.run()
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/__init__.py", line 172, in run
    while self._run_once():
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/__init__.py", line 186, in _run_once
    self._handle_ask_call(message)
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/__init__.py", line 214, in _handle_ask_call
    trial = solver.ask(idg)
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/optuna.py", line 86, in ask
    trial = self._create_new_trial()
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/optuna.py", line 172, in _create_new_trial
    return optuna.trial.Trial(self._study, trial_id)
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/optuna/trial/_trial.py", line 67, in __init__
    self._init_relative_params()
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/optuna/trial/_trial.py", line 77, in _init_relative_params
    self.relative_params = self.study.sampler.sample_relative(
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/optuna/samplers/_cmaes.py", line 245, in sample_relative
(ALL) [00:01:28] [STUDIES    825/819 100%] [ETA  0s] canceled

Error: InvalidInput (cause; EOF while parsing a value at line 1 column 0)
HISTORY:
  [0] at kurobako_core/src/epi/channel.rs:62 -- line=""
  [1] at kurobako_core/src/epi/solver/external_program.rs:164
  [2] at kurobako_core/src/epi/solver/embedded_script.rs:94
  [3] at kurobako_solvers/src/optuna.rs:276
  [4] at kurobako_core/src/solver.rs:176
  [5] at kurobako_core/src/solver.rs:176
  [6] at src/runner.rs:314
  [7] at src/runner.rs:271
  [8] at src/runner.rs:357
  [9] at src/runner.rs:136
  [10] at src/runner.rs:145
  [11] at src/main.rs:85

@c-bata
Copy link
Collaborator

c-bata commented Jun 20, 2020

Benchmark results

himmelblau-function-8e06e122b0f12fb43a451a51f89700f00c19b5e4e5c9617fc5f4ab3ce5e46495
rosenbrock-function-fe199876201a22929fa66f433ec65531173f97ce4795880487653ed43957c29a
six-hump-camel-function-70edcb4eb29cdef939b4fab013648815eb6a58c76ca9cc6734b0e690b581e085

Copy link
Collaborator

@c-bata c-bata left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @HideakiImamura! I successfully reproduce an error. And this PR does not affect an optimization efficiency. Basically LGTM but I have a question about a magic number.

cmaes/cma.py Outdated
Comment on lines 287 to 288
if _log_sigma > 10 ** 2.8:
self._sigma = np.exp(10 ** 2.8)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you tell me the meaning of this magic number?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comment. I've just decided this magic number to avoid the overflow. It is more reasonable to set it as the maximum number of float. What do you think?

Copy link
Collaborator

@c-bata c-bata Jun 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree 👍

And I guess you can refactor these changes like self._sigma = np.exp(min(_log_sigma, ...)).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid overflow, seemingly it needs to refactor like:

self._sigma = min(np.exp(_log_sigma), sys.float_info.max)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion! I addressed.

Copy link
Collaborator

@c-bata c-bata Jun 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry. Due to my suggestion, the error will be raised again.

...
ValueError: Nan is detected: [nan nan nan nan nan nan]
/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/cmaes/cma.py:222: RuntimeWarning: invalid value encountered in sqrt
  D = np.sqrt(D2)
[E 2020-06-22 05:06:10,985] Nan is detected: [nan nan nan nan nan nan]
Traceback (most recent call last):
  File "/var/folders/9q/c1wp98sd4110kvnb89ycs7vj6xh7ks/T/.tmpLv0wtD", line 89, in <module>
    runner.run()
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/__init__.py", line 172, in run
    while self._run_once():
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/__init__.py", line 186, in _run_once
    self._handle_ask_call(message)
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/__init__.py", line 214, in _handle_ask_call
    trial = solver.ask(idg)
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/optuna.py", line 86, in ask
    trial = self._create_new_trial()
(ALL) [00:01:37] [STUDIES   1787/1785 100%] [ETA  0s] canceled
(STUDY) [00:01:35] [STEPS   4033/8000  50%] [ETA  1m] "cma-es-median" "HPO-Bench-Protein"
(ALL) [00:01:37] [STUDIES   1788/1785 100%] [ETA  0s] canceled
...

Hmm...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... I changed the revision to 58e67ec,

$ git checkout 58e67ec
$ git status
HEAD detached at 58e67ec
Untracked files:
  (use "git add <file>..." to include in what will be committed)
        fcnet_tabular_benchmarks/
        problems.json
        reproduce.sh
        results.json
        solvers.json
        studies.json

nothing added to commit but untracked files present (use "git add" to track)
$ python -m pip freeze | grep kurobako
kurobako==0.1.7
$ ./reproduce-error.sh
...
ValueError: Nan is detected: [nan nan nan nan nan nan]
/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/cmaes/cma.py:222: RuntimeWarning: invalid value encountered in sqrt
  D = np.sqrt(D2)
[E 2020-06-22 05:11:06,425] Nan is detected: [nan nan nan nan nan nan]
Traceback (most recent call last):
  File "/var/folders/9q/c1wp98sd4110kvnb89ycs7vj6xh7ks/T/.tmpMl1HVg", line 89, in <module>
    runner.run()
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/__init__.py", line 172, in run
    while self._run_once():
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/__init__.py", line 186, in _run_once
    self._handle_ask_call(message)
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/__init__.py", line 214, in _handle_ask_call
    trial = solver.ask(idg)
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/optuna.py", line 86, in ask
    trial = self._create_new_trial()
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/optuna.py", line 172, in _create_new_trial
    return optuna.trial.Trial(self._study, trial_id)
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/optuna/trial/_trial.py", line 67, in __init__
    self._init_relative_params()
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/optuna/trial/_trial.py", line 77, in _init_relative_params
    self.relative_params = self.study.sampler.sample_relative(
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/optuna/samplers/_cmaes.py", line 245, in sample_relative
    raise ValueError("Nan is detected: {}".format(params))
ValueError: Nan is detected: [nan nan nan nan nan nan]
/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/cmaes/cma.py:222: RuntimeWarning: invalid value encountered in sqrt
  D = np.sqrt(D2)
[E 2020-06-22 05:11:06,613] Nan is detected: [nan nan nan nan nan nan]
Traceback (most recent call last):
  File "/var/folders/9q/c1wp98sd4110kvnb89ycs7vj6xh7ks/T/.tmpMl1HVg", line 89, in <module>
    runner.run()
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/__init__.py", line 172, in run
    while self._run_once():
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/__init__.py", line 186, in _run_once
    self._handle_ask_call(message)
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/__init__.py", line 214, in _handle_ask_call
    trial = solver.ask(idg)
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/optuna.py", line 86, in ask
(ALL) [00:01:43] [STUDIES   2117/2109 100%] [ETA  0s] canceled
(STUDY) [00:01:41] [STEPS   4097/8000  51%] [ETA  1m] "cma-es-median" "HPO-Bench-Protein"
/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/cmaes/cma.py:222: RuntimeWarning: invalid value encountered in sqrt
  D = np.sqrt(D2)
[E 2020-06-22 05:11:07,187] Nan is detected: [nan nan nan nan nan nan]
Traceback (most recent call last):
  File "/var/folders/9q/c1wp98sd4110kvnb89ycs7vj6xh7ks/T/.tmpMl1HVg", line 89, in <module>
    runner.run()
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/__init__.py", line 172, in run
    while self._run_once():
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/__init__.py", line 186, in _run_once
    self._handle_ask_call(message)
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/__init__.py", line 214, in _handle_ask_call
    trial = solver.ask(idg)
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/optuna.py", line 86, in ask
    trial = self._create_new_trial()
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/kurobako/solver/optuna.py", line 172, in _create_new_trial
    return optuna.trial.Trial(self._study, trial_id)
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/optuna/trial/_trial.py", line 67, in __init__
    self._init_relative_params()
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/optuna/trial/_trial.py", line 77, in _init_relative_params
    self.relative_params = self.study.sampler.sample_relative(
  File "/Users/a14737/src/github.com/CyberAgent/cmaes/venv/lib/python3.8/site-packages/optuna/samplers/_cmaes.py", line 245, in sample_relative
(ALL) [00:01:43] [STUDIES   2117/2109 100%] [ETA  0s] canceled

Error: InvalidInput (cause; EOF while parsing a value at line 1 column 0)
HISTORY:
  [0] at kurobako_core/src/epi/channel.rs:62 -- line=""
  [1] at kurobako_core/src/epi/solver/external_program.rs:164
  [2] at kurobako_core/src/epi/solver/embedded_script.rs:94
  [3] at kurobako_solvers/src/optuna.rs:276
  [4] at kurobako_core/src/solver.rs:176
  [5] at kurobako_core/src/solver.rs:176
  [6] at src/runner.rs:314
  [7] at src/runner.rs:271
  [8] at src/runner.rs:357
  [9] at src/runner.rs:136
  [10] at src/runner.rs:145
  [11] at src/main.rs:85


Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... It seems the version of cmaes is wrong because the current cmaes in this PR does not contain D = np.sqrt(D2) but contain D = np.sqrt(np.where(D2 < 0, 0, D2)). Could you re-install cmaes?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, thanks. I'll re-try it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. The error seems not to be raised.

Now, I'm checking more simplified patch at #41.

c-bata added a commit that referenced this pull request Jun 22, 2020
Fix numerical overflow errors (simplified version of #34).
@c-bata c-bata merged commit 0ccc62b into CyberAgentAILab:master Jun 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants