Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Timeout parameter to limit time (in seconds) of the search process (fmin) #533

Merged
merged 17 commits into from
Dec 12, 2019
16 changes: 16 additions & 0 deletions hyperopt/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,8 @@
"""
from __future__ import print_function
from __future__ import absolute_import

import numbers
from builtins import str
from builtins import map
from builtins import zip
Expand Down Expand Up @@ -226,6 +228,18 @@ def spec_from_misc(misc):
return spec


def validate_timeout(timeout):
if timeout is not None and (
not isinstance(timeout, numbers.Number)
or timeout <= 0
or isinstance(timeout, bool)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is redundant. If it's not a number, you don't need to check if it's a bool

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, if the docstring says int, then I think better to check here is int, rather than any number

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not my code segment, I used check from spark implementation, but since I use the same parameter I just moved this check to separate method to base file. To not break something by accident I did not change the logic of this validation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. if you can reuse code from somwhere else, do it. Do not duplicate
  2. Either the docstring or check need to be changed, to match. The check allows for floats and has a check for bool that seems useless to me, while the docstring states int. I would change this code as I don't think we need float precision for a timeout in seconds

Copy link
Contributor Author

@vladkar vladkar Dec 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. It was reused, not duplicated. As I mention, this check was just moved to base.
  2. I think this should have been discussed when this check was introduced to spark implementation. Again, I prefer to reuse it than make (almost) the same new check. And I do not want to influence existing logic, where this check is already used.

):
raise Exception(
"The timeout argument should be None or a positive value. "
"Given value: {timeout}".format(timeout=timeout)
)


class Trials(object):
"""Database interface supporting data-driven model-based optimization.

Expand Down Expand Up @@ -619,6 +633,7 @@ def fmin(
space,
algo,
max_evals,
timeout=None,
max_queue_len=1,
rstate=None,
verbose=0,
Expand Down Expand Up @@ -655,6 +670,7 @@ def fmin(
space,
algo,
max_evals,
timeout=timeout,
trials=self,
rstate=rstate,
verbose=verbose,
Expand Down
17 changes: 16 additions & 1 deletion hyperopt/fmin.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,10 +9,12 @@
import os
import sys
import time
from timeit import default_timer as timer
from tqdm import tqdm

import numpy as np

from hyperopt.base import validate_timeout
from . import pyll
from .utils import coarse_utcnow
from . import base
Expand Down Expand Up @@ -117,6 +119,7 @@ def __init__(
max_queue_len=1,
poll_interval_secs=1.0,
max_evals=sys.maxsize,
timeout=None,
verbose=False,
show_progressbar=True,
):
Expand All @@ -136,6 +139,8 @@ def __init__(
self.poll_interval_secs = poll_interval_secs
self.max_queue_len = max_queue_len
self.max_evals = max_evals
self.timeout = timeout
self.start_time = timer()
self.rstate = rstate
self.verbose = verbose

Expand Down Expand Up @@ -234,7 +239,8 @@ def get_n_unfinished():
) as progress_ctx:

all_trials_complete = False
while n_queued < N or (block_until_done and not all_trials_complete):
while (n_queued < N or (block_until_done and not all_trials_complete)) and \
(self.timeout is None or (timer() - self.start_time) < self.timeout):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can be timer() - self.start_time < (self.timeout or float('inf')). Also, please use black to format code

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can, but version I used (a) subjectively more readable, (b) objectively faster

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Black was applied

qlen = get_queue_len()
while (
qlen < self.max_queue_len and n_queued < N and not self.is_cancelled
Expand Down Expand Up @@ -322,6 +328,7 @@ def fmin(
space,
algo,
max_evals,
timeout=None,
trials=None,
rstate=None,
allow_trials_fmin=True,
Expand Down Expand Up @@ -371,6 +378,10 @@ def fmin(
max_evals : int
Allow up to this many function evaluations before returning.

timeout : int
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use style from other params :

timeout: None or int.
Limits ...
If None, then the search process has no time constraint.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Limits search time by parametrized number of seconds.
None is default, it means there is no time constraint in the search process.

trials : None or base.Trials (or subclass)
Storage for completed, ongoing, and scheduled evaluation points. If
None, then a temporary `base.Trials` instance will be created. If
Expand Down Expand Up @@ -436,12 +447,15 @@ def fmin(
else:
rstate = np.random.RandomState()

validate_timeout(timeout)

if allow_trials_fmin and hasattr(trials, "fmin"):
return trials.fmin(
fn,
space,
algo=algo,
max_evals=max_evals,
timeout=timeout,
max_queue_len=max_queue_len,
rstate=rstate,
pass_expr_memo_ctrl=pass_expr_memo_ctrl,
Expand All @@ -465,6 +479,7 @@ def fmin(
domain,
trials,
max_evals=max_evals,
timeout=timeout,
rstate=rstate,
verbose=verbose,
max_queue_len=max_queue_len,
Expand Down
17 changes: 7 additions & 10 deletions hyperopt/spark.py
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
from __future__ import print_function

import copy
import numbers
import threading
import time
import timeit

from hyperopt import base, fmin, Trials
from hyperopt.base import validate_timeout
from hyperopt.utils import coarse_utcnow, _get_logger, _get_random_id

try:
Expand Down Expand Up @@ -72,15 +72,7 @@ def __init__(self, parallelism=None, timeout=None, spark_session=None):
"SparkTrials cannot import pyspark classes. Make sure that PySpark "
"is available in your environment. E.g., try running 'import pyspark'"
)
if timeout is not None and (
not isinstance(timeout, numbers.Number)
or timeout <= 0
or isinstance(timeout, bool)
):
raise Exception(
"The timeout argument should be None or a positive value. "
"Given value: {timeout}".format(timeout=timeout)
)
validate_timeout(timeout)
self._spark = (
SparkSession.builder.getOrCreate()
if spark_session is None
Expand Down Expand Up @@ -198,6 +190,7 @@ def fmin(
space,
algo,
max_evals,
timeout,
max_queue_len,
rstate,
verbose,
Expand All @@ -211,6 +204,9 @@ def fmin(
Refer to :func:`hyperopt.fmin` for docs on each argument
"""

validate_timeout(timeout)
self.timeout = timeout

assert (
not pass_expr_memo_ctrl
), "SparkTrials does not support `pass_expr_memo_ctrl`"
Expand All @@ -229,6 +225,7 @@ def fmin(
space,
algo,
max_evals,
timeout=timeout,
max_queue_len=max_queue_len,
trials=self,
allow_trials_fmin=False, # -- prevent recursion
Expand Down
35 changes: 35 additions & 0 deletions hyperopt/tests/test_fmin.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@
import unittest
import numpy as np
import nose.tools
from timeit import default_timer as timer
import time

from hyperopt import (
fmin,
Expand Down Expand Up @@ -235,3 +237,36 @@ def test_generate_trials_to_calculate(self):
)
assert best["x"] == 0.0
assert best["y"] == 0.0


def test_timeout():
fn = lambda x: [time.sleep(1), x][1]
space = hp.choice('x', range(20))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use black

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


start_time_1 = timer()
fmin(
fn=fn,
space=space,
max_evals=10,
timeout=1,
algo=rand.suggest,
return_argmin=False,
rstate=np.random.RandomState(0)
)
end_time_1 = timer()
assert (end_time_1 - start_time_1) < 2
assert (end_time_1 - start_time_1) > 0.9

start_time_5 = timer()
fmin(
fn=fn,
space=space,
max_evals=10,
timeout=5,
algo=rand.suggest,
return_argmin=False,
rstate=np.random.RandomState(0)
)
end_time_5 = timer()
assert (end_time_5 - start_time_5) < 6
assert (end_time_5 - start_time_5) > 4.9