Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revert to default inf max_trials and pool-size of 1 #659

Merged
merged 7 commits into from
Sep 14, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
73 changes: 0 additions & 73 deletions docs/src/tutorials/pytorch-mnist.rst
Original file line number Diff line number Diff line change
Expand Up @@ -162,76 +162,3 @@ don't use ``--debug`` you will likely quickly fill your database with broken exp
.. code-block:: bash

$ orion --debug hunt -n orion-tutorial python main.py --lr~'loguniform(1e-5, 1.0)'

Hunting Options
---------------

.. code-block:: console

$ orion hunt --help

Oríon arguments (optional):
These arguments determine orion's behaviour

-n stringID, --name stringID
experiment's unique name; (default: None - specified
either here or in a config)
-u USER, --user USER user associated to experiment's unique name; (default:
$USER - can be overriden either here or in a config)
-c path-to-config, --config path-to-config
user provided orion configuration file
--max-trials # number of trials to be completed for the experiment.
This value will be saved within the experiment
configuration and reused across all workers to
determine experiment's completion. (default: inf/until
preempted)
--worker-trials # number of trials to be completed for this worker. If
the experiment is completed, the worker will die even
if it did not reach its maximum number of trials
(default: inf/until preempted)
--working-dir WORKING_DIR
Set working directory for running experiment.
--pool-size # number of simultaneous trials the algorithm should
suggest. This is useful if many workers are executed
in parallel and the algorithm has a strategy to sample
non-independant trials simultaneously. Otherwise, it
is better to leave `pool_size` to 1 and set a Strategy
for Oríon's producer. Note that this option is not usefull useless you
know the algorithm have a strategy to produce multiple trials
simultaneously. If you have any doubt, leave it to 1.
(default: 1)

``name``

The unique name of the experiment.

``user``

Username used to identify the experiments of a user. The default value is the system's username
$USER.

``config``

Configuration file for Oríon which may define the database, the algorithm and all options of the
command hunt, including ``name``, ``pool-size`` and ``max-trials``.

``max-trials``

The maximum number of trials tried during an experiment.

``worker-trials``

The maximum number of trials to be executed by a worker (a single call to ``orion hunt [...]``).

``working-dir``

The directory where configuration files are created. If not specified, Oríon will create a
temporary directory that will be removed at end of execution of the trial.

``pool-size``

The number of trials which are generated by the algorithm each time it is interrogated. This is
useful if many workers are executed in parallel and the algorithm has a strategy to sample
non-independant trials simultaneously. Otherwise, it is better to leave ``pool_size`` to its default
value 1. Note that this option is not usefull useless you know the algorithm have a strategy
to produce multiple trials simultaneously. If you have any doubt, leave it to 1. :)
34 changes: 16 additions & 18 deletions docs/src/user/config.rst
Original file line number Diff line number Diff line change
Expand Up @@ -97,14 +97,14 @@ Full Example of Global Configuration
seed: None
max_broken: 3
max_trials: 1000000000
pool_size: 1
strategy:
MaxParallelStrategy
worker_trials: 1000000000
working_dir:

worker:
n_workers: 1
pool_size: 0
executor: joblib
executor_configuration: {}
heartbeat: 120
Expand Down Expand Up @@ -211,7 +211,6 @@ Experiment
seed: None
max_broken: 3
max_trials: 1000000000
pool_size: 1
strategy:
MaxParallelStrategy
worker_trials: 1000000000
Expand Down Expand Up @@ -322,22 +321,6 @@ working_dir



.. _config_experiment_pool_size:

pool_size
~~~~~~~~~

.. warning::

**DEPRECATED.** This argument will be removed in v0.3.

:Type: int
:Default: 1
:Env var:
:Description:
(DEPRECATED) This argument will be removed in v0.3.


.. _config_experiment_algorithms:

algorithms
Expand Down Expand Up @@ -376,6 +359,7 @@ Worker

worker:
n_workers: 1
pool_size: 0
executor: joblib
executor_configuration: {}
heartbeat: 120
Expand All @@ -400,6 +384,20 @@ n_workers
It is possible to run many `orion hunt` in parallel, and each will spawn
``n_workers``.

.. _config_worker_pool_size:

pool_size
~~~~~~~~~

:Type: int
:Default: 0
:Env var:
:Description:
Number of trials to sample at a time. If 0, default to number of workers.
Increase it to improve the sampling speed if workers spend too much time
waiting for algorithms to sample points. An algorithm will try sampling `pool_size`
trials but may return less.


.. _config_worker_executor:

Expand Down
43 changes: 36 additions & 7 deletions src/orion/client/experiment.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@
log = logging.getLogger(__name__)


def reserve_trial(experiment, producer, _depth=1):
def reserve_trial(experiment, producer, pool_size, _depth=1):
"""Reserve a new trial, or produce and reserve a trial if none are available."""
log.debug("Trying to reserve a new trial to evaluate.")
trial = experiment.reserve_trial()
Expand All @@ -51,9 +51,9 @@ def reserve_trial(experiment, producer, _depth=1):
producer.update()

log.debug("#### Produce new trials.")
producer.produce()
producer.produce(pool_size)

return reserve_trial(experiment, producer, _depth=_depth + 1)
return reserve_trial(experiment, producer, pool_size, _depth=_depth + 1)

return trial

Expand Down Expand Up @@ -500,7 +500,7 @@ def release(self, trial, status="interrupted"):
finally:
self._release_reservation(trial, raise_if_unreserved=raise_if_unreserved)

def suggest(self):
def suggest(self, pool_size=0):
"""Suggest a trial to execute.

Experiment must be in executable ('x') mode.
Expand All @@ -509,6 +509,16 @@ def suggest(self):
Otherwise, the algorithm is used to generate a new trial that is registered in storage and
reserved.

Parameters
----------
pool_size: int, optional
Number of trials to sample at a time. If 0, default to global config if defined,
else 1. Increase it to improve the sampling speed if workers spend too much time
waiting for algorithms to sample points. An algorithm will try sampling `pool_size`
trials but may return less. Note: The method will still return only 1 trial even though
if the pool size is larger than 1. This is because atomic reservation of trials
can only be done one at a time.

Returns
-------
`orior.core.worker.trial.Trial`
Expand All @@ -535,6 +545,10 @@ def suggest(self):

"""
self._check_if_executable()
if not pool_size:
pool_size = orion.core.config.worker.pool_size
if not pool_size:
pool_size = 1

if self.is_broken:
raise BrokenExperiment("Trials failed too many times")
Expand All @@ -543,7 +557,7 @@ def suggest(self):
raise CompletedExperiment("Experiment is done, cannot sample more trials.")

try:
trial = reserve_trial(self._experiment, self._producer)
trial = reserve_trial(self._experiment, self._producer, pool_size)

except (WaitingForTrials, SampleTimeout) as e:
if self.is_broken:
Expand Down Expand Up @@ -629,10 +643,12 @@ def tmp_executor(self, executor, **config):
yield self
self.executor = old_executor

# pylint:disable=too-many-arguments
def workon(
self,
fct,
n_workers=None,
pool_size=0,
max_trials=None,
max_trials_per_worker=None,
max_broken=None,
Expand All @@ -652,6 +668,11 @@ def workon(
objective.
n_workers: int, optional
Number of workers to run in parallel. Defaults to value of global config.
pool_size: int, optional
Number of trials to sample at a time. If 0, defaults to `n_workers` or value of global
config if defined. Increase it to improve the sampling speed if workers spend too much
time waiting for algorithms to sample points. An algorithm will try sampling
`pool_size` trials but may return less.
max_trials: int, optional
Maximum number of trials to execute within ``workon``. If the experiment or algorithm
reach status is_done before, the execution of ``workon`` terminates.
Expand Down Expand Up @@ -712,6 +733,11 @@ def workon(
str(self.executor.n_workers),
)

if not pool_size:
pool_size = orion.core.config.worker.pool_size
if not pool_size:
pool_size = n_workers

if max_trials is None:
max_trials = self.max_trials

Expand All @@ -731,6 +757,7 @@ def workon(
self.executor.submit(
self._optimize,
fct,
pool_size,
max_trials_per_worker,
max_broken,
trial_arg,
Expand All @@ -742,14 +769,16 @@ def workon(

return sum(trials)

def _optimize(self, fct, max_trials, max_broken, trial_arg, on_error, **kwargs):
def _optimize(
self, fct, pool_size, max_trials, max_broken, trial_arg, on_error, **kwargs
):
worker_broken_trials = 0
trials = 0
kwargs = flatten(kwargs)
max_trials = min(max_trials, self.max_trials)
while not self.is_done and trials - worker_broken_trials < max_trials:
try:
with self.suggest() as trial:
with self.suggest(pool_size=pool_size) as trial:

kwargs.update(flatten(trial.params))

Expand Down
26 changes: 16 additions & 10 deletions src/orion/core/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -134,7 +134,7 @@ def define_experiment_config(config):
experiment_config.add_option(
"max_trials",
option_type=int,
default=1000,
default=int(10e8),
env_var="ORION_EXP_MAX_TRIALS",
help="number of trials to be completed for the experiment. This value "
"will be saved within the experiment configuration and reused "
Expand All @@ -144,7 +144,7 @@ def define_experiment_config(config):
experiment_config.add_option(
"worker_trials",
option_type=int,
default=1000,
default=int(10e8),
deprecate=dict(
version="v0.3",
alternative="worker.max_trials",
Expand All @@ -169,14 +169,6 @@ def define_experiment_config(config):
help="Set working directory for running experiment.",
)

experiment_config.add_option(
"pool_size",
option_type=int,
default=1,
deprecate=dict(version="v0.3", alternative=None, name="experiment.pool_size"),
help="This argument will be removed in v0.3.",
)

experiment_config.add_option(
"algorithms",
option_type=dict,
Expand Down Expand Up @@ -210,6 +202,20 @@ def define_worker_config(config):
),
)

worker_config.add_option(
"pool_size",
option_type=int,
default=0,
env_var="ORION_POOL_SIZE",
help=(
"Number of trials to sample at a time. "
"If 0, will default to number of executor workers. "
"Increase it to improve the sampling speed if workers spend too much time "
"waiting for algorithms to sample points. An algorithm will try sampling "
"`pool-size` trials but may return less."
),
)

worker_config.add_option(
"executor",
option_type=str,
Expand Down
2 changes: 2 additions & 0 deletions src/orion/core/cli/hunt.py
Original file line number Diff line number Diff line change
Expand Up @@ -130,6 +130,7 @@ def on_error(client, trial, error, worker_broken_trials):
def workon(
experiment,
n_workers=None,
pool_size=None,
max_trials=None,
max_broken=None,
max_idle_time=None,
Expand Down Expand Up @@ -163,6 +164,7 @@ def workon(
client.workon(
consumer,
n_workers=n_workers,
pool_size=pool_size,
max_trials_per_worker=max_trials,
max_broken=max_broken,
trial_arg="trial",
Expand Down
5 changes: 0 additions & 5 deletions src/orion/core/io/experiment_builder.py
Original file line number Diff line number Diff line change
Expand Up @@ -379,11 +379,6 @@ def create_experiment(name, version, mode, space, **kwargs):
"""
experiment = Experiment(name=name, version=version, mode=mode)
experiment._id = kwargs.get("_id", None) # pylint:disable=protected-access
experiment.pool_size = kwargs.get("pool_size")
if experiment.pool_size is None:
experiment.pool_size = orion.core.config.experiment.get(
"pool_size", deprecated="ignore"
)
experiment.max_trials = kwargs.get(
"max_trials", orion.core.config.experiment.max_trials
)
Expand Down
1 change: 0 additions & 1 deletion src/orion/core/utils/format_terminal.py
Original file line number Diff line number Diff line change
Expand Up @@ -257,7 +257,6 @@ def format_commandline(experiment):

CONFIG_TEMPLATE = """\
{title}
pool size: {experiment.pool_size}
max trials: {experiment.max_trials}
max broken: {experiment.max_broken}
working dir: {experiment.working_dir}
Expand Down
Loading