[tune] refactor search spaces (old) #10401

krfricke · 2020-08-28T18:37:10Z

After a failed rebase the discussion has been moved here: #10444

Why are these changes needed?

This introduces a new search space representation that makes it possible to convert a Tune search space to other search algorithm definitions.

This also introduces new sampling methods, like quantized variants uniform and loguniform, called quniform and qloguniform, respectively.

With these abstractions we get a natural way to distinguish between allowed parameter values (called Domains) and the sampling methods (e.g. uniform, loguniform, normal). Theoretically users can introduce their own domains and custom samplers (like sampling from a Beta distribution or so). The underlying API is quite flexible, e.g. Float(1e-4, 1e-2).loguniform().quantized(5e-3). This API is currently hidden behind the tune sampler functions, like tune.qloguniform(1e-4, 1e-2, 5e-3).

Converting Tune search space definitions to search spaces for external search algorithms, like AxSearch, HyperOpt, BayesOpt, etc. ist straightforward. If a search algorithm doesn't support specific sampling methods, they can be dropped with a warning, or an error can be raised. For instance, BayesOpt doesn't support custom sampling methods, and is only interested in parameter bounds. If someone passes Float(1e-4, 1e-2).qloguniform(5e-3) to BayesOpt, it will be converted to the parameter bounds (1e-4, 1e-2) and a warning will be raised stating that the custom sampler has been dropped.

Generally, this refactoring will introduce flexibility in defining and converting search spaces, while keeping full backwards compatibility.

Example usage:

External API:

config = {
    "a": tune.choice([2, 3, 4]),
    "b": {
        "x": tune.qrandint(0, 5, 2),
        "y": 4,
        "z": tune.loguniform(1e-4, 1e-2)
    }
}
converted_config = HyperOptSearch.convert_search_space(config)

Lower-level API equivalent:

config = {
    "a": tune.sample.Categorical([2, 3, 4]).uniform(),
    "b": {
        "x": tune.sample.Integer(0, 5).quantized(2),
        "y": 4,
        "z": tune.sample.Float(1e-4, 1e-2).loguniform()
    }
}
converted_config = HyperOptSearch.convert_search_space(config)

Related issue number

Concerns #9969

Checks

I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/latest/.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failure rates at https://ray-travis-tracker.herokuapp.com/.
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested (please justify below)

…-search-space

…ject#10367)

…ay-project#10377)

…roject#10405)

* fix profiling for docker * small fixes * use name * do not import pwd on windows

…t#10399)

* Improve reporter module * Add test_node_physical_stats to test_reporter.py * Add test_class_method_route_table to test_dashboard.py * Add stats_collector module for dashboard * Subscribe actor table data * Add log module for dashboard * Only enable test module in some test cases * CI run all dashboard tests * Reduce test timeout to 10s * Use fstring * Remove unused code * Remove blank line * Fix dashboard tests * Fix asyncio.create_task not available in py36; Fix lint * Add format_web_url to ray.test_utils * Update dashboard/modules/reporter/reporter_head.py Co-authored-by: Max Fitton <mfitton@berkeley.edu> * Add DictChangeItem type for Dict change * Refine logger.exception * Refine GET /api/launch_profiling * Remove disable_test_module fixture * Fix test_basic may fail Co-authored-by: 刘宝 <po.lb@antfin.com> Co-authored-by: Max Fitton <mfitton@berkeley.edu>

* update cloudpickle to 1.6.0 * fix CI timeout

…ay-project#10323) * Patch error that occurred when there was an entry in the dashboard logs or errors internal data structures, and a worker was removed from the cluster. This would crash the cluster with a KeyError. * lint Co-authored-by: Max Fitton <max@semprehealth.com>

* Enable large tests. * Lint. * Fix issue. * Skip all tests.

sumanthratna

this is amazing! I left a bunch of comments about list/dict comprehensions which might improve perf but aren't really necessary and might decrease code readability

also, I wonder if we should create a tune.set_seed function to set the seed for numpy and python, similar to pytorch_lightning.seed_everything

sumanthratna · 2020-08-30T14:31:48Z

python/ray/tune/sample.py

+class Iterative(Domain):
+    def __init__(self, iterator: Iterator):
+        self.iterator = iterator
+
+    def uniform(self):
+        new = copy(self)
+        new.set_sampler(Uniform())
+        return new


I don't remember reading this in the design doc—what's the use-case for Iterative? I think this warrants an explanation in the docs too

python/ray/tune/suggest/ax.py

python/ray/tune/suggest/bayesopt.py

sumanthratna · 2020-08-30T14:58:41Z

python/ray/tune/suggest/optuna.py

@@ -91,6 +102,8 @@ def __init__(

        self._space = space

+        self._config = config or {}


why not set the default of the config kwarg in the initialization function to {} instead of None?

Throws a PEP8 error for me (default argument is mutable)

python/ray/tune/suggest/optuna.py

sumanthratna · 2020-08-30T15:12:47Z

python/ray/tune/utils/util.py

+        item = out
+        for k in path[:-1]:
+            item = item[k]


can't this be shortened to item = item[path[-2]]? I think I'm misunderstanding something here

Path here is usually a tuple like (a, b, c). We would like to set dt[a][b][c]=x and do this by setting item=dt, then item = item[a], then item=item[b] and finally item[c]=x. Does that make sense?
See also assign_value() in variant_generator.py

* requesting new workers only when pipelines to existing ones are full * linting * added unit testing & linting * finished refactoring to consolidate all the fields that belong to a SchedulingKey into a single hashmap * linting * fixed bugs introduced by rebasing from new upstream master * changes as part of the PR review process * Fix typo in src/ray/core_worker/transport/direct_task_transport.cc Co-authored-by: fangfengbin <869218239a@zju.edu.cn> * Fixed comment in src/ray/core_worker/transport/direct_task_transport.cc Co-authored-by: Stephanie Wang <swang@cs.berkeley.edu> * second revision, with linting. all tests are passing locally * Renamed SafeToDeleteEntry method in SchedulingKeyEntry Co-authored-by: Stephanie Wang <swang@cs.berkeley.edu> * all new revisions but the memory leak check. performed linting. * added checks to make sure scheduling_key_entries does not leak memory * linting. all checks passing locally * edited CheckNoSchedulingKeyEntries function * linting * fixed build error on mac * created public version of CheckNoSchedulingKeyEntries to acquire the lock * linting Co-authored-by: fangfengbin <869218239a@zju.edu.cn> Co-authored-by: Stephanie Wang <swang@cs.berkeley.edu>

krfricke · 2020-08-31T13:57:50Z

Thanks for the benchmarks and suggestions, I applied them.

…rch-space # Conflicts: # python/ray/tune/sample.py # python/ray/tune/suggest/ax.py # python/ray/tune/suggest/bayesopt.py # python/ray/tune/suggest/optuna.py # python/ray/tune/tests/test_sample.py

krfricke · 2020-08-31T14:00:05Z

Oh well, something went wrong with the latest merge. I'll try to fix that

krfricke · 2020-08-31T14:04:18Z

Weird, when I create a PR from the exact same branch it doesn't blow. Let's continue the discussion here: #10444

Kai Fricke added 10 commits August 26, 2020 17:37

Added basic functionality and tests

a16c807

Feature parity with old tune search space config

6750060

Merge branch 'master' into tune-search-space

47c8f23

Convert Optuna search spaces

aaaab73

Introduced quantized values

546e9ba

Merge branch 'master' of https://github.com/ray-project/ray into tune…

4d75d91

…-search-space

Updated Optuna resolving

18535a6

Added HyperOpt search space conversion

3ab491c

Convert search spaces to AxSearch

cba10d8

Convert search spaces to BayesOpt

f83ed33

krfricke added the tune Tune-related issues label Aug 28, 2020

edoakes and others added 19 commits August 28, 2020 13:51

Fix docs - atexit is not called when you ray.kill() an actor (ray-pro…

94a069e

…ject#10367)

[hotfix] Fix test_cli.py (ray-project#10403)

9c25ca6

[api] Initial API deprecations for Ray 1.0 (ray-project#10325)

519354a

Validate non-integral args to ray.remote (ray-project#10221)

2afb54c

[Autoscaler] Fix resource passing bug fix (ray-project#10397)

b1f3c9e

[api] Second round of 1.0 API changes: exceptions, num_return_vals (r…

2a20426

…ay-project#10377)

[autoscaler] Add documentation for multi node type autoscaling (ray-p…

f6a1698

…roject#10405)

[api] Remove legacy memory management docs (ray-project#10406)

c14b44a

[docker] Run profiling without sudo (ray-project#10388)

d6f2b0d

* fix profiling for docker * small fixes * use name * do not import pwd on windows

[Autoscaler] Move Resource Demand Scheduler Test to Small (ray-projec…

bd92cef

…t#10399)

[hotfix] Bad merge with num_return_vals (ray-project#10418)

910d5d2

Option to disable profiling and task timeline (ray-project#10414)

9a31166

[core] Move log_to_driver back to public (ray-project#10422)

cb438be

[Serialization] Update CloudPickle to 1.6.0 (ray-project#9694)

f0c3910

* update cloudpickle to 1.6.0 * fix CI timeout

update-scripts (ray-project#10425)

8c75381

[Tests] Enable large test (ray-project#10391)

c8b14fd

* Enable large tests. * Lint. * Fix issue. * Skip all tests.

[Tests] Fix Broken GCS restart test. (ray-project#10417)

3e5cac8

sumanthratna reviewed Aug 30, 2020

View reviewed changes

richardliaw mentioned this pull request Aug 31, 2020

[docs/feature] Show how to tune categorical variables + integers ray-project/tune-sklearn#87

Open

amogkam and others added 12 commits August 31, 2020 00:00

[Tune] Synchronous Mode for PBT (ray-project#10283)

afde3db

Added basic functionality and tests

c93fb76

Feature parity with old tune search space config

2ffbdca

Convert Optuna search spaces

13a153c

Introduced quantized values

6abffaa

Updated Optuna resolving

2cbd77b

Added HyperOpt search space conversion

c68b9c4

Convert search spaces to AxSearch

1167c75

Convert search spaces to BayesOpt

a4f2d2d

Re-factored samplers into domain classes

bd7ed77

Re-added base classes

e1ba45c

Re-factored into list comprehensions

9d32499

Merge remote-tracking branch 'origin/tune-search-space' into tune-sea…

24ff642

…rch-space # Conflicts: # python/ray/tune/sample.py # python/ray/tune/suggest/ax.py # python/ray/tune/suggest/bayesopt.py # python/ray/tune/suggest/optuna.py # python/ray/tune/tests/test_sample.py

krfricke closed this Aug 31, 2020

krfricke changed the title ~~[tune] refactor search spaces~~ [tune] refactor search spaces (old) Aug 31, 2020

krfricke mentioned this pull request Aug 31, 2020

[tune] refactor tune search space #10444

Merged

6 tasks

This was referenced Aug 31, 2020

[tune] shim instantiation of search algorithms #10451

Closed

[tune] implement shim instantiation #10456

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[tune] refactor search spaces (old) #10401

[tune] refactor search spaces (old) #10401

krfricke commented Aug 28, 2020 •

edited

Loading

sumanthratna left a comment

sumanthratna Aug 30, 2020

sumanthratna Aug 30, 2020

krfricke Aug 31, 2020

sumanthratna Aug 30, 2020

krfricke Aug 31, 2020

krfricke commented Aug 31, 2020

krfricke commented Aug 31, 2020

krfricke commented Aug 31, 2020

		@@ -91,6 +102,8 @@ def __init__(

		self._space = space

		self._config = config or {}

[tune] refactor search spaces (old) #10401

[tune] refactor search spaces (old) #10401

Conversation

krfricke commented Aug 28, 2020 • edited Loading

After a failed rebase the discussion has been moved here: #10444

Why are these changes needed?

Example usage:

Related issue number

Checks

sumanthratna left a comment

Choose a reason for hiding this comment

sumanthratna Aug 30, 2020

Choose a reason for hiding this comment

sumanthratna Aug 30, 2020

Choose a reason for hiding this comment

krfricke Aug 31, 2020

Choose a reason for hiding this comment

sumanthratna Aug 30, 2020

Choose a reason for hiding this comment

krfricke Aug 31, 2020

Choose a reason for hiding this comment

krfricke commented Aug 31, 2020

krfricke commented Aug 31, 2020

krfricke commented Aug 31, 2020

krfricke commented Aug 28, 2020 •

edited

Loading