-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Introduce TrialRunner Abstraction #720
Draft
bpkroth
wants to merge
106
commits into
microsoft:main
Choose a base branch
from
bpkroth:trial-runner-abstraction
base: main
Could not load branches
Branch not found: {{ refName }}
Could not load tags
Nothing to show
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Changes from all commits
Commits
Show all changes
106 commits
Select commit
Hold shift + click to select a range
616f44e
do not pass the optimizer into _run()
motus 33e332a
mypy fixes
motus 0247259
start splitting the optimization loop into two
motus 483e378
first complete version of the optimization loop (not tested yet)
motus addd5a4
Merge branch 'main' into sergiym/run/2loops
motus e97266f
allow running mlos_bench.run._main directly from unit tests + add a u…
motus 64771fd
move in-process launch to a separate unit test file
motus bd7c55e
add is_warm_up flag to the optimization step
motus 387722a
Merge branch 'main' of github.com:microsoft/MLOS into sergiym/run/2loops
motus 9f15aee
in-process optimizaiton loop invocation works!
motus 65cd072
add multi-iteration optimization to in-process test; fix the mlos_cor…
motus c010d95
make in-process launcerh tests pass
motus 7cfef3a
remove unnecessary local variables to make pylint happy
motus 7233180
move trial_config_repeat_count checks to the launcher
motus be7dcec
make experiment.load() return trial_ids and use them in the optimizat…
motus 3c52e03
use proper last_trial_id in the main loop; fix the unit tests
motus 0d9dc97
update launcher tests with the new output patterns
motus 4e171e0
remove unused variable
motus ab69fa0
Merge branch 'main' into sergiym/run/2loops
motus 52adab8
better naming for functions in the optimization loop
motus df893d9
start implementing the scheduler class
motus 5aca764
change the default value for is_warm_up parameter to False
motus 4d183df
Merge branch 'sergiym/run/2loops' into sergiym/run/scheduler
motus 309e10c
started to implement teh start() method of the sync scheduler
motus 9a72b40
Merge branch 'main' of github.com:microsoft/MLOS into sergiym/run/sch…
motus ffe23e1
implement proper Scheduler constructor
motus cb863e0
more clean-ups to the base scheduler
motus 990b019
minor pylint fixes
motus 2ac0520
add _add_trial_to_queue() method
motus b95100a
better handling of warm-up phase (no redundant code)
motus e15033d
split the sccheduler implementation into the base class and the sync …
motus 6eab1b0
use the new scheduler in _main()
motus 9c7f2cc
add scheduler config parameters that can be overridden from global co…
motus 479a5ed
add todo comments
motus 50dad9f
update the scores for launcher unit tests + fix teh regexps
motus 220ece1
add logging to the sync optimization loop
motus 29cec19
add more logging to the scheduler class
motus 6f8bb2c
move (sync) implementation of the run_trial() to SyncScheduler; other…
motus 6adb2d0
wip
bpkroth 41a0c37
start tracking which trial runner a trial is assigned to
bpkroth 2453427
Merge branch 'main' of github.com:microsoft/MLOS into sergiym/run/sch…
motus d8d8dfb
Merge branch 'sergiym/run/scheduler' of github.com:motus/MLOS into se…
motus 8a32e5a
wip: adding trial runner
bpkroth 57dc4c3
Merge remote-tracking branch 'sergiy/sergiym/run/scheduler' into para…
bpkroth e55f33e
wip: integrating trial runner to merged branch
bpkroth 7df0770
Roll back forceful assignment of PATH when invoking a local process
motus da55c5e
instantiate Scheduler from JSON config in the launcher (no JSON schem…
motus f6eb5ef
fix unit tests
motus 97438e7
add test for Launcher scheduler load in test_load_cli_config_examples…
motus 715fab9
Merge branch 'sergiym/local_exec/env' into sergiym/run/scheduler_load
motus 034aef9
fix the way launcher handles trial_config_repeat_count
motus 629236f
minor type fixes
motus 049fdb6
add required_keys for base Scheduler
motus 094155c
remove unnecessary type annotation
motus a6a7283
typo in pylint exception
motus 0a94a37
make all unit tests run
motus cf42730
add a missing import
motus 6f31a2d
add ConfigSchema.SCHEDULER (not defined yet)
motus e6ceb5c
fix the teardown property propagation issue
motus 3121fb0
proper ordering of launcher properties initialization
motus 5951544
fix last unit tests
motus e3f515c
more unit test fixes
motus 86f155e
add Scheduler JSON config schema
motus 928ceff
validate scheduler JSON schema
motus 1511c6e
add an example config for sync scheduler
motus 38ab457
fix the instantiation of scheduler config from JSON file
motus 9323a1c
minor logging improvements in the Scheduler
motus 6b35444
fix the trial_config_repeat_count default values for CLI
motus b242f23
roll back some unnecessary test fixes
motus 208c393
temporarily rollback the --max_iterations 9 setting in unit test
motus 303c25f
roll back another small fix to minimize the diff
motus 16ea2cb
undo a fix to LocalExecService that is in a separate PR
motus 5ad4b74
keep minimizing the diff
motus e0845ea
minimize diff
motus ed95295
Merge branch 'main' into sergiym/run/scheduler_load
motus 45a9293
Merge branch 'main' of github.com:microsoft/MLOS into sergiym/run/sch…
motus be0106a
Merge branch 'sergiym/run/scheduler_load' of github.com:motus/MLOS in…
motus 71e3ced
Merge branch 'main' into sergiym/run/scheduler_load
motus ca9b3a1
Merge branch 'main' into sergiym/run/scheduler_load
motus 52352dc
Merge remote-tracking branch 'upstream/main' into parallel-async-tria…
bpkroth 1aa0e4b
Merge branch 'main' of github.com:motus/MLOS into sergiym/run/schedul…
motus bbf7922
Merge branch 'sergiym/run/scheduler_load' of github.com:motus/MLOS in…
motus b204ebc
Fix some storage schema related tests
bpkroth 63da0e0
make local edits scheduler schema aware
bpkroth ba59035
include the scheduler schema in the global config
bpkroth 2ca34cd
fixup relative paths
bpkroth 946b0c4
basic schema testing
bpkroth 58e8609
Merge branch 'main' into sergiym/run/scheduler_load
bpkroth 7985a3e
add another test case
bpkroth 8070c30
Update mlos_bench/mlos_bench/launcher.py
motus f395531
pylint
bpkroth 678f4c5
Merge remote-tracking branch 'sergiy/sergiym/run/scheduler_load' into…
bpkroth 969e496
remove async status changes for now - future PR
bpkroth 6f4928f
wip
bpkroth 92382dc
wip: refactor running of a trial to a separate class so we can do the…
bpkroth f00c975
Merge branch 'main' into trial-runner-abstraction
bpkroth e91b744
comments
bpkroth 64e7575
consistency
bpkroth 8a9e29e
Merge branch 'main' into trial-runner-abstraction
motus 32c01c0
fixup
bpkroth 0e89e25
schema tests
bpkroth 5549925
spelling
bpkroth 7feba3a
make sure trial_runner_id shows up by default
bpkroth cc7ed4d
wip: fixups
bpkroth 8d794f1
fixme comments
bpkroth 967b6e2
Launcher args fixups
bpkroth File filter
Filter by extension
Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -23,6 +23,7 @@ | |||||
from mlos_bench.tunables.tunable import TunableValue | ||||||
from mlos_bench.tunables.tunable_groups import TunableGroups | ||||||
from mlos_bench.environments.base_environment import Environment | ||||||
from mlos_bench.schedulers.trial_runner import TrialRunner | ||||||
|
||||||
from mlos_bench.optimizers.base_optimizer import Optimizer | ||||||
from mlos_bench.optimizers.mock_optimizer import MockOptimizer | ||||||
|
@@ -54,6 +55,8 @@ class Launcher: | |||||
|
||||||
def __init__(self, description: str, long_text: str = "", argv: Optional[List[str]] = None): | ||||||
# pylint: disable=too-many-statements | ||||||
# pylint: disable=too-complex | ||||||
# pylint: disable=too-many-locals | ||||||
_LOG.info("Launch: %s", description) | ||||||
epilog = """ | ||||||
Additional --key=value pairs can be specified to augment or override values listed in --globals. | ||||||
|
@@ -95,11 +98,13 @@ def __init__(self, description: str, long_text: str = "", argv: Optional[List[st | |||||
|
||||||
self._parent_service: Service = LocalExecService(parent=self._config_loader) | ||||||
|
||||||
args_dict = vars(args) | ||||||
self.global_config = self._load_config( | ||||||
config.get("globals", []) + (args.globals or []), | ||||||
(args.config_path or []) + config.get("config_path", []), | ||||||
args_rest, | ||||||
{key: val for (key, val) in config.items() if key not in vars(args)}, | ||||||
# Prime the global config with the command line args and the config file. | ||||||
{key: val for (key, val) in config.items() if key not in args_dict or args_dict[key] is None}, | ||||||
) | ||||||
# experiment_id is generally taken from --globals files, but we also allow overriding it on the CLI. | ||||||
# It's useful to keep it there explicitly mostly for the --help output. | ||||||
|
@@ -108,6 +113,11 @@ def __init__(self, description: str, long_text: str = "", argv: Optional[List[st | |||||
# trial_config_repeat_count is a scheduler property but it's convenient to set it via command line | ||||||
if args.trial_config_repeat_count: | ||||||
self.global_config["trial_config_repeat_count"] = args.trial_config_repeat_count | ||||||
self.global_config.setdefault("num_trial_runners", 1) | ||||||
if args.num_trial_runners: | ||||||
self.global_config["num_trial_runners"] = args.num_trial_runners | ||||||
if self.global_config["num_trial_runners"] <= 0: | ||||||
raise ValueError(f"Invalid num_trial_runners: {self.global_config['num_trial_runners']}") | ||||||
# Ensure that the trial_id is present since it gets used by some other | ||||||
# configs but is typically controlled by the run optimize loop. | ||||||
self.global_config.setdefault('trial_id', 1) | ||||||
|
@@ -127,12 +137,21 @@ def __init__(self, description: str, long_text: str = "", argv: Optional[List[st | |||||
" Run `mlos_bench --help` and consult `README.md` for more info.") | ||||||
self.root_env_config = self._config_loader.resolve_path(env_path) | ||||||
|
||||||
self.environment: Environment = self._config_loader.load_environment( | ||||||
self.root_env_config, TunableGroups(), self.global_config, service=self._parent_service) | ||||||
_LOG.info("Init environment: %s", self.environment) | ||||||
|
||||||
# NOTE: Init tunable values *after* the Environment, but *before* the Optimizer | ||||||
self.trial_runners: List[TrialRunner] = [] | ||||||
for trial_runner_id in range(0, self.global_config["num_trial_runners"]): | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
# Create a new global config for each Environment with a unique trial_runner_id for it. | ||||||
env_global_config = self.global_config.copy() | ||||||
env_global_config["trial_runner_id"] = trial_runner_id | ||||||
env = self._config_loader.load_environment( | ||||||
self.root_env_config, TunableGroups(), env_global_config, service=self._parent_service) | ||||||
self.trial_runners.append(TrialRunner(trial_runner_id, env)) | ||||||
_LOG.info("Init %d trial runners for environments: %s", | ||||||
len(self.trial_runners), list(trial_runner.environment for trial_runner in self.trial_runners)) | ||||||
|
||||||
# NOTE: Init tunable values *after* the Environment(s), but *before* the Optimizer | ||||||
# TODO: should we assign the same or different tunables for all TrialRunner Environments? | ||||||
self.tunables = self._init_tunable_values( | ||||||
self.trial_runners[0].environment, | ||||||
args.random_init or config.get("random_init", False), | ||||||
config.get("random_seed") if args.random_seed is None else args.random_seed, | ||||||
config.get("tunable_values", []) + (args.tunable_values or []) | ||||||
|
@@ -208,6 +227,11 @@ def _parse_args(parser: argparse.ArgumentParser, argv: Optional[List[str]]) -> T | |||||
'--trial_config_repeat_count', '--trial-config-repeat-count', required=False, type=int, | ||||||
help='Number of times to repeat each config. Default is 1 trial per config, though more may be advised.') | ||||||
|
||||||
parser.add_argument( | ||||||
'--num_trial_runners', '--num-trial-runners', required=False, type=int, | ||||||
help='Number of TrialRunners to use for executing benchmark Environments. ' | ||||||
+ 'Individual TrialRunners can be identified in configs with $trial_runner_id and optionally run in parallel.') | ||||||
|
||||||
parser.add_argument( | ||||||
'--scheduler', required=False, | ||||||
help='Path to the scheduler configuration file. By default, use' + | ||||||
|
@@ -314,13 +338,13 @@ def _load_config(self, | |||||
global_config["config_path"] = config_path | ||||||
return global_config | ||||||
|
||||||
def _init_tunable_values(self, random_init: bool, seed: Optional[int], | ||||||
def _init_tunable_values(self, env: Environment, random_init: bool, seed: Optional[int], | ||||||
args_tunables: Optional[str]) -> TunableGroups: | ||||||
""" | ||||||
Initialize the tunables and load key/value pairs of the tunable values | ||||||
from given JSON files, if specified. | ||||||
""" | ||||||
tunables = self.environment.tunable_params | ||||||
tunables = env.tunable_params | ||||||
_LOG.debug("Init tunables: default = %s", tunables) | ||||||
|
||||||
if random_init: | ||||||
|
@@ -329,6 +353,8 @@ def _init_tunable_values(self, random_init: bool, seed: Optional[int], | |||||
config={"start_with_defaults": False, "seed": seed}).suggest() | ||||||
_LOG.debug("Init tunables: random = %s", tunables) | ||||||
|
||||||
# TODO: should we assign the same or different tunables for all TrialRunner Environments? | ||||||
|
||||||
if args_tunables is not None: | ||||||
for data_file in args_tunables: | ||||||
values = self._config_loader.load_config(data_file, ConfigSchema.TUNABLE_VALUES) | ||||||
|
@@ -402,7 +428,7 @@ def _load_scheduler(self, args_scheduler: Optional[str]) -> Scheduler: | |||||
"teardown": self.teardown, | ||||||
}, | ||||||
global_config=self.global_config, | ||||||
environment=self.environment, | ||||||
trial_runners=self.trial_runners, | ||||||
optimizer=self.optimizer, | ||||||
storage=self.storage, | ||||||
root_env_config=self.root_env_config, | ||||||
|
@@ -412,7 +438,7 @@ def _load_scheduler(self, args_scheduler: Optional[str]) -> Scheduler: | |||||
return self._config_loader.build_scheduler( | ||||||
config=class_config, | ||||||
global_config=self.global_config, | ||||||
environment=self.environment, | ||||||
trial_runners=self.trial_runners, | ||||||
optimizer=self.optimizer, | ||||||
storage=self.storage, | ||||||
root_env_config=self.root_env_config, | ||||||
|
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.