-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Introduce TrialRunner Abstraction #720
base: main
Are you sure you want to change the base?
Conversation
…nit test for bench (not tested)
…e bulk registration (check for is_warm_up)
Co-authored-by: Brian Kroth <bpkroth@users.noreply.github.com>
… parallel-async-trial-runners
…m in parallel eventually
@@ -123,6 +137,7 @@ def get_results_df( | |||
'tunable_config_id', | |||
'tunable_config_trial_group_id', | |||
'status', | |||
'trial_runner_id', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs tests
"""Get the running state of the current TrialRunner.""" | ||
return self._is_running | ||
|
||
def run_trial(self, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs tests
mlos_bench/mlos_bench/launcher.py
Outdated
self.root_env_config, TunableGroups(), env_global_config, service=self._parent_service) | ||
self.trial_runners[trial_runner_id] = TrialRunner(trial_runner_id, env) | ||
_LOG.info("Init %d trial runners for environments: %s", | ||
self.trial_runners, list(trial_runner.environment for trial_runner in self.trial_runners)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs tests.
}) | ||
# Rotate which TrialRunner the Trial is assigned to. | ||
self._current_trial_runner_idx = (self._current_trial_runner_idx + 1) % len(self._trial_runners) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs tests.
mlos_bench/mlos_bench/launcher.py
Outdated
parser.add_argument( | ||
'--trial_runners', '--trial-runners', required=False, type=int, default=1, | ||
help='Number of trial runners to run in parallel. ' | ||
+ 'Individual TrialRunners can be identified in configs with $trial_runner_id.') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs tests.
TrialRunner | ||
""" | ||
if trial.trial_runner_id is None: | ||
raise ValueError(f"Trial {trial} has no trial_runner_id") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Add fallback support for assigning a Trial to a TrialRunner if one is missing.
# pylint: disable=too-many-statements | ||
# pylint: disable=too-complex |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# pylint: disable=too-many-statements | |
# pylint: disable=too-complex | |
# pylint: disable=too-many-statements,too-complex |
|
||
# NOTE: Init tunable values *after* the Environment, but *before* the Optimizer | ||
self.trial_runners: List[TrialRunner] = [] | ||
for trial_runner_id in range(0, self.global_config["num_trial_runners"]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for trial_runner_id in range(0, self.global_config["num_trial_runners"]): | |
for trial_runner_id in range(self.global_config["num_trial_runners"]): |
This is another step in adding support for parallel trial execution #380.
Here we separate out the running of an individual trial to a single class - TrialRunner.
Multiple TrialRunners are instantiated at CLI invocation with the
--num-trial-runners
argument.Each TrialRunner associated with a single copy of the root Environment, and made unique by means of a unique
trial_runner_id
value that's included in that Environment's global_config.TODO:
In future PRs we will add: