Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added ability to stop and resume hyperopt / automl runs #2108

Merged
merged 8 commits into from
Jun 8, 2022
Merged

Conversation

tgaddair
Copy link
Collaborator

@tgaddair tgaddair commented Jun 7, 2022

This PR adds two new API features that work with both hyperopt and automl:

  1. should_stop_hyperopt callback function
  2. resume parameter to automl and hyperopt entrypoint functions

The idea is that the user can configure a callback that will trigger stopping of the entire job when some criteria is met. Then at a later point in time, the user can re-run the job and the state will be picked up from where it left off.

This PR also makes a small change to the output name of hyperopt results. Previously the directory name was randomized like trainable_func_12345. Now the directory name matches that of the experiment_name param, so that if the user calls the same hyperopt function twice, the default behavior will be to resume the previous job from where it left off. This can be disabled by setting resume=False, which will create a new job and write trial results to the same directory.

@tgaddair tgaddair requested a review from justinxzhao June 7, 2022 16:39
@github-actions
Copy link

github-actions bot commented Jun 7, 2022

Unit Test Results

       6 files  ±0         6 suites  ±0   2h 13m 28s ⏱️ - 14m 21s
2 805 tests ±0  2 773 ✔️ ±0    32 💤 ±0  0 ±0 
8 415 runs  ±0  8 315 ✔️ ±0  100 💤 ±0  0 ±0 

Results for commit 9ab3a80. ± Comparison against base commit ec902e2.

♻️ This comment has been updated with latest results.

Copy link
Collaborator

@justinxzhao justinxzhao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Thanks for the changes.

@@ -835,29 +835,25 @@ def test_frequency_vs_f1_vis_api(experiment_to_use):
assert 2 == len(figure_cnt)


def test_hyperopt_report_vis_api(hyperopt_results):
def test_hyperopt_report_vis_api(hyperopt_results, tmpdir):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for cleaning up these tests!

ludwig/hyperopt/execution.py Outdated Show resolved Hide resolved
ludwig/hyperopt/execution.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@justinxzhao justinxzhao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@tgaddair tgaddair merged commit 6c26ee9 into master Jun 8, 2022
@tgaddair tgaddair deleted the stopper branch June 8, 2022 14:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants