Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update automl to use default of max_batches=1 #1452

Merged
merged 13 commits into from
Nov 24, 2020

Conversation

dsherry
Copy link
Contributor

@dsherry dsherry commented Nov 21, 2020

We have 8 classification estimators and 7 regression estimators now! We should use all of them by default.

Docs changes visible here.

@dsherry dsherry force-pushed the ds_update_max_batches_default branch from 2379738 to 276defe Compare November 24, 2020 01:02
@codecov
Copy link

codecov bot commented Nov 24, 2020

Codecov Report

Merging #1452 (752f469) into main (ccfd9eb) will increase coverage by 0.1%.
The diff coverage is 100.0%.

Impacted file tree graph

@@            Coverage Diff            @@
##             main    #1452     +/-   ##
=========================================
+ Coverage   100.0%   100.0%   +0.1%     
=========================================
  Files         223      223             
  Lines       15001    15013     +12     
=========================================
+ Hits        14994    15006     +12     
  Misses          7        7             
Impacted Files Coverage Δ
evalml/automl/automl_search.py 99.7% <100.0%> (+0.1%) ⬆️
evalml/tests/automl_tests/test_automl.py 100.0% <100.0%> (ø)
.../automl_tests/test_automl_search_classification.py 100.0% <100.0%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ccfd9eb...752f469. Read the comment docs.

@dsherry dsherry marked this pull request as ready for review November 24, 2020 01:15
@dsherry dsherry added documentation Improvements or additions to documentation enhancement An improvement to an existing feature. and removed documentation Improvements or additions to documentation labels Nov 24, 2020
@dsherry dsherry self-assigned this Nov 24, 2020
@dsherry dsherry force-pushed the ds_update_max_batches_default branch from 9b41693 to aac9988 Compare November 24, 2020 16:34
@@ -74,7 +74,7 @@
"source": [
"The AutoML search will log its progress, reporting each pipeline and parameter set evaluated during the search.\n",
"\n",
"There are a number of mechanisms to control the AutoML search time. One way is to set the maximum number of candidate models to be evaluated during AutoML using `max_iterations`. By default, AutoML will search a fixed number of iterations and parameter pairs (`max_iterations=5`). The first pipeline to be evaluated will always be a baseline model representing a trivial solution. "
"There are a number of mechanisms to control the AutoML search time. One way is to set the `max_batches` parameter which controls the maximum number of rounds of AutoML to evaluate, where each round may train and score a variable number of pipeline. Another way is to set the `max_iterations` parameter which controls the maximum number of candidate models to be evaluated during AutoML. By default, AutoML will search for a single batch. The first pipeline to be evaluated will always be a baseline model representing a trivial solution. "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not related to this PR but when are we making _pipelines_per_batch public?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@freddyaboulton yeah good q, no plans to do so currently, let's chat at standup

Copy link
Contributor

@bchen1116 bchen1116 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Left a few comments and nitpicks, but nothing blocking 🦃

docs/source/start.ipynb Outdated Show resolved Hide resolved
evalml/automl/automl_search.py Outdated Show resolved Hide resolved
docs/source/user_guide/automl.ipynb Outdated Show resolved Hide resolved
evalml/tests/automl_tests/test_automl.py Show resolved Hide resolved
evalml/automl/automl_search.py Outdated Show resolved Hide resolved
Copy link
Contributor

@angela97lin angela97lin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!! 🚢 ⚓

if not isinstance(max_time, (int, float, str, type(None))):
raise TypeError(f"Parameter max_time must be a float, int, string or None. Received {str(max_time)}.")
if isinstance(max_time, (int, float)) and max_time <= 0:
raise ValueError(f"Parameter max_time must be None or non-negative. Received {max_time}.")
Copy link
Contributor

@angela97lin angela97lin Nov 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, the ValueError message doesn't align with the check (max_time <= 0 vs non-negative)

(Same with max_batches and max_iterations)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe 'strictly positive' rather than 'non-negative'?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I'll change the boundary conditions here to be non-negative and if we wanna change this in the future we can!

@dsherry dsherry force-pushed the ds_update_max_batches_default branch from afae52e to 06f826e Compare November 24, 2020 18:01
Copy link
Contributor

@freddyaboulton freddyaboulton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me @dsherry !

@dsherry dsherry merged commit e10cb02 into main Nov 24, 2020
@dsherry dsherry mentioned this pull request Nov 24, 2020
@freddyaboulton freddyaboulton deleted the ds_update_max_batches_default branch May 13, 2022 14:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement An improvement to an existing feature.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants