-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update automl to use default of max_batches=1 #1452
Conversation
2379738
to
276defe
Compare
Codecov Report
@@ Coverage Diff @@
## main #1452 +/- ##
=========================================
+ Coverage 100.0% 100.0% +0.1%
=========================================
Files 223 223
Lines 15001 15013 +12
=========================================
+ Hits 14994 15006 +12
Misses 7 7
Continue to review full report at Codecov.
|
9b41693
to
aac9988
Compare
docs/source/user_guide/automl.ipynb
Outdated
@@ -74,7 +74,7 @@ | |||
"source": [ | |||
"The AutoML search will log its progress, reporting each pipeline and parameter set evaluated during the search.\n", | |||
"\n", | |||
"There are a number of mechanisms to control the AutoML search time. One way is to set the maximum number of candidate models to be evaluated during AutoML using `max_iterations`. By default, AutoML will search a fixed number of iterations and parameter pairs (`max_iterations=5`). The first pipeline to be evaluated will always be a baseline model representing a trivial solution. " | |||
"There are a number of mechanisms to control the AutoML search time. One way is to set the `max_batches` parameter which controls the maximum number of rounds of AutoML to evaluate, where each round may train and score a variable number of pipeline. Another way is to set the `max_iterations` parameter which controls the maximum number of candidate models to be evaluated during AutoML. By default, AutoML will search for a single batch. The first pipeline to be evaluated will always be a baseline model representing a trivial solution. " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not related to this PR but when are we making _pipelines_per_batch
public?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@freddyaboulton yeah good q, no plans to do so currently, let's chat at standup
f74ab35
to
afae52e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Left a few comments and nitpicks, but nothing blocking 🦃
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!! 🚢 ⚓
if not isinstance(max_time, (int, float, str, type(None))): | ||
raise TypeError(f"Parameter max_time must be a float, int, string or None. Received {str(max_time)}.") | ||
if isinstance(max_time, (int, float)) and max_time <= 0: | ||
raise ValueError(f"Parameter max_time must be None or non-negative. Received {max_time}.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, the ValueError
message doesn't align with the check (max_time <= 0 vs non-negative)
(Same with max_batches
and max_iterations
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe 'strictly positive' rather than 'non-negative'?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I'll change the boundary conditions here to be non-negative and if we wanna change this in the future we can!
afae52e
to
06f826e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me @dsherry !
We have 8 classification estimators and 7 regression estimators now! We should use all of them by default.
Docs changes visible here.