Update automl to use default of max_batches=1#1452
Conversation
2379738 to
276defe
Compare
Codecov Report
@@ Coverage Diff @@
## main #1452 +/- ##
=========================================
+ Coverage 100.0% 100.0% +0.1%
=========================================
Files 223 223
Lines 15001 15013 +12
=========================================
+ Hits 14994 15006 +12
Misses 7 7
Continue to review full report at Codecov.
|
9b41693 to
aac9988
Compare
docs/source/user_guide/automl.ipynb
Outdated
| "The AutoML search will log its progress, reporting each pipeline and parameter set evaluated during the search.\n", | ||
| "\n", | ||
| "There are a number of mechanisms to control the AutoML search time. One way is to set the maximum number of candidate models to be evaluated during AutoML using `max_iterations`. By default, AutoML will search a fixed number of iterations and parameter pairs (`max_iterations=5`). The first pipeline to be evaluated will always be a baseline model representing a trivial solution. " | ||
| "There are a number of mechanisms to control the AutoML search time. One way is to set the `max_batches` parameter which controls the maximum number of rounds of AutoML to evaluate, where each round may train and score a variable number of pipeline. Another way is to set the `max_iterations` parameter which controls the maximum number of candidate models to be evaluated during AutoML. By default, AutoML will search for a single batch. The first pipeline to be evaluated will always be a baseline model representing a trivial solution. " |
There was a problem hiding this comment.
Not related to this PR but when are we making _pipelines_per_batch public?
There was a problem hiding this comment.
@freddyaboulton yeah good q, no plans to do so currently, let's chat at standup
f74ab35 to
afae52e
Compare
bchen1116
left a comment
There was a problem hiding this comment.
LGTM! Left a few comments and nitpicks, but nothing blocking 🦃
| if not isinstance(max_time, (int, float, str, type(None))): | ||
| raise TypeError(f"Parameter max_time must be a float, int, string or None. Received {str(max_time)}.") | ||
| if isinstance(max_time, (int, float)) and max_time <= 0: | ||
| raise ValueError(f"Parameter max_time must be None or non-negative. Received {max_time}.") |
There was a problem hiding this comment.
Hmm, the ValueError message doesn't align with the check (max_time <= 0 vs non-negative)
(Same with max_batches and max_iterations)
There was a problem hiding this comment.
Maybe 'strictly positive' rather than 'non-negative'?
There was a problem hiding this comment.
Good point. I'll change the boundary conditions here to be non-negative and if we wanna change this in the future we can!
afae52e to
06f826e
Compare
freddyaboulton
left a comment
There was a problem hiding this comment.
Looks good to me @dsherry !
We have 8 classification estimators and 7 regression estimators now! We should use all of them by default.
Docs changes visible here.