-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rename max_pipelines
to max_iterations
#1169
Conversation
Codecov Report
@@ Coverage Diff @@
## main #1169 +/- ##
=======================================
Coverage 99.92% 99.92%
=======================================
Files 196 196
Lines 11987 11995 +8
=======================================
+ Hits 11978 11986 +8
Misses 9 9
Continue to review full report at Codecov.
|
2c19526
to
ca4c003
Compare
def next_batch(self): | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Codecov isn't passing because of this line. Should I call it in the line below? Since it's an abstract method I need to have next_batch()
but it seem unnecessary to call next_batch()
without needing to explicitly test it...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if you add a docstring instead of pass
it will count as being "run" - but if we just rename max_pipelines
to max_iterations
in AutoMLAlgorithm then we don't have to raise a deprecation warning and we can delete this code hehe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@christopherbunn This looks good! I have a comment about using the logger instead of warnings
. I also think we don't need to accept both arguments in IterativeAlgorithm
and AutoMLAlgorithm
.
@@ -12,6 +13,7 @@ class IterativeAlgorithm(AutoMLAlgorithm): | |||
def __init__(self, | |||
allowed_pipelines=None, | |||
max_pipelines=None, | |||
max_iterations=None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need to accept both max_iterations and max_pipelines. Users don't interact with this class as far as I know - they are called within AutoMLSearch.
@@ -23,13 +25,18 @@ def __init__(self, | |||
|
|||
Arguments: | |||
allowed_pipelines (list(class)): A list of PipelineBase subclasses indicating the pipelines allowed in the search. The default of None indicates all pipelines for this problem type are allowed. | |||
max_pipelines (int): The maximum number of pipelines to be evaluated. | |||
max_pipelines (int): Will be deprecated in the next release. The maximum number of pipelines to be evaluated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment here - I think we can just have a max_iterations
parameter in this class since users don't interact with this class.
def next_batch(self): | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if you add a docstring instead of pass
it will count as being "run" - but if we just rename max_pipelines
to max_iterations
in AutoMLAlgorithm then we don't have to raise a deprecation warning and we can delete this code hehe
if max_pipelines: | ||
if not max_iterations: | ||
max_iterations = max_pipelines | ||
warnings.warn("`max_pipelines will be deprecated in the next release. Use `max_iterations` instead.", DeprecationWarning) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should use logger.warning
so that the warning persists in the log for now - although we may change how our logger works in the future (#1174).
Even if we keep warnings.warn
we should not use a DeprecationWarning
because they are ignored by default unless triggered in __main__
so they would not show up in a jupyter notebook.
c2b8d44
to
8f34cf9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@christopherbunn looks great, thanks! I left a few small things
@@ -105,9 +106,12 @@ def __init__(self, | |||
LogLossMulticlass for multiclass classification problems, and | |||
R2 for regression problems. | |||
|
|||
max_pipelines (int): Maximum number of pipelines to search. If max_pipelines and | |||
max_pipelines (int): Will be deprecated in the next release. Maximum number of pipelines to search. If max_pipelines and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 thanks. Did we file another ticket yet to track doing the deprecation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just opened one, it's #1201
self.max_pipelines = 5 | ||
logger.info("Using default limit of max_pipelines=5.\n") | ||
if max_pipelines: | ||
if not max_iterations: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit-pick: I think this if
is unnecessary:
if max_pipelines:
max_iterations = max_pipelines
logger.warning("`max_pipelines` will be deprecated in the next release. Use `max_iterations` instead.")
Right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The use case I thought of was if max_iterations
and max_pipelines
were both set, I would want to show something saying to not set max_pipelines
but also respect what was in max_iterations
. I'll admit that it's a pretty narrow use case though.
evalml/automl/automl_search.py
Outdated
logger.warning("`max_pipelines` will be deprecated in the next release. Use `max_iterations` instead.") | ||
|
||
self.max_iterations = max_iterations | ||
if self.max_iterations is None and self.max_time is None and _max_batches is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Style nit-pick: if not self.max_iterations and not self.max_time and not _max_batches
works too
|
||
def test_max_pipelines_deprecation(caplog): | ||
AutoMLSearch(problem_type='binary', max_pipelines=5) | ||
assert "`max_pipelines` will be deprecated in the next release. Use `max_iterations` instead." in caplog.text |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have a test where you set both, to different values? Would be good to have that for completeness, even thought we'll delete it soon
Oh @christopherbunn one more thing: our current automl user guide doesn't explain what I think its ok if you just say it's one of the parameters which controls the stopping criterion for the automl search, and that it tells automl the maximum number of pipelines to train and evaluate. Something along those lines. You don't need to describe the automl algorithm in further detail, even though we don't do that right now. |
I filed the new issue here: #1203. It makes sense to add some more context to this in the user guides. I'll grab it soon, it shouldn't be too difficult to accomplish. |
3d1d9d3
to
f4caa14
Compare
…DeprecationWarning test
Fixed lint errors Fixed import order
f4caa14
to
4e584bd
Compare
Changed the
max_pipelines
parameter tomax_iterations
inAutoMLSearch
,AutoMLAlgorithm
, andIterativeAlgorithm
. Added a warning to be raised saying that it will be deprecated in the next release.Resolves #1149