-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update tuner API to work with pipeline parameter dicts #779
Conversation
Codecov Report
@@ Coverage Diff @@
## master #779 +/- ##
==========================================
+ Coverage 99.38% 99.40% +0.02%
==========================================
Files 151 150 -1
Lines 5545 5567 +22
==========================================
+ Hits 5511 5534 +23
+ Misses 34 33 -1
Continue to review full report at Codecov.
|
7351e6e
to
2086885
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like how this API cleans up AutoSearchBase!
evalml/tuners/tuner.py
Outdated
|
||
class Tuner(ABC): | ||
"""Defines API for Tuners | ||
|
||
Tuners implement different strategies for sampling from a search space. They're used in EvalML to search the space of pipeline hyperparameters. | ||
""" | ||
|
||
@abstractmethod | ||
def __init__(self, space, random_state=0): | ||
def __init__(self, pipeline_class, random_state=0): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what if the Tuner init took in in the hyperparameter range dictionary rather than the pipeline object. that has the advantage of making the Tuner API more general/reusable without sacrificing usability.
I think it would also make it easier to test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like it. I'll make it happen.
hyperparameter_ranges.update(cls.custom_hyperparameters) | ||
component_hyperparameters = copy.copy(component.hyperparameter_ranges) | ||
if cls.custom_hyperparameters and component.name in cls.custom_hyperparameters: | ||
component_hyperparameters.update(cls.custom_hyperparameters.get(component.name, {})) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think .get(component.name, {})
is redundant with component.name in cls.custom_hyperparameters
in the if statement in the line above
evalml/tests/conftest.py
Outdated
return [Integer(0, 10), Real(0, 10), ['option_a', 'option_b', 'option_c'], ['option_a', 'option_b', 100, np.inf]] | ||
def dummy_component_hyperparameters(): | ||
return { | ||
'column a': Integer(0, 10), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor, but column seems a bit confusing here. perhaps it should be "hyperparameter a"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, good point
evalml/tests/conftest.py
Outdated
estimator = dummy_estimator() | ||
component_graph = [estimator] | ||
def dummy_binary_pipeline(dummy_binary_pipeline_class): | ||
MockBinaryClassificationPipeline = dummy_binary_pipeline_class(_estimator_hyperparameter_ranges=None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is _estimator_hyperparameter_ranges=None
used at all? also, I wonder if we make the tuner API accept the hyper parameter range dictionary instead, we dont need to worry about all these mocking changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, true, that's convincing. And yeah, I don't think I had written a test yet which used it.
evalml/tuners/tuner.py
Outdated
def _convert_to_pipeline_parameters(self, flat_parameters): | ||
"""Convert from a flat list of values to a dict of pipeline parameters""" | ||
pipeline_parameters = {component_name: dict() for component_name in self._component_names} | ||
for i, parameter_value in enumerate(flat_parameters): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think you could do for flat_parameter_name, parameter_value in zip(self._search_space_names, flat_parameters):
here. I find that easier to read and zip is a commonly used built in python method.
evalml/tuners/tuner.py
Outdated
self._search_space_names = [] | ||
self._search_space_ranges = [] | ||
hyperparameter_ranges = pipeline_class.hyperparameters | ||
for component_name, component_ranges in hyperparameter_ranges.items(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder the parameter transformation could be done more cleaning if you converted the dictionary into an OrderedDict and then throughout this class you could assume things were iterated over in the same order each time.
I've been looking at this and I haven't been able to decide on this conclusively, but figured I share in case it sparks any ideas
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree the implementation of the transformation could be simplified. Right now, I'm more interested in getting the interface solid. I'll take another peek at it though before we merge this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And yeah, I think using OrderedDict
would be helpful. It would simplify things. I'm just wary of making too many changes in what is already a big PR.
I'm going to file a follow-on ticket for it.
…t hierarchy. Update pipeline.hyperparameters to return dict as well
2086885
to
2a17b86
Compare
@kmax12 @angela97lin @jeremyliweishih I've addressed Max's comment's and I feel this is ready to merge. Please get me a re-review! |
@@ -43,6 +43,8 @@ def test_init(X_y): | |||
|
|||
assert isinstance(automl.best_pipeline, PipelineBase) | |||
assert isinstance(automl.get_pipeline(0), PipelineBase) | |||
with pytest.raises(RuntimeError, match="Pipeline not found"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this more codecov testing 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep 😂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just took a brief glance and left a few comments/questions but this is really neat stuff!
if 'number_features' in inspect.signature(component_class.__init__).parameters: | ||
component_parameters['number_features'] = number_features | ||
|
||
# Inspects each component and checks the parameters list for the right parameters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cool that we can clean this up! neat :D
proposed_params = tuner.propose() | ||
assert_params_almost_equal(proposed_params, [5.928446182250184]) | ||
assert proposed_params == {'Mock Classifier': {'param a': 0}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm maybe I haven't read far enough yet but I get that the test is changing because of our format, but why is the value proposed changing too? :o
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The previous value was incorrect! This is a grid search, and I'd expect a grid search to start at one side of the space and work its way to the other side, and "0" here is all the way at one side, haha. I'm not 100% sure why this was incorrect before, would have to do some detective work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, yeah, good eye, this is confusing :)
with pytest.raises(TypeError, match=type_error_text): | ||
GridSearchTuner(((0, 1))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason why we removed this test :o
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a valid range! I changed the way the search space is constructed and validated in the tuners, and I believe that made this a valid input. I think this is the key change... but again, I'd have to go back and do some detective work in order to explain why this didn't work in the old implementation. I think it deferred to skopt. 🤷♂️Good eye though. I'll circle back on this.
parameters: Hyperparameters used | ||
score: Associated score | ||
pipeline_parameters (dict): a dict of the parameters used to evaluate a pipeline | ||
score (float): the score obtained by evaluating the pipeline with the provided parameters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good. just a few minor comments
estimator = dummy_estimator() | ||
component_graph = [estimator] | ||
return MockBinaryClassificationPipeline(parameters={}) | ||
estimator = MockEstimator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think it may be incorrect to set this here. we do this automatically on init.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, this may be left over from a while back. I'll make a note to circle back to it. I don't think it's breaking any tests / it was in place before this PR.
(1.0, 'B') | ||
>>> print(tuner.propose()) | ||
(3.25, 'A') | ||
>>> tuner = GridSearchTuner({'My Component': {'param a': [0.0, 10.0], 'param b': ['a', 'b', 'c']}}, n_points=5) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why did you switch to the asserts here?
if we write it like
>>> proposal.keys()
{'My Component'}
>>> proposal['My Component']
{'param a': 0.0, 'param b': 'a'}
the tests should run it and validate everything worked correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the dicts aren't ordered. The previous output was a list, which was ordered. Here's a stackoverflow about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, that being said, it should work for proposal.keys()
. all good though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I understand what you mean. Simply because we only have one component. Sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it'd work because it's a set
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But it's not an ordered set! So if it has more than one element, it'll print in a different order depending on the RNG. I ran into this issue during testing, before we simplified the Tuner
interface to accept hyperparams instead of a pipeline class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't an ordered set just a list? i see, so it compares the string representation.
Changes
Tuner
init to accept a pipeline class, and propose/add to use pipeline parameter dictsDescription
Before this PR, the
Tuner
classes worked with flat parameter lists. They accepted a "space" description as input, which was a dict with parameter names as the keys and ranges as the values.Tuner.propose
andTuner.add
worked with flat lists of parameters:The problem with this is that in order to use those parameters in pipelines, we had to convert from the flat format to the format which pipelines expect, where each component's parameters are in a separate dict
This PR updates
Tuner
s to work with the pipeline parameter format instead of the flat list format, meaningTuner.propose
output can be converted directly into a pipeline, and a set of pipeline parameters can be registered directly inTuner.add
.Why do this now
This is necessary for the automl algorithm work in #754. Without it, we'd need some ugly, buggy code to convert between the old flat
Tuner
format and the pipeline dict format. The work to add this appeared to be comparable in duration to the work in this PR, so I went with the more sustainable option :) It's also a good clarifying change to make, because ourTuner
API is specific to pipelines, and this makes that quite clear.Future Work
This PR doesn't address the case when we have two components with the same name. In the short-term, this isn't a pressing problem because we don't yet have pipelines with multiple instances of the same component. In the long-term, we should update this when we upgrade pipelines to use graphs of components (#273)
Also, if we use
OrderedDict
for pipeline parameters, that will simplify the tuner code, simplify some test cases and also potentially be easier to understand from a user perspective. TODO file that as a follow-on issue.