Remove pipeline_parameters and custom_hyperparameters and replace with search_parameters#3373
Remove pipeline_parameters and custom_hyperparameters and replace with search_parameters#3373
pipeline_parameters and custom_hyperparameters and replace with search_parameters#3373Conversation
Codecov Report
@@ Coverage Diff @@
## main #3373 +/- ##
=======================================
- Coverage 99.7% 99.6% -0.0%
=======================================
Files 329 329
Lines 32405 32380 -25
=======================================
- Hits 32276 32249 -27
- Misses 129 131 +2
Continue to review full report at Codecov.
|
eccabay
left a comment
There was a problem hiding this comment.
Just have some nitpicky doc comments for now, I'll come back and do a full review later!
| for ( | ||
| name, | ||
| component_instance, | ||
| ) in pipeline.component_graph.component_instances.items(): |
There was a problem hiding this comment.
@bchen1116 This code block is doing two things:
- Getting random values from the skopt spaces so that the parameters used in the first batch are in the space the tuner is tuning over
- Making sure the the _pipeline_parameters are correctly added to the parameters so that Drop Columns etc get the right parameters
I think this would be simpler if 1 was a tuner method, like get_starting_parameters ?
… bc_search_parameters
freddyaboulton
left a comment
There was a problem hiding this comment.
@bchen1116 Thank you for your work on this! I left some suggestions for testing improvements. This is looking pretty good though.
| ) | ||
| if self._sampler_name not in parameters and self._sampler_name is not None: | ||
| parameters[self._sampler_name] = { | ||
| if ( |
There was a problem hiding this comment.
Can this be moved to the AutoMLAlgorithm? It's kind of awkward that there parameters are set in AutoMLSearch while the rest are set in the AutoMLAlgorithm.
There was a problem hiding this comment.
@freddyaboulton I think this would be a weird move. We use a lot of information that isn't massed to the AutoMLAlgorithm to determine whether we use a sampler and which sampler to use. We would need to pass all of this relevant data to the AutoMLAlgorithm in order to move this logic, and I'm not sure if that's worth it.
There was a problem hiding this comment.
I'm on the fence on this:
On one side, I made the decision to move pipeline building into the algorithms and this certainly falls under that category. On the other side, I do understand @bchen1116's concern about bloat in AutoMLAlgorithm. @bchen1116 can you file an issue and use this discussion as context for it?
There was a problem hiding this comment.
Yea If the long term plan is to move pipeline building logic to the algorithms then I think the logic for determining whether or not to add a sampler should move to the algorithms. I think there are some unused parameters in the automl algos right now that can be cleaned up too, e.g. number_features. We can do that in a separate issue.
| assert aml._tuners.keys() == aml_add_pipelines._tuners.keys() | ||
| assert aml._tuner_class == aml_add_pipelines._tuner_class | ||
| aml.next_batch() | ||
| aml._transform_parameters(None, None) |
There was a problem hiding this comment.
What does this line do?
There was a problem hiding this comment.
Codecov would raise errors if I didn't have calls to the next_batch and _transform_parameters methods. This was to satisfy that
jeremyliweishih
left a comment
There was a problem hiding this comment.
Really great work @bchen1116, a big value add in cleaning up the internal API as well as the external parameters API. Appreciate the cleanup in DefaultAlgo as well! Just left some general comments.
| ) | ||
| if self._sampler_name not in parameters and self._sampler_name is not None: | ||
| parameters[self._sampler_name] = { | ||
| if ( |
There was a problem hiding this comment.
I'm on the fence on this:
On one side, I made the decision to move pipeline building into the algorithms and this certainly falls under that category. On the other side, I do understand @bchen1116's concern about bloat in AutoMLAlgorithm. @bchen1116 can you file an issue and use this discussion as context for it?
fix #3153 and fix #3150
Design doc in confluence