Conversation
… bc_search_parameters
freddyaboulton
left a comment
There was a problem hiding this comment.
@bchen1116 Almost there! Let's make sure to run perf tests prior to merging so we can be extra confident as well.
| select_parameters = self._create_select_parameters() | ||
| proposed_parameters = self._tuners[pipeline.name].propose() | ||
| parameters = self._transform_parameters(pipeline, proposed_parameters) | ||
| parameters = ( |
There was a problem hiding this comment.
Fixed the parameters we use when we call this through _create_fast_final versus _create_n_pipelines with a larger size of pipelines
| pipeline_parameters[component_name][parameter_name] = parameter_value | ||
| return pipeline_parameters | ||
|
|
||
| def get_starting_parameters(self, hyperparameter_ranges, random_seed=0): |
There was a problem hiding this comment.
Add the hyperparam_ranges and random_seed args to limit the starting parameters to only custom hyperparameters that users pass in.
freddyaboulton
left a comment
There was a problem hiding this comment.
@bchen1116 Looks good to me. Thank you for sticking with this! The perf tests look good and I was able to verify the previous bug in how the default algorithm created the starting batch is no longer there.
I feel confident about the merge but I left a suggestion for a unit test that I think will nail it home for us now and in the future.
Would be good to get two approvals on this one since it's a big refactor. Maybe wait for @jeremyliweishih !
| self._tuners[pipeline.name].get_starting_parameters( | ||
| self._hyperparameters, self.random_seed | ||
| ) | ||
| if n == 1 |
There was a problem hiding this comment.
I think we want _batch_number <= 2 here right? I think this works right now because we only call _create_n_pipelines with n=1 in the create_fast_final batch but would not work if the code was refactored I think.
There was a problem hiding this comment.
a good alternative here would be to refactor this logic to be under a flag instead of basing it off of the number of pipelines generated
There was a problem hiding this comment.
@jeremyliweishih great suggestion! I'll change that.
| search_parameters={}, | ||
| ) | ||
| automl.automl_algorithm.allowed_pipelines = pipelines | ||
| automl.automl_algorithm._set_allowed_pipelines(pipelines) |
There was a problem hiding this comment.
Is the tuner not created otherwise? Wondering why we're making this change
There was a problem hiding this comment.
Yep! With the changes that we made to iterative, just setting allowed_pipelines doesn't automatically create tuners for them.
| if "n_jobs" in init_params: | ||
| component_parameters["n_jobs"] = self.n_jobs | ||
| try: | ||
| if "number_features" in init_params: |
There was a problem hiding this comment.
Will file an issue to delete this. Don't know what it's used for and why it's only for iterative algorithm.
| ], | ||
| ) | ||
| @pytest.mark.parametrize("problem_type", ["binary", "time series binary"]) | ||
| def test_search_parameters_held_automl( |
There was a problem hiding this comment.
Thoughts on adding a test to check whether the estimator parameters used match the default values? I think e can use inspect to make sure the test does not get stale. We chose the default values precisely cause they're fast/get good enough performance so might be good to add that coverage. And I think that unit test would have caught the original problem with the first version of this pr right?
jeremyliweishih
left a comment
There was a problem hiding this comment.
LGTM! Should be good to go once Freddy's comments are addressed. Great work @bchen1116!
jeremyliweishih
left a comment
There was a problem hiding this comment.
Final revisions are great. Thanks!
Fix #3414
Performance tests here:
report_param.html.zip
Running main locallly:

Running this branch locally:

The performances are more or less the same for both iterative and default (iterative was only tested 1 batch locally).