Add thresholding_objective argument to AutoMLSearch#2320
Conversation
Codecov Report
@@ Coverage Diff @@
## main #2320 +/- ##
=======================================
+ Coverage 99.9% 99.9% +0.1%
=======================================
Files 281 281
Lines 24858 24907 +49
=======================================
+ Hits 24825 24874 +49
Misses 33 33
Continue to review full report at Codecov.
|
| mock_score_regression.return_value = {'R2': 1.0} | ||
|
|
||
| automl = AutoMLSearch(X_train=X, y_train=y, problem_type=automl_type, max_iterations=1) | ||
| automl = AutoMLSearch(X_train=X, y_train=y, problem_type=automl_type, optimize_thresholds=False, max_iterations=1) |
There was a problem hiding this comment.
For most of these tests, I choose to set optimize_threhsholds=False rather than patch predict_proba, optimize_thresholds, and _encode_targets.
| mock_optimize_threshold.assert_not_called() | ||
| assert automl.best_pipeline.threshold is None | ||
| mock_split_data.assert_not_called() | ||
| else: |
There was a problem hiding this comment.
Check that we optimize the threshold even if the main objective isn't optimizable
evalml/automl/automl_search.py
Outdated
| self.threshold_automl_config = self.automl_config | ||
| if is_binary(self.problem_type) and self.optimize_thresholds and self.objective.score_needs_proba: | ||
| # use the thresholding_objective | ||
| self.threshold_automl_config = AutoMLConfig(self.data_splitter, self.problem_type, |
There was a problem hiding this comment.
This allows us to use the thresholding_objective as the objective to train the best pipeline with if we allow it.
| @patch('evalml.pipelines.BinaryClassificationPipeline.score') | ||
| @patch('evalml.pipelines.BinaryClassificationPipeline.fit') | ||
| @patch('evalml.pipelines.BinaryClassificationPipeline.predict_proba') | ||
| def test_tuning_threshold_objective(mock_predict, mock_fit, mock_score, mock_encode_targets, mock_optimize_threshold, objective, X_y_binary): |
There was a problem hiding this comment.
The other tests cover this condition already, so removing this
|
@freddyaboulton updated this with the perf test results! |
ParthivNaresh
left a comment
There was a problem hiding this comment.
Excellent work on this!
dsherry
left a comment
There was a problem hiding this comment.
@bchen1116 looking good! I had one code change request, some comments on naming/docs and a couple points on the tests.
evalml/automl/automl_search.py
Outdated
| if ( | ||
| is_binary(self.problem_type) | ||
| and self.optimize_thresholds | ||
| and self.objective.score_needs_proba | ||
| ): | ||
| # use the thresholding_objective | ||
| self.threshold_automl_config = AutoMLConfig( | ||
| self.data_splitter, | ||
| self.problem_type, | ||
| self.thresholding_objective, | ||
| self.additional_objectives, | ||
| self.thresholding_objective, | ||
| self.optimize_thresholds, | ||
| self.error_callback, | ||
| self.random_seed, | ||
| self.X_train.ww.schema, | ||
| self.y_train.ww.schema, | ||
| ) |
There was a problem hiding this comment.
@bchen1116 hang on, I think @chukarsten 's point still stands. The logic you added in engine_base.py looks great. And I see you've added the alternate threshold tuning objective as an additional argument to AutoMLConfig above this code block, which looks great. So, why can't you delete this block entirely?
| if objective == "Log Loss Binary": | ||
| assert automl.best_pipeline.threshold is None | ||
| else: | ||
| assert automl.best_pipeline.threshold == 0.5 |
There was a problem hiding this comment.
Ah, so when "optimize" is true, that means automl will have run in this test with optimize_thresholds true, in which case the mock optimization value is expected, regardless of what the objective was. 👍
If you wanna go above and beyond haha, I bet we could move most of this test to cover engine_base.py code directly instead of being an automl-level test. We wrote this test before the engine concept existed.
There was a problem hiding this comment.
Sorry lol. First part was me explaining to myself what this test does so that I understand it. Second part was a suggested improvement which you can certainly ignore if you'd like.
My point was that technically this test is checking the behavior of EngineBase.train_and_score_pipelines and doesn't have anything to do with AutoMLSearch itself, right? I guess its also checking that the threshold values get saved and attached to AutoMLSearch.best_pipeline, but we could write a simpler test to check that using mocking.
There was a problem hiding this comment.
@dsherry I might just leave this test as is since it is particular to time series as well, and it seems to be a thorough enough test that shouldn't take much time. Seems like this test is just making sure that AutoMLSearch can threshold time series problems properly when needed.
| X_train=X, | ||
| y_train=y, | ||
| problem_type="binary", | ||
| optimize_thresholds=False, |
There was a problem hiding this comment.
Following up on my outdated comment on another one of these similar lines:
Why set these to False?
If you did it to avoid having the tests waste time calling the threshold optimizer, then, lol I agree we shouldn't run the optimizer in every test. I do wonder if there's a different way to accomplish this though. Could we mock
BinaryClassificationObjective.optimize_thresholdinstead, in the same way we mock pipeline fit and score in many tests?
I wasn't just referring to that specific test though heh, I was referring to every automl test where you've added optimize_thresholds=False in this PR.
There was a problem hiding this comment.
Ah, I added optimize_thresholds to false to avoid tuning the thresholds. The other way would've been to patch predict_proba, optimize_thresholds, and encode_targets since so many of the tests patch the pipeline fit.
There was a problem hiding this comment.
Oh damn yeah I follow. Basically, now that we default to enabling threshold optimization, even if we mock pipeline fit/predict/score, optimization will still run and consume a bunch of test runtime.
@freddyaboulton FYI this could be relevant for #1815 #2298
Thanks for explaining @bchen1116 SGTM
| if objective == "Log Loss Binary": | ||
| assert automl.best_pipeline.threshold is None | ||
| else: | ||
| assert automl.best_pipeline.threshold == 0.5 |
There was a problem hiding this comment.
Sorry lol. First part was me explaining to myself what this test does so that I understand it. Second part was a suggested improvement which you can certainly ignore if you'd like.
My point was that technically this test is checking the behavior of EngineBase.train_and_score_pipelines and doesn't have anything to do with AutoMLSearch itself, right? I guess its also checking that the threshold values get saved and attached to AutoMLSearch.best_pipeline, but we could write a simpler test to check that using mocking.
dsherry
left a comment
There was a problem hiding this comment.
@bchen1116 yep, well done! I left some comments about one last refactor, to minimize duplicated code.
Its great that our binary classification models will now stay tuned

fixes #2301
Adds an additional thresholding_objective argument to AutoMLSearch. We use this to threshold binary classification problems when the original objective isn't thresholdable.
Original design doc here
Perf test results HERE