New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Default optimize_thresholds to True in AutoMLSearch #1943
Conversation
Codecov Report
@@ Coverage Diff @@
## main #1943 +/- ##
=========================================
+ Coverage 100.0% 100.0% +0.1%
=========================================
Files 273 273
Lines 22279 22356 +77
=========================================
+ Hits 22273 22350 +77
Misses 6 6
Continue to review full report at Codecov.
|
@@ -83,7 +83,7 @@ def test_search_results(X_y_regression, X_y_binary, X_y_multi, automl_type, obje | |||
expected_pipeline_class = MulticlassClassificationPipeline | |||
X, y = X_y_multi | |||
|
|||
automl = AutoMLSearch(X_train=X, y_train=y, problem_type=automl_type, objective=objective, max_iterations=2, n_jobs=1) | |||
automl = AutoMLSearch(X_train=X, y_train=y, problem_type=automl_type, objective=objective, optimize_thresholds=False, max_iterations=2, n_jobs=1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
needed to set optimize_thresholds
as False in order to avoid issues during training best_pipeline
with predict_proba
. Otherwise, I could have mocked predict_proba
for all the tests but this was a faster approach with similar results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bchen1116 I have a couple design questions before taking a closer look!
@@ -227,20 +227,21 @@ def test_additional_objectives(X_y_binary): | |||
|
|||
|
|||
@patch('evalml.objectives.BinaryClassificationObjective.optimize_threshold') | |||
@patch('evalml.pipelines.BinaryClassificationPipeline._encode_targets', side_effect=lambda y: y) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need to patch encode_targets
in order to optimize the threshold
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, otherwise sklearn will throw an exception. The encoder is fit during pipeline.fit
but since we're mocking fit, that doesn't happen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bchen1116 This looks great!
@@ -227,20 +227,21 @@ def test_additional_objectives(X_y_binary): | |||
|
|||
|
|||
@patch('evalml.objectives.BinaryClassificationObjective.optimize_threshold') | |||
@patch('evalml.pipelines.BinaryClassificationPipeline._encode_targets', side_effect=lambda y: y) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, otherwise sklearn will throw an exception. The encoder is fit during pipeline.fit
but since we're mocking fit, that doesn't happen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Mostly left some nit picky comments heh
fix #648
Perf test here
In-depth dive added here