Make utils to handle the logic for threshold tuning objective and resplitting data #3888

tamargrey · 2022-12-13T18:35:49Z

closes #3885
closes #3863

Allows us to determine for pipelines produced by automl search whether the full training data was used to train them.

tamargrey · 2022-12-13T18:39:23Z

evalml/automl/utils.py

@@ -261,3 +262,62 @@ def get_pipelines_from_component_graphs(
            ),
        )
    return created_pipelines
+
+
+def get_threshold_tuning_info(automl_config, pipeline):


I stuck these in automl utils instead of putting them in the engine_base file/making a new engine utils file, because part of their use is in determining after automl search whether pipelines were trained on the full data. So since they can be used in the wider context of automl, I thought it made sense to put them there. If anyone has any strong feelings to the contrary, definitely let me know!

tamargrey · 2022-12-13T18:40:45Z

evalml/tests/automl_tests/test_engine_base.py

@@ -103,7 +103,7 @@ def test_train_and_score_pipelines_error(
    assert "yeet" in caplog.text


-@patch("evalml.automl.engine.engine_base.split_data")
+@patch("evalml.automl.utils.split_data")


The logic isn't changing, so I didn't add any tests that specifically test these utils on their own, but I'd be open to adding some if that's a necessary part of pulling this logic out into public utils

codecov · 2022-12-13T18:42:13Z

Codecov Report

Merging #3888 (9f7d393) into main (1ab688d) will increase coverage by 0.1%.
The diff coverage is 100.0%.

@@           Coverage Diff           @@
##            main   #3888     +/-   ##
=======================================
+ Coverage   99.7%   99.7%   +0.1%     
=======================================
  Files        346     346             
  Lines      36352   36358      +6     
=======================================
+ Hits       36221   36227      +6     
  Misses       131     131

Impacted Files	Coverage Δ
evalml/automl/__init__.py	`100.0% <ø> (ø)`
evalml/automl/engine/engine_base.py	`100.0% <100.0%> (ø)`
evalml/automl/utils.py	`97.3% <100.0%> (+0.5%)`	⬆️
evalml/tests/automl_tests/test_automl.py	`99.5% <100.0%> (ø)`
.../automl_tests/test_automl_search_classification.py	`96.4% <100.0%> (ø)`
evalml/tests/automl_tests/test_engine_base.py	`100.0% <100.0%> (ø)`

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

christopherbunn

LGTM! RE: additional tests for the new functions, I also agree that it doesn't seem necessary given that the logic was covered before.

eccabay

Looks great, thanks for taking care of this!

eccabay · 2022-12-15T14:27:45Z

docs/source/release_notes.rst

@@ -2,6 +2,7 @@ Release Notes
 -------------
 **Future Releases**
    * Enhancements
+        * Make utils to handle the logic for threshold tuning objective and resplitting data :pr:`3888`


I have decided past tense release notes are the hill I'm going to die on. Can we do make --> made? Or even really "added" instead.

ah damn! I had noticed that and had been doing past tense to fit in. I'll make the change!

For what it's worth, my commit message (and release note since I mostly treat them the same) writing style comes from this article that I was ~~forced~~ asked to read at my first internship. The thing that made the biggest impression on me was this section

okay, I see the internet is kind of split between what tense release notes should be in, but more of them say past tense than present tense.

Yeah, my personal ethos has always been past tense for release notes and present tense for commits, and that seems to be what we generally follow. Super interesting article though, lots to think about there!

jeremyliweishih

This LGTM as well - thanks for handling this!

tamargrey commented Dec 13, 2022

View reviewed changes

tamargrey force-pushed the make-resplit-utils branch from ec96078 to 4ab5c33 Compare December 13, 2022 18:41

tamargrey marked this pull request as ready for review December 14, 2022 14:03

auto-assign bot assigned tamargrey Dec 14, 2022

tamargrey requested review from jeremyliweishih, christopherbunn and eccabay December 14, 2022 14:03

christopherbunn approved these changes Dec 14, 2022

View reviewed changes

eccabay approved these changes Dec 15, 2022

View reviewed changes

jeremyliweishih approved these changes Dec 15, 2022

View reviewed changes

Tamar Grey added 8 commits December 15, 2022 11:16

Add and use utils for resplitting data for enginee base

1d256d6

Add docstring and update names to make more sense

f26f67f

cleanup

b447662

add release

1bd6370

lint fix docstrings

95cbc5f

fix broken tests

2ccbc39

Fix api reference build by adding new utils to __init__

6940a91

Change release note to past tense

9f7d393

tamargrey force-pushed the make-resplit-utils branch from bac699c to 9f7d393 Compare December 15, 2022 16:16

tamargrey merged commit a37d089 into main Dec 15, 2022

tamargrey deleted the make-resplit-utils branch December 15, 2022 17:11

christopherbunn mentioned this pull request Jan 3, 2023

Release v0.65.0 #3904

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Make utils to handle the logic for threshold tuning objective and resplitting data #3888

Make utils to handle the logic for threshold tuning objective and resplitting data #3888

Uh oh!

tamargrey commented Dec 13, 2022

Uh oh!

tamargrey Dec 13, 2022

Uh oh!

tamargrey Dec 13, 2022

Uh oh!

codecov bot commented Dec 13, 2022 •

edited

Loading

Uh oh!

christopherbunn left a comment

Uh oh!

eccabay left a comment

Uh oh!

eccabay Dec 15, 2022

Uh oh!

tamargrey Dec 15, 2022

Uh oh!

tamargrey Dec 15, 2022

Uh oh!

eccabay Dec 15, 2022

Uh oh!

jeremyliweishih left a comment

Uh oh!

Uh oh!

Make utils to handle the logic for threshold tuning objective and resplitting data #3888

Make utils to handle the logic for threshold tuning objective and resplitting data #3888

Uh oh!

Conversation

tamargrey commented Dec 13, 2022

Uh oh!

tamargrey Dec 13, 2022

Choose a reason for hiding this comment

Uh oh!

tamargrey Dec 13, 2022

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Dec 13, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

christopherbunn left a comment

Choose a reason for hiding this comment

Uh oh!

eccabay left a comment

Choose a reason for hiding this comment

Uh oh!

eccabay Dec 15, 2022

Choose a reason for hiding this comment

Uh oh!

tamargrey Dec 15, 2022

Choose a reason for hiding this comment

Uh oh!

tamargrey Dec 15, 2022

Choose a reason for hiding this comment

Uh oh!

eccabay Dec 15, 2022

Choose a reason for hiding this comment

Uh oh!

jeremyliweishih left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

codecov bot commented Dec 13, 2022 •

edited

Loading