Remove ensemble split and indices in AutoMLSearch #2260

angela97lin · 2021-05-12T20:13:27Z

Closes #2093

codecov · 2021-05-13T03:59:14Z

Codecov Report

Merging #2260 (a2ce6e1) into main (93e043c) will decrease coverage by 0.1%.
The diff coverage is 100.0%.

@@            Coverage Diff            @@
##             main    #2260     +/-   ##
=========================================
- Coverage   100.0%   100.0%   -0.0%     
=========================================
  Files         280      280             
  Lines       24382    24274    -108     
=========================================
- Hits        24360    24252    -108     
  Misses         22       22

Impacted Files	Coverage Δ
evalml/tests/automl_tests/test_engine_base.py	`100.0% <ø> (ø)`
evalml/automl/automl_search.py	`100.0% <100.0%> (ø)`
evalml/automl/engine/engine_base.py	`100.0% <100.0%> (ø)`
evalml/automl/utils.py	`100.0% <100.0%> (ø)`
evalml/tests/automl_tests/dask_test_utils.py	`100.0% <100.0%> (ø)`
evalml/tests/automl_tests/test_automl.py	`99.7% <100.0%> (-<0.1%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 93e043c...a2ce6e1. Read the comment docs.

angela97lin · 2021-05-13T16:47:27Z

Hm. Most recent attempt passed, but noting these random Windows failures:
https://github.com/alteryx/evalml/pull/2260/checks?check_run_id=2572337724
https://github.com/alteryx/evalml/pull/2260/checks?check_run_id=2576148761
https://github.com/alteryx/evalml/pull/2260/checks?check_run_id=2577210855

Also fun:

bchen1116

LGTM!

freddyaboulton

@angela97lin Good job chasing this issue down! I'm glad we're removing our ensemble split for the time being as it cuts down the complexity of our automl and engine code.

It seems like the root case of #2093 was that we weren't fairly comparing the performance of our ensemble pipeline to the other pipelines. I agree with your analysis that removing the ensemble split moves the rankings in the direction we want but
I feel like we're not really fixing the root cause because the cv score for ensembles only considers the first fold as opposed to all folds. I imagine this could be unrealistic in datasets with variance across the folds, i.e. fail the high-variance-cv check.

I think as long as the ensemble pipeline is not scored on the same data as the other pipelines, we leave the door open for people to question the validity of our leaderboard and for similar tricky open-ended questions in the future, e.g "ensemble shows up first in leaderboard but does not outperform xgboost in holdout data. why?"

I'll file a separate issue to see if we can compute the score for ensembles the same way as the other pipelines.

angela97lin · 2021-05-17T20:54:56Z

@freddyaboulton Yes, I definitely agree with you that even here, we're not doing a fair apples-to-apples comparison. I've talked to @dsherry about this before, and he mentioned that he and @kmax12 have had some discussions about a separate "model-selection" split where we hold out some data that is then used to validate the model and determine the actual ranking of the models as they should appear on the leaderboard, rather than using the training cv score we're currently relying on.

I know work is being done for the automl algo right now, not sure if this would step on those toes but I've filed #2284 to track this, feel free to add more there! :D

dsherry

Looks good! 🥳

angela97lin added 5 commits May 10, 2021 10:54

init

5833f94

Merge branch 'main' into test_remove_ensemble_split

34165aa

cleanup and release notes

4fa4892

merge

5eac921

remove tests

30692ff

angela97lin added 3 commits May 13, 2021 12:47

Merge branch 'main' into test_remove_ensemble_split

dd884f2

Merge branch 'main' into test_remove_ensemble_split

7df2bac

Merge branch 'main' into test_remove_ensemble_split

6718ed8

angela97lin marked this pull request as ready for review May 15, 2021 18:41

auto-assign bot assigned angela97lin May 15, 2021

angela97lin added 2 commits May 16, 2021 16:19

Merge branch 'main' into test_remove_ensemble_split

03b97b7

Merge branch 'main' into test_remove_ensemble_split

3fc2434

angela97lin requested review from bchen1116, dsherry, freddyaboulton, chukarsten and ParthivNaresh May 17, 2021 14:13

bchen1116 approved these changes May 17, 2021

View reviewed changes

Merge branch 'main' into test_remove_ensemble_split

a2ce6e1

freddyaboulton approved these changes May 17, 2021

View reviewed changes

angela97lin mentioned this pull request May 17, 2021

Add model selection split for automl search #2284

Open

angela97lin merged commit b929044 into main May 17, 2021

angela97lin deleted the test_remove_ensemble_split branch May 17, 2021 21:02

dsherry reviewed May 17, 2021

View reviewed changes

angela97lin mentioned this pull request May 26, 2021

CV fold for ensembler after ensembling_indices split #2144

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove ensemble split and indices in AutoMLSearch #2260

Remove ensemble split and indices in AutoMLSearch #2260

angela97lin commented May 12, 2021 •

edited

Loading

codecov bot commented May 13, 2021 •

edited

Loading

angela97lin commented May 13, 2021 •

edited

Loading

bchen1116 left a comment

freddyaboulton left a comment

angela97lin commented May 17, 2021

dsherry left a comment

Remove ensemble split and indices in AutoMLSearch #2260

Remove ensemble split and indices in AutoMLSearch #2260

Conversation

angela97lin commented May 12, 2021 • edited Loading

codecov bot commented May 13, 2021 • edited Loading

Codecov Report

angela97lin commented May 13, 2021 • edited Loading

bchen1116 left a comment

Choose a reason for hiding this comment

freddyaboulton left a comment

Choose a reason for hiding this comment

angela97lin commented May 17, 2021

dsherry left a comment

Choose a reason for hiding this comment

angela97lin commented May 12, 2021 •

edited

Loading

codecov bot commented May 13, 2021 •

edited

Loading

angela97lin commented May 13, 2021 •

edited

Loading