Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove ensemble split and indices in AutoMLSearch #2260

Merged
merged 11 commits into from
May 17, 2021

Conversation

angela97lin
Copy link
Contributor

@angela97lin angela97lin commented May 12, 2021

Closes #2093

@codecov
Copy link

codecov bot commented May 13, 2021

Codecov Report

Merging #2260 (a2ce6e1) into main (93e043c) will decrease coverage by 0.1%.
The diff coverage is 100.0%.

Impacted file tree graph

@@            Coverage Diff            @@
##             main    #2260     +/-   ##
=========================================
- Coverage   100.0%   100.0%   -0.0%     
=========================================
  Files         280      280             
  Lines       24382    24274    -108     
=========================================
- Hits        24360    24252    -108     
  Misses         22       22             
Impacted Files Coverage Δ
evalml/tests/automl_tests/test_engine_base.py 100.0% <ø> (ø)
evalml/automl/automl_search.py 100.0% <100.0%> (ø)
evalml/automl/engine/engine_base.py 100.0% <100.0%> (ø)
evalml/automl/utils.py 100.0% <100.0%> (ø)
evalml/tests/automl_tests/dask_test_utils.py 100.0% <100.0%> (ø)
evalml/tests/automl_tests/test_automl.py 99.7% <100.0%> (-<0.1%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 93e043c...a2ce6e1. Read the comment docs.

@angela97lin
Copy link
Contributor Author

angela97lin commented May 13, 2021

@angela97lin angela97lin marked this pull request as ready for review May 15, 2021 18:41
Copy link
Contributor

@bchen1116 bchen1116 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Contributor

@freddyaboulton freddyaboulton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@angela97lin Good job chasing this issue down! I'm glad we're removing our ensemble split for the time being as it cuts down the complexity of our automl and engine code.

It seems like the root case of #2093 was that we weren't fairly comparing the performance of our ensemble pipeline to the other pipelines. I agree with your analysis that removing the ensemble split moves the rankings in the direction we want but
I feel like we're not really fixing the root cause because the cv score for ensembles only considers the first fold as opposed to all folds. I imagine this could be unrealistic in datasets with variance across the folds, i.e. fail the high-variance-cv check.

I think as long as the ensemble pipeline is not scored on the same data as the other pipelines, we leave the door open for people to question the validity of our leaderboard and for similar tricky open-ended questions in the future, e.g "ensemble shows up first in leaderboard but does not outperform xgboost in holdout data. why?"

I'll file a separate issue to see if we can compute the score for ensembles the same way as the other pipelines.

@angela97lin
Copy link
Contributor Author

@freddyaboulton Yes, I definitely agree with you that even here, we're not doing a fair apples-to-apples comparison. I've talked to @dsherry about this before, and he mentioned that he and @kmax12 have had some discussions about a separate "model-selection" split where we hold out some data that is then used to validate the model and determine the actual ranking of the models as they should appear on the leaderboard, rather than using the training cv score we're currently relying on.

I know work is being done for the automl algo right now, not sure if this would step on those toes but I've filed #2284 to track this, feel free to add more there! :D

@angela97lin angela97lin merged commit b929044 into main May 17, 2021
@angela97lin angela97lin deleted the test_remove_ensemble_split branch May 17, 2021 21:02
Copy link
Contributor

@dsherry dsherry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! 🥳

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Stacked ensemble performing poorly
4 participants