Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Early Stopping for AutoML #241

Merged
merged 51 commits into from
Dec 12, 2019
Merged

Implement Early Stopping for AutoML #241

merged 51 commits into from
Dec 12, 2019

Conversation

jeremyliweishih
Copy link
Collaborator

@jeremyliweishih jeremyliweishih commented Nov 25, 2019

  • test with max_time
  • fix last iteration popping up twice on tqdm
  • early_stopping default value
  • update docs
  • tolerance / min_delta
  • use new results dict for early stopping
  • edit docs to show patience
  • create accessor methods for new results dict
  • file tickets on early stopping validation set, adaptive tolerance

@codecov
Copy link

codecov bot commented Nov 25, 2019

Codecov Report

Merging #241 into master will decrease coverage by 0.11%.
The diff coverage is 93.42%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #241      +/-   ##
==========================================
- Coverage   97.14%   97.03%   -0.12%     
==========================================
  Files          95       95              
  Lines        2872     2932      +60     
==========================================
+ Hits         2790     2845      +55     
- Misses         82       87       +5
Impacted Files Coverage Δ
evalml/models/auto_regressor.py 100% <ø> (ø) ⬆️
evalml/models/auto_classifier.py 100% <ø> (ø) ⬆️
evalml/tests/automl_tests/test_autoregressor.py 100% <100%> (ø) ⬆️
evalml/tests/automl_tests/test_autoclassifier.py 100% <100%> (ø) ⬆️
evalml/models/auto_base.py 92.72% <89.13%> (-1.22%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ea32c4a...173cdea. Read the comment docs.

@jeremyliweishih
Copy link
Collaborator Author

jeremyliweishih commented Dec 2, 2019

Default Value Options:

Value Pros Cons
None Easiest, less confusion for user, user has entire control more parameters to think of for user
Defined Value (something like 3 from H20) one less parameter for users to worry about, easy to change if we believe the default isn't good one defined default may not be best for all cases (max time and max pipelines or certain edge cases)
Smarter Handling (ratio of max_pipelines or max_time) Could potentially have the best performance Hardest to implement and change, lowest ROI

Copy link
Contributor

@angela97lin angela97lin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just a small thing about early_stopping = 0 :)

@jeremyliweishih
Copy link
Collaborator Author

@kmax12 @dsherry, this is what pipeline search looks like now. I could also revert to 0%, 10% etc.. but I think this makes more sense until we can revamp the progress bar. The downside to this is that it would hit 100% when the last pipeline trains and not after. However, the timing would make sense when compared to master.

Screen Shot 2019-12-11 at 12 34 00 PM

'pipeline_results': {}
}

scores = [0.95, 0.84, 0.91]
Copy link
Collaborator Author

@jeremyliweishih jeremyliweishih Dec 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dsherry instead of mocking the AutoML process, here I mock the results and check if it triggers the early stopping conditions. Think this is an improvement on the previous tests where I would check using clf.fit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! This is great. Not fitting models in unit tests makes me super happy :)

My opinion is that it's best practice to only call public methods in tests whenever possible. I have certainly violated this convention in many of my own PRs 😂 but that was part of why I suggested mocking the pipeline objects themselves, because then you could call automl fit/search here, have the mock pipelines do nothing in their fit methods, and have them return your custom mock scores in the scoring code. But I do like what you have here now.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dsherry you bring up some great points! I left those ideas out of this PR because I thought they could be applied to many other parts of our unit tests and I thought it would be best to have a more comprehensive overhaul and design a new testing framework accordingly. I don't have any concrete thoughts on how we can do so but maybe we can file an issue and work through it there.

Copy link
Contributor

@dsherry dsherry Dec 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome. I just saw your ticket about mock testing, #275. Thanks for filing that

@jeremyliweishih jeremyliweishih changed the title WIP: Implement Early Stopping for AutoML Implement Early Stopping for AutoML Dec 12, 2019
dsherry
dsherry previously approved these changes Dec 12, 2019
Copy link
Contributor

@dsherry dsherry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is looking solid! Nice going @jeremyliweishih

I left a doc typo comment

kmax12
kmax12 previously approved these changes Dec 12, 2019
Copy link
Contributor

@kmax12 kmax12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Nice work

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants