-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement Early Stopping for AutoML #241
Conversation
Codecov Report
@@ Coverage Diff @@
## master #241 +/- ##
==========================================
- Coverage 97.14% 97.03% -0.12%
==========================================
Files 95 95
Lines 2872 2932 +60
==========================================
+ Hits 2790 2845 +55
- Misses 82 87 +5
Continue to review full report at Codecov.
|
Default Value Options:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just a small thing about early_stopping = 0 :)
@kmax12 @dsherry, this is what pipeline search looks like now. I could also revert to 0%, 10% etc.. but I think this makes more sense until we can revamp the progress bar. The downside to this is that it would hit 100% when the last pipeline trains and not after. However, the timing would make sense when compared to master. |
'pipeline_results': {} | ||
} | ||
|
||
scores = [0.95, 0.84, 0.91] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dsherry instead of mocking the AutoML process, here I mock the results and check if it triggers the early stopping conditions. Think this is an improvement on the previous tests where I would check using clf.fit
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool! This is great. Not fitting models in unit tests makes me super happy :)
My opinion is that it's best practice to only call public methods in tests whenever possible. I have certainly violated this convention in many of my own PRs 😂 but that was part of why I suggested mocking the pipeline objects themselves, because then you could call automl fit
/search
here, have the mock pipelines do nothing in their fit
methods, and have them return your custom mock scores in the scoring code. But I do like what you have here now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dsherry you bring up some great points! I left those ideas out of this PR because I thought they could be applied to many other parts of our unit tests and I thought it would be best to have a more comprehensive overhaul and design a new testing framework accordingly. I don't have any concrete thoughts on how we can do so but maybe we can file an issue and work through it there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome. I just saw your ticket about mock testing, #275. Thanks for filing that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is looking solid! Nice going @jeremyliweishih
I left a doc typo comment
stale
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Nice work
create accessor methods for new results dict