Make it easy to customize automl search data splitter#1568
Conversation
Codecov Report
@@ Coverage Diff @@
## main #1568 +/- ##
=========================================
+ Coverage 100.0% 100.0% +0.1%
=========================================
Files 239 240 +1
Lines 17593 17677 +84
=========================================
+ Hits 17585 17669 +84
Misses 8 8
Continue to review full report at Codecov.
|
bchen1116
left a comment
There was a problem hiding this comment.
LGTM!
This change means we should file a separate issue to update our implementation for class_imbalance_data_check, specifically the cv_folds arg here. It defaults to 3 since that's the default data split n_folds, but with this change, I believe we should update AutoMLSearch to pass in the required param. Not blocking this PR though!
|
@bchen1116 excellent point RE the class imbalance data check! I just filed as #1570 |
freddyaboulton
left a comment
There was a problem hiding this comment.
@dsherry Looks great!
8576972 to
150bd95
Compare
c8ed8ff to
fcf7083
Compare
Fix #1567
Usage: configure automl search to use a different number of folds (splits) in CV
Usage: disable shuffling
I suppose we could simply add
n_splitsandshuffletoAutoMLSearch.__init__, but I wanted to keep us flexible on this a while longer. We'll be adding more heuristics to the data splitting logic here at some point and its nice exposing that.Hmm, perhaps we need to rename "data_split" to "data_splitter" as well.