Added threshold for NullDataCheck#3507
Conversation
Codecov Report
@@ Coverage Diff @@
## main #3507 +/- ##
=======================================
+ Coverage 99.7% 99.7% +0.1%
=======================================
Files 336 336
Lines 33420 33436 +16
=======================================
+ Hits 33294 33310 +16
Misses 126 126
Continue to review full report at Codecov.
|
…evalml into add_thresholding_nulldc
| else: | ||
| ww_payload = infer_frequency( | ||
| X[self.time_index], | ||
| X_ww[self.time_index], |
There was a problem hiding this comment.
Sneaking this in here
chukarsten
left a comment
There was a problem hiding this comment.
This looks solid to me. Left a few nits and some things that aren't blocking that you can feel free to reject.
| def __init__(self, pct_null_col_threshold=0.95, pct_null_row_threshold=0.95): | ||
| def __init__( | ||
| self, | ||
| pct_null_col_threshold=0.95, |
There was a problem hiding this comment.
Do we want to change the name of this to reflect that this is "more null" than "moderately null"? Like "heavily null" or something. I dunno, I'm not a thesaurus.
There was a problem hiding this comment.
The only reason I chose not to was to prevent this from becoming a breaking change!
| ).to_dict(), | ||
| DataCheckWarning( | ||
| message="Column(s) 'lots_of_null', 'nullable_integer', 'nullable_bool' have null values", | ||
| message="Column(s) 'lots_of_null', 'nullable_integer', 'nullable_bool' have between 20.0% and 95.0% null values", |
There was a problem hiding this comment.
This is a nit, but is there any value between making these values like a module level variable for the data check and import it here to build this string? Or do we just want to accept that we might change the datacheck's default values and then have to update this test?
There was a problem hiding this comment.
Hmm I think we could try something like what woodwork has. Since we have enough instances throughout our data checks that if we go down the path of creating a config dc defaults file with all default parameters, we can call upon that in all our tests if we aren't using the default parameters. Should I file an issue for this?
Fixes #3286