Separate thresholds for pct null rows and columns in HighlyNullDataCheck#2562
Separate thresholds for pct null rows and columns in HighlyNullDataCheck#2562
HighlyNullDataCheck#2562Conversation
Codecov Report
@@ Coverage Diff @@
## main #2562 +/- ##
=======================================
+ Coverage 99.9% 99.9% +0.1%
=======================================
Files 287 287
Lines 26377 26398 +21
=======================================
+ Hits 26338 26359 +21
Misses 39 39
Continue to review full report at Codecov.
|
bchen1116
left a comment
There was a problem hiding this comment.
Looks good! I left a nit and a comment about potentially separating the check for valid thresholds on rows vs columns, but nothing blocking!
| ): | ||
| raise ValueError( | ||
| "pct_null_threshold must be a float between 0 and 1, inclusive." | ||
| "pct null thresholds must be a float between 0 and 1, inclusive." |
There was a problem hiding this comment.
I think it'd be nice to separate the checks for cols vs rows to raise error specifically on the value needed
There was a problem hiding this comment.
I think you can also do a compound inequality
angela97lin
left a comment
There was a problem hiding this comment.
Left a nitpicky comment but LGTM, thanks for this! 😁
chukarsten
left a comment
There was a problem hiding this comment.
Awesome, this looks like a solid extension of the highlynulldatacheck. I think we need to clean up the added test a little bit. I think it carries over some copy pasta. We might want to consider a test_null_rows that just goes through and tests the rows similarly to the columns, with the all null, some null, all full and shifting the threshold around to play with that.
| ): | ||
| raise ValueError( | ||
| "pct_null_threshold must be a float between 0 and 1, inclusive." | ||
| "pct null thresholds must be a float between 0 and 1, inclusive." |
There was a problem hiding this comment.
I think you can also do a compound inequality
Closes #2270