Standardize data check messages by adding default "rows" and "columns" metadata#2869
Standardize data check messages by adding default "rows" and "columns" metadata#2869angela97lin merged 32 commits intomainfrom
Conversation
…ml into 2792_standardize_dc_cleanup
bchen1116
left a comment
There was a problem hiding this comment.
LGTM! Left some nits, but I like that we have now simplified the returns and combined the column/rows when needed!
| >>> target_leakage_check = TargetLeakageDataCheck(pct_corr_threshold=0.95) | ||
| >>> assert target_leakage_check.validate(X, y) == { | ||
| ... "warnings": [{"message": "Column 'leak' is 95.0% or more correlated with the target", | ||
| ... "warnings": [{"message": "Columns 'leak' are 95.0% or more correlated with the target", |
There was a problem hiding this comment.
It feels weird that this is plural when there's only one column. Not a nit, but would be nice to fix this
…ml into 2792_standardize_dc_cleanup
ParthivNaresh
left a comment
There was a problem hiding this comment.
Looks great, it's definitely odd that the docstrings aren't failing in HighlyNullDataCheck and elsewhere. Getting this compatible with tempo health shouldn't be a heavy lift from here for me, thanks for the changes!
| ).to_dict(), | ||
| DataCheckWarning( | ||
| message="Column 'd' is 80.0% or more correlated with the target", | ||
| message="Columns 'a', 'b', 'c', 'd' are 80.0% or more correlated with the target", |
There was a problem hiding this comment.
The beauty of consolidation
| DataCheckAction( | ||
| DataCheckActionCode.DROP_ROWS, | ||
| metadata={"indices": all_rows_with_indices}, | ||
| metadata={"rows": all_rows_with_indices}, |
There was a problem hiding this comment.
I guess a follow up comment, do we want to repeat the all_rows_with_indices over here? Maybe it should exist only in the DataCheckAction and not the Warning?
There was a problem hiding this comment.
Oof I missed this comment before merging, but I'm pretty indifferent about this either way!
Closes #2792.
I still need to update docstring tests, will wait until #2933 is merged to avoid unnecessary changes and conflicts.