You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Keep the lowest column A and then replace each other column with the diffs (Replace B with (B-A) and replace C with (C-B))
Create a new categorical column that stores which columns are missing
Replace any originally missing values with the average
On the reversal, we can compute the values in a cascading way using the diffs (B = A + diff1, C = B + diff2), and then set the missing values using the categorical column.
npatki
changed the title
Range constraint does not produce cases of missing values & may create inaccurate dataRange constraint does not produce cases of missing values & may create violative data
Apr 26, 2023
Error Description
In my dataset, I have a
Range
constraint such that columnA < B < C
. Any of the values may be null in different scenarios.Expected:
A < B
,B < C
andA < C
Observed:
Inequality
constraint does not produce all possibilities of missing values #1392, not all combinations of missing values are createdC > A
(ifB
is missing)Fix
The fix is also similar to #1392:
A
and then replace each other column with the diffs (ReplaceB
with(B-A)
and replaceC
with(C-B)
)On the reversal, we can compute the values in a cascading way using the diffs (
B = A + diff1
,C = B + diff2
), and then set the missing values using the categorical column.Steps to reproduce
The text was updated successfully, but these errors were encountered: