You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Running the expectation expect_column_values_to_match_regex verifies that the regex provided as an argument is respected by the elements of the column which, if different from strings, are transformed into strings before executing the expectation.
However, if the column contains missing, the validation result always returns 100% matching. It is understandable that where the data is not there, it cannot conflict with the regex but wanting to open a discussion on validity, it would mean stating that the total of the rows is congruent to the constraint of the regex, while it satisfies the expectation only the percentage of data actually available (not missing).
A further development of the expectation could be implemented in such a way that we can treat the missing as unexpected, specifying it with an additional argument.
The alternative is to manually perform the calculation after executing the checkpoint, subtracting from the number of rows the number of unexpected and the number of missing, but this takes a lot of time for different batches and expenditure of resources.
The text was updated successfully, but these errors were encountered:
Is this issue still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Running the expectation expect_column_values_to_match_regex verifies that the regex provided as an argument is respected by the elements of the column which, if different from strings, are transformed into strings before executing the expectation.
However, if the column contains missing, the validation result always returns 100% matching. It is understandable that where the data is not there, it cannot conflict with the regex but wanting to open a discussion on validity, it would mean stating that the total of the rows is congruent to the constraint of the regex, while it satisfies the expectation only the percentage of data actually available (not missing).
A further development of the expectation could be implemented in such a way that we can treat the missing as unexpected, specifying it with an additional argument.
The alternative is to manually perform the calculation after executing the checkpoint, subtracting from the number of rows the number of unexpected and the number of missing, but this takes a lot of time for different batches and expenditure of resources.
The text was updated successfully, but these errors were encountered: