Further tests for the InlineVerifier #137
Merged
+192
−5
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Is a decently straightforward PR, although I do fix an issue with
NULL
vs'NULL'
.Appendix: Technical analysis of the
NULL
vs'NULL'
problemOne thing I uncovered is that
NULL
is being treated the same as'NULL'
by the InlineVerifier. Thus, we're prone to false negatives involving NULL. The cause of the issue is as folllows:MD5
function. However,MD5(NULL) -> NULL
, andCONCAT(NULL, 'data') -> NULL
. This means every row that has a NULL column will have a NULL checksum, which is unacceptable.COALESCE
function, which returns the first non null entry. Specifically, we usedMD5(COALASCE(column, 'NULL'))
. If the column is NULL, we would thus computeMD5('NULL')
. This causes the collision with the string'NULL'
.The current workaround uses a magic string that's unlikely to be found in nature. Although it may be possible to use
CONCAT_WS
andNULLIF
to accomplish this without the magic string. There are some reverted commits for reference. The commit at the end also adds a test case that will be triggered if the naiveCONCAT_WS
method is implemented in the future.