-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Alternator: wrong ordering of bytes attributes with "negative" bytes #6573
Comments
Happily, the KeyConditions / KeyConditionExpression do not have this bug - the sort keys are already sorted in correct unsigned order and range queries on them work as needed, and tests I wrote for this pass successfully on Alternator. Apparently, Cassandra's sort ordering for blobs is also as unsigned which is why we already did this correctly, but I can't find where this is documented. I confirmed we do have the bug with Expected, ConditionExpression, QueryFilter and ScanFilter, though, and I'm preparing a patch. |
@nyh please evaluate for backport |
We implemented the order operators (LT, GT, LE, GE, BETWEEN) incorrectly for binary attributes: DynamoDB requires that the bytes be treated as unsigned for the purpose of order (so byte 128 is higher than 127), but our implementation uses Scylla's "bytes" type which has signed bytes. The solution is simple - we can continue to use the "bytes" type, but we need to use its compare_unsigned() function, not its "<" operator. This bug affected conditional operations ("Expected" and "ConditionExpression") and also filters ("QueryFilter", "ScanFilter", "FilterExpression"). The bug did *not* affect Query's key conditions ("KeyConditions", "KeyConditionExpression") because those already used Scylla's key comparison functions - which correctly compare binary blobs as unsigned bytes (in fact, this is why we have the compare_unsigned() function). The patch also adds tests that reproduce the bugs in conditional operations, and show that the bug did not exist in key conditions. Fixes #6573 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200603084257.394136-1-nyh@scylladb.com> (cherry picked from commit f6b1f45) Manually removed tests in test_key_conditions.py that did not exist in this branch
We implemented the order operators (LT, GT, LE, GE, BETWEEN) incorrectly for binary attributes: DynamoDB requires that the bytes be treated as unsigned for the purpose of order (so byte 128 is higher than 127), but our implementation uses Scylla's "bytes" type which has signed bytes. The solution is simple - we can continue to use the "bytes" type, but we need to use its compare_unsigned() function, not its "<" operator. This bug affected conditional operations ("Expected" and "ConditionExpression") and also filters ("QueryFilter", "ScanFilter", "FilterExpression"). The bug did *not* affect Query's key conditions ("KeyConditions", "KeyConditionExpression") because those already used Scylla's key comparison functions - which correctly compare binary blobs as unsigned bytes (in fact, this is why we have the compare_unsigned() function). The patch also adds tests that reproduce the bugs in conditional operations, and show that the bug did not exist in key conditions. Fixes #6573 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200603084257.394136-1-nyh@scylladb.com> (cherry picked from commit f6b1f45) Manyally removed tests in test_key_conditions.py which didn't exist in this branch.
DynamoDB documentation defines (see https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_Condition.html) the order of byte arrays assuming they are unsigned bytes.
Unfortunately, Alternator in
conditions.cc
'scheck_compare()
doesreturn cmp(base64_decode(kv1.value), base64_decode(kv2.value));
But
base64_decode()
returns Scylla'sbytes
type, which is an array of signed bytes... This means that Scylla sorts byte 128 as coming before byte 127, instead of after as it should.In addition to
check_compare()
, similar buggy code also exists incheck_BETWEEN()
.This wrong type results in incorrect comparison results in
Expected
orConditionalExpression
conditional operations (unfortunately we don't have a test for the negative bytes for those!), and also results in incorrect comparisons in uncommittedQueryFilter
code (fortunately, we do have a test there also for the negative bytes case).It's also possible that KeyConditions / KeyConditionExpression works incorrectly with bytes as sort key with operations like < or BETWEEN and when there are negative bytes. We also need a test for that case (and if broken, fixing this would require different work than fixing the issues above...)
The text was updated successfully, but these errors were encountered: