New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LWT update with empty clustering key range causes a crash #13129
Comments
Add a test case which performs an LWT UPDATE, but the clustering key has 0 possible values, because it's supposed to be equal to two different values. This currently causes a crash, see scylladb#13129 Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
Nice catch. By the way, what happens when we do the same query without LWT (without the "IF ..."), is it handled correctly (i.e., get a sensible error)? By the way, doing a "c=2 AND c=3" may not be the only way to get a restriction that matches nothing, there is also "c IN ()", "c = 123" (where there is no match with 123), "c = null" (never true), so please verify the fix is more general than just refusing two equality conditions (which apparently Cassandra does). |
It should be ok, it's the same case as in #13001 and #13007, LWT modifications use a different code path that doesn't handle empty pk ranges. |
… ranges are empty' from Jan Ciołek Adds two test cases which test what happens when we perform an LWT UPDATE, but the partition/clustering key has 0 possible values. This can happen e.g when a column is supposed to be equal to two different values (`c = 0 AND c = 1`). Empty partition ranges work properly, empty clustering range currently causes a crash (#13129). I added tests for both of these cases. Closes #13130 * github.com:scylladb/scylladb: cql-pytest/test_lwt: test LWT update with empty clustering range cql-pytest/test_lwt: test LWT update with empty partition range
I see it in Eliran's backlog, unsure when we are going to commit to it - putting also in scylla backlog until we will have a different update. |
Backports:
On cqlsh:ks> create table t (p int, c int , r int , primary key (p, c));
cqlsh:ks> insert into t (p, c, r) VALUES (0, 1, 2);
cqlsh:ks> UPDATE t SET r = 3 WHERE p = 0 AND c = 1 AND c = 2 AND r = 2;
InvalidRequest: Error from server: code=2200 [Invalid query] message="Cannot execute this query as it might involve data filtering and thus may have unpredictable performance. If you want to execute this query despite the performance unpredictability, use ALLOW FILTERING"
cqlsh:ks> UPDATE t SET r = 3 WHERE p = 0 AND c = 1 AND c = 2 IF r = 2;
[applied] | r
-----------+---
True | 2
cqlsh:ks> select * from t;
p | c | r
---+---+---
0 | 1 | 2
(1 rows)
cqlsh:ks> UPDATE t SET r = 4 WHERE p = 0 AND c = 1 IF r = 2;
[applied] | r
-----------+---
True | 2
cqlsh:ks> select * from t;
p | c | r
---+---+---
0 | 1 | 4
(1 rows) This could be fixed by backporting the change that rejects such queries. |
A user just reported this on the users forum: I have set up a 3-node Scylla cluster in GCP (machine type: n1-highmem-8(8 vCPU, 52 GB), version of scylla-server: 4.5.1-0.20211024.4c0eac049, OS: Linux 8-gcp #24~20.04.1-Ubuntu SMP Mon Sep 12 06:14:01 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux). We have a keyspace, and here is the information about the keyspace: mykeyspace | True | {‘class’: ‘org.apache.cassandra.locator.SimpleStrategy’, ‘replication_factor’: ‘2’} The Scylla instance sometimes restarts, and we see this message in the logs: Jul 25 18:14:16 db-scylla1 scylla[2190565]: Segmentation fault on shard 4. Decoded:
|
@cvybhu how come no backport is needed to |
@scylladb/scylla-maint please consider backport |
I have the same question. How can 5.1 be immune to this issue, but both older and newer be vulnerable. @bhalevy please note that even if I backport this to 5.2 (which I plan to do now) and/or 5.1, it won't help the user you quoted who was using the ancient 4.5. We'll not backport anything to that. |
Yes, I've already let him know that 4.5.x is not support any more and they'd need to upgrade to a newer (best to latest available) version. Also, this fix will turn the crash into a CQL error, which is much better, bu it will not solve the root cause, which is apparently a bad CQL LWT query, which they's need to fix in any case. |
LWT queries with empty clustering range used to cause a crash. For example in: ```cql UPDATE tab SET r = 9000 WHERE p = 1 AND c = 2 AND c = 2000 IF r = 3 ``` The range of `c` is empty - there are no valid values. This caused a segfault when accessing the `first` range: ```c++ op.ranges.front() ``` Cassandra rejects such queries at the preparation stage. It doesn't allow two `EQ` restriction on the same clustering column when an IF is involved. We reject them during runtime, which is a worse solution. The user can prepare a query with `c = ? AND c = ?`, and then run it, but unexpectedly it will throw an `invalid_request_exception` when the two bound variables are different. We could ban such queries as well, we already ban the usage of `IN` in conditional statements. The problem is that this would be a breaking change. A better solution would be to allow empty ranges in `LWT` statements. When an empty range is detected we just wouldn't apply the change. This would be a larger change, for now let's just fix the crash. Fixes: #13129 Closes #14429 * github.com:scylladb/scylladb: modification_statement: reject conditional statements with empty clustering key statements/cas_request: fix crash on empty clustering range in LWT (cherry picked from commit 49c8c06)
LWT queries with empty clustering range used to cause a crash. For example in: ```cql UPDATE tab SET r = 9000 WHERE p = 1 AND c = 2 AND c = 2000 IF r = 3 ``` The range of `c` is empty - there are no valid values. This caused a segfault when accessing the `first` range: ```c++ op.ranges.front() ``` Cassandra rejects such queries at the preparation stage. It doesn't allow two `EQ` restriction on the same clustering column when an IF is involved. We reject them during runtime, which is a worse solution. The user can prepare a query with `c = ? AND c = ?`, and then run it, but unexpectedly it will throw an `invalid_request_exception` when the two bound variables are different. We could ban such queries as well, we already ban the usage of `IN` in conditional statements. The problem is that this would be a breaking change. A better solution would be to allow empty ranges in `LWT` statements. When an empty range is detected we just wouldn't apply the change. This would be a larger change, for now let's just fix the crash. Fixes: #13129 Closes #14429 * github.com:scylladb/scylladb: modification_statement: reject conditional statements with empty clustering key statements/cas_request: fix crash on empty clustering range in LWT (cherry picked from commit 49c8c06)
Connected to c1-5.1 at 127.0.0.1:9042
[cqlsh 6.0.9 | Scylla 5.1.15-0.20230730.12966e84352e | CQL spec 3.3.1 | Native protocol v4]
Use HELP for help.
cqlsh> use ks;
cqlsh:ks> create table t (p int, c int , r int , primary key (p, c));
cqlsh:ks> insert into t (p, c, r) VALUES (0, 1, 2);
cqlsh:ks> UPDATE t SET r = 3 WHERE p = 0 AND c = 1 AND c = 2 AND r = 2;
InvalidRequest: Error from server: code=2200 [Invalid query] message="Cannot execute this query as it might involve data filtering and thus may have unpredictable performance. If you want to execute this query despite the performance unpredictability, use ALLOW FILTERING"
cqlsh:ks> UPDATE t SET r = 3 WHERE p = 0 AND c = 1 AND c = 2 IF r = 2;
InvalidRequest: Error from server: code=2200 [Invalid query] message="clustering key ranges empty - probably caused by an unset value"
cqlsh:ks> The check is from another PR that fixed a similar problem: #13133 |
IIUC, it's not required in 5.1 - removing labels. |
Trying to perform an LWT update where the clustering key has 0 possible values,
for example when it's restricted by
c = 1 AND c = 2
, causes a crash.Here's a cql-pytest reproducer:
Cassandra rejects such queries, running it there gives an error:
Decoded backtrace:
Scylla version: 25cf325
Found when looking at the causes of #13001
Doing such an update with empty partition range properly throws an error:
The text was updated successfully, but these errors were encountered: