New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Missing key in cassandra-harry verification while doing rolling restart of the cluster #10598
Comments
More times it happened: Installation detailsKernel Version: 5.13.0-1021-aws Test:
Logs:
|
Can we try and run this on 4.6 - to see if this is a regression |
@fruch ^^ |
@eliransin try to find someone that can help look at this |
Happens also on 4.6.3, so it's no regression: Kernel Version: 5.11.0-1022-aws OS / Image: Test:
Logs:
|
now we proven the the nemesis mostly works, beside scylladb/scylladb#10598 that still under investigation This reverts commit 820d561.
now we proven the the nemesis mostly works, beside scylladb/scylladb#10598 that still under investigation This reverts commit 820d561.
now we proven the the nemesis mostly works, beside scylladb/scylladb#10598 that still under investigation This reverts commit 820d561.
now we proven the the nemesis mostly works, beside scylladb/scylladb#10598 that still under investigation This reverts commit 820d561. (cherry picked from commit 09a0292)
now we proven the the nemesis mostly works, beside scylladb/scylladb#10598 that still under investigation This reverts commit 820d561. (cherry picked from commit 09a0292)
The same happened during Installation detailsKernel Version: 5.13.0-1025-aws Scylla Nodes used in this run:
OS / Image: Test:
Logs:
|
@eliransin will someone look at it? |
@roydahan does it happen all the time. Can I have a reproducer job so I can debug?
|
you can use https://jenkins.scylladb.com/view/master/job/scylla-master/job/reproducers/job/longevity-harry-2h-test/, it a 2.5h run.
we don't have SCT setup to run Cassandra with our nemesis
Again, just to add, in steadily state, without restarting nodes, cassandra-harry verification is working correctly. |
Where is the cassandra-harry log (on which tar file?) |
in the loader-set |
@eliransin are we planning to fix this for 5.1? |
I am not sure we will have the bandwidth for it for 5.1 |
@michoecho can you take it from here please? |
It would be really to have cassandra-harry available also for dtests, and reproduce this issue with one, so we can add such dtests to our regression test suite. |
Looking. |
Apparently reconciliation of range tombstones during reverse queries is very broken. Minimal reproducer:
The state of mutation fragments on both nodes is exactly as expected. cl=one queries and non-reverse cl=all queries give expected results. It's the reverse query reconciliation that's broken. |
@michoecho thanks for reproducing and closing-in on the root cause. Cc @denesb, @tgrabiec your insight and review later on would be appreciated |
Fix:
As far as I understand, this means that range tombstones of the form (Tombstones of the form This is quite a terrible bug. It means that a workload with range deletes and reverse reads can't be trusted. This seems to be a Scylla 5.1 regression. (4629f7d). |
@michoecho please prepare a PR. |
…ueries Reproducer for scylladb#10598.
…ueries Reproducer for scylladb#10598.
@scylladb/scylla-maint - please backport to relevant (5.2, 5.4) branches. |
…ies' from Michał Chojnowski reconcilable_result_builder passes range tombstone changes to _rt_assembler using table schema, not query schema. This means that a tombstone with bounds (a; b), where a < b in query schema but a > b in table schema, will not be emitted from mutation_query. This is a very serious bug, because it means that such tombstones in reverse queries are not reconciled with data from other replicas. If *any* queried replica has a row, but not the range tombstone which deleted the row, the reconciled result will contain the deleted row. In particular, range deletes performed while a replica is down will not later be visible to reverse queries which select this replica, regardless of the consistency level. As far as I can see, this doesn't result in any persistent data loss. Only in that some data might appear resurrected to reverse queries, until the relevant range tombstone is fully repaired. This series fixes the bug and adds a minimal reproducer test. Fixes #10598 Closes #16003 * github.com:scylladb/scylladb: mutation_query_test: test that range tombstones are sent in reverse queries mutation_query: properly send range tombstones in reverse queries (cherry picked from commit 65e42e4)
…ies' from Michał Chojnowski reconcilable_result_builder passes range tombstone changes to _rt_assembler using table schema, not query schema. This means that a tombstone with bounds (a; b), where a < b in query schema but a > b in table schema, will not be emitted from mutation_query. This is a very serious bug, because it means that such tombstones in reverse queries are not reconciled with data from other replicas. If *any* queried replica has a row, but not the range tombstone which deleted the row, the reconciled result will contain the deleted row. In particular, range deletes performed while a replica is down will not later be visible to reverse queries which select this replica, regardless of the consistency level. As far as I can see, this doesn't result in any persistent data loss. Only in that some data might appear resurrected to reverse queries, until the relevant range tombstone is fully repaired. This series fixes the bug and adds a minimal reproducer test. Fixes #10598 Closes #16003 * github.com:scylladb/scylladb: mutation_query_test: test that range tombstones are sent in reverse queries mutation_query: properly send range tombstones in reverse queries (cherry picked from commit 65e42e4)
Backport to 5.4 and 5.2 queued. |
Installation details
Kernel Version: 5.13.0-1021-aws
Scylla version (or git commit hash):
5.0~rc3-20220406.f92622e0d
with build-id2b79c4744216b294fdbd2f277940044c899156ea
Cluster size: 6 nodes (i3.large)
Scylla Nodes used in this run:
OS / Image:
ami-0e4ae5e4a139c50f3
(aws: eu-north-1)Test:
longevity-harry-2h-test
Test id:
5ad966f5-8ded-437d-9224-7601b95777f1
Test name:
scylla-staging/fruch/longevity-harry-2h-test
Test config file(s):
Issue description
During RollingRestartCluster, cassandra-harry fail in it's verification find only 19 key out of 134 expected ones.
It's doing the query six times, and in all of them get the same results
This is the query, a prepare statment that runs with
ConsistencyLevel.QUORUM
:Form cassandra-harry log,
Partition state
is the expected result,Observed state
is the result from the query (notice that the expected is in reverse order then the observed)With nemesis running this is happens 3/4 time it run
On steady state, with nemesis cassandra-harry finish with success, on 5.0.rc3
On master we are facing a coredump without nemesis, so we can't compare (#10553)
$ hydra investigate show-monitor 5ad966f5-8ded-437d-9224-7601b95777f1
$ hydra investigate show-logs 5ad966f5-8ded-437d-9224-7601b95777f1
Logs:
Jenkins job URL
The text was updated successfully, but these errors were encountered: