-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
4 nodes was aborted with segmentation fault during decommission streaming and creating index (with raft topology) #18460
Comments
Decoded backtrace:
|
Another segmentation and coredump during decommission last point.
at this moment node 4 reported segmentation issue
|
Happened on PackagesScylla version: Kernel Version: Issue description
Describe your issue in detail and steps it took to produce it. ImpactDescribe the impact this issue causes to the user. How frequently does it reproduce?Describe the frequency with how this issue can be reproduced. Installation detailsCluster size: 5 nodes (i4i.8xlarge) Scylla Nodes used in this run:
OS / Image: Test: Logs and commands
Logs:
|
Issue reproduced during this job PackagesScylla version: Kernel Version: Issue description
Describe your issue in detail and steps it took to produce it. ImpactDescribe the impact this issue causes to the user. How frequently does it reproduce?Describe the frequency with how this issue can be reproduced. Installation detailsCluster size: 5 nodes (i4i.2xlarge) Scylla Nodes used in this run:
OS / Image: Test: Logs and commands
Logs:
|
@aleksbykov no need to post more reports -- we already know the problem and its root cause. |
Packages
Scylla version:
5.5.0~dev-20240427.d8313dda43d7
with build-id2c0b8475593d8ec95ecc44ea0b59d42ac20da322
Kernel Version:
5.15.0-1060-aws
Issue description
Cluster has multi dc configuration + raft topology. 2 operations runs in parallel:
CreateSecondaryIndexes and DecommissionStreamingAbort.
While decommission streaming was running, 4 nodes triggers coredump with segmentation fault:
node12, node5, node6, node9
Cores are located:
download_instructions=gsutil cp
gunzip /var/lib/systemd/coredump/core.scylla.112.3f23a2aec3c74a12bd33fd30b2c6f509.7715.1714307806000000.gz
gunzip /var/lib/systemd/coredump/core.scylla.112.7f018feb668f4d0bb85ea837ce977fc8.7657.1714307806000000.gz
Impact
Describe the impact this issue causes to the user.
How frequently does it reproduce?
Describe the frequency with how this issue can be reproduced.
Installation details
Cluster size: 12 nodes (i3en.2xlarge)
Scylla Nodes used in this run:
OS / Image:
ami-044f45ee3df20f616 ami-0817b1a4fd29bd52e
(aws: undefined_region)Test:
longevity-multidc-schema-topology-changes-12h-with-raft-test
Test id:
930c515b-0222-49ff-aaff-bdfc6ac9ddbb
Test name:
scylla-master/raft/longevity-multidc-schema-topology-changes-12h-with-raft-test
Test config file(s):
Logs and commands
$ hydra investigate show-monitor 930c515b-0222-49ff-aaff-bdfc6ac9ddbb
$ hydra investigate show-logs 930c515b-0222-49ff-aaff-bdfc6ac9ddbb
Logs:
Jenkins job URL
Argus
The text was updated successfully, but these errors were encountered: