Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault on shard 2 during node decommission #5089

Closed
yarongilor opened this issue Sep 23, 2019 · 11 comments
Closed

Segmentation fault on shard 2 during node decommission #5089

yarongilor opened this issue Sep 23, 2019 · 11 comments
Assignees
Labels
Milestone

Comments

@yarongilor
Copy link

yarongilor commented Sep 23, 2019

Installation details
Scylla version (or git commit hash): 3.1.0.rc7-0.20190915.024d1563a
Cluster size: 6
OS (RHEL/CentOS/Ubuntu/AWS AMI): ami-0a01b94dffe4bcc0d

scenario: 50GB longevity - during nodetool_decommission nemesis of 10.0.63.76 node, there was a core dump on 10.0.244.153 node. then 10.0.244.153 got "Segmentation fault on shard 2"

failure scenario and error:

< t:2019-09-19 23:04:07,694 f:nemesis.py      l:585  c:sdcm.nemesis         p:INFO  > sdcm.nemesis.ChaosMonkey: >>>>>>>>>>>>>Started random_disrupt_method nodetool_decommission
< t:2019-09-19 23:04:08,458 f:nemesis.py      l:338  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.ChaosMonkey: Set current_disruption -> Decommission Node longevity-tls-50gb-4d-3-1-db-node-5979c9a0-5 [34.248.223.143 | 10.0.63.76] (seed: False)


2019-09-20T01:34:58+00:00  ip-10-0-244-153 !INFO    | scylla: Segmentation fault on shard 2
< t:2019-09-20 01:36:48,942 f:cluster.py      l:833  c:sdcm.cluster         p:DEBUG > Node longevity-tls-50gb-4d-3-1-db-node-5979c9a0-8 [34.245.136.197 | 10.0.244.153] (seed: False): Found coredump file: /var/lib/systemd/coredump/core.scylla.996.c212eda51d7d40c1b68eb746938de6ae.1730.1568943298000000
2019-09-20T01:37:05+00:00  ip-10-0-244-153 !ERR     | scylla: [shard 3] hints_manager - Exception when draining 10.0.63.76: std::filesystem::__cxx11::filesystem_error (error system:39, filesystem 

segmentation fault backtrace:

2019-09-20T01:34:58+00:00  ip-10-0-244-153 !INFO    | scylla: Segmentation fault on shard 2.
2019-09-20T01:34:58+00:00  ip-10-0-244-153 !INFO    | scylla: Backtrace:
2019-09-20T01:34:58+00:00  ip-10-0-244-153 !INFO    | scylla: 0x0000000004186f22
2019-09-20T01:34:58+00:00  ip-10-0-244-153 !INFO    | scylla: 0x0000000004074235
2019-09-20T01:34:58+00:00  ip-10-0-244-153 !INFO    | scylla: 0x0000000004074535
2019-09-20T01:34:58+00:00  ip-10-0-244-153 !INFO    | scylla: 0x0000000004074583
2019-09-20T01:34:58+00:00  ip-10-0-244-153 !INFO    | scylla: 0x00007f62960b902f
2019-09-20T01:34:58+00:00  ip-10-0-244-153 !INFO    | scylla: 0x00000000024aca59
2019-09-20T01:34:58+00:00  ip-10-0-244-153 !INFO    | scylla: 0x0000000004071701
2019-09-20T01:34:58+00:00  ip-10-0-244-153 !INFO    | scylla: 0x000000000407191e
2019-09-20T01:34:58+00:00  ip-10-0-244-153 !INFO    | scylla: 0x00000000041536ad
2019-09-20T01:34:58+00:00  ip-10-0-244-153 !INFO    | scylla: 0x000000000417dd1b
2019-09-20T01:34:58+00:00  ip-10-0-244-153 !INFO    | scylla: 0x000000000403f9dd
2019-09-20T01:34:58+00:00  ip-10-0-244-153 !INFO    | scylla: /opt/scylladb/libreloc/libpthread.so.0+0x000000000000858d
2019-09-20T01:34:58+00:00  ip-10-0-244-153 !INFO    | scylla: /opt/scylladb/libreloc/libc.so.6+0x00000000000fd6a2
2019-09-20T01:34:58+00:00  ip-10-0-244-153 !INFO    | kernel: traps: reactor-2[1732] general protection fault ip:24aca5a sp:7f62949e8cf0 error:0 in scylla[400000+4d6f000]

decoded backtrace:

[centos@ip-10-0-42-134 ~]$ addr2line -Cpfie /usr/lib/debug/opt/scylladb/libexec/scylla-3.1.0.rc7-0.20190915.024d1563a.el7.x86_64.debug 

?? ??:0
0x0000000004186f22
0x0000000004074235
0x0000000004074535
0x0000000004074583
0x00007f62960b902f
0x00000000024aca59
0x0000000004071701
0x000000000407191e
0x00000000041536ad
0x000000000417dd1b
0x000000000403f9dd
/opt/scylladb/libreloc/libpthread.so.0+0x000000000000858d
/opt/scylladb/libreloc/libc.so.6+0x00000000000fd6a2

void seastar::backtrace<seastar::backtrace_buffer::append_backtrace()::{lambda(seastar::frame)#1}>(seastar::backtrace_buffer::append_backtrace()::{lambda(seastar::frame)#1}&&) at /usr/include/boost/program_options/variables_map.hpp:146
seastar::backtrace_buffer::append_backtrace() at /usr/include/boost/program_options/variables_map.hpp:146
 (inlined by) print_with_backtrace at /data/jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../src/core/reactor.cc:1104
seastar::print_with_backtrace(char const*) at /usr/include/boost/program_options/variables_map.hpp:146
sigsegv_action at /usr/include/boost/program_options/variables_map.hpp:146
 (inlined by) operator() at /data/jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../src/core/reactor.cc:5015
 (inlined by) _FUN at /data/jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../src/core/reactor.cc:5011
?? ??:0
seastar::circular_buffer<seastar::shared_mutex::waiter, std::allocator<seastar::shared_mutex::waiter> >::front() at /data/jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:768
 (inlined by) seastar::shared_mutex::wake() at /data/jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/shared_mutex.hh:101
 (inlined by) seastar::shared_mutex::unlock() at /data/jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/shared_mutex.hh:96
 (inlined by) operator() at /data/jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/shared_mutex.hh:157
 (inlined by) operator() at /data/jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:1091
 (inlined by) apply<seastar::future<>::finally_body<seastar::with_lock(seastar::shared_mutex&, Func&&) [with Func = db::hints::space_watchdog::on_timer()::<lambda(std::filesystem::__cxx11::path, seastar::directory_entry)>::<lambda()>]::<lambda()>, false>, seastar::future<> > at /data/jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:1402
 (inlined by) operator()<seastar::future_state<> > at /data/jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:1004
 (inlined by) run_and_dispose at /data/jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:377
seastar::reactor::run_tasks(seastar::reactor::task_queue&) at /usr/include/boost/program_options/variables_map.hpp:146
seastar::reactor::run_some_tasks() at /usr/include/boost/program_options/variables_map.hpp:146
seastar::reactor::run_some_tasks() at /usr/include/boost/program_options/variables_map.hpp:146
 (inlined by) seastar::reactor::run() at /data/jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../src/core/reactor.cc:4246
seastar::smp::configure(boost::program_options::variables_map, seastar::reactor_config)::{lambda()#3}::operator()() const at /usr/include/boost/program_options/variables_map.hpp:146
std::function<void ()>::operator()() const at /usr/include/c++/8/bits/std_function.h:687
 (inlined by) seastar::posix_thread::start_routine(void*) at /data/jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../src/core/posix.cc:52
?? ??:0
?? ??:0

coredump details: download_instructions=
gsutil cp gs://upload.scylladb.com/core.scylla.996.c212eda51d7d40c1b68eb746938de6ae.1730.1568943298000000/core.scylla.996.c212eda51d7d40c1b68eb746938de6ae.1730.1568943298000000.gz .

test logs:
job log: scratch.scylladb.com/qa_issues/sct_08_42_22_09_2019.log.gz

$ hydra investigate show-logs 5979c9a0-a1c1-4601-a2c3-63f1850766d9 | grep '|'
|                                            Log links for testrun with test id 5979c9a0-a1c1-4601-a2c3-63f1850766d9                                             |
| Log type              | Link                                                                                                                                   |
| events                | https://cloudius-jenkins-test.s3.amazonaws.com/5979c9a0-a1c1-4601-a2c3-63f1850766d9/events_log.zip                                     |
| db-cluster            | https://cloudius-jenkins-test.s3.amazonaws.com/5979c9a0-a1c1-4601-a2c3-63f1850766d9/longevity-tls-50gb-4d-3-1-db-cluster-5979c9a0.zip  |
| monitor-set           | https://cloudius-jenkins-test.s3.amazonaws.com/5979c9a0-a1c1-4601-a2c3-63f1850766d9/longevity-tls-50gb-4d-3-1-monitor-set-5979c9a0.zip |
| monitoring_data_stack | https://cloudius-jenkins-test.s3.amazonaws.com/5979c9a0-a1c1-4601-a2c3-63f1850766d9/monitoring_data_stack_branch-2.4_3.1.tar.gz        |

snapshot:
https://snapshot.raintank.io/dashboard/snapshot/I6ue7n0F4RQkowyEhHqc2p44NdxIZwT1

@bhalevy bhalevy added this to the 3.1 milestone Sep 24, 2019
@bhalevy
Copy link
Member

bhalevy commented Sep 24, 2019

@eliransin please do the triage

@bhalevy
Copy link
Member

bhalevy commented Sep 25, 2019

@yarongilor is this segfault the same one hit in #5087?

@bhalevy
Copy link
Member

bhalevy commented Sep 26, 2019

@amoskong / @avikivity could toppartitions be involved here too? (#5104)

@eliransin
Copy link
Contributor

@bhalevy it is not the same as #5087
it seams like a race between endpoint user and endpoint destroyer. I am still try to figure this out.
It looks like the endpoint is gone between point that the resource manager finds it and the point
where it tries to take the lock (which by then I suspect doesn't exists).

@eliransin
Copy link
Contributor

(gdb) f 0
#0  seastar::shared_mutex::wake (this=0x602000065918) at /data/jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/shared_mutex.hh:103
103	            if (w.for_write) {
(gdb) p &w
$1 = (seastar::shared_mutex::waiter *) 0xbb31f25f87b37520

The suspicious logic is:
https://github.com/scylladb/scylla/blob/b0e0f29b062e7fc916b27e874fb027f3cca406fe/db/hints/resource_manager.cc#L137-L141
This code is part of the resource manager that checks the hints disk occupancy.
Since there is a decommission involved, if we are waiting on a lock and while doing so, manager::on_leave_cluster(...) is called that in turn calls manager::drain_for(...), there can be a situation that we are referencing an invalid endpoint file update shared mutex reference.
I think there should be a gate for all hint writers that should be closed when a node is leaving the cluster and entered before acquiring the lock.
We also know that it must have been locked since wake was called.
/cc @bhalevy @psarna - since you edited code in the vicinity - maybe it rings a bell, or you can say
if my analysis or my suggested solution sounds reasonable.

@psarna
Copy link
Contributor

psarna commented Sep 30, 2019

Duplicate of #4685? @vladzcloudius?

@glommer
Copy link
Contributor

glommer commented Sep 30, 2019 via email

@psarna
Copy link
Contributor

psarna commented Sep 30, 2019

@glommer true, I only found it today and reviewed.

@yarongilor
Copy link
Author

@yarongilor is this segfault the same one hit in #5087?

@bhalevy yes #5087 points to the same time in log - 2019-09-20T01:34 - 37

@slivne
Copy link
Contributor

slivne commented Oct 3, 2019

Avi is working to merge the fixes for #4685 and #4836 [PATCH scylla v2 0/5] hinted handoff: fix races during shutdown and draining

@slivne
Copy link
Contributor

slivne commented Oct 6, 2019

patch merged d9dc8f9 and backported - closing issue

will be part of 3.1.rc9

@slivne slivne closed this as completed Oct 6, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants