Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

repair_option_pr_multi_dc_test coredumps opon shutdown #4589

Closed
bhalevy opened this issue Jun 23, 2019 · 8 comments

Comments

Projects
None yet
5 participants
@bhalevy
Copy link
Contributor

commented Jun 23, 2019

Not sure when exactly this started.
The earliest clear evidence in dtest-release is in mentioned below.
Since then it happens consistently.

[dtest-release/128/artifact/logs-release.2/1558446246519_repair_additional_test.RepairAdditionalTest.repair_option_pr_multi_dc_test/node2.log):

INFO  2019-05-21 13:42:06,202 [shard 0] view - Stopping view builder
INFO  2019-05-21 13:42:06,202 [shard 1] view - Stopping view builder
terminate called after throwing an instance of 'seastar::no_sharded_instance_exception'
  what():  sharded instance does not exists
Aborting on shard 0.
Backtrace:
  0x000000000389d852
  0x0000000003787375
  0x0000000003787675
  0x0000000003787723
  0x00007f4e2164402f
  /jenkins/workspace/scylla-master/dtest-release/scylla-dtest/../scylla/dynamic_libs/libc.so.6+0x000000000003853e
  /jenkins/workspace/scylla-master/dtest-release/scylla-dtest/../scylla/dynamic_libs/libc.so.6+0x0000000000022894
  /jenkins/workspace/scylla-master/dtest-release/scylla-dtest/../scylla/dynamic_libs/libstdc++.so.6+0x0000000000090f1a
  /jenkins/workspace/scylla-master/dtest-release/scylla-dtest/../scylla/dynamic_libs/libstdc++.so.6+0x000000000009738b
  /jenkins/workspace/scylla-master/dtest-release/scylla-dtest/../scylla/dynamic_libs/libstdc++.so.6+0x0000000000096398
  /jenkins/workspace/scylla-master/dtest-release/scylla-dtest/../scylla/dynamic_libs/libstdc++.so.6+0x0000000000096d57
  0x00007f4e210e62b2
  0x00007f4e210e6bcc
  0x000000000041e151
  0x00000000008602d7
  0x0000000000862ce7
  0x0000000000829fb1

@bhalevy bhalevy added the dtest label Jun 23, 2019

@slivne

This comment has been minimized.

Copy link
Contributor

commented Jun 23, 2019

backtrace ?

@bhalevy

This comment has been minimized.

Copy link
Contributor Author

commented Jun 24, 2019

From dtest-release/157/artifact/logs-release.2/1561284048806_repair_additional_test.RepairAdditionalTest.repair_option_pr_multi_dc_test/node2.log:

INFO  2019-06-23 09:57:07,506 [shard 0] init - Signal received; shutting down
INFO  2019-06-23 09:57:07,506 [shard 0] compaction_manager - Asked to stop
INFO  2019-06-23 09:57:07,506 [shard 0] compaction_manager - Stopped
INFO  2019-06-23 09:57:07,506 [shard 1] compaction_manager - Asked to stop
INFO  2019-06-23 09:57:07,506 [shard 1] compaction_manager - Stopped
INFO  2019-06-23 09:57:07,506 [shard 0] view - Stopping view builder
INFO  2019-06-23 09:57:07,506 [shard 1] view - Stopping view builder
terminate called after throwing an instance of 'seastar::no_sharded_instance_exception'
  what():  sharded instance does not exists
Aborting on shard 0.
Backtrace:
  0x0000000002dac432
  0x0000000002cbc835
  0x0000000002cbcb35
  0x0000000002cbcbe3
  0x00007f5bcb7cd02f
  /jenkins/workspace/scylla-master/dtest-release/scylla-dtest/../scylla/dynamic_libs/libc.so.6+0x000000000003853e
  /jenkins/workspace/scylla-master/dtest-release/scylla-dtest/../scylla/dynamic_libs/libc.so.6+0x0000000000022894
  /jenkins/workspace/scylla-master/dtest-release/scylla-dtest/../scylla/dynamic_libs/libstdc++.so.6+0x0000000000090f1a
  /jenkins/workspace/scylla-master/dtest-release/scylla-dtest/../scylla/dynamic_libs/libstdc++.so.6+0x000000000009738b
  /jenkins/workspace/scylla-master/dtest-release/scylla-dtest/../scylla/dynamic_libs/libstdc++.so.6+0x0000000000096398
  /jenkins/workspace/scylla-master/dtest-release/scylla-dtest/../scylla/dynamic_libs/libstdc++.so.6+0x0000000000096d57
  0x00007f5bcaf852b2
  0x00007f5bcaf85bcc
  0x00000000004215e0
  0x0000000000767d1c
  0x0000000000769207
  0x0000000000742e81

void seastar::backtrace<seastar::backtrace_buffer::append_backtrace()::{lambda(seastar::frame)#1}>(seastar::backtrace_buffer::append_backtrace()::{lambda(seastar::frame)#1}&&) at /jenkins/workspace/scylla-master/dtest-release/scylla/seastar/build/release/../../include/seastar/util/backtrace.hh:55
seastar::print_with_backtrace(seastar::backtrace_buffer&) at /jenkins/workspace/scylla-master/dtest-release/scylla/seastar/build/release/../../src/core/reactor.cc:1136
 (inlined by) print_with_backtrace at /jenkins/workspace/scylla-master/dtest-release/scylla/seastar/build/release/../../src/core/reactor.cc:1157
seastar::print_with_backtrace(char const*) at /jenkins/workspace/scylla-master/dtest-release/scylla/seastar/build/release/../../src/core/reactor.cc:1164
seastar::install_oneshot_signal_handler<6, &seastar::sigabrt_action>()::{lambda(int, siginfo_t*, void*)#1}::_FUN(int, siginfo_t*, void*) at /jenkins/workspace/scylla-master/dtest-release/scylla/seastar/build/release/../../src/core/reactor.cc:5123
 (inlined by) operator() at /jenkins/workspace/scylla-master/dtest-release/scylla/seastar/build/release/../../src/core/reactor.cc:5105
 (inlined by) _FUN at /jenkins/workspace/scylla-master/dtest-release/scylla/seastar/build/release/../../src/core/reactor.cc:5101
?? ??:0
?? ??:0
?? ??:0
?? ??:0
?? ??:0
?? ??:0
?? ??:0
?? ??:0
?? ??:0
?? ??:0
?? ??:0
?? ??:0
?? ??:0
?? ??:0
?? ??:0
main::{lambda()#1}::operator()() const::{lambda()#2}::operator()() const::{lambda()#33}::operator()() const [clone .isra.3838] [clone .cold.4813] at /jenkins/workspace/scylla-master/dtest-release/scylla/seastar/include/seastar/core/future.hh:722
 (inlined by) ?? at /jenkins/workspace/scylla-master/dtest-release/scylla/seastar/include/seastar/core/future.hh:715
 (inlined by) ?? at /jenkins/workspace/scylla-master/dtest-release/scylla/seastar/include/seastar/core/future.hh:830
 (inlined by) operator() at /jenkins/workspace/scylla-master/dtest-release/scylla/main.cc:926
main::{lambda()#1}::operator()() const::{lambda()#2}::operator()() const at /jenkins/workspace/scylla-master/dtest-release/scylla/seastar/include/seastar/util/defer.hh:44
 (inlined by) operator() at /jenkins/workspace/scylla-master/dtest-release/scylla/main.cc:927
_ZN7seastar20noncopyable_functionIFvvEE17direct_vtable_forIZZNS_5asyncIZZ4mainENKUlvE_clEvEUlvE0_JEEENS_8futurizeINSt9result_ofIFNSt5decayIT_E4typeEDpNS9_IT0_E4typeEEE4typeEE4typeENS_17thread_attributesEOSA_DpOSD_ENUlRZNS4_IS6_JEEESL_SM_SN_SP_E4workE_clESR_EUlvE_E4callEPKS2_ at /jenkins/workspace/scylla-master/dtest-release/scylla/seastar/include/seastar/core/apply.hh:35
 (inlined by) ?? at /jenkins/workspace/scylla-master/dtest-release/scylla/seastar/include/seastar/core/apply.hh:43
 (inlined by) ?? at /jenkins/workspace/scylla-master/dtest-release/scylla/seastar/include/seastar/core/future.hh:1306
 (inlined by) operator() at /jenkins/workspace/scylla-master/dtest-release/scylla/seastar/include/seastar/core/thread.hh:324
 (inlined by) call at /jenkins/workspace/scylla-master/dtest-release/scylla/seastar/include/seastar/util/noncopyable_function.hh:71
seastar::thread_context::main() at /jenkins/workspace/scylla-master/dtest-release/scylla/seastar/build/release/../../include/seastar/util/noncopyable_function.hh:145
 (inlined by) seastar::thread_context::main() at /jenkins/workspace/scylla-master/dtest-release/scylla/seastar/build/release/../../src/core/thread.cc:317

main.cc#L925-L927:

            auto stop_view_builder = defer([] {
                view_builder.stop().get();
            });
@bhalevy

This comment has been minimized.

Copy link
Contributor Author

commented Jun 24, 2019

@psarna can you please look into?

@psarna

This comment has been minimized.

Copy link
Member

commented Jun 24, 2019

sure, I'll take a look

@psarna

This comment has been minimized.

Copy link
Member

commented Jun 24, 2019

Hm, seems obvious. view_builder is started conditionally, but stopped unconditionally, which will lead to this failure on shutdown. I don't really see how view building gets disabled in this test case, but it's the matter of configuration. I'll fix it by making the stopping conditional as well.

psarna added a commit to psarna/scylla that referenced this issue Jun 24, 2019

main: stop view builder conditionally
The view builder is started only if it's enabled in config,
via the view_building=true variable. Unfortunately, stopping
the builder was unconditional, which may result in failed
assertions during shutdown. To remedy this, view building
is stopped only if it was previously started.

Fixes scylladb#4589

avikivity added a commit that referenced this issue Jun 24, 2019

main: stop view builder conditionally
The view builder is started only if it's enabled in config,
via the view_building=true variable. Unfortunately, stopping
the builder was unconditional, which may result in failed
assertions during shutdown. To remedy this, view building
is stopped only if it was previously started.

Fixes #4589
@bhalevy

This comment has been minimized.

Copy link
Contributor Author

commented Jun 24, 2019

Hm, seems obvious. view_builder is started conditionally, but stopped unconditionally, which will lead to this failure on shutdown. I don't really see how view building gets disabled in this test case, but it's the matter of configuration. I'll fix it by making the stopping conditional as well.

I didn't see any evidence of the view builder disabled so I doubt #4594 will fix this issue.

@bhalevy

This comment has been minimized.

Copy link
Contributor Author

commented Jun 25, 2019

@psarna fyi, seen this in another dtest: dtest-release/158/artifact/logs-release.2/1561433162725_materialized_views_test.TestMaterializedViews.mv_populating_from_existing_data_during_node_stop_test/node2.log:

INFO  2019-06-25 03:23:42,099 [shard 0] init - stopping view builder
INFO  2019-06-25 03:23:42,099 [shard 0] view - Stopping view builder
INFO  2019-06-25 03:23:42,099 [shard 1] view - Stopping view builder
ERROR 2019-06-25 03:23:42,099 [shard 0] view - Failed to update materialized view bookkeeping (seastar::no_sharded_instance_exception (sharded instance does not exists)), continuing anyway.

I'm not sure if this came from view_builder::execute or view_builder::calculate_shard_build_step.

a. I think we better uniquify these two messages so we can distinguish between them.
b. Since we're continuing anyway, we better demote the log severity level from error to warn, this will also not fail the dtest.

Opened #4600 for the above.

avikivity added a commit that referenced this issue Jun 26, 2019

main: stop view builder conditionally
The view builder is started only if it's enabled in config,
via the view_building=true variable. Unfortunately, stopping
the builder was unconditional, which may result in failed
assertions during shutdown. To remedy this, view building
is stopped only if it was previously started.

Fixes #4589

(cherry picked from commit efa7951)

avikivity added a commit that referenced this issue Jun 26, 2019

main: stop view builder conditionally
The view builder is started only if it's enabled in config,
via the view_building=true variable. Unfortunately, stopping
the builder was unconditional, which may result in failed
assertions during shutdown. To remedy this, view building
is stopped only if it was previously started.

Fixes #4589

(cherry picked from commit efa7951)

avikivity added a commit that referenced this issue Jun 26, 2019

main: stop view builder conditionally
The view builder is started only if it's enabled in config,
via the view_building=true variable. Unfortunately, stopping
the builder was unconditional, which may result in failed
assertions during shutdown. To remedy this, view building
is stopped only if it was previously started.

Fixes #4589

(cherry picked from commit efa7951)

avikivity added a commit that referenced this issue Jun 26, 2019

main: stop view builder conditionally
The view builder is started only if it's enabled in config,
via the view_building=true variable. Unfortunately, stopping
the builder was unconditional, which may result in failed
assertions during shutdown. To remedy this, view building
is stopped only if it was previously started.

Fixes #4589

(cherry picked from commit efa7951)

avikivity added a commit that referenced this issue Jun 26, 2019

main: stop view builder conditionally
The view builder is started only if it's enabled in config,
via the view_building=true variable. Unfortunately, stopping
the builder was unconditional, which may result in failed
assertions during shutdown. To remedy this, view building
is stopped only if it was previously started.

Fixes #4589

(cherry picked from commit efa7951)

avikivity added a commit that referenced this issue Jun 26, 2019

main: stop view builder conditionally
The view builder is started only if it's enabled in config,
via the view_building=true variable. Unfortunately, stopping
the builder was unconditional, which may result in failed
assertions during shutdown. To remedy this, view building
is stopped only if it was previously started.

Fixes #4589

(cherry picked from commit efa7951)
@avikivity

This comment has been minimized.

Copy link
Contributor

commented Jun 26, 2019

Backported to 2.3, 3.0, 3.1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.