Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reactor stalled for 32 ms after during repair in row_cache::update() #7181

Closed
fgelcer opened this issue Sep 6, 2020 · 38 comments
Closed

Reactor stalled for 32 ms after during repair in row_cache::update() #7181

fgelcer opened this issue Sep 6, 2020 · 38 comments

Comments

@fgelcer
Copy link

fgelcer commented Sep 6, 2020

This is Scylla's bug tracker, to be used for reporting bugs only.
If you have a question about Scylla, and not a bug, please ask it in
our mailing-list at scylladb-dev@googlegroups.com or in our slack channel.

  • [] I have read the disclaimer above, and I am reporting a suspected malfunction in Scylla.

Installation details
Scylla version (or git commit hash): 666.development-0.20200901.b3f00685e with build-id 0de9edcca81c447d53c39dd1adc333f769f4e240
Cluster size: 3
OS (RHEL/CentOS/Ubuntu/AWS AMI): ami-0096fe0439d41b6d6

void seastar::backtrace<seastar::backtrace_buffer::append_backtrace()::{lambda(seastar::frame)#1}>(seastar::backtrace_buffer::append_backtrace()::{lambda(seastar::frame)#1}&&) at reactor.cc:?
seastar::print_with_backtrace(seastar::backtrace_buffer&) at reactor.cc:?
 (inlined by) print_with_backtrace at /jenkins/workspace/scylla-master/next/scylla/seastar/src/core/reactor.cc:752
seastar::internal::cpu_stall_detector::generate_trace() at reactor.cc:?
seastar::reactor::block_notifier(int) at reactor.cc:?
 (inlined by) seastar::internal::cpu_stall_detector::on_signal() at /jenkins/workspace/scylla-master/next/scylla/seastar/src/core/reactor.cc:1061
 (inlined by) seastar::reactor::block_notifier(int) at /jenkins/workspace/scylla-master/next/scylla/seastar/src/core/reactor.cc:1184
?? ??:0
logalloc::region_impl::object_descriptor::encode(char*&, unsigned long) const at /jenkins/workspace/scylla-master/next/scylla/seastar/include/seastar/util/log.hh:126
 (inlined by) logalloc::region_impl::alloc_small(logalloc::region_impl::object_descriptor const&, unsigned int, unsigned long) at /jenkins/workspace/scylla-master/next/scylla/utils/logalloc.cc:1193
logalloc::region_impl::alloc_small(logalloc::region_impl::object_descriptor const&, unsigned int, unsigned long) at /jenkins/workspace/scylla-master/next/scylla/utils/logalloc.cc:1187
 (inlined by) logalloc::region_impl::alloc_small(logalloc::region_impl::object_descriptor const&, unsigned int, unsigned long) at /jenkins/workspace/scylla-master/next/scylla/utils/logalloc.cc:1176
 (inlined by) logalloc::region_impl::alloc(migrate_fn_type const*, unsigned long, unsigned long) at /jenkins/workspace/scylla-master/next/scylla/utils/logalloc.cc:1415
row::apply_monotonically(schema const&, column_kind, row&&) at mutation_partition.cc:?
 (inlined by) row::reserve(unsigned int) at /jenkins/workspace/scylla-master/next/scylla/mutation_partition.cc:1622
 (inlined by) row::reserve(unsigned int) at /jenkins/workspace/scylla-master/next/scylla/mutation_partition.cc:1616
 (inlined by) row::apply_monotonically(schema const&, column_kind, row&&) at /jenkins/workspace/scylla-master/next/scylla/mutation_partition.cc:1708
deletable_row::apply_monotonically(schema const&, deletable_row&&) at mutation_partition.cc:?
 (inlined by) deletable_row::apply_monotonically(schema const&, deletable_row&&) at /jenkins/workspace/scylla-master/next/scylla/mutation_partition.cc:1092
operator() at /jenkins/workspace/scylla-master/next/scylla/partition_version.cc:470
 (inlined by) consume_row<partition_entry::apply_to_incomplete(const schema&, partition_entry&&, mutation_cleaner&, logalloc::allocating_section&, logalloc::region&, cache_tracker&, partition_snapshot::phase_type, real_dirty_memory_accounter&)::<lambda()> mutable::<lambda()>::<lambda()>::<lambda(deletable_row&&)> > at /jenkins/workspace/scylla-master/next/scylla/partition_snapshot_row_cursor.hh:378
 (inlined by) operator() at /jenkins/workspace/scylla-master/next/scylla/partition_version.cc:469
 (inlined by) with_linearized_managed_bytes<partition_entry::apply_to_incomplete(const schema&, partition_entry&&, mutation_cleaner&, logalloc::allocating_section&, logalloc::region&, cache_tracker&, partition_snapshot::phase_type, real_dirty_memory_accounter&)::<lambda()> mutable::<lambda()>::<lambda()> > at /jenkins/workspace/scylla-master/next/scylla/utils/managed_bytes.hh:427
 (inlined by) operator() at /jenkins/workspace/scylla-master/next/scylla/partition_version.cc:484
 (inlined by) with_reclaiming_disabled<partition_entry::apply_to_incomplete(const schema&, partition_entry&&, mutation_cleaner&, logalloc::allocating_section&, logalloc::region&, cache_tracker&, partition_snapshot::phase_type, real_dirty_memory_accounter&)::<lambda()> mutable::<lambda()>&> at /jenkins/workspace/scylla-master/next/scylla/./utils/logalloc.hh:746
 (inlined by) operator() at /jenkins/workspace/scylla-master/next/scylla/./utils/logalloc.hh:768
 (inlined by) with_reserve<logalloc::allocating_section::operator()<partition_entry::apply_to_incomplete(const schema&, partition_entry&&, mutation_cleaner&, logalloc::allocating_section&, logalloc::region&, cache_tracker&, partition_snapshot::phase_type, real_dirty_memory_accounter&)::<lambda()> mutable::<lambda()> >::<lambda()> > at /jenkins/workspace/scylla-master/next/scylla/./utils/logalloc.hh:718
 (inlined by) operator()<partition_entry::apply_to_incomplete(const schema&, partition_entry&&, mutation_cleaner&, logalloc::allocating_section&, logalloc::region&, cache_tracker&, partition_snapshot::phase_type, real_dirty_memory_accounter&)::<lambda()> mutable::<lambda()> > at /jenkins/workspace/scylla-master/next/scylla/./utils/logalloc.hh:767
 (inlined by) operator() at /jenkins/workspace/scylla-master/next/scylla/partition_version.cc:485
 (inlined by) call at /jenkins/workspace/scylla-master/next/scylla/seastar/include/seastar/util/noncopyable_function.hh:145
seastar::noncopyable_function<seastar::bool_class<seastar::stop_iteration_tag> ()>::operator()() const at /jenkins/workspace/scylla-master/next/scylla/./utils/logalloc.hh:715
 (inlined by) coroutine::run() at /jenkins/workspace/scylla-master/next/scylla/utils/coroutine.hh:51
 (inlined by) operator() at /jenkins/workspace/scylla-master/next/scylla/row_cache.cc:982
 (inlined by) with_allocator<row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::<lambda(logalloc::allocating_section&, double_decker<long int, cache_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0>::iterator, memtable_entry&, partition_presence_checker&, real_dirty_memory_accounter&, const double_decker<long int, cache_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0>::bound_hint&)> >::<lambda()>::<lambda()> mutable::<lambda()> > at /jenkins/workspace/scylla-master/next/scylla/./utils/allocation_strategy.hh:307
 (inlined by) operator() at /jenkins/workspace/scylla-master/next/scylla/row_cache.cc:956
 (inlined by) __invoke_impl<void, row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::<lambda(logalloc::allocating_section&, double_decker<long int, cache_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0>::iterator, memtable_entry&, partition_presence_checker&, real_dirty_memory_accounter&, const double_decker<long int, cache_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0>::bound_hint&)> >::<lambda()>::<lambda()> > at /usr/include/c++/10/bits/invoke.h:60
 (inlined by) __invoke<row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::<lambda(logalloc::allocating_section&, double_decker<long int, cache_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0>::iterator, memtable_entry&, partition_presence_checker&, real_dirty_memory_accounter&, const double_decker<long int, cache_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0>::bound_hint&)> >::<lambda()>::<lambda()> > at /usr/include/c++/10/bits/invoke.h:95
 (inlined by) __apply_impl<row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::<lambda(logalloc::allocating_section&, double_decker<long int, cache_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0>::iterator, memtable_entry&, partition_presence_checker&, real_dirty_memory_accounter&, const double_decker<long int, cache_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0>::bound_hint&)> >::<lambda()>::<lambda()>, std::tuple<> > at /usr/include/c++/10/tuple:1723
 (inlined by) apply<row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::<lambda(logalloc::allocating_section&, double_decker<long int, cache_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0>::iterator, memtable_entry&, partition_presence_checker&, real_dirty_memory_accounter&, const double_decker<long int, cache_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0>::bound_hint&)> >::<lambda()>::<lambda()>, std::tuple<> > at /usr/include/c++/10/tuple:1734
 (inlined by) apply<row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::<lambda(logalloc::allocating_section&, double_decker<long int, cache_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0>::iterator, memtable_entry&, partition_presence_checker&, real_dirty_memory_accounter&, const double_decker<long int, cache_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0>::bound_hint&)> >::<lambda()>::<lambda()> > at /jenkins/workspace/scylla-master/next/scylla/seastar/include/seastar/core/future.hh:1988
 (inlined by) operator() at /jenkins/workspace/scylla-master/next/scylla/seastar/include/seastar/core/thread.hh:258
 (inlined by) call at /jenkins/workspace/scylla-master/next/scylla/seastar/include/seastar/util/noncopyable_function.hh:116
seastar::noncopyable_function<void ()>::operator()() const at /usr/include/c++/10/bits/basic_string.h:323
 (inlined by) seastar::thread_context::main() at /jenkins/workspace/scylla-master/next/scylla/seastar/src/core/thread.cc:297

db logs here

@fgelcer
Copy link
Author

fgelcer commented Sep 6, 2020

@avikivity , could this be the same as the one in the description?

void seastar::backtrace<seastar::backtrace_buffer::append_backtrace()::{lambda(seastar::frame)#1}>(seastar::backtrace_buffer::append_backtrace()::{lambda(seastar::frame)#1}&&) at reactor.cc:?
seastar::print_with_backtrace(seastar::backtrace_buffer&) at reactor.cc:?
 (inlined by) print_with_backtrace at /jenkins/workspace/scylla-master/next/scylla/seastar/src/core/reactor.cc:752
seastar::internal::cpu_stall_detector::generate_trace() at reactor.cc:?
seastar::reactor::block_notifier(int) at reactor.cc:?
 (inlined by) seastar::internal::cpu_stall_detector::on_signal() at /jenkins/workspace/scylla-master/next/scylla/seastar/src/core/reactor.cc:1061
 (inlined by) seastar::reactor::block_notifier(int) at /jenkins/workspace/scylla-master/next/scylla/seastar/src/core/reactor.cc:1184
?? ??:0
logalloc::region_impl::object_descriptor::encode(char*&, unsigned long) const at /jenkins/workspace/scylla-master/next/scylla/seastar/include/seastar/util/log.hh:126
 (inlined by) logalloc::region_impl::alloc_small(logalloc::region_impl::object_descriptor const&, unsigned int, unsigned long) at /jenkins/workspace/scylla-master/next/scylla/utils/logalloc.cc:1193
logalloc::region_impl::alloc_small(logalloc::region_impl::object_descriptor const&, unsigned int, unsigned long) at /jenkins/workspace/scylla-master/next/scylla/utils/logalloc.cc:1187
 (inlined by) logalloc::region_impl::alloc_small(logalloc::region_impl::object_descriptor const&, unsigned int, unsigned long) at /jenkins/workspace/scylla-master/next/scylla/utils/logalloc.cc:1176
 (inlined by) logalloc::region_impl::alloc(migrate_fn_type const*, unsigned long, unsigned long) at /jenkins/workspace/scylla-master/next/scylla/utils/logalloc.cc:1415
bplus::node<long, intrusive_array<cache_entry>, dht::raw_token_less_comparator, 16ul, (bplus::key_search)0, (bplus::with_debug)0>* allocation_strategy::construct<bplus::node<long, intrusive_array<cache_entry>, dht::raw_token_less_comparator, 16ul, (bplus::key_search)0, (bplus::with_debug)0>>() at /jenkins/workspace/scylla-master/next/scylla/./utils/logalloc.hh:715
 (inlined by) bplus::node<long, intrusive_array<cache_entry>, dht::raw_token_less_comparator, 16ul, (bplus::key_search)0, (bplus::with_debug)0>::create() at /jenkins/workspace/scylla-master/next/scylla/./utils/bptree.hh:1398
 (inlined by) bplus::node<long, intrusive_array<cache_entry>, dht::raw_token_less_comparator, 16ul, (bplus::key_search)0, (bplus::with_debug)0>::prealloc::push() at /jenkins/workspace/scylla-master/next/scylla/./utils/bptree.hh:1869
 (inlined by) bplus::node<long, intrusive_array<cache_entry>, dht::raw_token_less_comparator, 16ul, (bplus::key_search)0, (bplus::with_debug)0>::insert(unsigned long, long, bplus::data<long, intrusive_array<cache_entry>, dht::raw_token_less_comparator, 16ul, (bplus::key_search)0, (bplus::with_debug)0>*, dht::raw_token_less_comparator) at /jenkins/workspace/scylla-master/next/scylla/./utils/bptree.hh:1655
bplus::tree<long, intrusive_array<cache_entry>, dht::raw_token_less_comparator, 16ul, (bplus::key_search)0, (bplus::with_debug)0>::iterator bplus::tree<long, intrusive_array<cache_entry>, dht::raw_token_less_comparator, 16ul, (bplus::key_search)0, (bplus::with_debug)0>::iterator::emplace_before<bplus::tree<long, intrusive_array<cache_entry>, dht::raw_token_less_comparator, 16ul, (bplus::key_search)0, (bplus::with_debug)0>::iterator bplus::tree<long, intrusive_array<cache_entry>, dht::raw_token_less_comparator, 16ul, (bplus::key_search)0, (bplus::with_debug)0>::iterator::emplace_before<cache_entry::evictable_tag, seastar::lw_shared_ptr<schema const>&, dht::decorated_key, partition_entry>(long, dht::raw_token_less_comparator, cache_entry::evictable_tag&&, seastar::lw_shared_ptr<schema const>&, dht::decorated_key&&, partition_entry&&)::{lambda(bplus::data<long, intrusive_array<cache_entry>, dht::raw_token_less_comparator, 16ul, (bplus::key_search)0, (bplus::with_debug)0>*)#1}, cache_entry::evictable_tag, seastar::lw_shared_ptr<schema const>&, dht::decorated_key, partition_entry>(bplus::tree<long, intrusive_array<cache_entry>, dht::raw_token_less_comparator, 16ul, (bplus::key_search)0, (bplus::with_debug)0>::iterator bplus::tree<long, intrusive_array<cache_entry>, dht::raw_token_less_comparator, 16ul, (bplus::key_search)0, (bplus::with_debug)0>::iterator::emplace_before<cache_entry::evictable_tag, seastar::lw_shared_ptr<schema const>&, dht::decorated_key, partition_entry>(long, dht::raw_token_less_comparator, cache_entry::evictable_tag&&, seastar::lw_shared_ptr<schema const>&, dht::decorated_key&&, partition_entry&&)::{lambda(bplus::data<long, intrusive_array<cache_entry>, dht::raw_token_less_comparator, 16ul, (bplus::key_search)0, (bplus::with_debug)0>*)#1}, dht::raw_token_less_comparator, cache_entry::evictable_tag&&, seastar::lw_shared_ptr<schema const>&, dht::decorated_key&&, partition_entry&&) at /jenkins/workspace/scylla-master/next/scylla/./utils/logalloc.hh:715
 (inlined by) bplus::tree<long, intrusive_array<cache_entry>, dht::raw_token_less_comparator, 16ul, (bplus::key_search)0, (bplus::with_debug)0>::iterator bplus::tree<long, intrusive_array<cache_entry>, dht::raw_token_less_comparator, 16ul, (bplus::key_search)0, (bplus::with_debug)0>::iterator::emplace_before<cache_entry::evictable_tag, seastar::lw_shared_ptr<schema const>&, dht::decorated_key, partition_entry>(long, dht::raw_token_less_comparator, cache_entry::evictable_tag&&, seastar::lw_shared_ptr<schema const>&, dht::decorated_key&&, partition_entry&&) at /jenkins/workspace/scylla-master/next/scylla/./utils/bptree.hh:768
 (inlined by) double_decker<long, cache_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0, (bplus::with_debug)0>::iterator double_decker<long, cache_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0, (bplus::with_debug)0>::emplace_before<cache_entry::evictable_tag, seastar::lw_shared_ptr<schema const>&, dht::decorated_key, partition_entry>(double_decker<long, cache_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0, (bplus::with_debug)0>::iterator, long, double_decker<long, cache_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0, (bplus::with_debug)0>::bound_hint const&, cache_entry::evictable_tag&&, seastar::lw_shared_ptr<schema const>&, dht::decorated_key&&, partition_entry&&) at /jenkins/workspace/scylla-master/next/scylla/utils/double-decker.hh:224
 (inlined by) operator() at /jenkins/workspace/scylla-master/next/scylla/row_cache.cc:1040
operator() at /jenkins/workspace/scylla-master/next/scylla/./utils/logalloc.hh:715
 (inlined by) with_linearized_managed_bytes<row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::<lambda(logalloc::allocating_section&, double_decker<long int, cache_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0>::iterator, memtable_entry&, partition_presence_checker&, real_dirty_memory_accounter&, const double_decker<long int, cache_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0>::bound_hint&)> >::<lambda()>::<lambda()> mutable::<lambda()>::<lambda()>::<lambda()> > at /jenkins/workspace/scylla-master/next/scylla/utils/managed_bytes.hh:427
 (inlined by) operator() at /jenkins/workspace/scylla-master/next/scylla/row_cache.cc:970
 (inlined by) with_reclaiming_disabled<row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::<lambda(logalloc::allocating_section&, double_decker<long int, cache_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0>::iterator, memtable_entry&, partition_presence_checker&, real_dirty_memory_accounter&, const double_decker<long int, cache_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0>::bound_hint&)> >::<lambda()>::<lambda()> mutable::<lambda()>::<lambda()>&> at /jenkins/workspace/scylla-master/next/scylla/./utils/logalloc.hh:746
 (inlined by) operator() at /jenkins/workspace/scylla-master/next/scylla/./utils/logalloc.hh:768
 (inlined by) with_reserve<logalloc::allocating_section::operator()<row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::<lambda(logalloc::allocating_section&, double_decker<long int, cache_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0>::iterator, memtable_entry&, partition_presence_checker&, real_dirty_memory_accounter&, const double_decker<long int, cache_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0>::bound_hint&)> >::<lambda()>::<lambda()> mutable::<lambda()>::<lambda()> >::<lambda()> > at /jenkins/workspace/scylla-master/next/scylla/./utils/logalloc.hh:718
 (inlined by) operator()<row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::<lambda(logalloc::allocating_section&, double_decker<long int, cache_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0>::iterator, memtable_entry&, partition_presence_checker&, real_dirty_memory_accounter&, const double_decker<long int, cache_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0>::bound_hint&)> >::<lambda()>::<lambda()> mutable::<lambda()>::<lambda()> > at /jenkins/workspace/scylla-master/next/scylla/./utils/logalloc.hh:767
 (inlined by) operator() at /jenkins/workspace/scylla-master/next/scylla/row_cache.cc:969
 (inlined by) with_allocator<row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::<lambda(logalloc::allocating_section&, double_decker<long int, cache_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0>::iterator, memtable_entry&, partition_presence_checker&, real_dirty_memory_accounter&, const double_decker<long int, cache_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0>::bound_hint&)> >::<lambda()>::<lambda()> mutable::<lambda()> > at /jenkins/workspace/scylla-master/next/scylla/./utils/allocation_strategy.hh:307
 (inlined by) operator() at /jenkins/workspace/scylla-master/next/scylla/row_cache.cc:956
 (inlined by) __invoke_impl<void, row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::<lambda(logalloc::allocating_section&, double_decker<long int, cache_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0>::iterator, memtable_entry&, partition_presence_checker&, real_dirty_memory_accounter&, const double_decker<long int, cache_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0>::bound_hint&)> >::<lambda()>::<lambda()> > at /usr/include/c++/10/bits/invoke.h:60
 (inlined by) __invoke<row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::<lambda(logalloc::allocating_section&, double_decker<long int, cache_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0>::iterator, memtable_entry&, partition_presence_checker&, real_dirty_memory_accounter&, const double_decker<long int, cache_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0>::bound_hint&)> >::<lambda()>::<lambda()> > at /usr/include/c++/10/bits/invoke.h:95
 (inlined by) __apply_impl<row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::<lambda(logalloc::allocating_section&, double_decker<long int, cache_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0>::iterator, memtable_entry&, partition_presence_checker&, real_dirty_memory_accounter&, const double_decker<long int, cache_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0>::bound_hint&)> >::<lambda()>::<lambda()>, std::tuple<> > at /usr/include/c++/10/tuple:1723
 (inlined by) apply<row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::<lambda(logalloc::allocating_section&, double_decker<long int, cache_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0>::iterator, memtable_entry&, partition_presence_checker&, real_dirty_memory_accounter&, const double_decker<long int, cache_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0>::bound_hint&)> >::<lambda()>::<lambda()>, std::tuple<> > at /usr/include/c++/10/tuple:1734
 (inlined by) apply<row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::<lambda(logalloc::allocating_section&, double_decker<long int, cache_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0>::iterator, memtable_entry&, partition_presence_checker&, real_dirty_memory_accounter&, const double_decker<long int, cache_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0>::bound_hint&)> >::<lambda()>::<lambda()> > at /jenkins/workspace/scylla-master/next/scylla/seastar/include/seastar/core/future.hh:1988
 (inlined by) operator() at /jenkins/workspace/scylla-master/next/scylla/seastar/include/seastar/core/thread.hh:258
 (inlined by) call at /jenkins/workspace/scylla-master/next/scylla/seastar/include/seastar/util/noncopyable_function.hh:116
seastar::noncopyable_function<void ()>::operator()() const at /usr/include/c++/10/bits/basic_string.h:323
 (inlined by) seastar::thread_context::main() at /jenkins/workspace/scylla-master/next/scylla/seastar/src/core/thread.cc:297

@fgelcer
Copy link
Author

fgelcer commented Sep 6, 2020

@avikivity , could this be the same as the one in the description?

void seastar::backtrace<seastar::backtrace_buffer::append_backtrace()::{lambda(seastar::frame)#1}>(seastar::backtrace_buffer::append_backtrace()::{lambda(seastar::frame)#1}&&) at reactor.cc:?
seastar::print_with_backtrace(seastar::backtrace_buffer&) at reactor.cc:?
 (inlined by) print_with_backtrace at /jenkins/workspace/scylla-master/next/scylla/seastar/src/core/reactor.cc:752
seastar::internal::cpu_stall_detector::generate_trace() at reactor.cc:?
seastar::reactor::block_notifier(int) at reactor.cc:?
 (inlined by) seastar::internal::cpu_stall_detector::on_signal() at /jenkins/workspace/scylla-master/next/scylla/seastar/src/core/reactor.cc:1061
 (inlined by) seastar::reactor::block_notifier(int) at /jenkins/workspace/scylla-master/next/scylla/seastar/src/core/reactor.cc:1184
?? ??:0
logalloc::region_impl::object_descriptor::encode(char*&, unsigned long) const at /jenkins/workspace/scylla-master/next/scylla/seastar/include/seastar/util/log.hh:126
 (inlined by) logalloc::region_impl::alloc_small(logalloc::region_impl::object_descriptor const&, unsigned int, unsigned long) at /jenkins/workspace/scylla-master/next/scylla/utils/logalloc.cc:1193
logalloc::region_impl::alloc_small(logalloc::region_impl::object_descriptor const&, unsigned int, unsigned long) at /jenkins/workspace/scylla-master/next/scylla/utils/logalloc.cc:1187
 (inlined by) logalloc::region_impl::alloc_small(logalloc::region_impl::object_descriptor const&, unsigned int, unsigned long) at /jenkins/workspace/scylla-master/next/scylla/utils/logalloc.cc:1176
 (inlined by) logalloc::region_impl::alloc(migrate_fn_type const*, unsigned long, unsigned long) at /jenkins/workspace/scylla-master/next/scylla/utils/logalloc.cc:1415
bplus::node<long, intrusive_array<cache_entry>, dht::raw_token_less_comparator, 16ul, (bplus::key_search)0, (bplus::with_debug)0>* allocation_strategy::construct<bplus::node<long, intrusive_array<cache_entry>, dht::raw_token_less_comparator, 16ul, (bplus::key_search)0, (bplus::with_debug)0>>() at /jenkins/workspace/scylla-master/next/scylla/./utils/logalloc.hh:715
 (inlined by) bplus::node<long, intrusive_array<cache_entry>, dht::raw_token_less_comparator, 16ul, (bplus::key_search)0, (bplus::with_debug)0>::create() at /jenkins/workspace/scylla-master/next/scylla/./utils/bptree.hh:1398
 (inlined by) bplus::node<long, intrusive_array<cache_entry>, dht::raw_token_less_comparator, 16ul, (bplus::key_search)0, (bplus::with_debug)0>::prealloc::push() at /jenkins/workspace/scylla-master/next/scylla/./utils/bptree.hh:1869
 (inlined by) bplus::node<long, intrusive_array<cache_entry>, dht::raw_token_less_comparator, 16ul, (bplus::key_search)0, (bplus::with_debug)0>::insert(unsigned long, long, bplus::data<long, intrusive_array<cache_entry>, dht::raw_token_less_comparator, 16ul, (bplus::key_search)0, (bplus::with_debug)0>*, dht::raw_token_less_comparator) at /jenkins/workspace/scylla-master/next/scylla/./utils/bptree.hh:1655
bplus::tree<long, intrusive_array<cache_entry>, dht::raw_token_less_comparator, 16ul, (bplus::key_search)0, (bplus::with_debug)0>::iterator bplus::tree<long, intrusive_array<cache_entry>, dht::raw_token_less_comparator, 16ul, (bplus::key_search)0, (bplus::with_debug)0>::iterator::emplace_before<bplus::tree<long, intrusive_array<cache_entry>, dht::raw_token_less_comparator, 16ul, (bplus::key_search)0, (bplus::with_debug)0>::iterator bplus::tree<long, intrusive_array<cache_entry>, dht::raw_token_less_comparator, 16ul, (bplus::key_search)0, (bplus::with_debug)0>::iterator::emplace_before<cache_entry::evictable_tag, seastar::lw_shared_ptr<schema const>&, dht::decorated_key, partition_entry>(long, dht::raw_token_less_comparator, cache_entry::evictable_tag&&, seastar::lw_shared_ptr<schema const>&, dht::decorated_key&&, partition_entry&&)::{lambda(bplus::data<long, intrusive_array<cache_entry>, dht::raw_token_less_comparator, 16ul, (bplus::key_search)0, (bplus::with_debug)0>*)#1}, cache_entry::evictable_tag, seastar::lw_shared_ptr<schema const>&, dht::decorated_key, partition_entry>(bplus::tree<long, intrusive_array<cache_entry>, dht::raw_token_less_comparator, 16ul, (bplus::key_search)0, (bplus::with_debug)0>::iterator bplus::tree<long, intrusive_array<cache_entry>, dht::raw_token_less_comparator, 16ul, (bplus::key_search)0, (bplus::with_debug)0>::iterator::emplace_before<cache_entry::evictable_tag, seastar::lw_shared_ptr<schema const>&, dht::decorated_key, partition_entry>(long, dht::raw_token_less_comparator, cache_entry::evictable_tag&&, seastar::lw_shared_ptr<schema const>&, dht::decorated_key&&, partition_entry&&)::{lambda(bplus::data<long, intrusive_array<cache_entry>, dht::raw_token_less_comparator, 16ul, (bplus::key_search)0, (bplus::with_debug)0>*)#1}, dht::raw_token_less_comparator, cache_entry::evictable_tag&&, seastar::lw_shared_ptr<schema const>&, dht::decorated_key&&, partition_entry&&) at /jenkins/workspace/scylla-master/next/scylla/./utils/logalloc.hh:715
 (inlined by) bplus::tree<long, intrusive_array<cache_entry>, dht::raw_token_less_comparator, 16ul, (bplus::key_search)0, (bplus::with_debug)0>::iterator bplus::tree<long, intrusive_array<cache_entry>, dht::raw_token_less_comparator, 16ul, (bplus::key_search)0, (bplus::with_debug)0>::iterator::emplace_before<cache_entry::evictable_tag, seastar::lw_shared_ptr<schema const>&, dht::decorated_key, partition_entry>(long, dht::raw_token_less_comparator, cache_entry::evictable_tag&&, seastar::lw_shared_ptr<schema const>&, dht::decorated_key&&, partition_entry&&) at /jenkins/workspace/scylla-master/next/scylla/./utils/bptree.hh:768
 (inlined by) double_decker<long, cache_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0, (bplus::with_debug)0>::iterator double_decker<long, cache_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0, (bplus::with_debug)0>::emplace_before<cache_entry::evictable_tag, seastar::lw_shared_ptr<schema const>&, dht::decorated_key, partition_entry>(double_decker<long, cache_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0, (bplus::with_debug)0>::iterator, long, double_decker<long, cache_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0, (bplus::with_debug)0>::bound_hint const&, cache_entry::evictable_tag&&, seastar::lw_shared_ptr<schema const>&, dht::decorated_key&&, partition_entry&&) at /jenkins/workspace/scylla-master/next/scylla/utils/double-decker.hh:224
 (inlined by) operator() at /jenkins/workspace/scylla-master/next/scylla/row_cache.cc:1040
operator() at /jenkins/workspace/scylla-master/next/scylla/./utils/logalloc.hh:715
 (inlined by) with_linearized_managed_bytes<row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::<lambda(logalloc::allocating_section&, double_decker<long int, cache_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0>::iterator, memtable_entry&, partition_presence_checker&, real_dirty_memory_accounter&, const double_decker<long int, cache_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0>::bound_hint&)> >::<lambda()>::<lambda()> mutable::<lambda()>::<lambda()>::<lambda()> > at /jenkins/workspace/scylla-master/next/scylla/utils/managed_bytes.hh:427
 (inlined by) operator() at /jenkins/workspace/scylla-master/next/scylla/row_cache.cc:970
 (inlined by) with_reclaiming_disabled<row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::<lambda(logalloc::allocating_section&, double_decker<long int, cache_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0>::iterator, memtable_entry&, partition_presence_checker&, real_dirty_memory_accounter&, const double_decker<long int, cache_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0>::bound_hint&)> >::<lambda()>::<lambda()> mutable::<lambda()>::<lambda()>&> at /jenkins/workspace/scylla-master/next/scylla/./utils/logalloc.hh:746
 (inlined by) operator() at /jenkins/workspace/scylla-master/next/scylla/./utils/logalloc.hh:768
 (inlined by) with_reserve<logalloc::allocating_section::operator()<row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::<lambda(logalloc::allocating_section&, double_decker<long int, cache_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0>::iterator, memtable_entry&, partition_presence_checker&, real_dirty_memory_accounter&, const double_decker<long int, cache_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0>::bound_hint&)> >::<lambda()>::<lambda()> mutable::<lambda()>::<lambda()> >::<lambda()> > at /jenkins/workspace/scylla-master/next/scylla/./utils/logalloc.hh:718
 (inlined by) operator()<row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::<lambda(logalloc::allocating_section&, double_decker<long int, cache_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0>::iterator, memtable_entry&, partition_presence_checker&, real_dirty_memory_accounter&, const double_decker<long int, cache_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0>::bound_hint&)> >::<lambda()>::<lambda()> mutable::<lambda()>::<lambda()> > at /jenkins/workspace/scylla-master/next/scylla/./utils/logalloc.hh:767
 (inlined by) operator() at /jenkins/workspace/scylla-master/next/scylla/row_cache.cc:969
 (inlined by) with_allocator<row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::<lambda(logalloc::allocating_section&, double_decker<long int, cache_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0>::iterator, memtable_entry&, partition_presence_checker&, real_dirty_memory_accounter&, const double_decker<long int, cache_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0>::bound_hint&)> >::<lambda()>::<lambda()> mutable::<lambda()> > at /jenkins/workspace/scylla-master/next/scylla/./utils/allocation_strategy.hh:307
 (inlined by) operator() at /jenkins/workspace/scylla-master/next/scylla/row_cache.cc:956
 (inlined by) __invoke_impl<void, row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::<lambda(logalloc::allocating_section&, double_decker<long int, cache_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0>::iterator, memtable_entry&, partition_presence_checker&, real_dirty_memory_accounter&, const double_decker<long int, cache_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0>::bound_hint&)> >::<lambda()>::<lambda()> > at /usr/include/c++/10/bits/invoke.h:60
 (inlined by) __invoke<row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::<lambda(logalloc::allocating_section&, double_decker<long int, cache_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0>::iterator, memtable_entry&, partition_presence_checker&, real_dirty_memory_accounter&, const double_decker<long int, cache_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0>::bound_hint&)> >::<lambda()>::<lambda()> > at /usr/include/c++/10/bits/invoke.h:95
 (inlined by) __apply_impl<row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::<lambda(logalloc::allocating_section&, double_decker<long int, cache_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0>::iterator, memtable_entry&, partition_presence_checker&, real_dirty_memory_accounter&, const double_decker<long int, cache_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0>::bound_hint&)> >::<lambda()>::<lambda()>, std::tuple<> > at /usr/include/c++/10/tuple:1723
 (inlined by) apply<row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::<lambda(logalloc::allocating_section&, double_decker<long int, cache_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0>::iterator, memtable_entry&, partition_presence_checker&, real_dirty_memory_accounter&, const double_decker<long int, cache_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0>::bound_hint&)> >::<lambda()>::<lambda()>, std::tuple<> > at /usr/include/c++/10/tuple:1734
 (inlined by) apply<row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::<lambda(logalloc::allocating_section&, double_decker<long int, cache_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0>::iterator, memtable_entry&, partition_presence_checker&, real_dirty_memory_accounter&, const double_decker<long int, cache_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0>::bound_hint&)> >::<lambda()>::<lambda()> > at /jenkins/workspace/scylla-master/next/scylla/seastar/include/seastar/core/future.hh:1988
 (inlined by) operator() at /jenkins/workspace/scylla-master/next/scylla/seastar/include/seastar/core/thread.hh:258
 (inlined by) call at /jenkins/workspace/scylla-master/next/scylla/seastar/include/seastar/util/noncopyable_function.hh:116
seastar::noncopyable_function<void ()>::operator()() const at /usr/include/c++/10/bits/basic_string.h:323
 (inlined by) seastar::thread_context::main() at /jenkins/workspace/scylla-master/next/scylla/seastar/src/core/thread.cc:297

as a reference to myself, if need to report a new issue, this is the stall message in the log:
< t:2020-09-02 07:36:34,188 f:cluster.py l:1352 c:sdcm.cluster p:DEBUG > 2020-09-02T07:36:31+00:00 perf-latency-nemesis-master-db-node-74d9dea7-1 !INFO | scylla: Reactor stalled for 65 ms on shard 1.

@avikivity
Copy link
Member

Yes, they're exactly the same.

@avikivity avikivity changed the title Reactor stalled for 32 ms after during repair Reactor stalled for 32 ms after during repair in row_cache::update() Sep 7, 2020
@avikivity
Copy link
Member

avikivity commented Sep 7, 2020

btw, not repair related.

@avikivity
Copy link
Member

I don't understand it. The code is preemptible.

How many times did this happen?

What is the schema and data shape (sizes of column, number of rows in a partition)?

@fgelcer
Copy link
Author

fgelcer commented Sep 7, 2020

I don't understand it. The code is preemptible.

How many times did this happen?

What is the schema and data shape (sizes of column, number of rows in a partition)?

i can't confirm it right now, but it happened some 4 times in a small time window

@fgelcer
Copy link
Author

fgelcer commented Sep 7, 2020

those are the c-s commands we run:

prepare_write_cmd: ["cassandra-stress write no-warmup cl=ALL n=62500000 -schema 'replication(factor=3)' -port jmx=6868 -mode cql3 native -rate threads=200 -col 'size=FIXED(128) n=FIXED(8)' -pop seq=1..62500000",
                    "cassandra-stress write no-warmup cl=ALL n=62500000 -schema 'replication(factor=3)' -port jmx=6868 -mode cql3 native -rate threads=200 -col 'size=FIXED(128) n=FIXED(8)' -pop seq=62500001..125000000",
                    "cassandra-stress write no-warmup cl=ALL n=62500000 -schema 'replication(factor=3)' -port jmx=6868 -mode cql3 native -rate threads=200 -col 'size=FIXED(128) n=FIXED(8)' -pop seq=125000001..187500000",
                    "cassandra-stress write no-warmup cl=ALL n=62500000 -schema 'replication(factor=3)' -port jmx=6868 -mode cql3 native -rate threads=200 -col 'size=FIXED(128) n=FIXED(8)' -pop seq=187500001..250000000"]

stress_cmd_w: "cassandra-stress write no-warmup cl=QUORUM duration=700m -schema 'replication(factor=3)' -port jmx=6868 -mode cql3 native -rate 'threads=50 throttle=5000/s' -col 'size=FIXED(128) n=FIXED(8)' -pop 'dist=gauss(1..250000000,125000000,12500000)' "
stress_cmd_r: "cassandra-stress read no-warmup  cl=QUORUM duration=700m -schema 'replication(factor=3)' -port jmx=6868 -mode cql3 native -rate 'threads=50 throttle=4000/s' -col 'size=FIXED(128) n=FIXED(8)' -pop 'dist=gauss(1..250000000,125000000,12500000)' "
stress_cmd_m: "cassandra-stress mixed no-warmup cl=QUORUM duration=700m -schema 'replication(factor=3)' -port jmx=6868 -mode cql3 native -rate 'threads=50 throttle=3500/s' -col 'size=FIXED(128) n=FIXED(8)' -pop 'dist=gauss(1..250000000,125000000,12500000)' "

@avikivity
Copy link
Member

i can't confirm it right now, but it happened some 4 times in a small time window

Please extract the info. How large the time window? Was it always the same shard?

Maybe it was a fluke.

@fgelcer
Copy link
Author

fgelcer commented Sep 7, 2020

i can't confirm it right now, but it happened some 4 times in a small time window

Please extract the info. How large the time window? Was it always the same shard?

Maybe it was a fluke.

checking

@avikivity
Copy link
Member

@xemul please take a look. Why is there a with_debug in there?

@fgelcer
Copy link
Author

fgelcer commented Sep 7, 2020

i can't confirm it right now, but it happened some 4 times in a small time window

Please extract the info. How large the time window? Was it always the same shard?
Maybe it was a fluke.

checking

i found it happening 3 times (in that sequence, maybe in other parts as well, but i'm not sure about it)
each one on a different shard:

< t:2020-09-02 07:36:19,152 f:cluster.py      l:1352 c:sdcm.cluster         p:DEBUG > 020-09-02T07:36:07+00:00  perf-latency-nemesis-master-db-node-74d9dea7-1 !INFO    | scylla: Reactor stalled for 32 ms on shard 6.
< t:2020-09-02 07:36:19,154 f:cluster.py      l:1352 c:sdcm.cluster         p:DEBUG > 2020-09-02T07:36:07+00:00  perf-latency-nemesis-master-db-node-74d9dea7-1 !INFO    | scylla: Reactor stalled for 20 ms on shard 7.
< t:2020-09-02 07:36:34,188 f:cluster.py      l:1352 c:sdcm.cluster         p:DEBUG > 2020-09-02T07:36:31+00:00  perf-latency-nemesis-master-db-node-74d9dea7-1 !INFO    | scylla: Reactor stalled for 65 ms on shard 1.

@xemul
Copy link
Contributor

xemul commented Sep 7, 2020

@avikivity , there's (bplus::with_debug)0 which is false, isn't it?

@avikivity
Copy link
Member

Ah, didn't realize it was a value not a type.

@slivne
Copy link
Contributor

slivne commented Sep 13, 2020

@avikivity any direction on how to address this - so we can assign

@avikivity
Copy link
Member

I believe it's a fluke (the kernel took some time away from us, or the hypervisor). If it doesn't repeat we can close it.

@tgrabiec please opine.

@tgrabiec
Copy link
Contributor

Could be a fluke. The fact that there is no stall detector report for smaller periods, just straight to 32ms, supports this hypothesis.

@avikivity
Copy link
Member

Could be a fluke. The fact that there is no stall detector report for smaller periods, just straight to 32ms, supports this hypothesis.

That depends on the detector threshold. Though I think it is 5ms for these test.

@fgelcer when reporting latency issues, please report the stall detector threshold as well.

@tgrabiec
Copy link
Contributor

I saw single-digit stalls in the log as well.

@fgelcer
Copy link
Author

fgelcer commented Sep 15, 2020

Could be a fluke. The fact that there is no stall detector report for smaller periods, just straight to 32ms, supports this hypothesis.

That depends on the detector threshold. Though I think it is 5ms for these test.

@fgelcer when reporting latency issues, please report the stall detector threshold as well.

@avikivity , you are right. For those stalls checks I'm running, all of them have --blocked-reactor-notify-ms 5.

@avikivity
Copy link
Member

I see with_reserve, but not refill_emergency_reserve. So this isn't #325.

@slivne
Copy link
Contributor

slivne commented Sep 24, 2020

closing this for now

I believe it's a fluke (the kernel took some time away from us, or the hypervisor). If it doesn't repeat we can close it.

@slivne slivne closed this as completed Sep 24, 2020
@fgelcer
Copy link
Author

fgelcer commented Jan 7, 2021

this is being reproduced again.
Reactor stalled for 7 ms on shard 3. on node-1

void seastar::backtrace<seastar::backtrace_buffer::append_backtrace()::{lambda(seastar::frame)#1}>(seastar::backtrace_buffer::append_backtrace()::{lambda(seastar::frame)#1}&&) at ./build/release/seastar/./seastar/include/seastar/util/backtrace.hh:59
 (inlined by) seastar::backtrace_buffer::append_backtrace() at ./build/release/seastar/./seastar/src/core/reactor.cc:731
 (inlined by) seastar::print_with_backtrace(seastar::backtrace_buffer&) at ./build/release/seastar/./seastar/src/core/reactor.cc:752
 (inlined by) seastar::internal::cpu_stall_detector::generate_trace() at ./build/release/seastar/./seastar/src/core/reactor.cc:1210
seastar::internal::cpu_stall_detector::maybe_report() at ./build/release/seastar/./seastar/src/core/reactor.cc:1056
 (inlined by) seastar::internal::cpu_stall_detector::on_signal() at ./build/release/seastar/./seastar/src/core/reactor.cc:1070
 (inlined by) seastar::reactor::block_notifier(int) at ./build/release/seastar/./seastar/src/core/reactor.cc:1193
?? ??:0
bplus::tree<long, intrusive_array<memtable_entry>, dht::raw_token_less_comparator, 16ul, (bplus::key_search)0, (bplus::with_debug)0>::iterator::erase(dht::raw_token_less_comparator) at row_cache.cc:?
bplus::node<long, intrusive_array<memtable_entry>, dht::raw_token_less_comparator, 16ul, (bplus::key_search)0, (bplus::with_debug)0>::refill(dht::raw_token_less_comparator) at ././utils/bptree.hh:1750
 (inlined by) bplus::node<long, intrusive_array<memtable_entry>, dht::raw_token_less_comparator, 16ul, (bplus::key_search)0, (bplus::with_debug)0>::remove_from(unsigned long, dht::raw_token_less_comparator) at ././utils/bptree.hh:1671
 (inlined by) bplus::node<long, intrusive_array<memtable_entry>, dht::raw_token_less_comparator, 16ul, (bplus::key_search)0, (bplus::with_debug)0>::remove(unsigned long, dht::raw_token_less_comparator) at ././utils/bptree.hh:1796
 (inlined by) _ZN5bplus4treeIl15intrusive_arrayI14memtable_entryEN3dht25raw_token_less_comparatorELm16ELNS_10key_searchE0ELNS_10with_debugE0EE8iterator17erase_and_disposeIRDoFvPS3_EEES9_OT_S5_ at ././utils/bptree.hh:810
 (inlined by) bplus::tree<long, intrusive_array<memtable_entry>, dht::raw_token_less_comparator, 16ul, (bplus::key_search)0, (bplus::with_debug)0>::iterator::erase(dht::raw_token_less_comparator) at ././utils/bptree.hh:816
double_decker<long, memtable_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0, (bplus::with_debug)0>::iterator double_decker<long, memtable_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0, (bplus::with_debug)0>::iterator::erase_and_dispose<seastar::future<void> row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::$_19>(row_cache::external_updater, memtable&, row_cache::update(row_cache::external_updater, memtable&)::$_19)::{lambda()#1}::operator()() const::{lambda()#2}::operator()()::{lambda()#2}::operator()() const::{lambda()#2}::operator()() const::{lambda()#1}::operator()() const::{lambda(memtable_entry*)#1}>(dht::raw_token_less_comparator, seastar::future<void> row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::$_19>(row_cache::external_updater, memtable&, row_cache::update(row_cache::external_updater, memtable&)::$_19)::{lambda()#1}::operator()() const::{lambda()#2}::operator()()::{lambda()#2}::operator()() const::{lambda()#2}::operator()() const::{lambda()#1}::operator()() const::{lambda(memtable_entry*)#1}&&) at ././utils/double-decker.hh:132
 (inlined by) operator() at ./row_cache.cc:986
 (inlined by) std::result_of<seastar::future<void> row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::$_19>(row_cache::external_updater, memtable&, row_cache::update(row_cache::external_updater, memtable&)::$_19)::{lambda()#1}::operator()() const::{lambda()#2}::operator()()::{lambda()#2}::operator()() const::{lambda()#2}::operator()() const::{lambda()#1} ()>::type with_linearized_managed_bytes<seastar::future<void> row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::$_19>(row_cache::external_updater, memtable&, row_cache::update(row_cache::external_updater, memtable&)::$_19)::{lambda()#1}::operator()() const::{lambda()#2}::operator()()::{lambda()#2}::operator()() const::{lambda()#2}::operator()() const::{lambda()#1}>(seastar::future<void> row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::$_19>(row_cache::external_updater, memtable&, row_cache::update(row_cache::external_updater, memtable&)::$_19)::{lambda()#1}::operator()() const::{lambda()#2}::operator()()::{lambda()#2}::operator()() const::{lambda()#2}::operator()() const::{lambda()#1}&&) at ././utils/managed_bytes.hh:427
 (inlined by) operator() at ./row_cache.cc:984
 (inlined by) _ZN8logalloc18allocating_section24with_reclaiming_disabledIRZZZZN9row_cache9do_updateIZNS2_6updateENS2_16external_updaterER8memtableE4$_19EEN7seastar6futureIvEES4_S6_T_ENKUlvE_clEvENUlvE0_clEvENKUlvE0_clEvEUlvE0_EEDcRNS_6regionEOSB_ at ././utils/logalloc.hh:755
 (inlined by) operator() at ././utils/logalloc.hh:777
 (inlined by) _ZN8logalloc18allocating_section12with_reserveIZNS0_clIZZZZN9row_cache9do_updateIZNS3_6updateENS3_16external_updaterER8memtableE4$_19EEN7seastar6futureIvEES5_S7_T_ENKUlvE_clEvENUlvE0_clEvENKUlvE0_clEvEUlvE0_EEDcRNS_6regionEOSC_EUlvE_EEDcSJ_ at ././utils/logalloc.hh:726
 (inlined by) _ZN8logalloc18allocating_sectionclIZZZZN9row_cache9do_updateIZNS2_6updateENS2_16external_updaterER8memtableE4$_19EEN7seastar6futureIvEES4_S6_T_ENKUlvE_clEvENUlvE0_clEvENKUlvE0_clEvEUlvE0_EEDcRNS_6regionEOSB_ at ././utils/logalloc.hh:776
 (inlined by) operator() at ./row_cache.cc:983
_Z14with_allocatorIZZZN9row_cache9do_updateIZNS0_6updateENS0_16external_updaterER8memtableE4$_19EEN7seastar6futureIvEES2_S4_T_ENKUlvE_clEvENUlvE0_clEvEUlvE0_EDcR19allocation_strategyOS9_ at ././utils/allocation_strategy.hh:308
 (inlined by) operator() at ./row_cache.cc:952
 (inlined by) void std::__invoke_impl<void, seastar::future<void> row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::$_19>(row_cache::external_updater, memtable&, row_cache::update(row_cache::external_updater, memtable&)::$_19)::{lambda()#1}::operator()() const::{lambda()#2}>(std::__invoke_other, seastar::future<void> row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::$_19>(row_cache::external_updater, memtable&, row_cache::update(row_cache::external_updater, memtable&)::$_19)::{lambda()#1}::operator()() const::{lambda()#2}&&) at /usr/lib/gcc/x86_64-redhat-linux/10/../../../../include/c++/10/bits/invoke.h:60
 (inlined by) std::__invoke_result<seastar::future<void> row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::$_19>(row_cache::external_updater, memtable&, row_cache::update(row_cache::external_updater, memtable&)::$_19)::{lambda()#1}::operator()() const::{lambda()#2}>::type std::__invoke<seastar::future<void> row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::$_19>(row_cache::external_updater, memtable&, row_cache::update(row_cache::external_updater, memtable&)::$_19)::{lambda()#1}::operator()() const::{lambda()#2}>(seastar::future<void> row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::$_19>(row_cache::external_updater, memtable&, row_cache::update(row_cache::external_updater, memtable&)::$_19)::{lambda()#1}::operator()() const::{lambda()#2}&&, (seastar::future<void> row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::$_19>(row_cache::external_updater, memtable&, row_cache::update(row_cache::external_updater, memtable&)::$_19)::{lambda()#1}::operator()() const::{lambda()#2}&&)...) at /usr/lib/gcc/x86_64-redhat-linux/10/../../../../include/c++/10/bits/invoke.h:95
 (inlined by) _ZSt12__apply_implIZZN9row_cache9do_updateIZNS0_6updateENS0_16external_updaterER8memtableE4$_19EEN7seastar6futureIvEES2_S4_T_ENKUlvE_clEvEUlvE0_St5tupleIJEEJEEDcOS9_OT0_St16integer_sequenceImJXspT1_EEE at /usr/lib/gcc/x86_64-redhat-linux/10/../../../../include/c++/10/tuple:1723
 (inlined by) _ZSt5applyIZZN9row_cache9do_updateIZNS0_6updateENS0_16external_updaterER8memtableE4$_19EEN7seastar6futureIvEES2_S4_T_ENKUlvE_clEvEUlvE0_St5tupleIJEEEDcOS9_OT0_ at /usr/lib/gcc/x86_64-redhat-linux/10/../../../../include/c++/10/tuple:1734
 (inlined by) seastar::future<void> seastar::futurize<void>::apply<seastar::future<void> row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::$_19>(row_cache::external_updater, memtable&, row_cache::update(row_cache::external_updater, memtable&)::$_19)::{lambda()#1}::operator()() const::{lambda()#2}>(seastar::future<void> row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::$_19>(row_cache::external_updater, memtable&, row_cache::update(row_cache::external_updater, memtable&)::$_19)::{lambda()#1}::operator()() const::{lambda()#2}&&, std::tuple<>&&) at ././seastar/include/seastar/core/future.hh:2099
 (inlined by) operator() at ././seastar/include/seastar/core/thread.hh:258
 (inlined by) seastar::noncopyable_function<void ()>::direct_vtable_for<seastar::futurize<std::result_of<std::decay<seastar::future<void> row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::$_19>(row_cache::external_updater, memtable&, row_cache::update(row_cache::external_updater, memtable&)::$_19)::{lambda()#1}::operator()() const::{lambda()#2}>::type ()>::type>::type seastar::async<seastar::future<void> row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::$_19>(row_cache::external_updater, memtable&, row_cache::update(row_cache::external_updater, memtable&)::$_19)::{lambda()#1}::operator()() const::{lambda()#2}>(seastar::thread_attributes, seastar::future<void> row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::$_19>(row_cache::external_updater, memtable&, row_cache::update(row_cache::external_updater, memtable&)::$_19)::{lambda()#1}::operator()() const::{lambda()#2}&&, (std::decay<seastar::future<void> row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::$_19>(row_cache::external_updater, memtable&, row_cache::update(row_cache::external_updater, memtable&)::$_19)::{lambda()#1}::operator()() const::{lambda()#2}>&&)...)::{lambda()#1}>::call(seastar::noncopyable_function<void ()> const*) at ././seastar/include/seastar/util/noncopyable_function.hh:124
seastar::noncopyable_function<void ()>::operator()() const at ./build/release/seastar/./seastar/include/seastar/util/noncopyable_function.hh:209
 (inlined by) seastar::thread_context::main() at ./build/release/seastar/./seastar/src/core/thread.cc:297

logs are here

@fgelcer fgelcer reopened this Jan 7, 2021
@fgelcer
Copy link
Author

fgelcer commented Jan 7, 2021

i will report in case it happens more than once

@fgelcer
Copy link
Author

fgelcer commented Jan 7, 2021

another time, on the same node:

Reactor stalled for 7 ms on shard 3.
void seastar::backtrace<seastar::backtrace_buffer::append_backtrace()::{lambda(seastar::frame)#1}>(seastar::backtrace_buffer::append_backtrace()::{lambda(seastar::frame)#1}&&) at ./build/release/seastar/./seastar/include/seastar/util/backtrace.hh:59
 (inlined by) seastar::backtrace_buffer::append_backtrace() at ./build/release/seastar/./seastar/src/core/reactor.cc:731
 (inlined by) seastar::print_with_backtrace(seastar::backtrace_buffer&) at ./build/release/seastar/./seastar/src/core/reactor.cc:752
 (inlined by) seastar::internal::cpu_stall_detector::generate_trace() at ./build/release/seastar/./seastar/src/core/reactor.cc:1210
seastar::internal::cpu_stall_detector::maybe_report() at ./build/release/seastar/./seastar/src/core/reactor.cc:1056
 (inlined by) seastar::internal::cpu_stall_detector::on_signal() at ./build/release/seastar/./seastar/src/core/reactor.cc:1070
 (inlined by) seastar::reactor::block_notifier(int) at ./build/release/seastar/./seastar/src/core/reactor.cc:1193
?? ??:0
bplus::tree<long, intrusive_array<memtable_entry>, dht::raw_token_less_comparator, 16ul, (bplus::key_search)0, (bplus::with_debug)0>::iterator::erase(dht::raw_token_less_comparator) at row_cache.cc:?
 (inlined by) bplus::node<long, intrusive_array<memtable_entry>, dht::raw_token_less_comparator, 16ul, (bplus::key_search)0, (bplus::with_debug)0>::remove_from(unsigned long, dht::raw_token_less_comparator) at ././utils/bptree.hh:1671
 (inlined by) bplus::node<long, intrusive_array<memtable_entry>, dht::raw_token_less_comparator, 16ul, (bplus::key_search)0, (bplus::with_debug)0>::remove(unsigned long, dht::raw_token_less_comparator) at ././utils/bptree.hh:1796
 (inlined by) _ZN5bplus4treeIl15intrusive_arrayI14memtable_entryEN3dht25raw_token_less_comparatorELm16ELNS_10key_searchE0ELNS_10with_debugE0EE8iterator17erase_and_disposeIRDoFvPS3_EEES9_OT_S5_ at ././utils/bptree.hh:810
 (inlined by) bplus::tree<long, intrusive_array<memtable_entry>, dht::raw_token_less_comparator, 16ul, (bplus::key_search)0, (bplus::with_debug)0>::iterator::erase(dht::raw_token_less_comparator) at ././utils/bptree.hh:816
double_decker<long, memtable_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0, (bplus::with_debug)0>::iterator double_decker<long, memtable_entry, dht::raw_token_less_comparator, dht::ring_position_comparator, 16, (bplus::key_search)0, (bplus::with_debug)0>::iterator::erase_and_dispose<seastar::future<void> row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::$_19>(row_cache::external_updater, memtable&, row_cache::update(row_cache::external_updater, memtable&)::$_19)::{lambda()#1}::operator()() const::{lambda()#2}::operator()()::{lambda()#2}::operator()() const::{lambda()#2}::operator()() const::{lambda()#1}::operator()() const::{lambda(memtable_entry*)#1}>(dht::raw_token_less_comparator, seastar::future<void> row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::$_19>(row_cache::external_updater, memtable&, row_cache::update(row_cache::external_updater, memtable&)::$_19)::{lambda()#1}::operator()() const::{lambda()#2}::operator()()::{lambda()#2}::operator()() const::{lambda()#2}::operator()() const::{lambda()#1}::operator()() const::{lambda(memtable_entry*)#1}&&) at ././utils/double-decker.hh:132
 (inlined by) operator() at ./row_cache.cc:986
 (inlined by) std::result_of<seastar::future<void> row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::$_19>(row_cache::external_updater, memtable&, row_cache::update(row_cache::external_updater, memtable&)::$_19)::{lambda()#1}::operator()() const::{lambda()#2}::operator()()::{lambda()#2}::operator()() const::{lambda()#2}::operator()() const::{lambda()#1} ()>::type with_linearized_managed_bytes<seastar::future<void> row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::$_19>(row_cache::external_updater, memtable&, row_cache::update(row_cache::external_updater, memtable&)::$_19)::{lambda()#1}::operator()() const::{lambda()#2}::operator()()::{lambda()#2}::operator()() const::{lambda()#2}::operator()() const::{lambda()#1}>(seastar::future<void> row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::$_19>(row_cache::external_updater, memtable&, row_cache::update(row_cache::external_updater, memtable&)::$_19)::{lambda()#1}::operator()() const::{lambda()#2}::operator()()::{lambda()#2}::operator()() const::{lambda()#2}::operator()() const::{lambda()#1}&&) at ././utils/managed_bytes.hh:427
 (inlined by) operator() at ./row_cache.cc:984
 (inlined by) _ZN8logalloc18allocating_section24with_reclaiming_disabledIRZZZZN9row_cache9do_updateIZNS2_6updateENS2_16external_updaterER8memtableE4$_19EEN7seastar6futureIvEES4_S6_T_ENKUlvE_clEvENUlvE0_clEvENKUlvE0_clEvEUlvE0_EEDcRNS_6regionEOSB_ at ././utils/logalloc.hh:755
 (inlined by) operator() at ././utils/logalloc.hh:777
 (inlined by) _ZN8logalloc18allocating_section12with_reserveIZNS0_clIZZZZN9row_cache9do_updateIZNS3_6updateENS3_16external_updaterER8memtableE4$_19EEN7seastar6futureIvEES5_S7_T_ENKUlvE_clEvENUlvE0_clEvENKUlvE0_clEvEUlvE0_EEDcRNS_6regionEOSC_EUlvE_EEDcSJ_ at ././utils/logalloc.hh:726
 (inlined by) _ZN8logalloc18allocating_sectionclIZZZZN9row_cache9do_updateIZNS2_6updateENS2_16external_updaterER8memtableE4$_19EEN7seastar6futureIvEES4_S6_T_ENKUlvE_clEvENUlvE0_clEvENKUlvE0_clEvEUlvE0_EEDcRNS_6regionEOSB_ at ././utils/logalloc.hh:776
 (inlined by) operator() at ./row_cache.cc:983
_Z14with_allocatorIZZZN9row_cache9do_updateIZNS0_6updateENS0_16external_updaterER8memtableE4$_19EEN7seastar6futureIvEES2_S4_T_ENKUlvE_clEvENUlvE0_clEvEUlvE0_EDcR19allocation_strategyOS9_ at ././utils/allocation_strategy.hh:308
 (inlined by) operator() at ./row_cache.cc:952
 (inlined by) void std::__invoke_impl<void, seastar::future<void> row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::$_19>(row_cache::external_updater, memtable&, row_cache::update(row_cache::external_updater, memtable&)::$_19)::{lambda()#1}::operator()() const::{lambda()#2}>(std::__invoke_other, seastar::future<void> row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::$_19>(row_cache::external_updater, memtable&, row_cache::update(row_cache::external_updater, memtable&)::$_19)::{lambda()#1}::operator()() const::{lambda()#2}&&) at /usr/lib/gcc/x86_64-redhat-linux/10/../../../../include/c++/10/bits/invoke.h:60
 (inlined by) std::__invoke_result<seastar::future<void> row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::$_19>(row_cache::external_updater, memtable&, row_cache::update(row_cache::external_updater, memtable&)::$_19)::{lambda()#1}::operator()() const::{lambda()#2}>::type std::__invoke<seastar::future<void> row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::$_19>(row_cache::external_updater, memtable&, row_cache::update(row_cache::external_updater, memtable&)::$_19)::{lambda()#1}::operator()() const::{lambda()#2}>(seastar::future<void> row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::$_19>(row_cache::external_updater, memtable&, row_cache::update(row_cache::external_updater, memtable&)::$_19)::{lambda()#1}::operator()() const::{lambda()#2}&&, (seastar::future<void> row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::$_19>(row_cache::external_updater, memtable&, row_cache::update(row_cache::external_updater, memtable&)::$_19)::{lambda()#1}::operator()() const::{lambda()#2}&&)...) at /usr/lib/gcc/x86_64-redhat-linux/10/../../../../include/c++/10/bits/invoke.h:95
 (inlined by) _ZSt12__apply_implIZZN9row_cache9do_updateIZNS0_6updateENS0_16external_updaterER8memtableE4$_19EEN7seastar6futureIvEES2_S4_T_ENKUlvE_clEvEUlvE0_St5tupleIJEEJEEDcOS9_OT0_St16integer_sequenceImJXspT1_EEE at /usr/lib/gcc/x86_64-redhat-linux/10/../../../../include/c++/10/tuple:1723
 (inlined by) _ZSt5applyIZZN9row_cache9do_updateIZNS0_6updateENS0_16external_updaterER8memtableE4$_19EEN7seastar6futureIvEES2_S4_T_ENKUlvE_clEvEUlvE0_St5tupleIJEEEDcOS9_OT0_ at /usr/lib/gcc/x86_64-redhat-linux/10/../../../../include/c++/10/tuple:1734
 (inlined by) seastar::future<void> seastar::futurize<void>::apply<seastar::future<void> row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::$_19>(row_cache::external_updater, memtable&, row_cache::update(row_cache::external_updater, memtable&)::$_19)::{lambda()#1}::operator()() const::{lambda()#2}>(seastar::future<void> row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::$_19>(row_cache::external_updater, memtable&, row_cache::update(row_cache::external_updater, memtable&)::$_19)::{lambda()#1}::operator()() const::{lambda()#2}&&, std::tuple<>&&) at ././seastar/include/seastar/core/future.hh:2099
 (inlined by) operator() at ././seastar/include/seastar/core/thread.hh:258
 (inlined by) seastar::noncopyable_function<void ()>::direct_vtable_for<seastar::futurize<std::result_of<std::decay<seastar::future<void> row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::$_19>(row_cache::external_updater, memtable&, row_cache::update(row_cache::external_updater, memtable&)::$_19)::{lambda()#1}::operator()() const::{lambda()#2}>::type ()>::type>::type seastar::async<seastar::future<void> row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::$_19>(row_cache::external_updater, memtable&, row_cache::update(row_cache::external_updater, memtable&)::$_19)::{lambda()#1}::operator()() const::{lambda()#2}>(seastar::thread_attributes, seastar::future<void> row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::$_19>(row_cache::external_updater, memtable&, row_cache::update(row_cache::external_updater, memtable&)::$_19)::{lambda()#1}::operator()() const::{lambda()#2}&&, (std::decay<seastar::future<void> row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::$_19>(row_cache::external_updater, memtable&, row_cache::update(row_cache::external_updater, memtable&)::$_19)::{lambda()#1}::operator()() const::{lambda()#2}>&&)...)::{lambda()#1}>::call(seastar::noncopyable_function<void ()> const*) at ././seastar/include/seastar/util/noncopyable_function.hh:124
seastar::noncopyable_function<void ()>::operator()() const at ./build/release/seastar/./seastar/include/seastar/util/noncopyable_function.hh:209
 (inlined by) seastar::thread_context::main() at ./build/release/seastar/./seastar/src/core/thread.cc:297

@avikivity
Copy link
Member

This is erase_and_dispose() in

template <typename Updater>
future<> row_cache::do_update(external_updater eu, memtable& m, Updater updater) {
  return do_update(std::move(eu), [this, &m, updater = std::move(updater)] {
    real_dirty_memory_accounter real_dirty_acc(m, _tracker);
    m.on_detach_from_region_group();
    _tracker.region().merge(m); // Now all data in memtable belongs to cache
    _tracker.memtable_cleaner().merge(m._cleaner);
    STAP_PROBE(scylla, row_cache_update_start);
    auto cleanup = defer([&m, this] {
        invalidate_sync(m);
        STAP_PROBE(scylla, row_cache_update_end);
    });

    return seastar::async([this, &m, updater = std::move(updater), real_dirty_acc = std::move(real_dirty_acc)] () mutable {
        size_t size_entry;
        // In case updater fails, we must bring the cache to consistency without deferring.
        auto cleanup = defer([&m, this] {
            invalidate_sync(m);
            _prev_snapshot_pos = {};
            _prev_snapshot = {};
        });
        coroutine update; // Destroy before cleanup to release snapshots before invalidating.
        partition_presence_checker is_present = _prev_snapshot->make_partition_presence_checker();
        while (!m.partitions.empty()) {
            with_allocator(_tracker.allocator(), [&] () {
                auto cmp = dht::ring_position_comparator(*_schema);
                {
                    size_t partition_count = 0;
                    {
                        STAP_PROBE(scylla, row_cache_update_one_batch_start);
                        // FIXME: we should really be checking should_yield() here instead of
                        // need_preempt(). However, should_yield() is currently quite
                        // expensive and we need to amortize it somehow.
                        do {
                          STAP_PROBE(scylla, row_cache_update_partition_start);
                          {
                            if (!update) {
                                _update_section(_tracker.region(), [&] {
                                  with_linearized_managed_bytes([&] {
                                    memtable_entry& mem_e = *m.partitions.begin();
                                    size_entry = mem_e.size_in_allocator_without_rows(_tracker.allocator());
                                    partitions_type::bound_hint hint;
                                    auto cache_i = _partitions.lower_bound(mem_e.key(), cmp, hint);
                                    update = updater(_update_section, cache_i, mem_e, is_present, real_dirty_acc, hint);
                                  });
                                });
                            }
                            // We use cooperative deferring instead of futures so that
                            // this layer has a chance to restore invariants before deferring,
                            // in particular set _prev_snapshot_pos to the correct value.
                            if (update.run() == stop_iteration::no) {
                                return;
                            }
                            update = {};
                            real_dirty_acc.unpin_memory(size_entry);
                            _update_section(_tracker.region(), [&] {
                              with_linearized_managed_bytes([&] {
                                auto i = m.partitions.begin();
                                i.erase_and_dispose(dht::raw_token_less_comparator{}, [&] (memtable_entry* e) noexcept {
                                    m.evict_entry(*e, _tracker.memtable_cleaner());
                                });
                              });
                            });
                            ++partition_count;
                          }
                          STAP_PROBE(scylla, row_cache_update_partition_end);
                        } while (!m.partitions.empty() && !need_preempt());
                        with_allocator(standard_allocator(), [&] {
                            if (m.partitions.empty()) {
                                _prev_snapshot_pos = {};
                            } else {
                                _update_section(_tracker.region(), [&] {
                                    _prev_snapshot_pos = m.partitions.begin()->key();
                                });
                            }
                        });
                        STAP_PROBE1(scylla, row_cache_update_one_batch_end, partition_count);
                    }
                }
            });
            real_dirty_acc.commit();
            seastar::thread::yield();
        }
    }).finally([cleanup = std::move(cleanup)] {});
  });
}

There is a preemption check in the do/while loop, but how can a single iteration take 32ms? /cc @xemul

@fgelcer
Copy link
Author

fgelcer commented Jan 7, 2021

There is a preemption check in the do/while loop, but how can a single iteration take 32ms? /cc @xemul

well, if you are checking the latest trace above this comment, then it took 7 ms...
the original issue reported in here was 32 ms, and i reported the last trace in here, because of the function name in the subject of this issue

@fgelcer
Copy link
Author

fgelcer commented Jan 7, 2021

one more occurrence, this time on node-3:
2021-01-01T07:05:14+00:00 perf-latency-nemesis-perf-v10-db-node-236aac08-3 !INFO | scylla: Reactor stalled for 9 ms on shard 2.

void seastar::backtrace<seastar::backtrace_buffer::append_backtrace()::{lambda(seastar::frame)#1}>(seastar::backtrace_buffer::append_backtrace()::{lambda(seastar::frame)#1}&&) at ./build/release/seastar/./seastar/include/seastar/util/backtrace.hh:59
 (inlined by) seastar::backtrace_buffer::append_backtrace() at ./build/release/seastar/./seastar/src/core/reactor.cc:731
 (inlined by) seastar::print_with_backtrace(seastar::backtrace_buffer&) at ./build/release/seastar/./seastar/src/core/reactor.cc:752
 (inlined by) seastar::internal::cpu_stall_detector::generate_trace() at ./build/release/seastar/./seastar/src/core/reactor.cc:1210
seastar::internal::cpu_stall_detector::maybe_report() at ./build/release/seastar/./seastar/src/core/reactor.cc:1056
 (inlined by) seastar::internal::cpu_stall_detector::on_signal() at ./build/release/seastar/./seastar/src/core/reactor.cc:1070
 (inlined by) seastar::reactor::block_notifier(int) at ./build/release/seastar/./seastar/src/core/reactor.cc:1193
?? ??:0
large_bitset::test(unsigned long) const at ././utils/large_bitset.hh:63
 (inlined by) operator()<long> at ./utils/bloom_filter.cc:72
 (inlined by) void utils::filter::for_each_index<utils::filter::bloom_filter::is_present(utils::hashed_key)::$_0>(utils::hashed_key, int, long, utils::filter_format, utils::filter::bloom_filter::is_present(utils::hashed_key)::$_0&&) at ./utils/bloom_filter.cc:62
 (inlined by) utils::filter::bloom_filter::is_present(utils::hashed_key) at ./utils/bloom_filter.cc:71
sstables::sstable::filter_has_key(utils::hashed_key) const at ././sstables/sstables.hh:729
 (inlined by) operator() at ./table.cc:999
 (inlined by) partition_presence_checker_result std::__invoke_impl<partition_presence_checker_result, table::make_partition_presence_checker(seastar::lw_shared_ptr<sstables::sstable_set>)::$_31&, dht::decorated_key const&>(std::__invoke_other, table::make_partition_presence_checker(seastar::lw_shared_ptr<sstables::sstable_set>)::$_31&, dht::decorated_key const&) at /usr/lib/gcc/x86_64-redhat-linux/10/../../../../include/c++/10/bits/invoke.h:60
 (inlined by) std::enable_if<is_invocable_r_v<partition_presence_checker_result, table::make_partition_presence_checker(seastar::lw_shared_ptr<sstables::sstable_set>)::$_31&, dht::decorated_key const&>, partition_presence_checker_result>::type std::__invoke_r<partition_presence_checker_result, table::make_partition_presence_checker(seastar::lw_shared_ptr<sstables::sstable_set>)::$_31&, dht::decorated_key const&>(table::make_partition_presence_checker(seastar::lw_shared_ptr<sstables::sstable_set>)::$_31&, dht::decorated_key const&) at /usr/lib/gcc/x86_64-redhat-linux/10/../../../../include/c++/10/bits/invoke.h:113
 (inlined by) std::_Function_handler<partition_presence_checker_result (dht::decorated_key const&), table::make_partition_presence_checker(seastar::lw_shared_ptr<sstables::sstable_set>)::$_31>::_M_invoke(std::_Any_data const&, dht::decorated_key const&) at /usr/lib/gcc/x86_64-redhat-linux/10/../../../../include/c++/10/bits/std_function.h:291
std::function<partition_presence_checker_result (dht::decorated_key const&)>::operator()(dht::decorated_key const&) const at /usr/lib/gcc/x86_64-redhat-linux/10/../../../../include/c++/10/bits/std_function.h:622
 (inlined by) operator() at ./row_cache.cc:1031
 (inlined by) _Z14with_allocatorIZZN9row_cache6updateENS0_16external_updaterER8memtableEN4$_19clERN8logalloc18allocating_sectionEN13double_deckerIl11cache_entryN3dht25raw_token_less_comparatorENSA_24ring_position_comparatorELi16ELN5bplus10key_searchE0ELNSD_10with_debugE0EE8iteratorER14memtable_entryRSt8functionIF33partition_presence_checker_resultRKNSA_13decorated_keyEEER27real_dirty_memory_accounterRKNSG_10bound_hintEEUlvE_EDcR19allocation_strategyOT_ at ././utils/allocation_strategy.hh:308
 (inlined by) operator() at ./row_cache.cc:1031
 (inlined by) operator() at ./row_cache.cc:971
 (inlined by) std::result_of<seastar::future<void> row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::$_19>(row_cache::external_updater, memtable&, row_cache::update(row_cache::external_updater, memtable&)::$_19)::{lambda()#1}::operator()() const::{lambda()#2}::operator()()::{lambda()#2}::operator()() const::{lambda()#1}::operator()() const::{lambda()#1} ()>::type with_linearized_managed_bytes<seastar::future<void> row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::$_19>(row_cache::external_updater, memtable&, row_cache::update(row_cache::external_updater, memtable&)::$_19)::{lambda()#1}::operator()() const::{lambda()#2}::operator()()::{lambda()#2}::operator()() const::{lambda()#1}::operator()() const::{lambda()#1}>(seastar::future<void> row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::$_19>(row_cache::external_updater, memtable&, row_cache::update(row_cache::external_updater, memtable&)::$_19)::{lambda()#1}::operator()() const::{lambda()#2}::operator()()::{lambda()#2}::operator()() const::{lambda()#1}::operator()() const::{lambda()#1}&&) at ././utils/managed_bytes.hh:427
 (inlined by) operator() at ./row_cache.cc:966
 (inlined by) _ZN8logalloc18allocating_section24with_reclaiming_disabledIRZZZZN9row_cache9do_updateIZNS2_6updateENS2_16external_updaterER8memtableE4$_19EEN7seastar6futureIvEES4_S6_T_ENKUlvE_clEvENUlvE0_clEvENKUlvE0_clEvEUlvE_EEDcRNS_6regionEOSB_ at ././utils/logalloc.hh:755
 (inlined by) operator() at ././utils/logalloc.hh:777
 (inlined by) _ZN8logalloc18allocating_section12with_reserveIZNS0_clIZZZZN9row_cache9do_updateIZNS3_6updateENS3_16external_updaterER8memtableE4$_19EEN7seastar6futureIvEES5_S7_T_ENKUlvE_clEvENUlvE0_clEvENKUlvE0_clEvEUlvE_EEDcRNS_6regionEOSC_EUlvE_EEDcSJ_ at ././utils/logalloc.hh:726
 (inlined by) _ZN8logalloc18allocating_sectionclIZZZZN9row_cache9do_updateIZNS2_6updateENS2_16external_updaterER8memtableE4$_19EEN7seastar6futureIvEES4_S6_T_ENKUlvE_clEvENUlvE0_clEvENKUlvE0_clEvEUlvE_EEDcRNS_6regionEOSB_ at ././utils/logalloc.hh:776
 (inlined by) operator() at ./row_cache.cc:965
_Z14with_allocatorIZZZN9row_cache9do_updateIZNS0_6updateENS0_16external_updaterER8memtableE4$_19EEN7seastar6futureIvEES2_S4_T_ENKUlvE_clEvENUlvE0_clEvEUlvE0_EDcR19allocation_strategyOS9_ at ././utils/allocation_strategy.hh:308
 (inlined by) operator() at ./row_cache.cc:952
 (inlined by) void std::__invoke_impl<void, seastar::future<void> row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::$_19>(row_cache::external_updater, memtable&, row_cache::update(row_cache::external_updater, memtable&)::$_19)::{lambda()#1}::operator()() const::{lambda()#2}>(std::__invoke_other, seastar::future<void> row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::$_19>(row_cache::external_updater, memtable&, row_cache::update(row_cache::external_updater, memtable&)::$_19)::{lambda()#1}::operator()() const::{lambda()#2}&&) at /usr/lib/gcc/x86_64-redhat-linux/10/../../../../include/c++/10/bits/invoke.h:60
 (inlined by) std::__invoke_result<seastar::future<void> row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::$_19>(row_cache::external_updater, memtable&, row_cache::update(row_cache::external_updater, memtable&)::$_19)::{lambda()#1}::operator()() const::{lambda()#2}>::type std::__invoke<seastar::future<void> row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::$_19>(row_cache::external_updater, memtable&, row_cache::update(row_cache::external_updater, memtable&)::$_19)::{lambda()#1}::operator()() const::{lambda()#2}>(seastar::future<void> row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::$_19>(row_cache::external_updater, memtable&, row_cache::update(row_cache::external_updater, memtable&)::$_19)::{lambda()#1}::operator()() const::{lambda()#2}&&, (seastar::future<void> row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::$_19>(row_cache::external_updater, memtable&, row_cache::update(row_cache::external_updater, memtable&)::$_19)::{lambda()#1}::operator()() const::{lambda()#2}&&)...) at /usr/lib/gcc/x86_64-redhat-linux/10/../../../../include/c++/10/bits/invoke.h:95
 (inlined by) _ZSt12__apply_implIZZN9row_cache9do_updateIZNS0_6updateENS0_16external_updaterER8memtableE4$_19EEN7seastar6futureIvEES2_S4_T_ENKUlvE_clEvEUlvE0_St5tupleIJEEJEEDcOS9_OT0_St16integer_sequenceImJXspT1_EEE at /usr/lib/gcc/x86_64-redhat-linux/10/../../../../include/c++/10/tuple:1723
 (inlined by) _ZSt5applyIZZN9row_cache9do_updateIZNS0_6updateENS0_16external_updaterER8memtableE4$_19EEN7seastar6futureIvEES2_S4_T_ENKUlvE_clEvEUlvE0_St5tupleIJEEEDcOS9_OT0_ at /usr/lib/gcc/x86_64-redhat-linux/10/../../../../include/c++/10/tuple:1734
 (inlined by) seastar::future<void> seastar::futurize<void>::apply<seastar::future<void> row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::$_19>(row_cache::external_updater, memtable&, row_cache::update(row_cache::external_updater, memtable&)::$_19)::{lambda()#1}::operator()() const::{lambda()#2}>(seastar::future<void> row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::$_19>(row_cache::external_updater, memtable&, row_cache::update(row_cache::external_updater, memtable&)::$_19)::{lambda()#1}::operator()() const::{lambda()#2}&&, std::tuple<>&&) at ././seastar/include/seastar/core/future.hh:2099
 (inlined by) operator() at ././seastar/include/seastar/core/thread.hh:258
 (inlined by) seastar::noncopyable_function<void ()>::direct_vtable_for<seastar::futurize<std::result_of<std::decay<seastar::future<void> row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::$_19>(row_cache::external_updater, memtable&, row_cache::update(row_cache::external_updater, memtable&)::$_19)::{lambda()#1}::operator()() const::{lambda()#2}>::type ()>::type>::type seastar::async<seastar::future<void> row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::$_19>(row_cache::external_updater, memtable&, row_cache::update(row_cache::external_updater, memtable&)::$_19)::{lambda()#1}::operator()() const::{lambda()#2}>(seastar::thread_attributes, seastar::future<void> row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::$_19>(row_cache::external_updater, memtable&, row_cache::update(row_cache::external_updater, memtable&)::$_19)::{lambda()#1}::operator()() const::{lambda()#2}&&, (std::decay<seastar::future<void> row_cache::do_update<row_cache::update(row_cache::external_updater, memtable&)::$_19>(row_cache::external_updater, memtable&, row_cache::update(row_cache::external_updater, memtable&)::$_19)::{lambda()#1}::operator()() const::{lambda()#2}>&&)...)::{lambda()#1}>::call(seastar::noncopyable_function<void ()> const*) at ././seastar/include/seastar/util/noncopyable_function.hh:124
seastar::noncopyable_function<void ()>::operator()() const at ./build/release/seastar/./seastar/include/seastar/util/noncopyable_function.hh:209
 (inlined by) seastar::thread_context::main() at ./build/release/seastar/./seastar/src/core/thread.cc:297

@avikivity
Copy link
Member

@fgelcer what's the AMI ID for the last reproducer? And were there no reproducers after the last one?

Maybe it was caused by #7696 (although the close date makes it unlikely).

@fgelcer
Copy link
Author

fgelcer commented Feb 3, 2021

@fgelcer what's the AMI ID for the last reproducer? And were there no reproducers after the last one?

Maybe it was caused by #7696 (although the close date makes it unlikely).

this is the AMI id ami-047c88faadef0dbc6
and this is the version 4.4.dev-0.20201231.93a2c5233 with build-id 63738ab215ea91209c7488bf63b0556f6adf6c15
and this is the test_id for the record 236aac08-9e84-45d6-a707-35c201bbae18

@avikivity
Copy link
Member

So this rules out the tuned problem.

@avikivity
Copy link
Member

I don't understand it.

@tgrabiec
Copy link
Contributor

A stall in partition_presence_checker could be caused by a large number of sstables with overlapping partition ranges, so there is a lot of bloom filters to check

@bhalevy
Copy link
Member

bhalevy commented Feb 16, 2021

Maybe something along these lines could help:

diff --git a/table.cc b/table.cc
index a76b48fa4..72b032731 100644
--- a/table.cc
+++ b/table.cc
@@ -1051,10 +1051,11 @@ table::make_partition_presence_checker(lw_shared_ptr<sstables::sstable_set> ssta
         auto hk = sstables::sstable::make_hashed_key(*_schema, key.key());
         for (auto&& s : sst) {
             if (s->filter_has_key(hk)) {
                 return partition_presence_checker_result::maybe_exists;
             }
+            seastar::thread::maybe_yield();
         }
         return partition_presence_checker_result::definitely_doesnt_exist;
     };
 }
 

@avikivity
Copy link
Member

How many sstables do we need for the loop to take 32ms?

@raphaelsc
Copy link
Member

level 0 sstables are unconditionally returned by incremental_selector::select(). they're thought to span roughly all token range. however, after repair, we may have tons (thousands?) of L0 sstables which span only a subset of the token space. after off-strategy, we'll no longer have this issue. 1) at most 256 sstables per shard. 2) they're held in partitioned_sstable_set. while off-strategy is not in, we may consider changing incremental_selector to only return L0 ssts which overlap with the requested key, instead of returning them all unconditionally. this will not only fix the stall after repair, but also avoid unneeded read ampl., which results from reading index unnecessarily.

@avikivity
Copy link
Member

Even a thousand sstables. 32ms / 1000 = 32us. Does a bloom filter check take 32us? No, it doesn't.

@avikivity
Copy link
Member

object_descriptor::encode, see #8828

@avikivity
Copy link
Member

Closing as duplicate of #8828.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants