New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compilation is S L O W #1
Comments
I'm using the following configuration for building Urchin:
It's slow but bearable. |
On Tue, Mar 24, 2015 at 3:27 PM, Pekka Enberg notifications@github.com
But isn't it sad that compiling just the tests - just 3,000 lines of code -
|
We certainly can do something about it and should do that. Are there any tools to "profile compilation" to detect which compilation units are slowing it down the most? |
2015-03-24 14:43 GMT+01:00 nyh notifications@github.com:
$ cat ~/.ninja-stats/stats | cut -d"," -f2 | series-stat.py |
On Tue, Mar 24, 2015 at 3:50 PM, Tomasz Grabiec notifications@github.com
You are suffering from http://en.wikipedia.org/wiki/Stockholm_syndrome :-) 9 minutes for a recompile is bad. 1.5 for an average compile is even It's even worse when you consider that more than half of this time is I'm not saying there's something trivial we can do to fix this, but at Nadav Har'El |
2015-03-24 15:00 GMT+01:00 nyh notifications@github.com:
I was of course being sarcastic, the compile times are horrible.
|
One of the things that mosts ruins my programming speed on Urchin is this: Say I need to modify sstable.cc and sstable.hh with a large new feature. This causes 100 other files need to be compiled. Ok, fine. But what drives me crazy is that sstable.cc itself gets compiled at a random time - on average after compiling 50 other files which takes a L O N G time - and if I have several errors I often need to repeat this many times. Even if we don't fix any of the other issues we talked about (smaller header files, less include-the-world, more pimpl, etc.), it would have really helped if Ninja simply found a way to compile sstable.cc first - and tell me my new error immediately. There's actually a request for this on the ninja bug tracker: ninja-build/ninja#60 |
Here's a little trick to do something similar to ninja#60, but manually: If I see that build of build/debug/sstables/sstables.o is what failed last time, then I manually run
next time, until the compilation resolve, and then I revert to the usual "ninja". |
This reverts commit a19d217. This commit breaks cql_query_test. [asias@hjpc urchin]$ ./cql_query_test Running 1 test case... WARNING: Not implemented: COMPACT_TABLES WARNING: Not implemented: METRICS WARNING: Not implemented: PERMISSIONS cql_query_test: core/distributed.hh:290: Service& distributed<Service>::local() [with Service = service::storage_service]: Assertion `local_is_initialized()' failed. unknown location(0): fatal error in "test_create_keyspace_statement": signal: SIGABRT (application abort requested) tests/test-utils.cc(31): last checkpoint *** 1 failure detected in test suite "tests/urchin/cql_query_test.cc" (gdb) bt #0 0x00000032930348d7 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:55 #1 0x000000329303653a in __GI_abort () at abort.c:89 #2 0x000000329302d47d in __assert_fail_base (fmt=0x3293186cb8 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x8ec10a "local_is_initialized()", file=file@entry=0x92508d "core/distributed.hh", line=line@entry=290, function=function@entry=0x8ed440 <distributed<service::storage_service>::local()::__PRETTY_FUNCTION__> "Service& distributed<Service>::local() [with Service = service::storage_service]") at assert.c:92 #3 0x000000329302d532 in __GI___assert_fail (assertion=0x8ec10a "local_is_initialized()", file=0x92508d "core/distributed.hh", line=290, function=0x8ed440 <distributed<service::storage_service>::local()::__PRETTY_FUNCTION__> "Service& distributed<Service>::local() [with Service = service::storage_service]") at assert.c:101 #4 0x0000000000430f19 in local (this=<optimized out>) at core/distributed.hh:290 #5 get_local_storage_service () at service/storage_service.hh:3326 #6 keyspace::create_replication_strategy (this=0x7ffff6bf8350) at database.cc:690 #7 0x000000000061537a in _ZZZN2db20legacy_schema_tables15merge_keyspacesERN7service13storage_proxyEOSt3mapI13basic_sstringIcjLj15EE13lw_shared_ptrIN5query10result_setEESt4lessIS6_ESaISt4pairIKS6_SA_EEESI_ENKUlRT_E0_clISt6ve ctorISF_SG_EEEDaSK_ENKUlR8databaseE_clESQ_ () at db/legacy_schema_tables.cc:584 #8 0x0000000000617d19 in operator() (__closure=0x7ffff6bf8650) at ./core/distributed.hh:284 In the test, storage_service and other services are not stared. Let's revert it and figure out a way to run cql_query_test with the needed services started properly and then bring the "storage_service: Remove ad-hoc token_metadata creation" change back.
cql_query_test segfaults in debug mode due to infinite recursion (see trace below). The problem started to appear after Avi's always-defer change. It looks like the following boost::any() constructor template is instantiated when boost::any is copied into a lambda: // Perfect forwarding of ValueType template<typename ValueType> any(ValueType&& value, typename boost::disable_if<boost::is_same<any&, ValueType> >::type* = 0) : content(new holder< typename remove_reference<ValueType>::type >(static_cast<ValueType&&>(value))) { } It checks for any& to disable itself on forwarding, but it doesn't check for const any&. There are (non-template) constructors which take const any&, but I guess some C++ rule favors the template version. This results in infinite recursion of constructor invocations between any() and holder(). Workaround is to make the lambda mutable, so that the argument type is any& and template doesn't kick in. Trace: CU 0x4debed5, DIE 0x4f176e6>, this=0x602000044550) at /usr/include/boost/any.hpp:175 ql_query_test, CU 0x4debed5, DIE 0x4f6c509>, this=0x602000044518) at /usr/include/boost/any.hpp:72 CU 0x4debed5, DIE 0x4f176e6>, this=0x602000044510) at /usr/include/boost/any.hpp:175 ql_query_test, CU 0x4debed5, DIE 0x4f6c509>, this=0x6020000444e8) at /usr/include/boost/any.hpp:72 CU 0x4debed5, DIE 0x4f176e6>, this=0x6020000444e0) at /usr/include/boost/any.hpp:175 oost::any const>, void>::type*) (this=0x7ffff2ffa6f0, value=<unknown type in /home/tgrabiec/src/urchin2/build /release/tests/urchin/cql_query_test, CU 0x4debed5, DIE 0x4f6c509>) at /usr/include/boost/any.hpp:72 5u> const&, std::vector<boost::any, std::allocator<boost::any> >, std::vector<boost::any, std::allocator<boos t::any> >, basic_sstring<char, unsigned int, 15u> const&, boost::any)::{lambda(database&)#1}::operator()(data base&) const::{lambda(std::unique_ptr<mutation_partition const, std::default_delete<mutation_partition> >)#1} ::unique_ptr(std::unique_ptr<mutation_partition const, std::default_delete<mutation_partition> >&&) (this=0x7 ffff2ffa6c0) at tests/urchin/cql_test_env.cc:126
make_local_reader takes a partition-range parameter, passed by *reference*. The meaning of this passing-by-reference is not documented: when make_local_reader returns, which of the following two statements holds? 1. make_local_reader still holds this reference, so the caller is forced to keep the range object alive. For how long? or 2. make_local_reader reads the range before returning, and when it returns, the caller continues to "own" the range object, and is allowed to do anything, including delete it. The principle of least surprise suggests that #2 is better, unless there is a very good reason to choose #1, and unless this requirement to keep the argument alive (and for how long) was clearly documented. But neither is the case here - the overhead of copying the argument is negligable compared to the other overheads of make_local_reader, and nothing was documented. But it turns out the current code did #1 - the address of the range was passed around *after* make_local_reader returns - the different readers on different cpus continue to use it to eventually find the right sstable byte range to iterate over. It very easy to forget this and call make_local_reader on a on-stack range variable, and the result is a hard-to-debug use-after-free mess. This patch switches us to situation #2: Before make_local_reader returns, the range is copied to whoever needs to hold it after the return, namely each individual shard_reader. The patch to do this is trivial (one character removed). Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
I'm closing since we're not actively intend to change anything for the time being |
On Sun, Aug 30, 2015 at 10:31 AM, Dor Laor notifications@github.com wrote:
Not being active on fixing this problem doesn't mean it's solved ;-) |
There is no point to have plenty of open issues if we don't intend to fix On Sun, Aug 30, 2015 at 11:09 AM, Avi Kivity notifications@github.com
|
There is no point in opening issues if they are closed in order to remove clutter. Issues should be prioritized, not closed. |
in Python, a raw string is created using 'r' or 'R' prefix. when creating the regex using Python string, sometimes, we have to use "\" to escape the parenthesis so the tools like "sed" can consider the parenthesis as a capture group. but "\" is also used to escape strings in Python, in order to put "\" as it is, we use "\" instead of escaping "\" with "\\" which is obscure. turns out in CPython 3.12, it loyally generates "\\" if "\\" is in a raw string. so "sed" is confused when executing the generated build rule, like: ``` [3/36] ANTLR3 alternator/expressions.g FAILED: build/debug/gen/alternator/expressionsLexer.cpp build/debug/gen/alternator/expressionsLexer.hpp build/debug/gen/alternator/expressionsParser.cpp build/debug/gen/alternator/expressionsParser.hpp sed -e '/^#if 0/,/^#endif/d' alternator/expressions.g > build/debug/gen/alternator/expressions.g && antlr3 build/debug/gen/alternator/expressions.g && sed -i -e '/^.*On :.*$/d' build/debug/gen/alternator/expressionsLexer.hpp && sed -i -e '/^.*On :.*$/d' build/debug/gen/alternator/expressionsLexer.cpp && sed -i -e '/^.*On :.*$/d' build/debug/gen/alternator/expressionsParser.hpp && sed -i -e 's/^\\( *\)\\(ImplTraits::CommonTokenType\\* [a-zA-Z0-9_]* = NULL;\)$/\1const \2/' -e '/^.*On :.*$/d' -e '1i using ExceptionBaseType = int;' -e 's/^{/{ ExceptionBaseType\* ex = nullptr;/; s/ExceptionBaseType\* ex = new/ex = new/; s/exceptions::syntax_exception e/exceptions::syntax_exception\& e/' build/debug/gen/alternator/expressionsParser.cpp sed: -e expression #1, char 80: Unmatched ) or \) ``` in this change, we replace "\\" with "\" in raw string where we intend to put a single "\" in the created string. this is accepted by both Python 3.11 shipped along with Fedora 38 and Python 3.12 shipped along with Fedora 39. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
…ports' from Yaron Kaikov This PR includes 3 commits: - **[actions] Add a check for backport labels**: As part of the Automation of ScyllaDB backports project, each PR should get either a `backport/none` or `backport/X.Y` label. Based on this label we will automatically open a backport PR for the relevant OSS release. In this commit, I am adding a GitHub action to verify if such a label was added. This only applies to PR with a based branch of `master` or `next`. For releases, we don't need this check - **Add Mergify (https://mergify.com/) configuration file**: In this PR we introduce the `.mergify.yml` configuration file, which include a set of rules that we will use for automating our backport process. For each supported OSS release (currently 5.2 and 5.4) we have an almost identical configuration section which includes the four conditions before we open a backport pr: * PR should be closed * PR should have the proper label. for example: backport/5.4 (we can have multiple labels) * Base branch should be `master` * PR should be set with a `promoted` label - this condition will be set automatically once the commits are promoted to the `master` branch (passed gating) Once all conditions are applied, the verify bot will open a backport PR and will assign it to the author of the original PR, then CI will start running, and only after it pass. we merge - **[action] Add promoted label when commits are in master**: In Scylla, we don't merge our PR but use ./script/pull_github_pr.sh` to close the pull request, adding `closes scylladb/scylladb <PR number>` remark and push changes to `next` branch. One of the conditions for opening a backport PR is that all relevant commits are in `master` (passed gating), in this GitHub action, we will go through the list of commits once a push was made to `master` and will identify the relevant PR, and add `promoted` label to it. This will allow Mergify to start the process of backporting Closes #17365 * github.com:scylladb/scylladb: [action] Add promoted label when commits are in master Add mergify (https://mergify.com/) configuration file [actions] Add a check for backport labels
Unfreezing large mutations inside memtable::apply may cause stalls as seen with the test_add_many_nodes_under_load dtest ``` ++[1#1/159 4%] addr=0x1638989 total=568 count=7 avg=81: | utils::uleb64_express_encode_impl at ././utils/vle.hh:73 | (inlined by) utils::uleb64_express_encode<void (&)(char const*, unsigned long), void (&)(char const*, unsigned long)> at ././utils/vle.hh:82 | (inlined by) logalloc::region_impl::object_descriptor::encode at ./utils/logalloc.cc:1658 | (inlined by) logalloc::region_impl::alloc_small at ./utils/logalloc.cc:1743 ++ - addr=0x163a59f: | logalloc::region_impl::alloc at ./utils/logalloc.cc:2104 | ++[2#1/7 56%] addr=0x1172c9c total=669 count=9 avg=74: | | managed_bytes::managed_bytes at ././utils/managed_bytes.hh:552 | | ++[3#1/3 74%] addr=0x15539b6 total=547 count=7 avg=78: | | | compound_wrapper<clustering_key_prefix, clustering_key_prefix_view>::compound_wrapper at ././keys.hh:149 | | | (inlined by) prefix_compound_wrapper<clustering_key_prefix, clustering_key_prefix_view, clustering_key_prefix>::prefix_compound_wrapper at ././keys.hh:574 | | | (inlined by) clustering_key_prefix::clustering_key_prefix at ././keys.hh:865 | | | (inlined by) rows_entry::rows_entry at ./mutation/mutation_partition.hh:976 | | | ++[4#1/1 100%] addr=0x1543379 total=962 count=18 avg=53: | | | | allocation_strategy::construct<rows_entry, schema const&, rows_entry const&> at ././utils/allocation_strategy.hh:160 | | | ++[5#1/1 100%] addr=0x15289e3 total=2231 count=48 avg=46: | | | | mutation_partition::mutation_partition(schema const&, mutation_partition const&)::$_0::operator() at ./mutation/mutation_partition.cc:149 | | | | (inlined by) intrusive_b::node<rows_entry, &rows_entry::_link, rows_entry::tri_compare, 12ul, 20ul, (intrusive_b::key_search)0, (intrusive_b::with_debug)0>::clone<mutation_partition::mutation_partition(schema const&, mutation_partition const&)::$_0&, current_deleter<rows_entry>()::{lambda(rows_entry*)scylladb#1}&> at ././utils/intrusive_btree.hh:2083 | | | ++[6#1/1 100%] addr=0x1528a29 total=2310 count=50 avg=46: | | | | intrusive_b::node<rows_entry, &rows_entry::_link, rows_entry::tri_compare, 12ul, 20ul, (intrusive_b::key_search)0, (intrusive_b::with_debug)0>::clone<mutation_partition::mutation_partition(schema const&, mutation_partition const&)::$_0&, current_deleter<rows_entry>()::{lambda(rows_entry*)scylladb#1}&> at ././utils/intrusive_btree.hh:2094 | | | ++[7#1/2 67%] addr=0x1528a29 total=4629 count=95 avg=49: | | | | intrusive_b::node<rows_entry, &rows_entry::_link, rows_entry::tri_compare, 12ul, 20ul, (intrusive_b::key_search)0, (intrusive_b::with_debug)0>::clone<mutation_partition::mutation_partition(schema const&, mutation_partition const&)::$_0&, current_deleter<rows_entry>()::{lambda(rows_entry*)scylladb#1}&> at ././utils/intrusive_btree.hh:2094 | | | -> continued at addr=0x1528a29 above | | | ++[7#2/2 33%] addr=0x1516e9b total=2310 count=50 avg=46: | | | | intrusive_b::tree<rows_entry, &rows_entry::_link, rows_entry::tri_compare, 12ul, 20ul, (intrusive_b::key_search)0, (intrusive_b::with_debug)0>::clone_from<mutation_partition::mutation_partition(schema const&, mutation_partition const&)::$_0&, current_deleter<rows_entry>()::{lambda(rows_entry*)scylladb#1}> at ././utils/intrusive_btree.hh:638 | | | | (inlined by) mutation_partition::mutation_partition at ./mutation/mutation_partition.cc:151 | | | ++ - addr=0x15a6782: | | | | partition_entry::apply at ./mutation/partition_version.cc:427 | | | ++[8#1/2 99%] addr=0x14a0ebe total=2277 count=49 avg=46: | | | | replica::memtable::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const> const&, db::rp_handle&&)::$_0::operator()() const::{lambda()scylladb#1}::operator() at ./replica/memtable.cc:806 | | | | (inlined by) logalloc::allocating_section::with_reclaiming_disabled<replica::memtable::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const> const&, db::rp_handle&&)::$_0::operator()() const::{lambda()scylladb#1}&> at ././utils/logalloc.hh:500 | | | | (inlined by) logalloc::allocating_section::operator()<replica::memtable::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const> const&, db::rp_handle&&)::$_0::operator()() const::{lambda()scylladb#1}>(logalloc::region&, replica::memtable::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const> const&, db::rp_handle&&)::$_0::operator()() const::{lambda()scylladb#1}&&)::{lambda()scylladb#1}::operator() at ././utils/logalloc.hh:527 | | | | (inlined by) logalloc::allocating_section::with_reserve<logalloc::allocating_section::operator()<replica::memtable::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const> const&, db::rp_handle&&)::$_0::operator()() const::{lambda()scylladb#1}>(logalloc::region&, replica::memtable::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const> const&, db::rp_handle&&)::$_0::operator()() const::{lambda()scylladb#1}&&)::{lambda()scylladb#1}> at ././utils/logalloc.hh:471 | | | | (inlined by) logalloc::allocating_section::operator()<replica::memtable::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const> const&, db::rp_handle&&)::$_0::operator()() const::{lambda()scylladb#1}> at ././utils/logalloc.hh:526 | | | | (inlined by) replica::memtable::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const> const&, db::rp_handle&&)::$_0::operator() at ./replica/memtable.cc:800 | | | | (inlined by) with_allocator<replica::memtable::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const> const&, db::rp_handle&&)::$_0> at ././utils/allocation_strategy.hh:318 | | | | (inlined by) replica::memtable::apply at ./replica/memtable.cc:799 | | | | ++[9#1/1 100%] addr=0x145740b total=2697 count=60 avg=45: | | | | | replica::table::do_apply<frozen_mutation const&, seastar::lw_shared_ptr<schema const>&> at ./replica/table.cc:2897 | | | | ++[10#1/2 57%] addr=0x1417dcd total=5467 count=147 avg=37: | | | | | replica::table::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const>, db::rp_handle&&, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_1::operator() at ./replica/table.cc:2933 | | | | | (inlined by) seastar::futurize<void>::invoke<replica::table::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const>, db::rp_handle&&, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_1&> at ././seastar/include/seastar/core/future.hh:2019 | | | | | (inlined by) seastar::futurize_invoke<replica::table::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const>, db::rp_handle&&, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_1&> at ././seastar/include/seastar/core/future.hh:2053 | | | | | (inlined by) replica::dirty_memory_manager_logalloc::region_group::run_when_memory_available<replica::table::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const>, db::rp_handle&&, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_1> at ./replica/dirty_memory_manager.hh:572 | | | | | (inlined by) replica::table::apply at ./replica/table.cc:2932 | | | | ++ - addr=0x106f00a: | | | | | std::__n4861::coroutine_handle<seastar::internal::coroutine_traits_base<void>::promise_type>::resume at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/coroutine:240 | | | | | (inlined by) seastar::internal::coroutine_traits_base<void>::promise_type::run_and_dispose at ././seastar/include/seastar/core/coroutine.hh:125 ``` This patch unfreezes all frozen mutations at the (async) table::apply layer where we can yield if needed and then eliminates the memtable apply path, given a frozen_mutation so all callers of memtable::apply will need to unfreeze their mutations first. Fortunately, this only place this is needed outside the replica::table layer is a unit test. With this change, table::do_apply can be detemplatized since it now accepts only ref to mutation. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Perevent stalls from "unpacking" of large canonical mutations seen with test_add_many_nodes_under_load when called from `group0_state_machine::transfer_snapshot`: ``` ++[1#1/44 14%] addr=0x395b2f total=569 count=6 avg=95: ?? ??:0 | ++[2#1/2 56%] addr=0x3991e3 total=321 count=4 avg=80: ?? ??:0 | ++ - addr=0x1587159: | | std::__new_allocator<seastar::basic_sstring<signed char, unsigned int, 31u, false> >::allocate at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/new_allocator.h:147 | | (inlined by) std::allocator<seastar::basic_sstring<signed char, unsigned int, 31u, false> >::allocate at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/allocator.h:198 | | (inlined by) std::allocator_traits<std::allocator<seastar::basic_sstring<signed char, unsigned int, 31u, false> > >::allocate at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/alloc_traits.h:482 | | (inlined by) std::_Vector_base<seastar::basic_sstring<signed char, unsigned int, 31u, false>, std::allocator<seastar::basic_sstring<signed char, unsigned int, 31u, false> > >::_M_allocate at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/stl_vector.h:378 | | (inlined by) std::vector<seastar::basic_sstring<signed char, unsigned int, 31u, false>, std::allocator<seastar::basic_sstring<signed char, unsigned int, 31u, false> > >::reserve at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/vector.tcc:79 | | (inlined by) ser::idl::serializers::internal::vector_serializer<std::vector<seastar::basic_sstring<signed char, unsigned int, 31u, false>, std::allocator<seastar::basic_sstring<signed char, unsigned int, 31u, false> > > >::read<seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator> > at ././serializer_impl.hh:226 | | (inlined by) ser::deserialize<std::vector<seastar::basic_sstring<signed char, unsigned int, 31u, false>, std::allocator<seastar::basic_sstring<signed char, unsigned int, 31u, false> > >, seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator> > at ././serializer.hh:264 | | (inlined by) ser::serializer<clustering_key_prefix>::read<seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator> >(seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator>&)::{lambda(auto:1&)scylladb#1}::operator()<seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator> > at ./build/dev/gen/idl/keys.dist.impl.hh:31 | ++ - addr=0x1587085: | | seastar::with_serialized_stream<seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator>, ser::serializer<clustering_key_prefix>::read<seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator> >(seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator>&)::{lambda(auto:1&)scylladb#1}, void, void> at ././seastar/include/seastar/core/simple-stream.hh:646 | | (inlined by) ser::serializer<clustering_key_prefix>::read<seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator> > at ./build/dev/gen/idl/keys.dist.impl.hh:28 | | (inlined by) ser::deserialize<clustering_key_prefix, seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator> > at ././serializer.hh:264 | | (inlined by) ser::deletable_row_view::key() const::{lambda(auto:1&)scylladb#1}::operator()<seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator> const> at ./build/dev/gen/idl/mutation.dist.impl.hh:1268 | | ++[3#1/1 100%] addr=0x15865a3 total=577 count=7 avg=82: | | | seastar::memory_input_stream<bytes_ostream::fragment_iterator>::with_stream<ser::deletable_row_view::key() const::{lambda(auto:1&)scylladb#1}> at ././seastar/include/seastar/core/simple-stream.hh:491 | | | (inlined by) seastar::with_serialized_stream<seastar::memory_input_stream<bytes_ostream::fragment_iterator> const, ser::deletable_row_view::key() const::{lambda(auto:1&)scylladb#1}, void> at ././seastar/include/seastar/core/simple-stream.hh:639 | | | (inlined by) ser::deletable_row_view::key at ./build/dev/gen/idl/mutation.dist.impl.hh:1264 | | ++[4#1/1 100%] addr=0x157cf27 total=643 count=8 avg=80: | | | mutation_partition_view::do_accept<partition_builder> at ./mutation/mutation_partition_view.cc:212 | | ++ - addr=0x1516cac: | | | mutation_partition::apply at ./mutation/mutation_partition.cc:497 | | ++[5#1/1 100%] addr=0x14e4433 total=1765 count=22 avg=80: | | | canonical_mutation::to_mutation at ./mutation/canonical_mutation.cc:60 | | ++[6#1/2 98%] addr=0x2452a60 total=1732 count=21 avg=82: | | | service::storage_service::merge_topology_snapshot at ./service/storage_service.cc:761 | | ++ - addr=0x2858782: | | | service::group0_state_machine::transfer_snapshot at ./service/raft/group0_state_machine.cc:303 ``` Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Freezing large mutations synchronously may cause reactor stalls, as seen in the test_add_many_nodes_under_load dtest: ``` ++[1#1/37 5%] addr=0x15b0bf total=99 count=2 avg=50: ?? ??:0 | ++[2#1/2 67%] addr=0x15a331f total=66 count=1 avg=66: | | bytes_ostream::write at ././bytes_ostream.hh:248 | | (inlined by) bytes_ostream::write at ././bytes_ostream.hh:263 | | (inlined by) ser::serialize_integral<unsigned int, bytes_ostream> at ././serializer.hh:203 | | (inlined by) ser::integral_serializer<unsigned int>::write<bytes_ostream> at ././serializer.hh:217 | | (inlined by) ser::serialize<unsigned int, bytes_ostream> at ././serializer.hh:254 | | (inlined by) ser::writer_of_column<bytes_ostream>::write_id at ./build/dev/gen/idl/mutation.dist.impl.hh:4680 | | ++[3#1/1 100%] addr=0x159df71 total=132 count=2 avg=66: | | | (anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}::operator() at ./mutation/mutation_partition_serializer.cc:99 | | | (inlined by) row::maybe_invoke_with_hash<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1} const, cell_and_hash const> at ./mutation/mutation_partition.hh:133 | | | (inlined by) row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}>(ser::deletable_row__cells<bytes_ostream>&&) const::{lambda(unsigned int, cell_and_hash const&)scylladb#1}::operator() at ./mutation/mutation_partition.hh:152 | | | (inlined by) compact_radix_tree::tree<cell_and_hash, unsigned int>::walking_visitor<row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}>(ser::deletable_row__cells<bytes_ostream>&&) const::{lambda(unsigned int, cell_and_hash const&)scylladb#1}, true>::operator() at ././utils/compact-radix-tree.hh:1888 | | | (inlined by) compact_radix_tree::tree<cell_and_hash, unsigned int>::visit_slot<compact_radix_tree::tree<cell_and_hash, unsigned int>::walking_visitor<row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}>(ser::deletable_row__cells<bytes_ostream>&&) const::{lambda(unsigned int, cell_and_hash const&)scylladb#1}, true>&> at ././utils/compact-radix-tree.hh:1560 | | ++ - addr=0x159d84d: | | | compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)1, 4u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)0, 0u>::visit<compact_radix_tree::tree<cell_and_hash, unsigned int>::walking_visitor<row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}>(ser::deletable_row__cells<bytes_ostream>&&) const::{lambda(unsigned int, cell_and_hash const&)scylladb#1}, true>&> at ././utils/compact-radix-tree.hh:1364 | | | (inlined by) compact_radix_tree::tree<cell_and_hash, unsigned int>::node_base<cell_and_hash, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)1, 4u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)0, 0u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 8u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)3, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)1, 4u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)3, 16u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)4, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 8u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)4, 32u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)6, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)3, 16u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::direct_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)6, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)0, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)4, 32u> >::visit<compact_radix_tree::tree<cell_and_hash, unsigned int>::walking_visitor<row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}>(ser::deletable_row__cells<bytes_ostream>&&) const::{lambda(unsigned int, cell_and_hash const&)scylladb#1}, true>&, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)1, 4u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)0, 0u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 8u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)3, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)1, 4u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)3, 16u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)4, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 8u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)4, 32u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)6, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)3, 16u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::direct_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)6, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)0, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)4, 32u> > at ././utils/compact-radix-tree.hh:799 | | | (inlined by) compact_radix_tree::tree<cell_and_hash, unsigned int>::node_base<cell_and_hash, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)1, 4u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)0, 0u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 8u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)3, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)1, 4u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)3, 16u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)4, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 8u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)4, 32u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)6, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)3, 16u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::direct_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)6, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)0, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)4, 32u> >::visit<compact_radix_tree::tree<cell_and_hash, unsigned int>::walking_visitor<row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}>(ser::deletable_row__cells<bytes_ostream>&&) const::{lambda(unsigned int, cell_and_hash const&)scylladb#1}, true>&> at ././utils/compact-radix-tree.hh:807 | | ++[4#1/1 100%] addr=0x1596f4a total=329 count=5 avg=66: | | | compact_radix_tree::tree<cell_and_hash, unsigned int>::node_head::visit<compact_radix_tree::tree<cell_and_hash, unsigned int>::walking_visitor<row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}>(ser::deletable_row__cells<bytes_ostream>&&) const::{lambda(unsigned int, cell_and_hash const&)scylladb#1}, true> > at ././utils/compact-radix-tree.hh:473 | | | (inlined by) compact_radix_tree::tree<cell_and_hash, unsigned int>::visit<compact_radix_tree::tree<cell_and_hash, unsigned int>::walking_visitor<row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}>(ser::deletable_row__cells<bytes_ostream>&&) const::{lambda(unsigned int, cell_and_hash const&)scylladb#1}, true> > at ././utils/compact-radix-tree.hh:1626 | | | (inlined by) compact_radix_tree::tree<cell_and_hash, unsigned int>::walk<row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}>(ser::deletable_row__cells<bytes_ostream>&&) const::{lambda(unsigned int, cell_and_hash const&)scylladb#1}> at ././utils/compact-radix-tree.hh:1909 | | | (inlined by) row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}> at ./mutation/mutation_partition.hh:151 | | | (inlined by) (anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> > at ./mutation/mutation_partition_serializer.cc:97 | | | (inlined by) write_row<ser::writer_of_deletable_row<bytes_ostream> > at ./mutation/mutation_partition_serializer.cc:168 | | ++[5#1/2 80%] addr=0x15a310c total=263 count=4 avg=66: | | | mutation_partition_serializer::write_serialized<ser::writer_of_mutation_partition<bytes_ostream> > at ./mutation/mutation_partition_serializer.cc:180 | | | ++[6#1/2 62%] addr=0x14eb60a total=428 count=7 avg=61: | | | | frozen_mutation::frozen_mutation(mutation const&)::$_0::operator()<ser::writer_of_mutation_partition<bytes_ostream> > at ./mutation/frozen_mutation.cc:85 | | | | (inlined by) ser::after_mutation__key<bytes_ostream>::partition<frozen_mutation::frozen_mutation(mutation const&)::$_0> at ./build/dev/gen/idl/mutation.dist.impl.hh:7058 | | | | (inlined by) frozen_mutation::frozen_mutation at ./mutation/frozen_mutation.cc:84 | | | | ++[7#1/1 100%] addr=0x14ed388 total=532 count=9 avg=59: | | | | | freeze at ./mutation/frozen_mutation.cc:143 | | | | ++[8#1/2 74%] addr=0x252cf55 total=394 count=6 avg=66: | | | | | service::storage_service::merge_topology_snapshot at ./service/storage_service.cc:763 ``` This change adds a preemptible mutation_partition_serializer called from `mutation_partition_serializer::write_in_thread` and latter called from `freeze_in_thread`. A thread is started in `storage_service::merge_topology_snapshot` in which we expect large enough mutations to warrant a seastar thread, and taking into account it is not on the hot data path, but rather it's on the management plane. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Freezing large mutations synchronously may cause reactor stalls, as seen in the test_add_many_nodes_under_load dtest: ``` ++[1#1/37 5%] addr=0x15b0bf total=99 count=2 avg=50: ?? ??:0 | ++[2#1/2 67%] addr=0x15a331f total=66 count=1 avg=66: | | bytes_ostream::write at ././bytes_ostream.hh:248 | | (inlined by) bytes_ostream::write at ././bytes_ostream.hh:263 | | (inlined by) ser::serialize_integral<unsigned int, bytes_ostream> at ././serializer.hh:203 | | (inlined by) ser::integral_serializer<unsigned int>::write<bytes_ostream> at ././serializer.hh:217 | | (inlined by) ser::serialize<unsigned int, bytes_ostream> at ././serializer.hh:254 | | (inlined by) ser::writer_of_column<bytes_ostream>::write_id at ./build/dev/gen/idl/mutation.dist.impl.hh:4680 | | ++[3#1/1 100%] addr=0x159df71 total=132 count=2 avg=66: | | | (anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}::operator() at ./mutation/mutation_partition_serializer.cc:99 | | | (inlined by) row::maybe_invoke_with_hash<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1} const, cell_and_hash const> at ./mutation/mutation_partition.hh:133 | | | (inlined by) row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}>(ser::deletable_row__cells<bytes_ostream>&&) const::{lambda(unsigned int, cell_and_hash const&)scylladb#1}::operator() at ./mutation/mutation_partition.hh:152 | | | (inlined by) compact_radix_tree::tree<cell_and_hash, unsigned int>::walking_visitor<row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}>(ser::deletable_row__cells<bytes_ostream>&&) const::{lambda(unsigned int, cell_and_hash const&)scylladb#1}, true>::operator() at ././utils/compact-radix-tree.hh:1888 | | | (inlined by) compact_radix_tree::tree<cell_and_hash, unsigned int>::visit_slot<compact_radix_tree::tree<cell_and_hash, unsigned int>::walking_visitor<row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}>(ser::deletable_row__cells<bytes_ostream>&&) const::{lambda(unsigned int, cell_and_hash const&)scylladb#1}, true>&> at ././utils/compact-radix-tree.hh:1560 | | ++ - addr=0x159d84d: | | | compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)1, 4u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)0, 0u>::visit<compact_radix_tree::tree<cell_and_hash, unsigned int>::walking_visitor<row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}>(ser::deletable_row__cells<bytes_ostream>&&) const::{lambda(unsigned int, cell_and_hash const&)scylladb#1}, true>&> at ././utils/compact-radix-tree.hh:1364 | | | (inlined by) compact_radix_tree::tree<cell_and_hash, unsigned int>::node_base<cell_and_hash, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)1, 4u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)0, 0u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 8u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)3, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)1, 4u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)3, 16u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)4, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 8u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)4, 32u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)6, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)3, 16u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::direct_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)6, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)0, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)4, 32u> >::visit<compact_radix_tree::tree<cell_and_hash, unsigned int>::walking_visitor<row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}>(ser::deletable_row__cells<bytes_ostream>&&) const::{lambda(unsigned int, cell_and_hash const&)scylladb#1}, true>&, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)1, 4u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)0, 0u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 8u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)3, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)1, 4u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)3, 16u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)4, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 8u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)4, 32u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)6, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)3, 16u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::direct_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)6, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)0, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)4, 32u> > at ././utils/compact-radix-tree.hh:799 | | | (inlined by) compact_radix_tree::tree<cell_and_hash, unsigned int>::node_base<cell_and_hash, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)1, 4u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)0, 0u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 8u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)3, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)1, 4u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)3, 16u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)4, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 8u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)4, 32u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)6, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)3, 16u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::direct_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)6, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)0, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)4, 32u> >::visit<compact_radix_tree::tree<cell_and_hash, unsigned int>::walking_visitor<row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}>(ser::deletable_row__cells<bytes_ostream>&&) const::{lambda(unsigned int, cell_and_hash const&)scylladb#1}, true>&> at ././utils/compact-radix-tree.hh:807 | | ++[4#1/1 100%] addr=0x1596f4a total=329 count=5 avg=66: | | | compact_radix_tree::tree<cell_and_hash, unsigned int>::node_head::visit<compact_radix_tree::tree<cell_and_hash, unsigned int>::walking_visitor<row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}>(ser::deletable_row__cells<bytes_ostream>&&) const::{lambda(unsigned int, cell_and_hash const&)scylladb#1}, true> > at ././utils/compact-radix-tree.hh:473 | | | (inlined by) compact_radix_tree::tree<cell_and_hash, unsigned int>::visit<compact_radix_tree::tree<cell_and_hash, unsigned int>::walking_visitor<row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}>(ser::deletable_row__cells<bytes_ostream>&&) const::{lambda(unsigned int, cell_and_hash const&)scylladb#1}, true> > at ././utils/compact-radix-tree.hh:1626 | | | (inlined by) compact_radix_tree::tree<cell_and_hash, unsigned int>::walk<row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}>(ser::deletable_row__cells<bytes_ostream>&&) const::{lambda(unsigned int, cell_and_hash const&)scylladb#1}> at ././utils/compact-radix-tree.hh:1909 | | | (inlined by) row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}> at ./mutation/mutation_partition.hh:151 | | | (inlined by) (anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> > at ./mutation/mutation_partition_serializer.cc:97 | | | (inlined by) write_row<ser::writer_of_deletable_row<bytes_ostream> > at ./mutation/mutation_partition_serializer.cc:168 | | ++[5#1/2 80%] addr=0x15a310c total=263 count=4 avg=66: | | | mutation_partition_serializer::write_serialized<ser::writer_of_mutation_partition<bytes_ostream> > at ./mutation/mutation_partition_serializer.cc:180 | | | ++[6#1/2 62%] addr=0x14eb60a total=428 count=7 avg=61: | | | | frozen_mutation::frozen_mutation(mutation const&)::$_0::operator()<ser::writer_of_mutation_partition<bytes_ostream> > at ./mutation/frozen_mutation.cc:85 | | | | (inlined by) ser::after_mutation__key<bytes_ostream>::partition<frozen_mutation::frozen_mutation(mutation const&)::$_0> at ./build/dev/gen/idl/mutation.dist.impl.hh:7058 | | | | (inlined by) frozen_mutation::frozen_mutation at ./mutation/frozen_mutation.cc:84 | | | | ++[7#1/1 100%] addr=0x14ed388 total=532 count=9 avg=59: | | | | | freeze at ./mutation/frozen_mutation.cc:143 | | | | ++[8#1/2 74%] addr=0x252cf55 total=394 count=6 avg=66: | | | | | service::storage_service::merge_topology_snapshot at ./service/storage_service.cc:763 ``` This change adds a preemptible mutation_partition_serializer called from `mutation_partition_serializer::write_in_thread` and latter called from `freeze_in_thread`. A thread is started in `storage_service::merge_topology_snapshot` in which we expect large enough mutations to warrant a seastar thread, and taking into account it is not on the hot data path, but rather it's on the management plane. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Unfreezing large mutations inside memtable::apply may cause stalls as seen with the test_add_many_nodes_under_load dtest ``` ++[1#1/159 4%] addr=0x1638989 total=568 count=7 avg=81: | utils::uleb64_express_encode_impl at ././utils/vle.hh:73 | (inlined by) utils::uleb64_express_encode<void (&)(char const*, unsigned long), void (&)(char const*, unsigned long)> at ././utils/vle.hh:82 | (inlined by) logalloc::region_impl::object_descriptor::encode at ./utils/logalloc.cc:1658 | (inlined by) logalloc::region_impl::alloc_small at ./utils/logalloc.cc:1743 ++ - addr=0x163a59f: | logalloc::region_impl::alloc at ./utils/logalloc.cc:2104 | ++[2#1/7 56%] addr=0x1172c9c total=669 count=9 avg=74: | | managed_bytes::managed_bytes at ././utils/managed_bytes.hh:552 | | ++[3#1/3 74%] addr=0x15539b6 total=547 count=7 avg=78: | | | compound_wrapper<clustering_key_prefix, clustering_key_prefix_view>::compound_wrapper at ././keys.hh:149 | | | (inlined by) prefix_compound_wrapper<clustering_key_prefix, clustering_key_prefix_view, clustering_key_prefix>::prefix_compound_wrapper at ././keys.hh:574 | | | (inlined by) clustering_key_prefix::clustering_key_prefix at ././keys.hh:865 | | | (inlined by) rows_entry::rows_entry at ./mutation/mutation_partition.hh:976 | | | ++[4#1/1 100%] addr=0x1543379 total=962 count=18 avg=53: | | | | allocation_strategy::construct<rows_entry, schema const&, rows_entry const&> at ././utils/allocation_strategy.hh:160 | | | ++[5#1/1 100%] addr=0x15289e3 total=2231 count=48 avg=46: | | | | mutation_partition::mutation_partition(schema const&, mutation_partition const&)::$_0::operator() at ./mutation/mutation_partition.cc:149 | | | | (inlined by) intrusive_b::node<rows_entry, &rows_entry::_link, rows_entry::tri_compare, 12ul, 20ul, (intrusive_b::key_search)0, (intrusive_b::with_debug)0>::clone<mutation_partition::mutation_partition(schema const&, mutation_partition const&)::$_0&, current_deleter<rows_entry>()::{lambda(rows_entry*)scylladb#1}&> at ././utils/intrusive_btree.hh:2083 | | | ++[6#1/1 100%] addr=0x1528a29 total=2310 count=50 avg=46: | | | | intrusive_b::node<rows_entry, &rows_entry::_link, rows_entry::tri_compare, 12ul, 20ul, (intrusive_b::key_search)0, (intrusive_b::with_debug)0>::clone<mutation_partition::mutation_partition(schema const&, mutation_partition const&)::$_0&, current_deleter<rows_entry>()::{lambda(rows_entry*)scylladb#1}&> at ././utils/intrusive_btree.hh:2094 | | | ++[7#1/2 67%] addr=0x1528a29 total=4629 count=95 avg=49: | | | | intrusive_b::node<rows_entry, &rows_entry::_link, rows_entry::tri_compare, 12ul, 20ul, (intrusive_b::key_search)0, (intrusive_b::with_debug)0>::clone<mutation_partition::mutation_partition(schema const&, mutation_partition const&)::$_0&, current_deleter<rows_entry>()::{lambda(rows_entry*)scylladb#1}&> at ././utils/intrusive_btree.hh:2094 | | | -> continued at addr=0x1528a29 above | | | ++[7#2/2 33%] addr=0x1516e9b total=2310 count=50 avg=46: | | | | intrusive_b::tree<rows_entry, &rows_entry::_link, rows_entry::tri_compare, 12ul, 20ul, (intrusive_b::key_search)0, (intrusive_b::with_debug)0>::clone_from<mutation_partition::mutation_partition(schema const&, mutation_partition const&)::$_0&, current_deleter<rows_entry>()::{lambda(rows_entry*)scylladb#1}> at ././utils/intrusive_btree.hh:638 | | | | (inlined by) mutation_partition::mutation_partition at ./mutation/mutation_partition.cc:151 | | | ++ - addr=0x15a6782: | | | | partition_entry::apply at ./mutation/partition_version.cc:427 | | | ++[8#1/2 99%] addr=0x14a0ebe total=2277 count=49 avg=46: | | | | replica::memtable::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const> const&, db::rp_handle&&)::$_0::operator()() const::{lambda()scylladb#1}::operator() at ./replica/memtable.cc:806 | | | | (inlined by) logalloc::allocating_section::with_reclaiming_disabled<replica::memtable::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const> const&, db::rp_handle&&)::$_0::operator()() const::{lambda()scylladb#1}&> at ././utils/logalloc.hh:500 | | | | (inlined by) logalloc::allocating_section::operator()<replica::memtable::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const> const&, db::rp_handle&&)::$_0::operator()() const::{lambda()scylladb#1}>(logalloc::region&, replica::memtable::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const> const&, db::rp_handle&&)::$_0::operator()() const::{lambda()scylladb#1}&&)::{lambda()scylladb#1}::operator() at ././utils/logalloc.hh:527 | | | | (inlined by) logalloc::allocating_section::with_reserve<logalloc::allocating_section::operator()<replica::memtable::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const> const&, db::rp_handle&&)::$_0::operator()() const::{lambda()scylladb#1}>(logalloc::region&, replica::memtable::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const> const&, db::rp_handle&&)::$_0::operator()() const::{lambda()scylladb#1}&&)::{lambda()scylladb#1}> at ././utils/logalloc.hh:471 | | | | (inlined by) logalloc::allocating_section::operator()<replica::memtable::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const> const&, db::rp_handle&&)::$_0::operator()() const::{lambda()scylladb#1}> at ././utils/logalloc.hh:526 | | | | (inlined by) replica::memtable::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const> const&, db::rp_handle&&)::$_0::operator() at ./replica/memtable.cc:800 | | | | (inlined by) with_allocator<replica::memtable::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const> const&, db::rp_handle&&)::$_0> at ././utils/allocation_strategy.hh:318 | | | | (inlined by) replica::memtable::apply at ./replica/memtable.cc:799 | | | | ++[9#1/1 100%] addr=0x145740b total=2697 count=60 avg=45: | | | | | replica::table::do_apply<frozen_mutation const&, seastar::lw_shared_ptr<schema const>&> at ./replica/table.cc:2897 | | | | ++[10#1/2 57%] addr=0x1417dcd total=5467 count=147 avg=37: | | | | | replica::table::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const>, db::rp_handle&&, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_1::operator() at ./replica/table.cc:2933 | | | | | (inlined by) seastar::futurize<void>::invoke<replica::table::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const>, db::rp_handle&&, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_1&> at ././seastar/include/seastar/core/future.hh:2019 | | | | | (inlined by) seastar::futurize_invoke<replica::table::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const>, db::rp_handle&&, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_1&> at ././seastar/include/seastar/core/future.hh:2053 | | | | | (inlined by) replica::dirty_memory_manager_logalloc::region_group::run_when_memory_available<replica::table::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const>, db::rp_handle&&, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_1> at ./replica/dirty_memory_manager.hh:572 | | | | | (inlined by) replica::table::apply at ./replica/table.cc:2932 | | | | ++ - addr=0x106f00a: | | | | | std::__n4861::coroutine_handle<seastar::internal::coroutine_traits_base<void>::promise_type>::resume at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/coroutine:240 | | | | | (inlined by) seastar::internal::coroutine_traits_base<void>::promise_type::run_and_dispose at ././seastar/include/seastar/core/coroutine.hh:125 ``` This patch unfreezes all frozen mutations at the (async) table::apply layer where we can yield if needed and then eliminates the memtable apply path, given a frozen_mutation so all callers of memtable::apply will need to unfreeze their mutations first. Fortunately, this only place this is needed outside the replica::table layer is a unit test. With this change, table::do_apply can be detemplatized since it now accepts only ref to mutation. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Perevent stalls from "unpacking" of large canonical mutations seen with test_add_many_nodes_under_load when called from `group0_state_machine::transfer_snapshot`: ``` ++[1#1/44 14%] addr=0x395b2f total=569 count=6 avg=95: ?? ??:0 | ++[2#1/2 56%] addr=0x3991e3 total=321 count=4 avg=80: ?? ??:0 | ++ - addr=0x1587159: | | std::__new_allocator<seastar::basic_sstring<signed char, unsigned int, 31u, false> >::allocate at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/new_allocator.h:147 | | (inlined by) std::allocator<seastar::basic_sstring<signed char, unsigned int, 31u, false> >::allocate at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/allocator.h:198 | | (inlined by) std::allocator_traits<std::allocator<seastar::basic_sstring<signed char, unsigned int, 31u, false> > >::allocate at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/alloc_traits.h:482 | | (inlined by) std::_Vector_base<seastar::basic_sstring<signed char, unsigned int, 31u, false>, std::allocator<seastar::basic_sstring<signed char, unsigned int, 31u, false> > >::_M_allocate at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/stl_vector.h:378 | | (inlined by) std::vector<seastar::basic_sstring<signed char, unsigned int, 31u, false>, std::allocator<seastar::basic_sstring<signed char, unsigned int, 31u, false> > >::reserve at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/vector.tcc:79 | | (inlined by) ser::idl::serializers::internal::vector_serializer<std::vector<seastar::basic_sstring<signed char, unsigned int, 31u, false>, std::allocator<seastar::basic_sstring<signed char, unsigned int, 31u, false> > > >::read<seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator> > at ././serializer_impl.hh:226 | | (inlined by) ser::deserialize<std::vector<seastar::basic_sstring<signed char, unsigned int, 31u, false>, std::allocator<seastar::basic_sstring<signed char, unsigned int, 31u, false> > >, seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator> > at ././serializer.hh:264 | | (inlined by) ser::serializer<clustering_key_prefix>::read<seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator> >(seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator>&)::{lambda(auto:1&)scylladb#1}::operator()<seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator> > at ./build/dev/gen/idl/keys.dist.impl.hh:31 | ++ - addr=0x1587085: | | seastar::with_serialized_stream<seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator>, ser::serializer<clustering_key_prefix>::read<seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator> >(seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator>&)::{lambda(auto:1&)scylladb#1}, void, void> at ././seastar/include/seastar/core/simple-stream.hh:646 | | (inlined by) ser::serializer<clustering_key_prefix>::read<seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator> > at ./build/dev/gen/idl/keys.dist.impl.hh:28 | | (inlined by) ser::deserialize<clustering_key_prefix, seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator> > at ././serializer.hh:264 | | (inlined by) ser::deletable_row_view::key() const::{lambda(auto:1&)scylladb#1}::operator()<seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator> const> at ./build/dev/gen/idl/mutation.dist.impl.hh:1268 | | ++[3#1/1 100%] addr=0x15865a3 total=577 count=7 avg=82: | | | seastar::memory_input_stream<bytes_ostream::fragment_iterator>::with_stream<ser::deletable_row_view::key() const::{lambda(auto:1&)scylladb#1}> at ././seastar/include/seastar/core/simple-stream.hh:491 | | | (inlined by) seastar::with_serialized_stream<seastar::memory_input_stream<bytes_ostream::fragment_iterator> const, ser::deletable_row_view::key() const::{lambda(auto:1&)scylladb#1}, void> at ././seastar/include/seastar/core/simple-stream.hh:639 | | | (inlined by) ser::deletable_row_view::key at ./build/dev/gen/idl/mutation.dist.impl.hh:1264 | | ++[4#1/1 100%] addr=0x157cf27 total=643 count=8 avg=80: | | | mutation_partition_view::do_accept<partition_builder> at ./mutation/mutation_partition_view.cc:212 | | ++ - addr=0x1516cac: | | | mutation_partition::apply at ./mutation/mutation_partition.cc:497 | | ++[5#1/1 100%] addr=0x14e4433 total=1765 count=22 avg=80: | | | canonical_mutation::to_mutation at ./mutation/canonical_mutation.cc:60 | | ++[6#1/2 98%] addr=0x2452a60 total=1732 count=21 avg=82: | | | service::storage_service::merge_topology_snapshot at ./service/storage_service.cc:761 | | ++ - addr=0x2858782: | | | service::group0_state_machine::transfer_snapshot at ./service/raft/group0_state_machine.cc:303 ``` Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Freezing large mutations synchronously may cause reactor stalls, as seen in the test_add_many_nodes_under_load dtest: ``` ++[1#1/37 5%] addr=0x15b0bf total=99 count=2 avg=50: ?? ??:0 | ++[2#1/2 67%] addr=0x15a331f total=66 count=1 avg=66: | | bytes_ostream::write at ././bytes_ostream.hh:248 | | (inlined by) bytes_ostream::write at ././bytes_ostream.hh:263 | | (inlined by) ser::serialize_integral<unsigned int, bytes_ostream> at ././serializer.hh:203 | | (inlined by) ser::integral_serializer<unsigned int>::write<bytes_ostream> at ././serializer.hh:217 | | (inlined by) ser::serialize<unsigned int, bytes_ostream> at ././serializer.hh:254 | | (inlined by) ser::writer_of_column<bytes_ostream>::write_id at ./build/dev/gen/idl/mutation.dist.impl.hh:4680 | | ++[3#1/1 100%] addr=0x159df71 total=132 count=2 avg=66: | | | (anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}::operator() at ./mutation/mutation_partition_serializer.cc:99 | | | (inlined by) row::maybe_invoke_with_hash<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1} const, cell_and_hash const> at ./mutation/mutation_partition.hh:133 | | | (inlined by) row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}>(ser::deletable_row__cells<bytes_ostream>&&) const::{lambda(unsigned int, cell_and_hash const&)scylladb#1}::operator() at ./mutation/mutation_partition.hh:152 | | | (inlined by) compact_radix_tree::tree<cell_and_hash, unsigned int>::walking_visitor<row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}>(ser::deletable_row__cells<bytes_ostream>&&) const::{lambda(unsigned int, cell_and_hash const&)scylladb#1}, true>::operator() at ././utils/compact-radix-tree.hh:1888 | | | (inlined by) compact_radix_tree::tree<cell_and_hash, unsigned int>::visit_slot<compact_radix_tree::tree<cell_and_hash, unsigned int>::walking_visitor<row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}>(ser::deletable_row__cells<bytes_ostream>&&) const::{lambda(unsigned int, cell_and_hash const&)scylladb#1}, true>&> at ././utils/compact-radix-tree.hh:1560 | | ++ - addr=0x159d84d: | | | compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)1, 4u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)0, 0u>::visit<compact_radix_tree::tree<cell_and_hash, unsigned int>::walking_visitor<row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}>(ser::deletable_row__cells<bytes_ostream>&&) const::{lambda(unsigned int, cell_and_hash const&)scylladb#1}, true>&> at ././utils/compact-radix-tree.hh:1364 | | | (inlined by) compact_radix_tree::tree<cell_and_hash, unsigned int>::node_base<cell_and_hash, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)1, 4u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)0, 0u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 8u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)3, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)1, 4u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)3, 16u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)4, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 8u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)4, 32u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)6, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)3, 16u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::direct_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)6, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)0, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)4, 32u> >::visit<compact_radix_tree::tree<cell_and_hash, unsigned int>::walking_visitor<row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}>(ser::deletable_row__cells<bytes_ostream>&&) const::{lambda(unsigned int, cell_and_hash const&)scylladb#1}, true>&, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)1, 4u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)0, 0u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 8u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)3, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)1, 4u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)3, 16u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)4, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 8u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)4, 32u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)6, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)3, 16u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::direct_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)6, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)0, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)4, 32u> > at ././utils/compact-radix-tree.hh:799 | | | (inlined by) compact_radix_tree::tree<cell_and_hash, unsigned int>::node_base<cell_and_hash, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)1, 4u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)0, 0u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 8u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)3, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)1, 4u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)3, 16u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)4, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 8u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)4, 32u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)6, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)3, 16u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::direct_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)6, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)0, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)4, 32u> >::visit<compact_radix_tree::tree<cell_and_hash, unsigned int>::walking_visitor<row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}>(ser::deletable_row__cells<bytes_ostream>&&) const::{lambda(unsigned int, cell_and_hash const&)scylladb#1}, true>&> at ././utils/compact-radix-tree.hh:807 | | ++[4#1/1 100%] addr=0x1596f4a total=329 count=5 avg=66: | | | compact_radix_tree::tree<cell_and_hash, unsigned int>::node_head::visit<compact_radix_tree::tree<cell_and_hash, unsigned int>::walking_visitor<row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}>(ser::deletable_row__cells<bytes_ostream>&&) const::{lambda(unsigned int, cell_and_hash const&)scylladb#1}, true> > at ././utils/compact-radix-tree.hh:473 | | | (inlined by) compact_radix_tree::tree<cell_and_hash, unsigned int>::visit<compact_radix_tree::tree<cell_and_hash, unsigned int>::walking_visitor<row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}>(ser::deletable_row__cells<bytes_ostream>&&) const::{lambda(unsigned int, cell_and_hash const&)scylladb#1}, true> > at ././utils/compact-radix-tree.hh:1626 | | | (inlined by) compact_radix_tree::tree<cell_and_hash, unsigned int>::walk<row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}>(ser::deletable_row__cells<bytes_ostream>&&) const::{lambda(unsigned int, cell_and_hash const&)scylladb#1}> at ././utils/compact-radix-tree.hh:1909 | | | (inlined by) row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}> at ./mutation/mutation_partition.hh:151 | | | (inlined by) (anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> > at ./mutation/mutation_partition_serializer.cc:97 | | | (inlined by) write_row<ser::writer_of_deletable_row<bytes_ostream> > at ./mutation/mutation_partition_serializer.cc:168 | | ++[5#1/2 80%] addr=0x15a310c total=263 count=4 avg=66: | | | mutation_partition_serializer::write_serialized<ser::writer_of_mutation_partition<bytes_ostream> > at ./mutation/mutation_partition_serializer.cc:180 | | | ++[6#1/2 62%] addr=0x14eb60a total=428 count=7 avg=61: | | | | frozen_mutation::frozen_mutation(mutation const&)::$_0::operator()<ser::writer_of_mutation_partition<bytes_ostream> > at ./mutation/frozen_mutation.cc:85 | | | | (inlined by) ser::after_mutation__key<bytes_ostream>::partition<frozen_mutation::frozen_mutation(mutation const&)::$_0> at ./build/dev/gen/idl/mutation.dist.impl.hh:7058 | | | | (inlined by) frozen_mutation::frozen_mutation at ./mutation/frozen_mutation.cc:84 | | | | ++[7#1/1 100%] addr=0x14ed388 total=532 count=9 avg=59: | | | | | freeze at ./mutation/frozen_mutation.cc:143 | | | | ++[8#1/2 74%] addr=0x252cf55 total=394 count=6 avg=66: | | | | | service::storage_service::merge_topology_snapshot at ./service/storage_service.cc:763 ``` This change adds a preemptible mutation_partition_serializer called from `mutation_partition_serializer::write_in_thread` and latter called from `freeze_in_thread`. A thread is started in `storage_service::merge_topology_snapshot` in which we expect large enough mutations to warrant a seastar thread, and taking into account it is not on the hot data path, but rather it's on the management plane. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Perevent stalls from "unpacking" of large canonical mutations seen with test_add_many_nodes_under_load when called from `group0_state_machine::transfer_snapshot`: ``` ++[1#1/44 14%] addr=0x395b2f total=569 count=6 avg=95: ?? ??:0 | ++[2#1/2 56%] addr=0x3991e3 total=321 count=4 avg=80: ?? ??:0 | ++ - addr=0x1587159: | | std::__new_allocator<seastar::basic_sstring<signed char, unsigned int, 31u, false> >::allocate at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/new_allocator.h:147 | | (inlined by) std::allocator<seastar::basic_sstring<signed char, unsigned int, 31u, false> >::allocate at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/allocator.h:198 | | (inlined by) std::allocator_traits<std::allocator<seastar::basic_sstring<signed char, unsigned int, 31u, false> > >::allocate at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/alloc_traits.h:482 | | (inlined by) std::_Vector_base<seastar::basic_sstring<signed char, unsigned int, 31u, false>, std::allocator<seastar::basic_sstring<signed char, unsigned int, 31u, false> > >::_M_allocate at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/stl_vector.h:378 | | (inlined by) std::vector<seastar::basic_sstring<signed char, unsigned int, 31u, false>, std::allocator<seastar::basic_sstring<signed char, unsigned int, 31u, false> > >::reserve at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/vector.tcc:79 | | (inlined by) ser::idl::serializers::internal::vector_serializer<std::vector<seastar::basic_sstring<signed char, unsigned int, 31u, false>, std::allocator<seastar::basic_sstring<signed char, unsigned int, 31u, false> > > >::read<seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator> > at ././serializer_impl.hh:226 | | (inlined by) ser::deserialize<std::vector<seastar::basic_sstring<signed char, unsigned int, 31u, false>, std::allocator<seastar::basic_sstring<signed char, unsigned int, 31u, false> > >, seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator> > at ././serializer.hh:264 | | (inlined by) ser::serializer<clustering_key_prefix>::read<seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator> >(seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator>&)::{lambda(auto:1&)scylladb#1}::operator()<seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator> > at ./build/dev/gen/idl/keys.dist.impl.hh:31 | ++ - addr=0x1587085: | | seastar::with_serialized_stream<seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator>, ser::serializer<clustering_key_prefix>::read<seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator> >(seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator>&)::{lambda(auto:1&)scylladb#1}, void, void> at ././seastar/include/seastar/core/simple-stream.hh:646 | | (inlined by) ser::serializer<clustering_key_prefix>::read<seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator> > at ./build/dev/gen/idl/keys.dist.impl.hh:28 | | (inlined by) ser::deserialize<clustering_key_prefix, seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator> > at ././serializer.hh:264 | | (inlined by) ser::deletable_row_view::key() const::{lambda(auto:1&)scylladb#1}::operator()<seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator> const> at ./build/dev/gen/idl/mutation.dist.impl.hh:1268 | | ++[3#1/1 100%] addr=0x15865a3 total=577 count=7 avg=82: | | | seastar::memory_input_stream<bytes_ostream::fragment_iterator>::with_stream<ser::deletable_row_view::key() const::{lambda(auto:1&)scylladb#1}> at ././seastar/include/seastar/core/simple-stream.hh:491 | | | (inlined by) seastar::with_serialized_stream<seastar::memory_input_stream<bytes_ostream::fragment_iterator> const, ser::deletable_row_view::key() const::{lambda(auto:1&)scylladb#1}, void> at ././seastar/include/seastar/core/simple-stream.hh:639 | | | (inlined by) ser::deletable_row_view::key at ./build/dev/gen/idl/mutation.dist.impl.hh:1264 | | ++[4#1/1 100%] addr=0x157cf27 total=643 count=8 avg=80: | | | mutation_partition_view::do_accept<partition_builder> at ./mutation/mutation_partition_view.cc:212 | | ++ - addr=0x1516cac: | | | mutation_partition::apply at ./mutation/mutation_partition.cc:497 | | ++[5#1/1 100%] addr=0x14e4433 total=1765 count=22 avg=80: | | | canonical_mutation::to_mutation at ./mutation/canonical_mutation.cc:60 | | ++[6#1/2 98%] addr=0x2452a60 total=1732 count=21 avg=82: | | | service::storage_service::merge_topology_snapshot at ./service/storage_service.cc:761 | | ++ - addr=0x2858782: | | | service::group0_state_machine::transfer_snapshot at ./service/raft/group0_state_machine.cc:303 ``` Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Freezing large mutations synchronously may cause reactor stalls, as seen in the test_add_many_nodes_under_load dtest: ``` ++[1#1/37 5%] addr=0x15b0bf total=99 count=2 avg=50: ?? ??:0 | ++[2#1/2 67%] addr=0x15a331f total=66 count=1 avg=66: | | bytes_ostream::write at ././bytes_ostream.hh:248 | | (inlined by) bytes_ostream::write at ././bytes_ostream.hh:263 | | (inlined by) ser::serialize_integral<unsigned int, bytes_ostream> at ././serializer.hh:203 | | (inlined by) ser::integral_serializer<unsigned int>::write<bytes_ostream> at ././serializer.hh:217 | | (inlined by) ser::serialize<unsigned int, bytes_ostream> at ././serializer.hh:254 | | (inlined by) ser::writer_of_column<bytes_ostream>::write_id at ./build/dev/gen/idl/mutation.dist.impl.hh:4680 | | ++[3#1/1 100%] addr=0x159df71 total=132 count=2 avg=66: | | | (anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}::operator() at ./mutation/mutation_partition_serializer.cc:99 | | | (inlined by) row::maybe_invoke_with_hash<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1} const, cell_and_hash const> at ./mutation/mutation_partition.hh:133 | | | (inlined by) row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}>(ser::deletable_row__cells<bytes_ostream>&&) const::{lambda(unsigned int, cell_and_hash const&)scylladb#1}::operator() at ./mutation/mutation_partition.hh:152 | | | (inlined by) compact_radix_tree::tree<cell_and_hash, unsigned int>::walking_visitor<row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}>(ser::deletable_row__cells<bytes_ostream>&&) const::{lambda(unsigned int, cell_and_hash const&)scylladb#1}, true>::operator() at ././utils/compact-radix-tree.hh:1888 | | | (inlined by) compact_radix_tree::tree<cell_and_hash, unsigned int>::visit_slot<compact_radix_tree::tree<cell_and_hash, unsigned int>::walking_visitor<row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}>(ser::deletable_row__cells<bytes_ostream>&&) const::{lambda(unsigned int, cell_and_hash const&)scylladb#1}, true>&> at ././utils/compact-radix-tree.hh:1560 | | ++ - addr=0x159d84d: | | | compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)1, 4u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)0, 0u>::visit<compact_radix_tree::tree<cell_and_hash, unsigned int>::walking_visitor<row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}>(ser::deletable_row__cells<bytes_ostream>&&) const::{lambda(unsigned int, cell_and_hash const&)scylladb#1}, true>&> at ././utils/compact-radix-tree.hh:1364 | | | (inlined by) compact_radix_tree::tree<cell_and_hash, unsigned int>::node_base<cell_and_hash, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)1, 4u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)0, 0u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 8u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)3, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)1, 4u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)3, 16u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)4, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 8u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)4, 32u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)6, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)3, 16u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::direct_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)6, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)0, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)4, 32u> >::visit<compact_radix_tree::tree<cell_and_hash, unsigned int>::walking_visitor<row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}>(ser::deletable_row__cells<bytes_ostream>&&) const::{lambda(unsigned int, cell_and_hash const&)scylladb#1}, true>&, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)1, 4u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)0, 0u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 8u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)3, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)1, 4u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)3, 16u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)4, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 8u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)4, 32u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)6, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)3, 16u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::direct_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)6, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)0, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)4, 32u> > at ././utils/compact-radix-tree.hh:799 | | | (inlined by) compact_radix_tree::tree<cell_and_hash, unsigned int>::node_base<cell_and_hash, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)1, 4u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)0, 0u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 8u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)3, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)1, 4u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)3, 16u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)4, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 8u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)4, 32u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)6, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)3, 16u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::direct_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)6, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)0, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)4, 32u> >::visit<compact_radix_tree::tree<cell_and_hash, unsigned int>::walking_visitor<row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}>(ser::deletable_row__cells<bytes_ostream>&&) const::{lambda(unsigned int, cell_and_hash const&)scylladb#1}, true>&> at ././utils/compact-radix-tree.hh:807 | | ++[4#1/1 100%] addr=0x1596f4a total=329 count=5 avg=66: | | | compact_radix_tree::tree<cell_and_hash, unsigned int>::node_head::visit<compact_radix_tree::tree<cell_and_hash, unsigned int>::walking_visitor<row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}>(ser::deletable_row__cells<bytes_ostream>&&) const::{lambda(unsigned int, cell_and_hash const&)scylladb#1}, true> > at ././utils/compact-radix-tree.hh:473 | | | (inlined by) compact_radix_tree::tree<cell_and_hash, unsigned int>::visit<compact_radix_tree::tree<cell_and_hash, unsigned int>::walking_visitor<row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}>(ser::deletable_row__cells<bytes_ostream>&&) const::{lambda(unsigned int, cell_and_hash const&)scylladb#1}, true> > at ././utils/compact-radix-tree.hh:1626 | | | (inlined by) compact_radix_tree::tree<cell_and_hash, unsigned int>::walk<row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}>(ser::deletable_row__cells<bytes_ostream>&&) const::{lambda(unsigned int, cell_and_hash const&)scylladb#1}> at ././utils/compact-radix-tree.hh:1909 | | | (inlined by) row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}> at ./mutation/mutation_partition.hh:151 | | | (inlined by) (anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> > at ./mutation/mutation_partition_serializer.cc:97 | | | (inlined by) write_row<ser::writer_of_deletable_row<bytes_ostream> > at ./mutation/mutation_partition_serializer.cc:168 | | ++[5#1/2 80%] addr=0x15a310c total=263 count=4 avg=66: | | | mutation_partition_serializer::write_serialized<ser::writer_of_mutation_partition<bytes_ostream> > at ./mutation/mutation_partition_serializer.cc:180 | | | ++[6#1/2 62%] addr=0x14eb60a total=428 count=7 avg=61: | | | | frozen_mutation::frozen_mutation(mutation const&)::$_0::operator()<ser::writer_of_mutation_partition<bytes_ostream> > at ./mutation/frozen_mutation.cc:85 | | | | (inlined by) ser::after_mutation__key<bytes_ostream>::partition<frozen_mutation::frozen_mutation(mutation const&)::$_0> at ./build/dev/gen/idl/mutation.dist.impl.hh:7058 | | | | (inlined by) frozen_mutation::frozen_mutation at ./mutation/frozen_mutation.cc:84 | | | | ++[7#1/1 100%] addr=0x14ed388 total=532 count=9 avg=59: | | | | | freeze at ./mutation/frozen_mutation.cc:143 | | | | ++[8#1/2 74%] addr=0x252cf55 total=394 count=6 avg=66: | | | | | service::storage_service::merge_topology_snapshot at ./service/storage_service.cc:763 ``` This change adds a preemptible mutation_partition_serializer called from `mutation_partition_serializer::write_in_thread` and latter called from `freeze_in_thread`. A thread is started in `storage_service::merge_topology_snapshot` in which we expect large enough mutations to warrant a seastar thread, and taking into account it is not on the hot data path, but rather it's on the management plane. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Prevent stalls coming from applying large mutations in memory synchronously, like the ones seen with the test_add_many_nodes_under_load dtest: ``` | | | ++[5#2/2 44%] addr=0x1498efb total=256 count=3 avg=85: | | | | replica::memtable::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const> const&, db::rp_handle&&)::$_0::operator()() const::{lambda()scylladb#1}::operator() at ./replica/memtable.cc:804 | | | | (inlined by) logalloc::allocating_section::with_reclaiming_disabled<replica::memtable::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const> const&, db::rp_handle&&)::$_0::operator()() const::{lambda()scylladb#1}&> at ././utils/logalloc.hh:500 | | | | (inlined by) logalloc::allocating_section::operator()<replica::memtable::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const> const&, db::rp_handle&&)::$_0::operator()() const::{lambda()scylladb#1}>(logalloc::region&, replica::memtable::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const> const&, db::rp_handle&&)::$_0::operator()() const::{lambda()scylladb#1}&&)::{lambda()scylladb#1}::operator() at ././utils/logalloc.hh:527 | | | | (inlined by) logalloc::allocating_section::with_reserve<logalloc::allocating_section::operator()<replica::memtable::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const> const&, db::rp_handle&&)::$_0::operator()() const::{lambda()scylladb#1}>(logalloc::region&, replica::memtable::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const> const&, db::rp_handle&&)::$_0::operator()() const::{lambda()scylladb#1}&&)::{lambda()scylladb#1}> at ././utils/logalloc.hh:471 | | | | (inlined by) logalloc::allocating_section::operator()<replica::memtable::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const> const&, db::rp_handle&&)::$_0::operator()() const::{lambda()scylladb#1}> at ././utils/logalloc.hh:526 | | | | (inlined by) replica::memtable::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const> const&, db::rp_handle&&)::$_0::operator() at ./replica/memtable.cc:800 | | | | (inlined by) with_allocator<replica::memtable::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const> const&, db::rp_handle&&)::$_0> at ././utils/allocation_strategy.hh:318 | | | | (inlined by) replica::memtable::apply at ./replica/memtable.cc:799 | | | ++[6#1/1 100%] addr=0x145047b total=1731 count=21 avg=82: | | | | replica::table::do_apply<frozen_mutation const&, seastar::lw_shared_ptr<schema const>&> at ./replica/table.cc:2896 | | | ++[7#1/1 100%] addr=0x13ddccb total=2852 count=32 avg=89: | | | | replica::table::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const>, db::rp_handle&&, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_0::operator() at ./replica/table.cc:2924 | | | | (inlined by) seastar::futurize<void>::invoke<replica::table::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const>, db::rp_handle&&, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_0&> at ././seastar/include/seastar/core/future.hh:2032 | | | | (inlined by) seastar::futurize_invoke<replica::table::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const>, db::rp_handle&&, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_0&> at ././seastar/include/seastar/core/future.hh:2066 | | | | (inlined by) replica::dirty_memory_manager_logalloc::region_group::run_when_memory_available<replica::table::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const>, db::rp_handle&&, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_0> at ./replica/dirty_memory_manager.hh:572 | | | | (inlined by) replica::table::apply at ./replica/table.cc:2923 | | | ++ - addr=0x1330ba1: | | | | replica::database::apply_in_memory at ./replica/database.cc:1812 | | | ++ - addr=0x1360054: | | | | replica::database::do_apply at ./replica/database.cc:2032 ``` This change has virtuall no effect on small mutations (up to 128KB in size). build/release/scylla perf-simple-query --write Before: median 1637412.01 tps ( 58.3 allocs/op, 14.2 tasks/op, 53782 insns/op, 0 errors) After: median 1621789.02 tps ( 58.3 allocs/op, 14.2 tasks/op, 53772 insns/op, 0 errors) To estimate the performance ramifications on large mutations, I measured perf-simple-query --write calling unfreeze_and_apply_in_memory in all cases: median 1580850.76 tps ( 70.3 allocs/op, 14.2 tasks/op, 54240 insns/op, 0 errors) Showing the extra allocations and <1% cpu overhead required for applying mutations vs. frozen mutations. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Perevent stalls from "unpacking" of large canonical mutations seen with test_add_many_nodes_under_load when called from `group0_state_machine::transfer_snapshot`: ``` ++[1#1/44 14%] addr=0x395b2f total=569 count=6 avg=95: ?? ??:0 | ++[2#1/2 56%] addr=0x3991e3 total=321 count=4 avg=80: ?? ??:0 | ++ - addr=0x1587159: | | std::__new_allocator<seastar::basic_sstring<signed char, unsigned int, 31u, false> >::allocate at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/new_allocator.h:147 | | (inlined by) std::allocator<seastar::basic_sstring<signed char, unsigned int, 31u, false> >::allocate at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/allocator.h:198 | | (inlined by) std::allocator_traits<std::allocator<seastar::basic_sstring<signed char, unsigned int, 31u, false> > >::allocate at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/alloc_traits.h:482 | | (inlined by) std::_Vector_base<seastar::basic_sstring<signed char, unsigned int, 31u, false>, std::allocator<seastar::basic_sstring<signed char, unsigned int, 31u, false> > >::_M_allocate at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/stl_vector.h:378 | | (inlined by) std::vector<seastar::basic_sstring<signed char, unsigned int, 31u, false>, std::allocator<seastar::basic_sstring<signed char, unsigned int, 31u, false> > >::reserve at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/vector.tcc:79 | | (inlined by) ser::idl::serializers::internal::vector_serializer<std::vector<seastar::basic_sstring<signed char, unsigned int, 31u, false>, std::allocator<seastar::basic_sstring<signed char, unsigned int, 31u, false> > > >::read<seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator> > at ././serializer_impl.hh:226 | | (inlined by) ser::deserialize<std::vector<seastar::basic_sstring<signed char, unsigned int, 31u, false>, std::allocator<seastar::basic_sstring<signed char, unsigned int, 31u, false> > >, seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator> > at ././serializer.hh:264 | | (inlined by) ser::serializer<clustering_key_prefix>::read<seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator> >(seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator>&)::{lambda(auto:1&)scylladb#1}::operator()<seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator> > at ./build/dev/gen/idl/keys.dist.impl.hh:31 | ++ - addr=0x1587085: | | seastar::with_serialized_stream<seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator>, ser::serializer<clustering_key_prefix>::read<seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator> >(seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator>&)::{lambda(auto:1&)scylladb#1}, void, void> at ././seastar/include/seastar/core/simple-stream.hh:646 | | (inlined by) ser::serializer<clustering_key_prefix>::read<seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator> > at ./build/dev/gen/idl/keys.dist.impl.hh:28 | | (inlined by) ser::deserialize<clustering_key_prefix, seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator> > at ././serializer.hh:264 | | (inlined by) ser::deletable_row_view::key() const::{lambda(auto:1&)scylladb#1}::operator()<seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator> const> at ./build/dev/gen/idl/mutation.dist.impl.hh:1268 | | ++[3#1/1 100%] addr=0x15865a3 total=577 count=7 avg=82: | | | seastar::memory_input_stream<bytes_ostream::fragment_iterator>::with_stream<ser::deletable_row_view::key() const::{lambda(auto:1&)scylladb#1}> at ././seastar/include/seastar/core/simple-stream.hh:491 | | | (inlined by) seastar::with_serialized_stream<seastar::memory_input_stream<bytes_ostream::fragment_iterator> const, ser::deletable_row_view::key() const::{lambda(auto:1&)scylladb#1}, void> at ././seastar/include/seastar/core/simple-stream.hh:639 | | | (inlined by) ser::deletable_row_view::key at ./build/dev/gen/idl/mutation.dist.impl.hh:1264 | | ++[4#1/1 100%] addr=0x157cf27 total=643 count=8 avg=80: | | | mutation_partition_view::do_accept<partition_builder> at ./mutation/mutation_partition_view.cc:212 | | ++ - addr=0x1516cac: | | | mutation_partition::apply at ./mutation/mutation_partition.cc:497 | | ++[5#1/1 100%] addr=0x14e4433 total=1765 count=22 avg=80: | | | canonical_mutation::to_mutation at ./mutation/canonical_mutation.cc:60 | | ++[6#1/2 98%] addr=0x2452a60 total=1732 count=21 avg=82: | | | service::storage_service::merge_topology_snapshot at ./service/storage_service.cc:761 | | ++ - addr=0x2858782: | | | service::group0_state_machine::transfer_snapshot at ./service/raft/group0_state_machine.cc:303 ``` Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Freezing large mutations synchronously may cause reactor stalls, as seen in the test_add_many_nodes_under_load dtest: ``` ++[1#1/37 5%] addr=0x15b0bf total=99 count=2 avg=50: ?? ??:0 | ++[2#1/2 67%] addr=0x15a331f total=66 count=1 avg=66: | | bytes_ostream::write at ././bytes_ostream.hh:248 | | (inlined by) bytes_ostream::write at ././bytes_ostream.hh:263 | | (inlined by) ser::serialize_integral<unsigned int, bytes_ostream> at ././serializer.hh:203 | | (inlined by) ser::integral_serializer<unsigned int>::write<bytes_ostream> at ././serializer.hh:217 | | (inlined by) ser::serialize<unsigned int, bytes_ostream> at ././serializer.hh:254 | | (inlined by) ser::writer_of_column<bytes_ostream>::write_id at ./build/dev/gen/idl/mutation.dist.impl.hh:4680 | | ++[3#1/1 100%] addr=0x159df71 total=132 count=2 avg=66: | | | (anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}::operator() at ./mutation/mutation_partition_serializer.cc:99 | | | (inlined by) row::maybe_invoke_with_hash<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1} const, cell_and_hash const> at ./mutation/mutation_partition.hh:133 | | | (inlined by) row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}>(ser::deletable_row__cells<bytes_ostream>&&) const::{lambda(unsigned int, cell_and_hash const&)scylladb#1}::operator() at ./mutation/mutation_partition.hh:152 | | | (inlined by) compact_radix_tree::tree<cell_and_hash, unsigned int>::walking_visitor<row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}>(ser::deletable_row__cells<bytes_ostream>&&) const::{lambda(unsigned int, cell_and_hash const&)scylladb#1}, true>::operator() at ././utils/compact-radix-tree.hh:1888 | | | (inlined by) compact_radix_tree::tree<cell_and_hash, unsigned int>::visit_slot<compact_radix_tree::tree<cell_and_hash, unsigned int>::walking_visitor<row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}>(ser::deletable_row__cells<bytes_ostream>&&) const::{lambda(unsigned int, cell_and_hash const&)scylladb#1}, true>&> at ././utils/compact-radix-tree.hh:1560 | | ++ - addr=0x159d84d: | | | compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)1, 4u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)0, 0u>::visit<compact_radix_tree::tree<cell_and_hash, unsigned int>::walking_visitor<row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}>(ser::deletable_row__cells<bytes_ostream>&&) const::{lambda(unsigned int, cell_and_hash const&)scylladb#1}, true>&> at ././utils/compact-radix-tree.hh:1364 | | | (inlined by) compact_radix_tree::tree<cell_and_hash, unsigned int>::node_base<cell_and_hash, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)1, 4u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)0, 0u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 8u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)3, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)1, 4u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)3, 16u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)4, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 8u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)4, 32u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)6, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)3, 16u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::direct_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)6, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)0, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)4, 32u> >::visit<compact_radix_tree::tree<cell_and_hash, unsigned int>::walking_visitor<row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}>(ser::deletable_row__cells<bytes_ostream>&&) const::{lambda(unsigned int, cell_and_hash const&)scylladb#1}, true>&, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)1, 4u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)0, 0u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 8u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)3, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)1, 4u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)3, 16u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)4, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 8u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)4, 32u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)6, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)3, 16u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::direct_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)6, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)0, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)4, 32u> > at ././utils/compact-radix-tree.hh:799 | | | (inlined by) compact_radix_tree::tree<cell_and_hash, unsigned int>::node_base<cell_and_hash, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)1, 4u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)0, 0u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 8u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)3, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)1, 4u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)3, 16u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)4, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 8u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)4, 32u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)6, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)3, 16u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::direct_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)6, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)0, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)4, 32u> >::visit<compact_radix_tree::tree<cell_and_hash, unsigned int>::walking_visitor<row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}>(ser::deletable_row__cells<bytes_ostream>&&) const::{lambda(unsigned int, cell_and_hash const&)scylladb#1}, true>&> at ././utils/compact-radix-tree.hh:807 | | ++[4#1/1 100%] addr=0x1596f4a total=329 count=5 avg=66: | | | compact_radix_tree::tree<cell_and_hash, unsigned int>::node_head::visit<compact_radix_tree::tree<cell_and_hash, unsigned int>::walking_visitor<row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}>(ser::deletable_row__cells<bytes_ostream>&&) const::{lambda(unsigned int, cell_and_hash const&)scylladb#1}, true> > at ././utils/compact-radix-tree.hh:473 | | | (inlined by) compact_radix_tree::tree<cell_and_hash, unsigned int>::visit<compact_radix_tree::tree<cell_and_hash, unsigned int>::walking_visitor<row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}>(ser::deletable_row__cells<bytes_ostream>&&) const::{lambda(unsigned int, cell_and_hash const&)scylladb#1}, true> > at ././utils/compact-radix-tree.hh:1626 | | | (inlined by) compact_radix_tree::tree<cell_and_hash, unsigned int>::walk<row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}>(ser::deletable_row__cells<bytes_ostream>&&) const::{lambda(unsigned int, cell_and_hash const&)scylladb#1}> at ././utils/compact-radix-tree.hh:1909 | | | (inlined by) row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}> at ./mutation/mutation_partition.hh:151 | | | (inlined by) (anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> > at ./mutation/mutation_partition_serializer.cc:97 | | | (inlined by) write_row<ser::writer_of_deletable_row<bytes_ostream> > at ./mutation/mutation_partition_serializer.cc:168 | | ++[5#1/2 80%] addr=0x15a310c total=263 count=4 avg=66: | | | mutation_partition_serializer::write_serialized<ser::writer_of_mutation_partition<bytes_ostream> > at ./mutation/mutation_partition_serializer.cc:180 | | | ++[6#1/2 62%] addr=0x14eb60a total=428 count=7 avg=61: | | | | frozen_mutation::frozen_mutation(mutation const&)::$_0::operator()<ser::writer_of_mutation_partition<bytes_ostream> > at ./mutation/frozen_mutation.cc:85 | | | | (inlined by) ser::after_mutation__key<bytes_ostream>::partition<frozen_mutation::frozen_mutation(mutation const&)::$_0> at ./build/dev/gen/idl/mutation.dist.impl.hh:7058 | | | | (inlined by) frozen_mutation::frozen_mutation at ./mutation/frozen_mutation.cc:84 | | | | ++[7#1/1 100%] addr=0x14ed388 total=532 count=9 avg=59: | | | | | freeze at ./mutation/frozen_mutation.cc:143 | | | | ++[8#1/2 74%] addr=0x252cf55 total=394 count=6 avg=66: | | | | | service::storage_service::merge_topology_snapshot at ./service/storage_service.cc:763 ``` This change uses freeze_gently to freeze the cdc_generations_v2 mutations one at a time to prevent the stalls reported above. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Prevent stalls coming from applying large mutations in memory synchronously, like the ones seen with the test_add_many_nodes_under_load dtest: ``` | | | ++[5#2/2 44%] addr=0x1498efb total=256 count=3 avg=85: | | | | replica::memtable::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const> const&, db::rp_handle&&)::$_0::operator()() const::{lambda()scylladb#1}::operator() at ./replica/memtable.cc:804 | | | | (inlined by) logalloc::allocating_section::with_reclaiming_disabled<replica::memtable::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const> const&, db::rp_handle&&)::$_0::operator()() const::{lambda()scylladb#1}&> at ././utils/logalloc.hh:500 | | | | (inlined by) logalloc::allocating_section::operator()<replica::memtable::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const> const&, db::rp_handle&&)::$_0::operator()() const::{lambda()scylladb#1}>(logalloc::region&, replica::memtable::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const> const&, db::rp_handle&&)::$_0::operator()() const::{lambda()scylladb#1}&&)::{lambda()scylladb#1}::operator() at ././utils/logalloc.hh:527 | | | | (inlined by) logalloc::allocating_section::with_reserve<logalloc::allocating_section::operator()<replica::memtable::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const> const&, db::rp_handle&&)::$_0::operator()() const::{lambda()scylladb#1}>(logalloc::region&, replica::memtable::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const> const&, db::rp_handle&&)::$_0::operator()() const::{lambda()scylladb#1}&&)::{lambda()scylladb#1}> at ././utils/logalloc.hh:471 | | | | (inlined by) logalloc::allocating_section::operator()<replica::memtable::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const> const&, db::rp_handle&&)::$_0::operator()() const::{lambda()scylladb#1}> at ././utils/logalloc.hh:526 | | | | (inlined by) replica::memtable::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const> const&, db::rp_handle&&)::$_0::operator() at ./replica/memtable.cc:800 | | | | (inlined by) with_allocator<replica::memtable::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const> const&, db::rp_handle&&)::$_0> at ././utils/allocation_strategy.hh:318 | | | | (inlined by) replica::memtable::apply at ./replica/memtable.cc:799 | | | ++[6#1/1 100%] addr=0x145047b total=1731 count=21 avg=82: | | | | replica::table::do_apply<frozen_mutation const&, seastar::lw_shared_ptr<schema const>&> at ./replica/table.cc:2896 | | | ++[7#1/1 100%] addr=0x13ddccb total=2852 count=32 avg=89: | | | | replica::table::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const>, db::rp_handle&&, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_0::operator() at ./replica/table.cc:2924 | | | | (inlined by) seastar::futurize<void>::invoke<replica::table::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const>, db::rp_handle&&, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_0&> at ././seastar/include/seastar/core/future.hh:2032 | | | | (inlined by) seastar::futurize_invoke<replica::table::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const>, db::rp_handle&&, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_0&> at ././seastar/include/seastar/core/future.hh:2066 | | | | (inlined by) replica::dirty_memory_manager_logalloc::region_group::run_when_memory_available<replica::table::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const>, db::rp_handle&&, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_0> at ./replica/dirty_memory_manager.hh:572 | | | | (inlined by) replica::table::apply at ./replica/table.cc:2923 | | | ++ - addr=0x1330ba1: | | | | replica::database::apply_in_memory at ./replica/database.cc:1812 | | | ++ - addr=0x1360054: | | | | replica::database::do_apply at ./replica/database.cc:2032 ``` This change has virtually no effect on small mutations (up to 128KB in size). build/release/scylla perf-simple-query --write --default-log-level=error --random-seed=1 -c 1 Before: median 80092.06 tps ( 59.3 allocs/op, 16.0 logallocs/op, 14.3 tasks/op, 53291 insns/op, 0 errors) After: median 78780.86 tps ( 59.3 allocs/op, 16.0 logallocs/op, 14.3 tasks/op, 53311 insns/op, 0 errors) To estimate the performance ramifications on large mutations, I measured perf-simple-query --write calling unfreeze_gently in all cases: median 77411.26 tps ( 71.3 allocs/op, 8.0 logallocs/op, 14.3 tasks/op, 53280 insns/op, 0 errors) Showing the allocations that moved out of logalloc (in memtable::apply of frozen_mutation) into seastar allocations (in unfreeze_gently) and <1% cpu overhead. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Perevent stalls from "unpacking" of large canonical mutations seen with test_add_many_nodes_under_load when called from `group0_state_machine::transfer_snapshot`: ``` ++[1#1/44 14%] addr=0x395b2f total=569 count=6 avg=95: ?? ??:0 | ++[2#1/2 56%] addr=0x3991e3 total=321 count=4 avg=80: ?? ??:0 | ++ - addr=0x1587159: | | std::__new_allocator<seastar::basic_sstring<signed char, unsigned int, 31u, false> >::allocate at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/new_allocator.h:147 | | (inlined by) std::allocator<seastar::basic_sstring<signed char, unsigned int, 31u, false> >::allocate at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/allocator.h:198 | | (inlined by) std::allocator_traits<std::allocator<seastar::basic_sstring<signed char, unsigned int, 31u, false> > >::allocate at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/alloc_traits.h:482 | | (inlined by) std::_Vector_base<seastar::basic_sstring<signed char, unsigned int, 31u, false>, std::allocator<seastar::basic_sstring<signed char, unsigned int, 31u, false> > >::_M_allocate at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/stl_vector.h:378 | | (inlined by) std::vector<seastar::basic_sstring<signed char, unsigned int, 31u, false>, std::allocator<seastar::basic_sstring<signed char, unsigned int, 31u, false> > >::reserve at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/vector.tcc:79 | | (inlined by) ser::idl::serializers::internal::vector_serializer<std::vector<seastar::basic_sstring<signed char, unsigned int, 31u, false>, std::allocator<seastar::basic_sstring<signed char, unsigned int, 31u, false> > > >::read<seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator> > at ././serializer_impl.hh:226 | | (inlined by) ser::deserialize<std::vector<seastar::basic_sstring<signed char, unsigned int, 31u, false>, std::allocator<seastar::basic_sstring<signed char, unsigned int, 31u, false> > >, seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator> > at ././serializer.hh:264 | | (inlined by) ser::serializer<clustering_key_prefix>::read<seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator> >(seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator>&)::{lambda(auto:1&)scylladb#1}::operator()<seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator> > at ./build/dev/gen/idl/keys.dist.impl.hh:31 | ++ - addr=0x1587085: | | seastar::with_serialized_stream<seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator>, ser::serializer<clustering_key_prefix>::read<seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator> >(seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator>&)::{lambda(auto:1&)scylladb#1}, void, void> at ././seastar/include/seastar/core/simple-stream.hh:646 | | (inlined by) ser::serializer<clustering_key_prefix>::read<seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator> > at ./build/dev/gen/idl/keys.dist.impl.hh:28 | | (inlined by) ser::deserialize<clustering_key_prefix, seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator> > at ././serializer.hh:264 | | (inlined by) ser::deletable_row_view::key() const::{lambda(auto:1&)scylladb#1}::operator()<seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator> const> at ./build/dev/gen/idl/mutation.dist.impl.hh:1268 | | ++[3#1/1 100%] addr=0x15865a3 total=577 count=7 avg=82: | | | seastar::memory_input_stream<bytes_ostream::fragment_iterator>::with_stream<ser::deletable_row_view::key() const::{lambda(auto:1&)scylladb#1}> at ././seastar/include/seastar/core/simple-stream.hh:491 | | | (inlined by) seastar::with_serialized_stream<seastar::memory_input_stream<bytes_ostream::fragment_iterator> const, ser::deletable_row_view::key() const::{lambda(auto:1&)scylladb#1}, void> at ././seastar/include/seastar/core/simple-stream.hh:639 | | | (inlined by) ser::deletable_row_view::key at ./build/dev/gen/idl/mutation.dist.impl.hh:1264 | | ++[4#1/1 100%] addr=0x157cf27 total=643 count=8 avg=80: | | | mutation_partition_view::do_accept<partition_builder> at ./mutation/mutation_partition_view.cc:212 | | ++ - addr=0x1516cac: | | | mutation_partition::apply at ./mutation/mutation_partition.cc:497 | | ++[5#1/1 100%] addr=0x14e4433 total=1765 count=22 avg=80: | | | canonical_mutation::to_mutation at ./mutation/canonical_mutation.cc:60 | | ++[6#1/2 98%] addr=0x2452a60 total=1732 count=21 avg=82: | | | service::storage_service::merge_topology_snapshot at ./service/storage_service.cc:761 | | ++ - addr=0x2858782: | | | service::group0_state_machine::transfer_snapshot at ./service/raft/group0_state_machine.cc:303 ``` Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Freezing large mutations synchronously may cause reactor stalls, as seen in the test_add_many_nodes_under_load dtest: ``` ++[1#1/37 5%] addr=0x15b0bf total=99 count=2 avg=50: ?? ??:0 | ++[2#1/2 67%] addr=0x15a331f total=66 count=1 avg=66: | | bytes_ostream::write at ././bytes_ostream.hh:248 | | (inlined by) bytes_ostream::write at ././bytes_ostream.hh:263 | | (inlined by) ser::serialize_integral<unsigned int, bytes_ostream> at ././serializer.hh:203 | | (inlined by) ser::integral_serializer<unsigned int>::write<bytes_ostream> at ././serializer.hh:217 | | (inlined by) ser::serialize<unsigned int, bytes_ostream> at ././serializer.hh:254 | | (inlined by) ser::writer_of_column<bytes_ostream>::write_id at ./build/dev/gen/idl/mutation.dist.impl.hh:4680 | | ++[3#1/1 100%] addr=0x159df71 total=132 count=2 avg=66: | | | (anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}::operator() at ./mutation/mutation_partition_serializer.cc:99 | | | (inlined by) row::maybe_invoke_with_hash<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1} const, cell_and_hash const> at ./mutation/mutation_partition.hh:133 | | | (inlined by) row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}>(ser::deletable_row__cells<bytes_ostream>&&) const::{lambda(unsigned int, cell_and_hash const&)scylladb#1}::operator() at ./mutation/mutation_partition.hh:152 | | | (inlined by) compact_radix_tree::tree<cell_and_hash, unsigned int>::walking_visitor<row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}>(ser::deletable_row__cells<bytes_ostream>&&) const::{lambda(unsigned int, cell_and_hash const&)scylladb#1}, true>::operator() at ././utils/compact-radix-tree.hh:1888 | | | (inlined by) compact_radix_tree::tree<cell_and_hash, unsigned int>::visit_slot<compact_radix_tree::tree<cell_and_hash, unsigned int>::walking_visitor<row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}>(ser::deletable_row__cells<bytes_ostream>&&) const::{lambda(unsigned int, cell_and_hash const&)scylladb#1}, true>&> at ././utils/compact-radix-tree.hh:1560 | | ++ - addr=0x159d84d: | | | compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)1, 4u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)0, 0u>::visit<compact_radix_tree::tree<cell_and_hash, unsigned int>::walking_visitor<row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}>(ser::deletable_row__cells<bytes_ostream>&&) const::{lambda(unsigned int, cell_and_hash const&)scylladb#1}, true>&> at ././utils/compact-radix-tree.hh:1364 | | | (inlined by) compact_radix_tree::tree<cell_and_hash, unsigned int>::node_base<cell_and_hash, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)1, 4u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)0, 0u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 8u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)3, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)1, 4u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)3, 16u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)4, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 8u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)4, 32u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)6, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)3, 16u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::direct_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)6, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)0, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)4, 32u> >::visit<compact_radix_tree::tree<cell_and_hash, unsigned int>::walking_visitor<row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}>(ser::deletable_row__cells<bytes_ostream>&&) const::{lambda(unsigned int, cell_and_hash const&)scylladb#1}, true>&, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)1, 4u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)0, 0u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 8u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)3, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)1, 4u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)3, 16u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)4, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 8u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)4, 32u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)6, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)3, 16u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::direct_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)6, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)0, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)4, 32u> > at ././utils/compact-radix-tree.hh:799 | | | (inlined by) compact_radix_tree::tree<cell_and_hash, unsigned int>::node_base<cell_and_hash, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)1, 4u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)0, 0u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 8u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)3, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)1, 4u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)3, 16u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)4, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 8u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)4, 32u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)6, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)3, 16u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::direct_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)6, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)0, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)4, 32u> >::visit<compact_radix_tree::tree<cell_and_hash, unsigned int>::walking_visitor<row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}>(ser::deletable_row__cells<bytes_ostream>&&) const::{lambda(unsigned int, cell_and_hash const&)scylladb#1}, true>&> at ././utils/compact-radix-tree.hh:807 | | ++[4#1/1 100%] addr=0x1596f4a total=329 count=5 avg=66: | | | compact_radix_tree::tree<cell_and_hash, unsigned int>::node_head::visit<compact_radix_tree::tree<cell_and_hash, unsigned int>::walking_visitor<row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}>(ser::deletable_row__cells<bytes_ostream>&&) const::{lambda(unsigned int, cell_and_hash const&)scylladb#1}, true> > at ././utils/compact-radix-tree.hh:473 | | | (inlined by) compact_radix_tree::tree<cell_and_hash, unsigned int>::visit<compact_radix_tree::tree<cell_and_hash, unsigned int>::walking_visitor<row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}>(ser::deletable_row__cells<bytes_ostream>&&) const::{lambda(unsigned int, cell_and_hash const&)scylladb#1}, true> > at ././utils/compact-radix-tree.hh:1626 | | | (inlined by) compact_radix_tree::tree<cell_and_hash, unsigned int>::walk<row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}>(ser::deletable_row__cells<bytes_ostream>&&) const::{lambda(unsigned int, cell_and_hash const&)scylladb#1}> at ././utils/compact-radix-tree.hh:1909 | | | (inlined by) row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)scylladb#1}> at ./mutation/mutation_partition.hh:151 | | | (inlined by) (anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> > at ./mutation/mutation_partition_serializer.cc:97 | | | (inlined by) write_row<ser::writer_of_deletable_row<bytes_ostream> > at ./mutation/mutation_partition_serializer.cc:168 | | ++[5#1/2 80%] addr=0x15a310c total=263 count=4 avg=66: | | | mutation_partition_serializer::write_serialized<ser::writer_of_mutation_partition<bytes_ostream> > at ./mutation/mutation_partition_serializer.cc:180 | | | ++[6#1/2 62%] addr=0x14eb60a total=428 count=7 avg=61: | | | | frozen_mutation::frozen_mutation(mutation const&)::$_0::operator()<ser::writer_of_mutation_partition<bytes_ostream> > at ./mutation/frozen_mutation.cc:85 | | | | (inlined by) ser::after_mutation__key<bytes_ostream>::partition<frozen_mutation::frozen_mutation(mutation const&)::$_0> at ./build/dev/gen/idl/mutation.dist.impl.hh:7058 | | | | (inlined by) frozen_mutation::frozen_mutation at ./mutation/frozen_mutation.cc:84 | | | | ++[7#1/1 100%] addr=0x14ed388 total=532 count=9 avg=59: | | | | | freeze at ./mutation/frozen_mutation.cc:143 | | | | ++[8#1/2 74%] addr=0x252cf55 total=394 count=6 avg=66: | | | | | service::storage_service::merge_topology_snapshot at ./service/storage_service.cc:763 ``` This change uses freeze_gently to freeze the cdc_generations_v2 mutations one at a time to prevent the stalls reported above. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Prevent stalls coming from applying large mutations in memory synchronously, like the ones seen with the test_add_many_nodes_under_load dtest: ``` | | | ++[5#2/2 44%] addr=0x1498efb total=256 count=3 avg=85: | | | | replica::memtable::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const> const&, db::rp_handle&&)::$_0::operator()() const::{lambda()scylladb#1}::operator() at ./replica/memtable.cc:804 | | | | (inlined by) logalloc::allocating_section::with_reclaiming_disabled<replica::memtable::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const> const&, db::rp_handle&&)::$_0::operator()() const::{lambda()scylladb#1}&> at ././utils/logalloc.hh:500 | | | | (inlined by) logalloc::allocating_section::operator()<replica::memtable::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const> const&, db::rp_handle&&)::$_0::operator()() const::{lambda()scylladb#1}>(logalloc::region&, replica::memtable::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const> const&, db::rp_handle&&)::$_0::operator()() const::{lambda()scylladb#1}&&)::{lambda()scylladb#1}::operator() at ././utils/logalloc.hh:527 | | | | (inlined by) logalloc::allocating_section::with_reserve<logalloc::allocating_section::operator()<replica::memtable::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const> const&, db::rp_handle&&)::$_0::operator()() const::{lambda()scylladb#1}>(logalloc::region&, replica::memtable::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const> const&, db::rp_handle&&)::$_0::operator()() const::{lambda()scylladb#1}&&)::{lambda()scylladb#1}> at ././utils/logalloc.hh:471 | | | | (inlined by) logalloc::allocating_section::operator()<replica::memtable::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const> const&, db::rp_handle&&)::$_0::operator()() const::{lambda()scylladb#1}> at ././utils/logalloc.hh:526 | | | | (inlined by) replica::memtable::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const> const&, db::rp_handle&&)::$_0::operator() at ./replica/memtable.cc:800 | | | | (inlined by) with_allocator<replica::memtable::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const> const&, db::rp_handle&&)::$_0> at ././utils/allocation_strategy.hh:318 | | | | (inlined by) replica::memtable::apply at ./replica/memtable.cc:799 | | | ++[6#1/1 100%] addr=0x145047b total=1731 count=21 avg=82: | | | | replica::table::do_apply<frozen_mutation const&, seastar::lw_shared_ptr<schema const>&> at ./replica/table.cc:2896 | | | ++[7#1/1 100%] addr=0x13ddccb total=2852 count=32 avg=89: | | | | replica::table::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const>, db::rp_handle&&, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_0::operator() at ./replica/table.cc:2924 | | | | (inlined by) seastar::futurize<void>::invoke<replica::table::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const>, db::rp_handle&&, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_0&> at ././seastar/include/seastar/core/future.hh:2032 | | | | (inlined by) seastar::futurize_invoke<replica::table::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const>, db::rp_handle&&, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_0&> at ././seastar/include/seastar/core/future.hh:2066 | | | | (inlined by) replica::dirty_memory_manager_logalloc::region_group::run_when_memory_available<replica::table::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const>, db::rp_handle&&, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_0> at ./replica/dirty_memory_manager.hh:572 | | | | (inlined by) replica::table::apply at ./replica/table.cc:2923 | | | ++ - addr=0x1330ba1: | | | | replica::database::apply_in_memory at ./replica/database.cc:1812 | | | ++ - addr=0x1360054: | | | | replica::database::do_apply at ./replica/database.cc:2032 ``` This change has virtually no effect on small mutations (up to 128KB in size). build/release/scylla perf-simple-query --write --default-log-level=error --random-seed=1 -c 1 Before: median 80092.06 tps ( 59.3 allocs/op, 16.0 logallocs/op, 14.3 tasks/op, 53291 insns/op, 0 errors) After: median 78780.86 tps ( 59.3 allocs/op, 16.0 logallocs/op, 14.3 tasks/op, 53311 insns/op, 0 errors) To estimate the performance ramifications on large mutations, I measured perf-simple-query --write calling unfreeze_gently in all cases: median 77411.26 tps ( 71.3 allocs/op, 8.0 logallocs/op, 14.3 tasks/op, 53280 insns/op, 0 errors) Showing the allocations that moved out of logalloc (in memtable::apply of frozen_mutation) into seastar allocations (in unfreeze_gently) and <1% cpu overhead. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Today, a fresh "configure.py --mode=release; ninja" takes a whopping 9 minutes on my machine.
All to often, chaning just one (header) file results in a large part of this being redone.
Life is too short for this :-)
We could try to improve these figures by reducing the inclusion of header files, moving more things to .cc files, not building all the tests by default, and perhaps other tricks.
P.S. as a reference to just how slow 9 minutes is, consider that a fresh compile of OSv on my machine takes just 4 minutes. Moreover in OSv, most changes require recompiling only a few files, and take just a few seconds.
The text was updated successfully, but these errors were encountered: