Useless linearization on large data during validation, of either type bytes or string-derived, potentially cause stalls due to reclaiming #7318

raphaelsc · 2020-10-01T21:55:41Z

The type bytes_type_impl doesn't need any validation, and so no linearization. But due to a regression introduced in commit b175657, bytes_typoe_impl no longer overrides validate(const fragmented_temporary_buffer::view&, ...), Meaning linearization is performed on bytes type even though it needs no validation. Found out this while auditing code (comparing old versions with the current one)

Large data here means data > 128k, as that's the default fragment size used by fragmented_temporary_buffer. Data smaller than that will not need any linearization.

Also, string-derived types are performing linearization needlessly, given that validation checks individual bytes and that could be done on the fragmented buffer instead.

Linearization can try to allocate a large contiguous buffer to place large data, potentially triggering memory reclaiming and causing reactor stalls.

Follow a stall backtrace coming from that linearization on a string-derived type:

 (inlined by) _M_invoke at /opt/scylladb/include/c++/7/bits/std_function.h:302
std::function<seastar::memory::reclaiming_result ()>::operator()() const at /opt/scylladb/include/c++/7/bits/std_function.h:706
 (inlined by) seastar::memory::reclaimer::do_reclaim() at /usr/src/debug/scylla-enterprise-2019.1.12/seastar/core/memory.hh:124
 (inlined by) seastar::memory::cpu_pages::run_reclaimers(seastar::memory::reclaimer_scope) at /usr/src/debug/scylla-enterprise-2019.1.12/seastar/core/memory.cc:1013
 (inlined by) seastar::memory::cpu_pages::find_and_unlink_span_reclaiming(unsigned int) at /usr/src/debug/scylla-enterprise-2019.1.12/seastar/core/memory.cc:555
 (inlined by) seastar::memory::cpu_pages::allocate_large_and_trim(unsigned int) at /usr/src/debug/scylla-enterprise-2019.1.12/seastar/core/memory.cc:578
seastar::memory::cpu_pages::allocate_large(unsigned int) at /usr/src/debug/scylla-enterprise-2019.1.12/seastar/core/memory.cc:624
 (inlined by) seastar::memory::allocate_large(unsigned long) at /usr/src/debug/scylla-enterprise-2019.1.12/seastar/core/memory.cc:1194
 (inlined by) seastar::memory::allocate(unsigned long) at /usr/src/debug/scylla-enterprise-2019.1.12/seastar/core/memory.cc:1244
operator new(unsigned long) at /usr/src/debug/scylla-enterprise-2019.1.12/seastar/core/memory.cc:1647
__gnu_cxx::new_allocator<char>::allocate(unsigned long, void const*) at /opt/scylladb/include/c++/7/ext/new_allocator.h:111
 (inlined by) std::string::_Rep::_S_create(unsigned long, unsigned long, std::allocator<char> const&) at /opt/scylladb/include/c++/7/bits/basic_string.tcc:1057
 (inlined by) std::string::_Rep::_M_clone(std::allocator<char> const&, unsigned long) at /opt/scylladb/include/c++/7/bits/basic_string.tcc:1078
 (inlined by) std::string::reserve(unsigned long) at /opt/scylladb/include/c++/7/bits/basic_string.tcc:960
std::basic_string<char, std::char_traits<char>, std::allocator<char> > boost::locale::conv::utf_to_utf<char, signed char>(signed char const*, signed char const*, boost::locale::conv::method_type) at /opt/scylladb/include/boost/locale/encoding_utf.hpp:37
string_type_impl::validate(std::experimental::fundamentals_v1::basic_string_view<signed char, std::char_traits<signed char> >) const at /usr/src/debug/scylla-enterprise-2019.1.12/types.cc:333
abstract_type::validate(fragmented_temporary_buffer::view const&) const::{lambda(std::experimental::fundamentals_v1::basic_string_view<signed char, std::char_traits<signed char> >)#1}::operator()(std::experimental::fundamentals_v1::basic_string_view<signed char, std::char_traits<signed char> >) const at /usr/src/debug/scylla-enterprise-2019.1.12/./types.hh:470
 (inlined by) _Z15with_linearizedIN27fragmented_temporary_buffer4viewEZNK13abstract_type8validateERKS1_EUlNSt12experimental15fundamentals_v117basic_string_viewIaSt11char_traitsIaEEEE_EDcRKT_OT0_ at /usr/src/debug/scylla-enterprise-2019.1.12/utils/fragment_range.hh:137
abstract_type::validate(fragmented_temporary_buffer::view const&) const at /usr/src/debug/scylla-enterprise-2019.1.12/./types.hh:469
 (inlined by) cql3::constants::marker::bind_and_get(cql3::query_options const&) at /usr/src/debug/scylla-enterprise-2019.1.12/cql3/constants.hh:172
cql3::constants::setter::execute(mutation&, clustering_key_prefix const&, cql3::update_parameters const&) at /usr/src/debug/scylla-enterprise-2019.1.12/cql3/constants.hh:194
cql3::statements::update_statement::execute_operations_for_key(mutation&, clustering_key_prefix const&, cql3::update_parameters const&, std::optional<std::unordered_map<seastar::basic_sstring<char, unsigned int, 15u, true>, std::experimental::fundamentals_v1::optional<seastar::basic_sstring<signed char, unsigned int, 31u, false> >, std::hash<seastar::basic_sstring<char, unsigned int, 15u, true> >, std::equal_to<seastar::basic_sstring<char, unsigned int, 15u, true> >, std::allocator<std::pair<seastar::basic_sstring<char, unsigned int, 15u, true> const, std::experimental::fundamentals_v1::optional<seastar::basic_sstring<signed char, unsigned int, 31u, false> > > > > > const&) at /usr/src/debug/scylla-enterprise-2019.1.12/cql3/statements/update_statement.cc:112

The text was updated successfully, but these errors were encountered:

raphaelsc · 2020-10-02T02:19:34Z

This following patch will restore the old behavior where linearization on bytes (blob) type is avoided:

diff --git a/concrete_types.hh b/concrete_types.hh
index c80e5dd08..c3bd920a1 100644
--- a/concrete_types.hh
+++ b/concrete_types.hh
@@ -119,6 +119,7 @@ struct utf8_type_impl final : public string_type_impl {
 
 struct bytes_type_impl final : public concrete_type<bytes> {
     bytes_type_impl();
+    virtual void validate(const fragmented_temporary_buffer::view&, cql_serialization_format) const override { }
 };

avikivity · 2020-10-03T18:44:19Z

Good catch!

slivne · 2020-10-04T09:52:55Z

@raphaelsc is out this week @bhalevy can you please have someone send this fix (we have issues in the field because of that - especially when users have large blobs)

bhalevy · 2020-10-04T12:28:04Z

@denesb can you please look into this? thanks

bhalevy · 2020-10-04T13:45:52Z

@raphaelsc did you close the issue on purpose?
If so, please provide more information.

raphaelsc · 2020-10-04T14:53:58Z

@raphaelsc did you close the issue on purpose?
If so, please provide more information.

No, by mistake. Sorry

denesb · 2020-10-05T06:33:08Z

I think overriding validate() would be a step back here, instead I'm going to remove the eager linearization from:

512      virtual void validate(const fragmented_temporary_buffer::view& view, cql_serialization_format sf) const {
   1         with_linearized(view, [this, sf] (bytes_view bv) {
   2             validate(bv, sf);
   3         });
   4     }

and have individual validating methods in the validator visitor decide whether they need linearising or not.

raphaelsc · 2020-10-05T17:45:35Z

I think overriding validate() would be a step back here, instead I'm going to remove the eager linearization from:
512      virtual void validate(const fragmented_temporary_buffer::view& view, cql_serialization_format sf) const {
   1         with_linearized(view, [this, sf] (bytes_view bv) {
   2             validate(bv, sf);
   3         });
   4     }
and have individual validating methods in the validator visitor decide whether they need linearising or not.

LGTM. Thanks for working on this, @denesb!

vladzcloudius · 2020-10-28T23:36:08Z

@avikivity @slivne Please, consider backporting this to all supported releases.

@eliransin @eyalgutkind @amihayday FYI

raphaelsc · 2020-10-29T00:36:57Z

Btw you're also interested in #7393, which was extracted out of this issue into a new one

…

On Wed, Oct 28, 2020, 8:36 PM vladzcloudius ***@***.***> wrote: @avikivity <https://github.com/avikivity> @slivne <https://github.com/slivne> Please, consider backporting this to all supported releases. @eliransin <https://github.com/eliransin> @eyalgutkind <https://github.com/eyalgutkind> @amihayday <https://github.com/amihayday> FYI — You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub <#7318 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAKYA4YRLP4SRUC6BOGRUWTSNCTHLANCNFSM4SA7JUWA> .

raphaelsc · 2020-10-29T00:40:57Z

Also, #7457 still needs to be fixed for this linearization issue to be really fixed. But backporting these available fixes will already help reducing the pressure on memory allocator. I plan to fix #7457 soon

avikivity · 2020-11-08T17:19:08Z

This isn't really a regression, the original code also used bytes_view. So not backporting.

vladzcloudius · 2020-11-09T13:59:08Z

This isn't really a regression, the original code also used bytes_view. So not backporting.

@avikivity Don't you think that it's worth backporting even if it's not a regression considering the comment #7318 (comment) ?

raphaelsc · 2020-11-09T13:59:20Z

This isn't really a regression, the original code also used bytes_view. So not backporting.

Actually it's a regression. The original code also used bytes_view in the validator, but before devirtualizing the types, we had bytes_type_impl overriding virtual void validate(const fragmented_temporary_buffer::view&, cql_serialization_format) const override { }, so the validation function which receives a bytes_view was never called for bytes type, meaning no linearization was performed.

raphaelsc · 2020-11-09T14:01:31Z

Given the latency impact on users produced by linearization on validation, I think the fix for this issue in addition to the one for #7393 are worth of backport.

avikivity · 2020-11-09T14:16:01Z

Ah, I looked at strings (which always need validation).

You're right that it is a regression. But won't we have to linearize in anyway, given that we'll convert it to bytes soon?

raphaelsc · 2020-11-09T14:22:28Z

Ah, I looked at strings (which always need validation).

You're right that it is a regression. But won't we have to linearize in anyway, given that we'll convert it to bytes soon?

If I understand correctly, the blob will be kept fragmented (within managed_bytes) in memtable, but it will indeed be linearized as soon as it reaches the SSTable layer. So we need #7457 fixed too (we need to extend its scope to writer too, not only parser)

raphaelsc · 2020-11-09T15:17:11Z

Ah, I looked at strings (which always need validation).
You're right that it is a regression. But won't we have to linearize in anyway, given that we'll convert it to bytes soon?

If I understand correctly, the blob will be kept fragmented (within managed_bytes) in memtable, but it will indeed be linearized as soon as it reaches the SSTable layer. So we need #7457 fixed too (we need to extend its scope to writer too, not only parser)

FWIW, confirmed this with large mem allocation detector, a large blob is only linearized when memtable is flushed into a sstable. Backporting these available fixes will reduce the # of times we linearize a given large blob from 2 to 1, which alleviates the pain to some extent, but we need to fix #7457 for the problem to be completely fixed.

Instead of eagerly linearizing all values as they are passed to validate(), defer linearization to those validators that actually need linearized values. Linearizing large values puts pressure on the memory allocator with large contiguous allocation requests. This is something we are trying to actively avoid, especially if it is not really neaded. Turns out the types, whose validators really want linearized values are a minority, as most validators just look at the size of the value, and some like bytes don't need validation at all, while usually having large values. This is achieved by templating the validator struct on the view and using the FragmentedRange concept to treat all passed in views (`bytes_view` and `fragmented_temporary_buffer_view`) uniformly. This patch makes no attempt at converting existing validators to work with fragmented buffers, only trivial cases are converted. The major offenders still left are ascii/utf8 and collections. Fixes: #7318 Tests: unit(dev) Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20201007054524.909420-1-bdenes@scylladb.com> (cherry picked from commit db56ae6)

avikivity · 2020-11-11T08:57:02Z

Backported to 4.1, 4.2 (4.3 has it already)

Instead of eagerly linearizing all values as they are passed to validate(), defer linearization to those validators that actually need linearized values. Linearizing large values puts pressure on the memory allocator with large contiguous allocation requests. This is something we are trying to actively avoid, especially if it is not really neaded. Turns out the types, whose validators really want linearized values are a minority, as most validators just look at the size of the value, and some like bytes don't need validation at all, while usually having large values. This is achieved by templating the validator struct on the view and using the FragmentedRange concept to treat all passed in views (`bytes_view` and `fragmented_temporary_buffer_view`) uniformly. This patch makes no attempt at converting existing validators to work with fragmented buffers, only trivial cases are converted. The major offenders still left are ascii/utf8 and collections. Fixes: #7318 Tests: unit(dev) Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20201007054524.909420-1-bdenes@scylladb.com> (cherry picked from commit db56ae6) [avi: squashed ed6775c ("types: adjust validation_visitor construction for clang") as gcc 9 in scylla 4.1 suffers from the same problem as clang 11]

Instead of eagerly linearizing all values as they are passed to validate(), defer linearization to those validators that actually need linearized values. Linearizing large values puts pressure on the memory allocator with large contiguous allocation requests. This is something we are trying to actively avoid, especially if it is not really neaded. Turns out the types, whose validators really want linearized values are a minority, as most validators just look at the size of the value, and some like bytes don't need validation at all, while usually having large values. This is achieved by templating the validator struct on the view and using the FragmentedRange concept to treat all passed in views (`bytes_view` and `fragmented_temporary_buffer_view`) uniformly. This patch makes no attempt at converting existing validators to work with fragmented buffers, only trivial cases are converted. The major offenders still left are ascii/utf8 and collections. Fixes: scylladb#7318 Tests: unit(dev) Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20201007054524.909420-1-bdenes@scylladb.com> (cherry picked from commit db56ae6) [avi: squashed ed6775c ("types: adjust validation_visitor construction for clang") as gcc 9 in scylla 4.1 suffers from the same problem as clang 11]

raphaelsc self-assigned this Oct 1, 2020

raphaelsc added symptom/latency symptom/performance Issues causing performance problems status/regression and removed status/regression symptom/latency labels Oct 1, 2020

slivne assigned bhalevy Oct 4, 2020

slivne added this to the 4.3 milestone Oct 4, 2020

slivne added the bug label Oct 4, 2020

bhalevy assigned denesb Oct 4, 2020

raphaelsc closed this as completed Oct 4, 2020

raphaelsc reopened this Oct 4, 2020

scylladb-promoter added the Backport candidate label Oct 7, 2020

scylladb-promoter closed this as completed in db56ae6 Oct 7, 2020

avikivity removed Backport candidate status/regression labels Nov 8, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Useless linearization on large data during validation, of either type bytes or string-derived, potentially cause stalls due to reclaiming #7318

Useless linearization on large data during validation, of either type bytes or string-derived, potentially cause stalls due to reclaiming #7318

raphaelsc commented Oct 1, 2020 •

edited

Loading

raphaelsc commented Oct 2, 2020 •

edited

Loading

avikivity commented Oct 3, 2020

slivne commented Oct 4, 2020

bhalevy commented Oct 4, 2020

bhalevy commented Oct 4, 2020

raphaelsc commented Oct 4, 2020

denesb commented Oct 5, 2020 •

edited

Loading

raphaelsc commented Oct 5, 2020

vladzcloudius commented Oct 28, 2020

raphaelsc commented Oct 29, 2020 via email

raphaelsc commented Oct 29, 2020

avikivity commented Nov 8, 2020

vladzcloudius commented Nov 9, 2020

raphaelsc commented Nov 9, 2020

raphaelsc commented Nov 9, 2020

avikivity commented Nov 9, 2020

raphaelsc commented Nov 9, 2020

raphaelsc commented Nov 9, 2020

avikivity commented Nov 11, 2020

Useless linearization on large data during validation, of either type bytes or string-derived, potentially cause stalls due to reclaiming #7318

Useless linearization on large data during validation, of either type bytes or string-derived, potentially cause stalls due to reclaiming #7318

Comments

raphaelsc commented Oct 1, 2020 • edited Loading

raphaelsc commented Oct 2, 2020 • edited Loading

avikivity commented Oct 3, 2020

slivne commented Oct 4, 2020

bhalevy commented Oct 4, 2020

bhalevy commented Oct 4, 2020

raphaelsc commented Oct 4, 2020

denesb commented Oct 5, 2020 • edited Loading

raphaelsc commented Oct 5, 2020

vladzcloudius commented Oct 28, 2020

raphaelsc commented Oct 29, 2020 via email

raphaelsc commented Oct 29, 2020

avikivity commented Nov 8, 2020

vladzcloudius commented Nov 9, 2020

raphaelsc commented Nov 9, 2020

raphaelsc commented Nov 9, 2020

avikivity commented Nov 9, 2020

raphaelsc commented Nov 9, 2020

raphaelsc commented Nov 9, 2020

avikivity commented Nov 11, 2020

raphaelsc commented Oct 1, 2020 •

edited

Loading

raphaelsc commented Oct 2, 2020 •

edited

Loading

denesb commented Oct 5, 2020 •

edited

Loading