MB-70044: Add MutateWithMeta command#7
Closed
jimwwalker wants to merge 1 commit intocouchbase:masterfrom
Closed
Conversation
The MutateWithMeta command allows a user to store an item and have parts of the xattr segment being replaced with the actual value for CAS and seqno (just like subdoc allows for). Subdoc cannot be used for this as there is a max number (it is however configurable) of paths one may operate on in the "multi" versions, but there is no upper limit on the number of xattr paths a document may have (which means that XDCR could encounter a document with more paths than the max configured subdoc path limit). The command: - Must be negotiated via HELLO (feature MutateWithMeta) - Supports Add, Set, and Delete operations - Uses JSON-encoded metadata appended to the value (length in 4-byte extras) - Allows specifying byte offsets where CAS/seqno values should be written - Supports the same options as existing WithMeta commands Change-Id: I47d70a1e7bb754eedde7a19ab3980eadccfa23f0
Contributor
Author
|
testing co-pilot review |
Contributor
Author
|
needs write access to add reviewers :/ |
jimwwalker
pushed a commit
to jimwwalker/kv_engine
that referenced
this pull request
Mar 5, 2026
Some tests in ep_testsuite_dcp spawn a std::thread for running a DCP
client loop that opens DCP producer, issues a StreamRequest and listens
for data.
The problematic part here is in the StreamRequest. The Producer's
stream-map is type folly::AtomicHashArray<> that maps vbid->streams.
That folly type uses a few thread locals internally that end up storing
references to memory that is allocated in the bucket arena in the
EPE::stream_req path. Then std::thread is destroyed outside any bucket
context causing a mem track mismatch at thread's TSD release.
Eg:
% env CB_ARENA_MALLOC_VERIFY_DEALLOC_CLIENT=1 lldb -- ./ep_testsuite_dcp "-E" "ep" "-v" "-e" "compression_mode=active;dbname=./ep_testsuite_dcp.value_eviction.comp_active.db" -C 29
..
Running [29/91]: test producer keep stream open...===ERROR===: JeArenaMalloc deallocation mismatch
Memory freed by client:100 domain:None which is assigned arena:0, but memory was previously allocated from arena:2 (client-specific arena).
Allocation address:0x103dcda40 size:24
..
Process 90807 stopped
* thread #43, name = 'dcp_thread', stop reason = EXC_BREAKPOINT (code=1, subcode=0x185e46ae4)
frame #0: 0x0000000185e46ae4 libsystem_c.dylib` __abort + 168
libsystem_c.dylib`:
-> 0x185e46ae4 <+168>: brk #0x1
libsystem_c.dylib`abort_report_np:
0x185e46ae8 <+0>: pacibsp
0x185e46aec <+4>: sub sp, sp, #0x30
0x185e46af0 <+8>: stp x20, x19, [sp, #0x10]
Target 0: (ep_testsuite_dcp) stopped.
(lldb) bt
* thread #43, name = 'dcp_thread', stop reason = EXC_BREAKPOINT (code=1, subcode=0x185e46ae4)
* frame #0: 0x0000000185e46ae4 libsystem_c.dylib` __abort + 168
frame #1: 0x0000000185e46a3c libsystem_c.dylib` abort + 192
frame #2: 0x00000001001d8808 ep_testsuite_dcp` cb::verifyMemDeallocatedByCorrectClient(client=0x0000000171696c78, ptr=0x0000000103dcda40, size=24) + 536 at je_arena_malloc.cc:247
frame couchbase#3: 0x00000001001d8898 ep_testsuite_dcp` cb::_JEArenaMalloc<cb::JEArenaSimpleTracker>::sized_free(ptr=0x0000000103dcda40, size=24) + 84 at je_arena_malloc.cc:442
frame couchbase#4: 0x00000001000d6dc0 ep_testsuite_dcp` cb::_ArenaMalloc<cb::_JEArenaMalloc<cb::JEArenaSimpleTracker>>::sized_free(ptr=0x0000000103dcda40, size=24) + 32 at cb_arena_malloc.h:273
frame couchbase#5: 0x00000001000d6d8c ep_testsuite_dcp` cb_sized_free(ptr=0x0000000103dcda40, size=24) + 44 at cb_malloc_arena.cc:75
frame couchbase#6: 0x00000001000d7398 ep_testsuite_dcp` operator delete(ptr=0x0000000103dcda40, size=24) + 32 at global_new_replacement.cc:146
frame couchbase#7: 0x000000010081f534 ep_testsuite_dcp` void folly::threadlocal_detail::ElementWrapper::set<folly::ThreadCachedInt<unsigned long long, unsigned long long>::IntCache*>(this=0x0000000171696d33, pt=0x0000000103dcda40, (null)=THIS_THREAD)::'lambda'(void*, folly::TLPDestructionMode)::operator()(void*, folly::TLPDestructionMode) const + 68 at ThreadLocalDetail.h:138
frame #8: 0x000000010081f4e4 ep_testsuite_dcp` void folly::threadlocal_detail::ElementWrapper::set<folly::ThreadCachedInt<unsigned long long, unsigned long long>::IntCache*>(folly::ThreadCachedInt<unsigned long long, unsigned long long>::IntCache*)::'lambda'(void*, folly::TLPDestructionMode)::__invoke(pt=0x0000000103dcda40, (null)=THIS_THREAD) + 36 at ThreadLocalDetail.h:137
frame #9: 0x0000000100168e90 ep_testsuite_dcp` folly::threadlocal_detail::ElementWrapper::dispose(this=0x000000011f7053c8, mode=THIS_THREAD) + 324 at ThreadLocalDetail.h:114
frame #10: 0x0000000100cd0758 ep_testsuite_dcp` folly::threadlocal_detail::StaticMetaBase::onThreadExit(ptr=0x0000000103dd1800) + 404 at ThreadLocalDetail.cpp:153
frame #11: 0x0000000185f37870 libsystem_pthread.dylib` _pthread_tsd_cleanup + 488
frame #12: 0x0000000185f3a684 libsystem_pthread.dylib` _pthread_exit + 84
frame #13: 0x0000000185f39fa0 libsystem_pthread.dylib` _pthread_start + 148
Fix by executing DcpOpen+StreamRequest in the main test thread. All
memcached/bucket resources are released before the main thread shuts down.
Notes:
- That is a test-only issue, no production bug addressed here
- Patch verified locally by
https://review.couchbase.org/c/platform/+/234973, which is being
submitted after all related failures are fixed
Change-Id: Ida20abef00daddb8da4d65305316eba7baccaec7
Reviewed-on: https://review.couchbase.org/c/kv_engine/+/235361
Tested-by: Build Bot <build@couchbase.com>
Well-Formed: Restriction Checker
Reviewed-by: Jim Walker <jim@couchbase.com>
jimwwalker
pushed a commit
to jimwwalker/kv_engine
that referenced
this pull request
Mar 5, 2026
This is the problematic code:
const T& MockConnection::getDescription() const {
static T obj{..};
return obj;
}
Problem with that is that in ep_testsuite that memory is (1) allocated
the first time the function is called (which is in a Bucket context)
and (2) then released when the program shuts down (ie outside Bucket
context, as at that point any bucket instance as been already released).
In local testing that shows up like
% env CB_ARENA_MALLOC_VERIFY_DEALLOC_CLIENT=1 lldb -- ../source_morpheus/build/kv_engine/ep_testsuite_dcp "-v" "-e" "compression_mode=active;item_eviction_policy=full_eviction;dbname=./ep_testsuite_dcp.full_eviction.comp_active.db" -C 90
(lldb) pr la
Process 11613 launched: '../source_morpheus/build/kv_engine/ep_testsuite_dcp' (arm64)
Running [90/91]: test oso backfill...(125 ms) OK
===ERROR===: JeArenaMalloc deallocation mismatch
Memory freed by client:100 domain:None which is assigned arena:0, but memory was previously allocated from arena:2 (client-specific arena).
Allocation address:0x1047fe4a0 size:24
..
Process 11613 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGABRT
frame #0: 0x00000001816955f0 libsystem_kernel.dylib` __pthread_kill + 8
libsystem_kernel.dylib`:
-> 0x1816955f0 <+8>: b.lo 0x181695610 ; <+40>
0x1816955f4 <+12>: pacibsp
0x1816955f8 <+16>: stp x29, x30, [sp, #-0x10]!
0x1816955fc <+20>: mov x29, sp
Target 0: (ep_testsuite_dcp) stopped.
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGABRT
* frame #0: 0x00000001816955f0 libsystem_kernel.dylib` __pthread_kill + 8
frame #1: 0x00000001816cdc20 libsystem_pthread.dylib` pthread_kill + 288
frame #2: 0x00000001815daa30 libsystem_c.dylib` abort + 180
frame couchbase#3: 0x0000000100fb71a8 ep_testsuite_dcp` cb::verifyMemDeallocatedByCorrectClient(client=0x000000016fdfeca8, ptr=0x00000001047fe4a0, size=24) + 536 at je_arena_malloc.cc:260
frame couchbase#4: 0x0000000100fb7238 ep_testsuite_dcp` cb::_JEArenaMalloc<cb::JEArenaSimpleTracker>::sized_free(ptr=0x00000001047fe4a0, size=24) + 84 at je_arena_malloc.cc:455
frame couchbase#5: 0x00000001000e29f4 ep_testsuite_dcp` cb::_ArenaMalloc<cb::_JEArenaMalloc<cb::JEArenaSimpleTracker>>::sized_free(ptr=0x00000001047fe4a0, size=24) + 32 at cb_arena_malloc.h:273
frame couchbase#6: 0x00000001000e29c0 ep_testsuite_dcp` cb_sized_free(ptr=0x00000001047fe4a0, size=24) + 44 at cb_malloc_arena.cc:75
frame couchbase#7: 0x00000001000e2fcc ep_testsuite_dcp` operator delete(ptr=0x00000001047fe4a0, size=24) + 32 at global_new_replacement.cc:146
frame #8: 0x0000000100030a24 ep_testsuite_dcp` void std::__1::__libcpp_operator_delete[abi:ue170006]<void*, unsigned long>(__args=0x00000001047fe4a0, __args=24) + 32 at new:308
frame #9: 0x00000001000309c4 ep_testsuite_dcp` void std::__1::__do_deallocate_handle_size[abi:ue170006]<>(__ptr=0x00000001047fe4a0, __size=24) + 32 at new:334
frame #10: 0x0000000100030960 ep_testsuite_dcp` std::__1::__libcpp_deallocate[abi:ue170006](__ptr=0x00000001047fe4a0, __size=24, __align=8) + 80 at new:348
frame #11: 0x00000001000676b4 ep_testsuite_dcp` std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>::deallocate[abi:ue170006](this=0x000000016fdfee85, __p="me", __n=1) + 48 at allocator.h:130
frame #12: 0x0000000100067678 ep_testsuite_dcp` std::__1::allocator_traits<std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>>::deallocate[abi:ue170006](__a=0x000000016fdfee85, __p="me", __n=1) + 40 at allocator_traits.h:288
frame #13: 0x000000010007c964 ep_testsuite_dcp` nlohmann::json_abi_v3_11_3::basic_json<std::__1::map, std::__1::vector, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, bool, long long, unsigned long long, double, std::__1::allocator, nlohmann::json_abi_v3_11_3::adl_serializer, std::__1::vector<unsigned char, std::__1::allocator<unsigned char>>, void>::json_value::destroy(this=0x000000016fdff0c0, t=string) + 1212 at json.hpp:646
frame #14: 0x000000010007c490 ep_testsuite_dcp` nlohmann::json_abi_v3_11_3::basic_json<std::__1::map, std::__1::vector, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, bool, long long, unsigned long long, double, std::__1::allocator, nlohmann::json_abi_v3_11_3::adl_serializer, std::__1::vector<unsigned char, std::__1::allocator<unsigned char>>, void>::data::~data(this=0x000000016fdff0b8) + 36 at json.hpp:4225
frame #15: 0x0000000100077fb8 ep_testsuite_dcp` nlohmann::json_abi_v3_11_3::basic_json<std::__1::map, std::__1::vector, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, bool, long long, unsigned long long, double, std::__1::allocator, nlohmann::json_abi_v3_11_3::adl_serializer, std::__1::vector<unsigned char, std::__1::allocator<unsigned char>>, void>::data::~data(this=0x000000016fdff0b8) + 28 at json.hpp:4224
frame #16: 0x0000000100085e58 ep_testsuite_dcp` nlohmann::json_abi_v3_11_3::basic_json<std::__1::map, std::__1::vector, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, bool, long long, unsigned long long, double, std::__1::allocator, nlohmann::json_abi_v3_11_3::adl_serializer, std::__1::vector<unsigned char, std::__1::allocator<unsigned char>>, void>::~basic_json(this=0x000000016fdff0b8) + 44 at json.hpp:1266
frame #17: 0x0000000100064694 ep_testsuite_dcp` nlohmann::json_abi_v3_11_3::basic_json<std::__1::map, std::__1::vector, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, bool, long long, unsigned long long, double, std::__1::allocator, nlohmann::json_abi_v3_11_3::adl_serializer, std::__1::vector<unsigned char, std::__1::allocator<unsigned char>>, void>::~basic_json(this=0x000000016fdff0b8) + 28 at json.hpp:1264
frame #18: 0x000000010007c874 ep_testsuite_dcp` nlohmann::json_abi_v3_11_3::basic_json<std::__1::map, std::__1::vector, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, bool, long long, unsigned long long, double, std::__1::allocator, nlohmann::json_abi_v3_11_3::adl_serializer, std::__1::vector<unsigned char, std::__1::allocator<unsigned char>>, void>::json_value::destroy(this=0x0000000101500b80, t=object) + 972 at json.hpp:621
frame #19: 0x000000010007c490 ep_testsuite_dcp` nlohmann::json_abi_v3_11_3::basic_json<std::__1::map, std::__1::vector, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, bool, long long, unsigned long long, double, std::__1::allocator, nlohmann::json_abi_v3_11_3::adl_serializer, std::__1::vector<unsigned char, std::__1::allocator<unsigned char>>, void>::data::~data(this=0x0000000101500b78) + 36 at json.hpp:4225
frame #20: 0x0000000100077fb8 ep_testsuite_dcp` nlohmann::json_abi_v3_11_3::basic_json<std::__1::map, std::__1::vector, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, bool, long long, unsigned long long, double, std::__1::allocator, nlohmann::json_abi_v3_11_3::adl_serializer, std::__1::vector<unsigned char, std::__1::allocator<unsigned char>>, void>::data::~data(this=0x0000000101500b78) + 28 at json.hpp:4224
frame #21: 0x0000000100085e58 ep_testsuite_dcp` nlohmann::json_abi_v3_11_3::basic_json<std::__1::map, std::__1::vector, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, bool, long long, unsigned long long, double, std::__1::allocator, nlohmann::json_abi_v3_11_3::adl_serializer, std::__1::vector<unsigned char, std::__1::allocator<unsigned char>>, void>::~basic_json(this=0x0000000101500b78) + 44 at json.hpp:1266
frame #22: 0x0000000100064694 ep_testsuite_dcp` nlohmann::json_abi_v3_11_3::basic_json<std::__1::map, std::__1::vector, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, bool, long long, unsigned long long, double, std::__1::allocator, nlohmann::json_abi_v3_11_3::adl_serializer, std::__1::vector<unsigned char, std::__1::allocator<unsigned char>>, void>::~basic_json(this=0x0000000101500b78) + 28 at json.hpp:1264
frame #23: 0x000000018158b2e8 libsystem_c.dylib` __cxa_finalize_ranges + 476
frame #24: 0x000000018158b070 libsystem_c.dylib` exit + 44
frame #25: 0x00000001816e6850 libdyld.dylib` dyld4::LibSystemHelpers::exit(int) const + 20
frame #26: 0x00000001813431a0 dyld` start + 2552
Change-Id: I6e4b8203eb5dfd67352386895d87dfd52f6c6641
Reviewed-on: https://review.couchbase.org/c/kv_engine/+/238640
Tested-by: Build Bot <build@couchbase.com>
Reviewed-by: Paolo Cocchi <paolo.cocchi@couchbase.com>
Well-Formed: Restriction Checker
Reviewed-by: Mohammad Zaeem <mohammad.zaeem@couchbase.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The MutateWithMeta command allows a user to store an item and have parts of the xattr segment being replaced with the actual value for CAS and seqno (just like subdoc allows for).
Subdoc cannot be used for this as there is a max number (it is however configurable) of paths one may operate on in the "multi" versions, but there is no upper limit on the number of xattr paths a document may have (which means that XDCR could encounter a document with more paths than the max configured subdoc path limit).
The command:
Change-Id: I47d70a1e7bb754eedde7a19ab3980eadccfa23f0