Conversation
WalkthroughAdds a nested SliceCoordinator to RangePartitionDataSyncScanCc, replaces per-core iterator state with per-core slice indices and coordinator-driven pinning/batching, rewrites template_cc_map scan/batching to use the coordinator, and changes several dispatch paths to enqueue scans to a single randomly selected core. Changes
Sequence Diagram(s)sequenceDiagram
actor ScanLoop as ScanLoop
participant SliceCoord as SliceCoordinator
participant Store as Store/RangeState
participant CoreSelector as CoreSelector
participant CoreWorker as CoreWorker
rect rgb(235,245,255)
Note over ScanLoop,SliceCoord: Initialize coordinator with slices (is_continuous flag)
ScanLoop->>SliceCoord: Prepare(slices_to_scan_, is_continuous)
SliceCoord->>Store: register start indices / pinned=false
end
rect rgb(235,255,235)
loop per batch iteration
ScanLoop->>SliceCoord: IsReadyForScan(core_id)?
alt ready
SliceCoord-->>ScanLoop: (StartKey, EndKey) / pinned slice
ScanLoop->>CoreSelector: choose random core (core_rand % n_cores)
CoreSelector-->>CoreWorker: selected core id
ScanLoop->>CoreWorker: Enqueue scan(StartKey, EndKey)
CoreWorker-->>ScanLoop: complete / Wait
else not ready
SliceCoord-->>ScanLoop: not ready
end
ScanLoop->>SliceCoord: TheBatchEnd(core_id)?
alt batch end
SliceCoord-->>ScanLoop: true
ScanLoop->>SliceCoord: MoveToNextSlice(core_id)
SliceCoord->>Store: advance index / update pinned state
end
end
end
rect rgb(250,235,255)
ScanLoop->>SliceCoord: IsLastBatch()?
alt last batch
ScanLoop->>SliceCoord: UnpinSlices()
SliceCoord->>Store: clear pinned slices
end
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (2 warnings, 1 inconclusive)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tx_service/include/cc/cc_request.h (1)
4179-4210:unfinished_cnt_ = 1initialization creates a race condition with early error paths
Reset()hard-setsunfinished_cnt_ = 1, but the subsequent call toSetUnfinishedCoreCnt(core_cnt)is not guaranteed to execute beforeSetError()inExecute(). At lines 5200–5206 and 5223–5226 intemplate_cc_map.h, early error paths callSetError()before line 5229 whereSetUnfinishedCoreCnt()is called. This causes:
unfinished_cnt_starts at 1 (afterReset())SetError()decrements it to 0 and immediately callsUnpinSlices()+ notifiesWait()- Caller's
Wait()returns prematurely while cores have not been dispatched- Data corruption or missing scans if the caller proceeds
This affects both
sk_generator.cpp(line 360–365) andlocal_cc_shards.cpp(line 4045–4047) which rely onWait()blocking until all cores finish.Fix: Initialize
unfinished_cnt_tocore_cnt_inReset()directly, or guard earlySetError()calls to ensureSetUnfinishedCoreCnt()is called first.Also,
UnpinSlices()correctly centralizes slice unpinning and ensures exactly-once semantics when the last core finishes or errors.
🧹 Nitpick comments (3)
tx_service/src/cc/local_cc_shards.cpp (1)
4070-4075: Random single-core dispatch relies on RangePartitionDataSyncScanCc to fan out correctlySwitching from per-core enqueue to picking a single shard with
butil::fast_rand() % cc_shards_.size()looks intentional given the comment that the “first core” will further dispatch to remaining cores. This should reduce queue pressure while preserving parallelism, assumingRangePartitionDataSyncScanCcindeed owns the cross-core fan-out. Two minor points to double‑check:
- Ensure the header that declares
butil::fast_rand()is included somewhere in the translation unit (directly or transitively), otherwise this will fail to compile.- Confirm that
RangePartitionDataSyncScanCcis only expected to be enqueued on one shard per iteration now; if it still assumes multi-enqueue from all cores, this change could silently reduce scan coverage/concurrency.If both assumptions hold, the change looks sound.
tx_service/include/cc/cc_request.h (1)
4041-4092: SliceCoordinator wiring andslices_to_scan_population look correctInitializing
slice_coordinator_withexport_base_table_item_and&slices_to_scan_, reservingslices_to_scan_capacity fromold_slices_delta_size, and usingGetShallowCopy()to avoid mutating the source map all look sound and consistent with the intended ownership model.curr_slice_index_is correctly initialised per core.If you want a tiny readability tweak later, a ranged
forover the map would be clearer thanstd::for_each, but that’s cosmetic only.tx_service/include/cc/template_cc_map.h (1)
5244-5416: SliceCoordinator initialization and multi‑core dispatch: behavior looks consistent, but a few invariants are worth double‑checkingThe new
RangePartitionDataSyncScanCc::Executeflow usingslice_coordinator_(pinning viapin_range_slice,PreparedSliceCnt,UpdateBatchEnd,MoveToNextSlice, and per‑corePausePos/IsDrained) is generally coherent and matches the intended design of coordinated, batched slice scanning across cores.A few points to explicitly verify in tests:
Single‑core ownership of batch preparation
if (!req.slice_coordinator_.IsReadyForScan()) { ... SetReadyForScan(); ... Enqueue(...,&req); }assumes this block is only ever run once per batch (on the initial core) before other cores see the request. GivenEnqueueis used for cross‑core dispatch, please confirm there is no path where a non‑initiator core can executeExecuteon the samereqbeforeSetReadyForScan()is set, which would cause duplicate pinning / slice‑state corruption.
PinRangeSlicestatus handling vs. subsequent use ofnew_slice_id
- In
pin_range_slice,RangeSliceOpStatus::NotOwneris treated assucc = trueafter an internalassert("Dead branch"). Downstream, the call site always does:
req.slice_coordinator_.StorePinnedSlice(new_slice_id);const TemplateStoreSlice<KeyT> *slice = static_cast<const TemplateStoreSlice<KeyT> *>(new_slice_id.Slice());- Please confirm that in the
NotOwnercasenew_slice_id.Slice()is guaranteed to be non‑null and safe to dereference; otherwise this path should probably be treated as a failure (or short‑circuited before accessing the slice) even in non‑debug builds.
find_non_empty_sliceinvariants when the map/slices are empty
- When all slices in the batch are empty (or CCM is empty),
find_non_empty_slicecan legitimately returnkey_it == slice_end_it, andslice_end_keycoming from the last slice. The laterassert(key_it != slice_end_it || req.TheBatchEnd(shard_->core_id_));relies onTheBatchEnd()beingtruein this case. That seems reasonable, but it’s worth validating with a unit/integration test where a batch contains only empty slices to ensureTheBatchEndandIsLastBatchstay consistent and that the resume logic usingPausePosworks as expected.Pause/resume semantics across batches
no_more_datais computed as(key_it == slice_end_it) && req.IsLastBatch();andPausePosis updated with either the next key or an emptyTxKey. Combined with theTheBatchEnd(core_id)condition in the finalif:
if (is_scan_mem_full || no_more_data || accumulated >= scan_batch_size || req.TheBatchEnd(core_id)) { req.SetFinish(core_id_); ... }- Please double‑check that for non‑last batches we always leave enough state (
PausePos, current slice index, coordinator state) for the next batch to resume correctly and thatIsDrained(core_id)transitions only when the final batch truly has no more slices for that core.Cross‑core dispatch helper choice
- Here we dispatch with
shard_->Enqueue(shard_->LocalCoreId(), core_id, &req);, while many other multi‑core flows in this file uselocal_shards_.EnqueueCcRequest(...). IfEnqueueis a newer convenience wrapper for cross‑core CC dispatch that preserves all the invariants (UnfinishedCoreCnt, per‑core state ordering), that’s fine; otherwise it might be safer to align with the existing pattern. Please confirm the intended API here.If all of the above invariants hold under tests (especially with empty‑slice batches and split‑slice scenarios), the refactored coordination logic looks solid and easier to reason about than the previous ad‑hoc per‑core slice tracking.
Also applies to: 5421-5432, 5436-5717, 5719-5724
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
tx_service/include/cc/cc_request.htx_service/include/cc/template_cc_map.htx_service/src/cc/local_cc_shards.cpptx_service/src/sk_generator.cpp
🧰 Additional context used
🧠 Learnings (6)
📓 Common learnings
Learnt from: liunyl
Repo: eloqdata/tx_service PR: 149
File: include/cc/cc_request.h:1876-1927
Timestamp: 2025-10-20T04:30:07.884Z
Learning: ScanNextBatchCc in include/cc/cc_request.h is used only for hash-partition scans; range-partition scans are handled by ScanSliceCc.
Learnt from: lzxddz
Repo: eloqdata/tx_service PR: 199
File: include/cc/local_cc_shards.h:233-234
Timestamp: 2025-11-11T07:10:40.346Z
Learning: In the LocalCcShards class in include/cc/local_cc_shards.h, the EnqueueCcRequest methods use `shard_code & 0x3FF` followed by `% cc_shards_.size()` to distribute work across processor cores for load balancing. This is intentional and separate from partition ID calculation. The 0x3FF mask creates a consistent distribution range (0-1023) before modulo by actual core count.
📚 Learning: 2025-11-11T07:10:40.346Z
Learnt from: lzxddz
Repo: eloqdata/tx_service PR: 199
File: include/cc/local_cc_shards.h:233-234
Timestamp: 2025-11-11T07:10:40.346Z
Learning: In the LocalCcShards class in include/cc/local_cc_shards.h, the EnqueueCcRequest methods use `shard_code & 0x3FF` followed by `% cc_shards_.size()` to distribute work across processor cores for load balancing. This is intentional and separate from partition ID calculation. The 0x3FF mask creates a consistent distribution range (0-1023) before modulo by actual core count.
Applied to files:
tx_service/src/sk_generator.cpptx_service/src/cc/local_cc_shards.cpptx_service/include/cc/cc_request.h
📚 Learning: 2025-10-20T04:30:07.884Z
Learnt from: liunyl
Repo: eloqdata/tx_service PR: 149
File: include/cc/cc_request.h:1876-1927
Timestamp: 2025-10-20T04:30:07.884Z
Learning: ScanNextBatchCc in include/cc/cc_request.h is used only for hash-partition scans; range-partition scans are handled by ScanSliceCc.
Applied to files:
tx_service/src/sk_generator.cpptx_service/src/cc/local_cc_shards.cpptx_service/include/cc/template_cc_map.htx_service/include/cc/cc_request.h
📚 Learning: 2025-10-21T06:46:53.700Z
Learnt from: lokax
Repo: eloqdata/tx_service PR: 149
File: src/remote/cc_stream_receiver.cpp:1066-1075
Timestamp: 2025-10-21T06:46:53.700Z
Learning: In src/remote/cc_stream_receiver.cpp, for ScanNextRequest handling, BucketIds() on RemoteScanNextBatch should never be empty—this is an expected invariant of the scan protocol.
Applied to files:
tx_service/src/sk_generator.cpptx_service/src/cc/local_cc_shards.cpptx_service/include/cc/template_cc_map.htx_service/include/cc/cc_request.h
📚 Learning: 2025-10-09T03:56:58.811Z
Learnt from: thweetkomputer
Repo: eloqdata/tx_service PR: 150
File: include/cc/local_cc_shards.h:626-631
Timestamp: 2025-10-09T03:56:58.811Z
Learning: For the LocalCcShards class in include/cc/local_cc_shards.h: Writer locks (unique_lock) should continue using the original meta_data_mux_ (std::shared_mutex) rather than fast_meta_data_mux_ (FastMetaDataMutex) at this stage. Only reader locks may use the FastMetaDataMutex wrapper.
Applied to files:
tx_service/src/cc/local_cc_shards.cpptx_service/include/cc/template_cc_map.h
📚 Learning: 2025-12-02T10:43:27.431Z
Learnt from: lokax
Repo: eloqdata/tx_service PR: 254
File: tx_service/src/cc/local_cc_shards.cpp:2949-3188
Timestamp: 2025-12-02T10:43:27.431Z
Learning: In tx_service/src/cc/local_cc_shards.cpp, whenever TryPinNodeGroupData is used, only call Sharder::Instance().UnpinNodeGroupData(node_group) if the recorded term is >= 0 (i.e., pin succeeded). Example: LocalCcShards::PostProcessFlushTaskEntries guards the unpin with `if (term >= 0)`.
Applied to files:
tx_service/src/cc/local_cc_shards.cpp
🧬 Code graph analysis (1)
tx_service/src/cc/local_cc_shards.cpp (1)
tx_service/include/cc/local_cc_shards.h (1)
EnqueueLowPriorityCcRequestToShard(413-419)
🔇 Additional comments (4)
tx_service/src/sk_generator.cpp (1)
358-362: The cross-core dispatch coordination is properly implemented. The handler forRangePartitionDataSyncScanCc(executed on the first core) pins the slices viareq.PinSlices()and then explicitly dispatches to remaining cores through a loop intemplate_cc_map.h(lines 3660-3671), ensuring parallel scanning after the slice is pinned. This pattern is consistently applied across normal scans, range partition data sync scans, and remote scans.tx_service/src/cc/local_cc_shards.cpp (1)
3836-3839: Extra debug logging for delta-size scan is reasonableThe DLOG adds useful context (range ID, table, last/data sync timestamps) and is guarded by debug logging, so it should not impact release performance. No functional concerns here.
tx_service/include/cc/template_cc_map.h (2)
623-636: Additional DLOG on cc_page last_dirty_commit_ts looks safeThe new debug log when updating
cc_page->last_dirty_commit_ts_is side‑effect free and gated behindDLOG, so it won’t affect release behavior; it should be helpful when diagnosing data‑sync / delta‑size inconsistencies.
7968-8001: New diagnostic logging for inconsistentNeedCkptvslast_dirty_commit_ts_is helpfulThe added
DLOGinScanSliceDeltaSizeCcForRangePartitionwhenccp->last_dirty_commit_ts_ <= req.LastDataSyncTs()butcce->NeedCkpt()is still true is a good safeguard: it makes the invariant enforced by the followingassert(!cce->NeedCkpt());observable, without changing behavior in release builds.No functional issues spotted here.
tx_service/include/cc/cc_request.h
Outdated
| DLOG(INFO) << "SetFinish on core " << core_id << " unfinished_cnt_ " | ||
| << unfinished_cnt_; | ||
| if (export_base_table_item_ && !pause_pos_[core_id].second) | ||
| { | ||
| // Only not drained on this core, should set the paused key. | ||
| UpdateMinPausedSlice(&pause_pos_[core_id].first); | ||
| } | ||
| else if (!export_base_table_item_) | ||
| { | ||
| UpdateMinPausedSlice(curr_slice_index_[core_id]); | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# Search for TxKey definition and operator< semantics
rg -n 'struct TxKey|class TxKey' --type cpp --type h -A 5Repository: eloqdata/tx_service
Length of output: 2069
🏁 Script executed:
# Search for UpdateMinPausedSlice implementation and usage
rg -n 'UpdateMinPausedSlice' --type cpp --type h -B 2 -A 5Repository: eloqdata/tx_service
Length of output: 2368
🏁 Script executed:
# Search for SliceCoordinator class definition
rg -n 'struct SliceCoordinator|class SliceCoordinator' --type cpp --type h -A 15Repository: eloqdata/tx_service
Length of output: 1237
🏁 Script executed:
# Search for FixCurrentSliceIndex implementation
rg -n 'FixCurrentSliceIndex' --type cpp --type h -B 2 -A 10Repository: eloqdata/tx_service
Length of output: 2119
🏁 Script executed:
# Get the full FixCurrentSliceIndex implementation
sed -n '4307,4330p' tx_service/include/cc/cc_request.hRepository: eloqdata/tx_service
Length of output: 939
🏁 Script executed:
# Check TxKey default constructor and operator< details
sed -n '289,350p' tx_service/include/tx_key.hRepository: eloqdata/tx_service
Length of output: 1684
🏁 Script executed:
# Look for operator< in TxKey
rg -n 'operator<' tx_service/include/tx_key.h -A 3Repository: eloqdata/tx_service
Length of output: 1016
🏁 Script executed:
# Check how min_paused_slice_index_ is used and if it's reset
rg -n 'min_paused_slice_index_' tx_service/include/cc/cc_request.h -B 2 -A 2Repository: eloqdata/tx_service
Length of output: 2303
🏁 Script executed:
# Check how pause_pos_ is set and its relationship to pinned_slices_
rg -n 'pause_pos_\[' tx_service/include/cc/cc_request.h -B 3 -A 1 | head -60Repository: eloqdata/tx_service
Length of output: 1308
🏁 Script executed:
# Check how pinned_slices_ is populated
rg -n 'pinned_slices_' tx_service/include/cc/cc_request.h -B 2 -A 2 | head -80Repository: eloqdata/tx_service
Length of output: 3129
🏁 Script executed:
# Check MoveNextSlice to understand min_paused_slice_index_ flow
sed -n '4524,4560p' tx_service/include/cc/cc_request.hRepository: eloqdata/tx_service
Length of output: 1415
🏁 Script executed:
# Check how pause_pos_ is initialized and where it comes from
rg -n 'pause_pos_' tx_service/include/cc/cc_request.h | head -20Repository: eloqdata/tx_service
Length of output: 1141
🏁 Script executed:
# Get initialization of pause_pos_
rg -n 'pause_pos_\(' tx_service/include/cc/cc_request.h -B 2 -A 2Repository: eloqdata/tx_service
Length of output: 340
🏁 Script executed:
# Check the context around SetFinish to see when UpdateMinPausedSlice is actually called
sed -n '4240,4260p' tx_service/include/cc/cc_request.hRepository: eloqdata/tx_service
Length of output: 780
🏁 Script executed:
# Check what calls FixCurrentSliceIndex and under what conditions
rg -n 'FixCurrentSliceIndex' tx_service/include/cc/ tx_service/src/cc/ --type cpp --type h -B 5 -A 2Repository: eloqdata/tx_service
Length of output: 1202
🏁 Script executed:
# Examine the full initialization context around line 4086 to understand pause_pos_ lifecycle
sed -n '4080,4090p' tx_service/include/cc/cc_request.hRepository: eloqdata/tx_service
Length of output: 556
🏁 Script executed:
# Check if there are any guards or contracts that ensure pause_pos_ is always within pinned_slices_
rg -n 'pause_pos_' tx_service/include/cc/template_cc_map.h -B 5 -A 5 | head -80Repository: eloqdata/tx_service
Length of output: 2481
🏁 Script executed:
# Look for any documentation or comments about the invariants of pause_pos_ vs pinned_slices_
sed -n '4590,4630p' tx_service/include/cc/cc_request.hRepository: eloqdata/tx_service
Length of output: 2060
🏁 Script executed:
# Let's check if there are bounds checks we might have missed in the while loop
sed -n '4307,4322p' tx_service/include/cc/cc_request.hRepository: eloqdata/tx_service
Length of output: 655
🏁 Script executed:
# Check the context where pause_pos_ is set during scanning to understand the invariant
sed -n '5695,5710p' tx_service/include/cc/template_cc_map.hRepository: eloqdata/tx_service
Length of output: 661
Bounds-checking vulnerability in FixCurrentSliceIndex requires immediate hardening
The overall refactor (per-core curr_slice_index_, FixCurrentSliceIndex, CurrentSlice*, TheBatchEnd, IsSlicePinned, and SliceCoordinator) centralises slice pinning and pause/batch state effectively. However, one critical issue and one semantic concern need attention:
-
FixCurrentSliceIndexout-of-bounds vulnerability
The loop walkspinned_slices_without bounds checking:while (curr_slice->EndTxKey() < pause_pos_[core_id].first) { ++curr_slice_idx; curr_slice = slice_coordinator_.pinned_slices_[curr_slice_idx]; }
If
pause_pos_[core_id].firstis set to a key beyond the last pinned slice, this reads past the vector boundary. The paused key is set independently during scanning and there is no local invariant enforcing it falls within the pinned slice range. Add a bounds guard:while (curr_slice_idx + 1 < slice_coordinator_.pinned_slices_.size() && curr_slice->EndTxKey() < pause_pos_[core_id].first)
-
Non-continuous
min_paused_slice_index_semantics
min_paused_slice_index_is initialized to0butUpdateMinPausedSlice(size_t slice_index)only updates whenslice_index < min_paused_slice_index_. This means a call with index0on the first batch will not update it. If the intent is to track the minimum paused slice index across cores for the next batch, the initialization logic should be reconsidered—either initialize tostd::numeric_limits<size_t>::max()or adjust the update condition.
| // The scan slice request is dispatched to the first core. The first | ||
| // core tries to pin the slice if necessary and if succeeds, further | ||
| // dispatches the request to remaining cores for parallel scans. |
There was a problem hiding this comment.
Comment doesn't match the implementation.
The comment states the request is "dispatched to the first core," but Line 362 actually dispatches to a randomly selected core using core_rand % core_cnt.
🔎 Proposed fix
- // The scan slice request is dispatched to the first core. The first
- // core tries to pin the slice if necessary and if succeeds, further
- // dispatches the request to remaining cores for parallel scans.
+ // The scan slice request is dispatched to a randomly selected core.
+ // That core tries to pin the slice if necessary and if succeeds,
+ // further dispatches the request to remaining cores for parallel scans.📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // The scan slice request is dispatched to the first core. The first | |
| // core tries to pin the slice if necessary and if succeeds, further | |
| // dispatches the request to remaining cores for parallel scans. | |
| // The scan slice request is dispatched to a randomly selected core. | |
| // That core tries to pin the slice if necessary and if succeeds, | |
| // further dispatches the request to remaining cores for parallel scans. |
There was a problem hiding this comment.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tx_service/include/cc/cc_request.h (1)
4178-4210: Guardunfinished_cnt_updates to avoid data races
unfinished_cnt_is read/modified undermux_inWait(),SetError(), andSetFinish(), butSetUnfinishedCoreCnt()writes it without taking the same mutex. This creates a data race and formally undefined behaviour ifSetUnfinishedCoreCnt()runs concurrently with any of these methods.Wrap the assignment in the same mutex:
- void SetUnfinishedCoreCnt(uint16_t core_cnt) - { - unfinished_cnt_ = core_cnt; + void SetUnfinishedCoreCnt(uint16_t core_cnt) + { + std::lock_guard<std::mutex> lk(mux_); + unfinished_cnt_ = core_cnt;
♻️ Duplicate comments (2)
tx_service/include/cc/cc_request.h (2)
4304-4318: Add bounds checks inFixCurrentSliceIndexto prevent out-of-bounds onpinned_slices_
FixCurrentSliceIndexwalksslice_coordinator_.pinned_slices_with an unboundedwhile:while (curr_slice->EndTxKey() < pause_pos_[core_id].first) { ++curr_slice_idx; curr_slice = slice_coordinator_.pinned_slices_[curr_slice_idx]; }If
pause_pos_[core_id].firstis beyond the last pinned slice’s end key, this will read pastpinned_slices_and invoke undefined behaviour. There is no local invariant here that guarantees the pause key is ≤ last pinned slice end key.Add a size guard and handle the empty-vector case defensively.
Suggested hardening of `FixCurrentSliceIndex`
void FixCurrentSliceIndex(uint16_t core_id) { assert(export_base_table_item_); if (pause_pos_[core_id].first.KeyPtr() != nullptr) { - size_t curr_slice_idx = 0; - StoreSlice *curr_slice = - slice_coordinator_.pinned_slices_[curr_slice_idx]; - while (curr_slice->EndTxKey() < pause_pos_[core_id].first) - { - ++curr_slice_idx; - curr_slice = slice_coordinator_.pinned_slices_[curr_slice_idx]; - } - curr_slice_index_[core_id] = curr_slice_idx; + size_t curr_slice_idx = 0; + auto &pinned = slice_coordinator_.pinned_slices_; + if (pinned.empty()) + { + // Nothing pinned – nothing to fix; bail out defensively. + return; + } + + StoreSlice *curr_slice = pinned[curr_slice_idx]; + while (curr_slice_idx + 1 < pinned.size() && + curr_slice->EndTxKey() < pause_pos_[core_id].first) + { + ++curr_slice_idx; + curr_slice = pinned[curr_slice_idx]; + } + curr_slice_index_[core_id] = curr_slice_idx; } }
4386-4407:min_paused_slice_index_initialization breaks minimum-tracking semantics and isn’t reset across runsIn non-continuous mode,
SliceCoordinatordoes:if (is_continuous_) { min_paused_key_ = TxKey(); } else { min_paused_slice_index_ = 0; }and
Reset()does not reinitialize the union.UpdateMinPausedSlice(size_t slice_index)only updates whenslice_index < min_paused_slice_index_. With an initial value of0, the first and all subsequent calls with valid indices (≥0) will never update the minimum; acrossReset()calls the old value is also preserved. This meansmin_paused_slice_index_can get stuck at0and no longer reflects the true minimum paused slice index across cores/batches, which affectsStartKey(),MoveNextSlice()andUpdateBatchEnd()semantics. This is the same issue previously called out and it still exists in this revision.A more robust pattern is to initialize and reset to a sentinel “no index yet” value (e.g.
std::numeric_limits<size_t>::max()) and then relax the logic accordingly.Suggested fix for non-continuous mode initialization/reset
struct SliceCoordinator { static constexpr uint16_t MaxBatchSliceCount = 512; @@ SliceCoordinator(bool is_continuous, std::vector<std::pair<TxKey, bool>> *slices_keys) : is_continuous_(is_continuous), slices_keys_ptr_(slices_keys) { pinned_slices_.reserve(MaxBatchSliceCount); if (is_continuous_) { - min_paused_key_ = TxKey(); + min_paused_key_ = TxKey(); } else { - min_paused_slice_index_ = 0; + min_paused_slice_index_ = + std::numeric_limits<size_t>::max(); } batch_end_slice_index_ = 0; } @@ void Reset() { first_slice_id_.Reset(); pinned_slices_.clear(); prepared_slice_cnt_ = 0; ready_for_scan_.store(false, std::memory_order_relaxed); batch_end_slice_index_ = 0; is_end_slice_ = false; + + if (is_continuous_) + { + min_paused_key_ = TxKey(); + } + else + { + min_paused_slice_index_ = + std::numeric_limits<size_t>::max(); + } } @@ void UpdateMinPausedSlice(size_t slice_index) { assert(!export_base_table_item_); if (slice_index < slice_coordinator_.min_paused_slice_index_) { slice_coordinator_.min_paused_slice_index_ = slice_index; } }Also applies to: 4418-4546
🧹 Nitpick comments (6)
tx_service/include/cc/template_cc_map.h (4)
5237-5305: Status handling inpin_range_slicecould be made more explicitThe
pin_range_slicelambda (Lines 5237–5305) is generally consistent with otherPinRangeSlicecall sites, but two aspects are worth tightening:
- Line 5283 (
RangeSliceOpStatus::NotOwner) both triggersassert("Dead branch")and setssucc = true. In non-assert builds this silently treats a “NotOwner” slice as success. If this state is truly impossible, consider usingassert(false)without also marking success; otherwise, treat it as a real error (e.g., log and propagatePIN_RANGE_SLICE_FAILED) so mis-routing during failover/split doesn’t get masked.- Line 5241 hard-codes
prefetch_size = 32. If the desired prefetch depth is already encoded elsewhere (e.g., onreqor in a config), it would be cleaner to thread that through rather than baking in a magic constant here.
5321-5414: Guard assumptions aroundStoreRangePtrandMoveNextSlicesemanticsThe new helpers around slice preparation and coordination look structurally sound but rely on a couple of non-obvious invariants:
- Line 5323 assumes
req.StoreRangePtr()is non-null and thatFindSlice(slice_key)never returnsnullptr. If there is any path where the store range has not been pinned/set before this executes, this will dereference null. Adding anassert(req.StoreRangePtr() != nullptr);(and optionally asserting theFindSliceresult) would make the contract explicit.- Lines 5336–5341 only populate
req_end_keywhenreq.export_base_table_item_is true; for the index path it remainsnullptr, yet it is still passed toMoveNextSlice<KeyT>(slice_end_key, req_end_key)on Line 5389. Please double-check thatMoveNextSliceis defined to accept anullptrend key (and treat it as unbounded); otherwise, this can skew batch-end calculation.- On Lines 5345–5353,
prepared_slice_cntis derived fromslice_coordinator_.PreparedSliceCnt()and then incremented for each loop iteration, regardless of whetherpin_range_sliceeventually succeeds. On theRangeSliceOpStatus::Retrypath,pin_range_sliceenqueues the request and returnssucc = false, whileprepared_slice_cnthas already been incremented. UpdatingPreparedSliceCntwith this value (Lines 5371–5373) means the coordinator might count a slice as “prepared” even though the pin hasn’t actually been performed yet. This is subtle; please verify the coordinator’s semantics so slices are neither skipped nor double-counted across retries.
5424-5488:find_non_empty_slicerelies on strong invariants forCurrentSliceThe new
find_non_empty_slicehelper (Lines 5426–5488) is a nice way to centralize discovery of the next non-empty slice, but it assumes:
- Line 5464:
req.CurrentSlice(shard_->core_id_)always returns a validStoreSlice*wheneverreq.TheBatchEnd(...)is false. IfCurrentSlicecan ever be null (e.g., coordinator misalignment or range metadata races), this will crash before theTheBatchEndcheck. A defensiveassert(store_slice != nullptr);here (and/or earlier) would document the invariant.- Line 5438–5453: the lambda asserts that if
search_keyis beyondcurr_start_key, it must still be strictly less than the slice’s end key derived viareq.StoreRangePtr()->FindSlice(curr_start_tx_key). This depends onPausePosandCurrentSliceKeyalways being in sync with the underlying store-range layout (no concurrent range split/move that invalidates the mapping). If range splits can occur while a data-sync scan is in progress, it’s worth confirming this assertion cannot fire in production.Functionally the control flow (advancing slices until
it != end_itor batch end) looks correct for both empty and non-empty slices; the main concern is making the hidden assumptions explicit.
5510-5524: Scan loop pause/resume and finish conditions are subtle; consider clarifying/validating with testsThe main per-core scan loop and finish logic tie together several moving parts:
- Lines 5510–5511: initial
key_it/slice_end_itare obtained fromfind_non_empty_slice, and Line 5524 assertskey_it != slice_end_it || req.TheBatchEnd(core). Later, after the loop, Line 5636 reasserts the same invariant.- Lines 5617–5627: when a slice is exhausted, you clear
slice_pinned, advance withMoveToNextSlice, and, if not atTheBatchEnd, callfind_non_empty_sliceagain to continue scanning the next non-empty slice.- Lines 5653–5656:
SetFinish(core)is triggered when any of these are true: heap full,no_more_data(only whenIsLastBatch()andkey_it == slice_end_it),accumulated_scan_cnt_ >= scan_batch_size_, orTheBatchEnd(core).This appears consistent, but the interaction between:
no_more_datavsTheBatchEnd(per-batch vs overall),PausePos(Line 5638–5647),- and
IsDrainedchecked at entry (Lines 5417–5422),is intricate enough that regressions are easy to introduce.
I’d recommend:
- Adding a brief comment near the
no_more_datacomputation (Lines 5639–5643) explaining the distinction between “end of current batch of slices” (TheBatchEnd) and “end of the entire scan” (IsLastBatch).- Validating with targeted tests covering: multi-batch scans over several slices (including empty slices), resume from
PausePosmid-range, and the path wherescan_heap_is_full_forces early exit with remaining data.Also applies to: 5617-5636, 5653-5656
tx_service/src/cc/local_cc_shards.cpp (1)
4066-4071: Randomized ‘first core’ selection looks fine; consider aligning the comment and factoring out core countThe change to route
scan_ccto a randomly chosen shard each iteration is compatible with the “first core coordinates and fans out” contract, but the comment still reads as if there were a fixed “first” core. To reduce confusion for future readers, you might explicitly mention that this core is chosen randomly now, and optionally cacheconst auto core_count = cc_shards_.size();outside the loop and reuse it for the modulo to avoid re-reading the vector size in a hot path.tx_service/include/cc/cc_request.h (1)
4587-4592: Updateslices_to_scan_comment to match new “pinned slice” semantics
slices_to_scan_’sboolis now used as a per-slice state flag (set bySliceCoordinator::SlicePinned()and read byIsSlicePinned()), but the comment still says “mark if the slice need to be split”, which is misleading under the new coordinator model.Consider updating the comment (or renaming the flag) to reflect that it tracks pinned/active slices for the current batch, not “need-split” state.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
tx_service/include/cc/cc_request.htx_service/include/cc/template_cc_map.htx_service/src/cc/local_cc_shards.cpp
🧰 Additional context used
🧠 Learnings (6)
📓 Common learnings
Learnt from: liunyl
Repo: eloqdata/tx_service PR: 149
File: include/cc/cc_request.h:1876-1927
Timestamp: 2025-10-20T04:30:07.884Z
Learning: ScanNextBatchCc in include/cc/cc_request.h is used only for hash-partition scans; range-partition scans are handled by ScanSliceCc.
Learnt from: lzxddz
Repo: eloqdata/tx_service PR: 199
File: include/cc/local_cc_shards.h:233-234
Timestamp: 2025-11-11T07:10:40.346Z
Learning: In the LocalCcShards class in include/cc/local_cc_shards.h, the EnqueueCcRequest methods use `shard_code & 0x3FF` followed by `% cc_shards_.size()` to distribute work across processor cores for load balancing. This is intentional and separate from partition ID calculation. The 0x3FF mask creates a consistent distribution range (0-1023) before modulo by actual core count.
📚 Learning: 2025-10-20T04:30:07.884Z
Learnt from: liunyl
Repo: eloqdata/tx_service PR: 149
File: include/cc/cc_request.h:1876-1927
Timestamp: 2025-10-20T04:30:07.884Z
Learning: ScanNextBatchCc in include/cc/cc_request.h is used only for hash-partition scans; range-partition scans are handled by ScanSliceCc.
Applied to files:
tx_service/include/cc/cc_request.htx_service/src/cc/local_cc_shards.cpptx_service/include/cc/template_cc_map.h
📚 Learning: 2025-11-11T07:10:40.346Z
Learnt from: lzxddz
Repo: eloqdata/tx_service PR: 199
File: include/cc/local_cc_shards.h:233-234
Timestamp: 2025-11-11T07:10:40.346Z
Learning: In the LocalCcShards class in include/cc/local_cc_shards.h, the EnqueueCcRequest methods use `shard_code & 0x3FF` followed by `% cc_shards_.size()` to distribute work across processor cores for load balancing. This is intentional and separate from partition ID calculation. The 0x3FF mask creates a consistent distribution range (0-1023) before modulo by actual core count.
Applied to files:
tx_service/include/cc/cc_request.htx_service/src/cc/local_cc_shards.cpp
📚 Learning: 2025-10-21T06:46:53.700Z
Learnt from: lokax
Repo: eloqdata/tx_service PR: 149
File: src/remote/cc_stream_receiver.cpp:1066-1075
Timestamp: 2025-10-21T06:46:53.700Z
Learning: In src/remote/cc_stream_receiver.cpp, for ScanNextRequest handling, BucketIds() on RemoteScanNextBatch should never be empty—this is an expected invariant of the scan protocol.
Applied to files:
tx_service/include/cc/cc_request.htx_service/src/cc/local_cc_shards.cpptx_service/include/cc/template_cc_map.h
📚 Learning: 2025-10-09T03:56:58.811Z
Learnt from: thweetkomputer
Repo: eloqdata/tx_service PR: 150
File: include/cc/local_cc_shards.h:626-631
Timestamp: 2025-10-09T03:56:58.811Z
Learning: For the LocalCcShards class in include/cc/local_cc_shards.h: Writer locks (unique_lock) should continue using the original meta_data_mux_ (std::shared_mutex) rather than fast_meta_data_mux_ (FastMetaDataMutex) at this stage. Only reader locks may use the FastMetaDataMutex wrapper.
Applied to files:
tx_service/src/cc/local_cc_shards.cpp
📚 Learning: 2025-12-02T10:43:27.431Z
Learnt from: lokax
Repo: eloqdata/tx_service PR: 254
File: tx_service/src/cc/local_cc_shards.cpp:2949-3188
Timestamp: 2025-12-02T10:43:27.431Z
Learning: In tx_service/src/cc/local_cc_shards.cpp, whenever TryPinNodeGroupData is used, only call Sharder::Instance().UnpinNodeGroupData(node_group) if the recorded term is >= 0 (i.e., pin succeeded). Example: LocalCcShards::PostProcessFlushTaskEntries guards the unpin with `if (term >= 0)`.
Applied to files:
tx_service/src/cc/local_cc_shards.cpp
🧬 Code graph analysis (1)
tx_service/src/cc/local_cc_shards.cpp (1)
tx_service/include/cc/local_cc_shards.h (1)
EnqueueLowPriorityCcRequestToShard(413-419)
🔇 Additional comments (1)
tx_service/include/cc/cc_request.h (1)
4055-4091: Slice list construction and per-core state look reasonableThe constructor’s initialization of
slice_coordinator_,slices_to_scan_, and per-core vectors (data_sync_vec_,archive_vec_,mv_base_idx_vec_,pause_pos_,curr_slice_index_, counters) is coherent and consistent with the new coordinator-based design; no issues from a correctness perspective here.
fix normal scan fix split scan
13803e9 to
b91e939
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tx_service/include/cc/template_cc_map.h (1)
5247-5316: GuardRangeSliceOpStatus::NotOwnerinpin_range_slicebefore treating the slice as pinnedIn
pin_range_slice, theNotOwnercase setssucc = trueand returns whateverRangeSliceIdthe underlyingPinRangeSlicecall produced. A few lines later, the caller unconditionally does:
req.slice_coordinator_.StorePinnedSlice(new_slice_id);auto *slice = static_cast<const TemplateStoreSlice<KeyT> *>(new_slice_id.Slice());const KeyT *slice_end_key = slice->EndKey();(Line 5391–5396)
If
PinRangeSlicerepresentsNotOwnerwith an invalid/emptyRangeSliceId(e.g.,Slice() == nullptr), this will be UB / a hard crash. Other call sites in this file (e.g.,ScanSliceDeltaSizeCcForRangePartition) explicitly avoid touchingslice_idonNotOwner, which suggests it’s not safe to use.Consider treating
NotOwneras “no usable slice” and skipping pin-related work for that slice rather than flowing it through as a successfully pinned slice. One possible shape:
- Return an “empty”
RangeSliceIdfrompin_range_sliceonNotOwner.- In the caller, detect
new_slice_id.Slice() == nullptrand skipSlicePinned/StorePinnedSliceandMoveNextSlice(slice_end_key, ...)for that iteration.Example tweak:
Proposed defensive handling for `NotOwner` and caller
@@ - auto pin_range_slice = - [this, &req, &next_slice_func]( - const KeyT &search_key) -> std::pair<RangeSliceId, bool> + auto pin_range_slice = + [this, &req, &next_slice_func]( + const KeyT &search_key) -> std::pair<RangeSliceId, bool> { @@ switch (pin_status) { @@ case RangeSliceOpStatus::NotOwner: { - assert("Dead branch"); - // The recovered cc entry does not belong to this ng - // anymore. This will happen if ng failover after a - // range split just finished but before checkpointer - // is able to truncate the log. In this case the log - // records of the data that now falls on another ng - // will still be replayed on the old ng on recover. - // Skip the cc entry and remove it at the end. - succ = true; + assert("Dead branch"); + // Slice is no longer owned by this node-group; treat it as + // having no usable RangeSliceId so downstream logic can skip it. + slice_id = RangeSliceId{}; + succ = true; break; } @@ return {slice_id, succ}; }; @@ - // Execute the pinslice operation. - auto [new_slice_id, succ] = pin_range_slice(*start_key); - if (!succ) - { - req.slice_coordinator_.UpdatePreparedSliceCnt( - prepared_slice_cnt); - return false; - } + // Execute the pinslice operation. + auto [new_slice_id, succ] = pin_range_slice(*start_key); + if (!succ) + { + req.slice_coordinator_.UpdatePreparedSliceCnt( + prepared_slice_cnt); + return false; + } + + // `NotOwner` (or similar) may yield an "empty" RangeSliceId; + // in that case just advance to the next slice without pinning. + if (new_slice_id.Slice() == nullptr) + { + req.slice_coordinator_.MoveNextSlice<KeyT>(); + continue; + }
♻️ Duplicate comments (1)
tx_service/include/cc/cc_request.h (1)
4291-4305: GuardFixCurrentSliceIndexagainst out-of-bounds onpinned_slices_.
FixCurrentSliceIndexwalksslice_coordinator_.pinned_slices_without any bounds checks. Ifpause_pos_[core_id].firstis greater than the last pinned slice’s end key (orpinned_slices_is empty), this will index past the end of the vector and trigger undefined behaviour.Please add an empty-check and cap the loop at
pinned_slices_.size().Proposed fix
void FixCurrentSliceIndex(uint16_t core_id) { assert(export_base_table_item_); - if (pause_pos_[core_id].first.KeyPtr() != nullptr) - { - size_t curr_slice_idx = 0; - StoreSlice *curr_slice = - slice_coordinator_.pinned_slices_[curr_slice_idx]; - while (curr_slice->EndTxKey() < pause_pos_[core_id].first) - { - ++curr_slice_idx; - curr_slice = slice_coordinator_.pinned_slices_[curr_slice_idx]; - } - curr_slice_index_[core_id] = curr_slice_idx; - } + if (pause_pos_[core_id].first.KeyPtr() == nullptr) + { + return; + } + + auto &pinned = slice_coordinator_.pinned_slices_; + if (pinned.empty()) + { + return; + } + + size_t curr_slice_idx = 0; + StoreSlice *curr_slice = pinned[curr_slice_idx]; + while (curr_slice_idx + 1 < pinned.size() && + curr_slice->EndTxKey() < pause_pos_[core_id].first) + { + ++curr_slice_idx; + curr_slice = pinned[curr_slice_idx]; + } + curr_slice_index_[core_id] = curr_slice_idx; }
🧹 Nitpick comments (2)
tx_service/src/cc/local_cc_shards.cpp (1)
4067-4072: Random single-core dispatch is reasonable; clarify coordinator intent and dependencyThe new logic on Line [4067]–Line [4072] to pick a random core as the “first”/coordinator core and enqueue
RangePartitionDataSyncScanCconly there looks aligned with the coordinator comment and should work fine as long asRangePartitionDataSyncScanCcis truly agnostic to which shard is the coordinator.Two small follow-ups:
Clarify naming/comment & avoid magic modulo in-place
Renaming the variable and comment to make the “coordinator” role explicit improves readability and avoids re-doing the modulo inline:
Proposed readability tweak
uint32_t core_rand = butil::fast_rand();// The scan slice request is dispatched to the first core. The first// core tries to pin the slice if necessary and if succeeds, further// dispatches the request to remaining cores for parallel scans.EnqueueLowPriorityCcRequestToShard(core_rand % cc_shards_.size(),&scan_cc);
const size_t coordinator_core_idx =static_cast<size_t>(butil::fast_rand()) % cc_shards_.size();// Dispatch the scan slice request to a randomly chosen coordinator// core. That core tries to pin the slice if necessary and, on success,// further dispatches the request to remaining cores for parallel scans.EnqueueLowPriorityCcRequestToShard(coordinator_core_idx, &scan_cc);</details>
Verify header and coordinator assumptions
- Ensure the translation unit includes the header that declares
butil::fast_randexplicitly (rather than relying on a transitive include), so this doesn’t become fragile if upstream headers change.- Double-check that
RangePartitionDataSyncScanCcno longer assumes coordinator core0and is safe to be enqueued from any shard index, since we now randomize the coordinator.tx_service/include/cc/template_cc_map.h (1)
5331-5424: Slice-coordinator batch preparation looks sane; consider makingStoreRangePtrinvariants explicitThe new helpers around slice preparation:
check_split_sliceusingTemplateStoreRange<KeyT>::FindSliceandPostCkptSize()to decide which slices actually need pinning.- The
req.slice_coordinator_loop that advances viaStartKey/MoveNextSlice, pins slices viapin_range_slice, storesRangeSliceIds, callsUpdateBatchEnd(), and then broadcasts the request to all cores onceSetReadyForScan()is set.all line up with how other parts of this file reason about range slices and post‑ckpt sizes, and the control flow for
PreparedSliceCnt,IsEndSlice, andSetUnfinishedCoreCntis consistent with existing multi‑core scan patterns.Two small points you may want to tighten up:
check_split_sliceandCurrentSliceKey/CurrentSliceassumereq.StoreRangePtr()is non‑null and that the range remains pinned for the whole scan. Other code paths here make the same assumption, but it’s implicit. Adding an earlyassert(req.StoreRangePtr() != nullptr);near the top ofExecute(RangePartitionDataSyncScanCc &req)would document that contract and fail fast if a future caller violates it.check_split_slicerepeatedly fetchesTemplateStoreRange<KeyT> *range_ptrand does aFindSlicefor every candidate slice start. If you ever see this show up in profiles, hoistingrange_ptrout of the lambda (capturing it by reference) would avoid redundant pointer casts / lookups.Functionally this block looks correct; these are minor robustness / clarity tweaks.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
tx_service/include/cc/cc_request.htx_service/include/cc/template_cc_map.htx_service/src/cc/local_cc_shards.cpptx_service/src/sk_generator.cpp
🚧 Files skipped from review as they are similar to previous changes (1)
- tx_service/src/sk_generator.cpp
🧰 Additional context used
🧠 Learnings (6)
📓 Common learnings
Learnt from: liunyl
Repo: eloqdata/tx_service PR: 149
File: include/cc/cc_request.h:1876-1927
Timestamp: 2025-10-20T04:30:07.884Z
Learning: ScanNextBatchCc in include/cc/cc_request.h is used only for hash-partition scans; range-partition scans are handled by ScanSliceCc.
📚 Learning: 2025-10-20T04:30:07.884Z
Learnt from: liunyl
Repo: eloqdata/tx_service PR: 149
File: include/cc/cc_request.h:1876-1927
Timestamp: 2025-10-20T04:30:07.884Z
Learning: ScanNextBatchCc in include/cc/cc_request.h is used only for hash-partition scans; range-partition scans are handled by ScanSliceCc.
Applied to files:
tx_service/include/cc/template_cc_map.htx_service/src/cc/local_cc_shards.cpptx_service/include/cc/cc_request.h
📚 Learning: 2025-12-02T10:43:27.431Z
Learnt from: lokax
Repo: eloqdata/tx_service PR: 254
File: tx_service/src/cc/local_cc_shards.cpp:2949-3188
Timestamp: 2025-12-02T10:43:27.431Z
Learning: In tx_service/src/cc/local_cc_shards.cpp, whenever TryPinNodeGroupData is used, only call Sharder::Instance().UnpinNodeGroupData(node_group) if the recorded term is >= 0 (i.e., pin succeeded). Example: LocalCcShards::PostProcessFlushTaskEntries guards the unpin with `if (term >= 0)`.
Applied to files:
tx_service/include/cc/template_cc_map.htx_service/src/cc/local_cc_shards.cpp
📚 Learning: 2025-11-11T07:10:40.346Z
Learnt from: lzxddz
Repo: eloqdata/tx_service PR: 199
File: include/cc/local_cc_shards.h:233-234
Timestamp: 2025-11-11T07:10:40.346Z
Learning: In the LocalCcShards class in include/cc/local_cc_shards.h, the EnqueueCcRequest methods use `shard_code & 0x3FF` followed by `% cc_shards_.size()` to distribute work across processor cores for load balancing. This is intentional and separate from partition ID calculation. The 0x3FF mask creates a consistent distribution range (0-1023) before modulo by actual core count.
Applied to files:
tx_service/include/cc/template_cc_map.htx_service/src/cc/local_cc_shards.cpptx_service/include/cc/cc_request.h
📚 Learning: 2025-10-21T06:46:53.700Z
Learnt from: lokax
Repo: eloqdata/tx_service PR: 149
File: src/remote/cc_stream_receiver.cpp:1066-1075
Timestamp: 2025-10-21T06:46:53.700Z
Learning: In src/remote/cc_stream_receiver.cpp, for ScanNextRequest handling, BucketIds() on RemoteScanNextBatch should never be empty—this is an expected invariant of the scan protocol.
Applied to files:
tx_service/include/cc/template_cc_map.htx_service/src/cc/local_cc_shards.cpptx_service/include/cc/cc_request.h
📚 Learning: 2025-10-09T03:56:58.811Z
Learnt from: thweetkomputer
Repo: eloqdata/tx_service PR: 150
File: include/cc/local_cc_shards.h:626-631
Timestamp: 2025-10-09T03:56:58.811Z
Learning: For the LocalCcShards class in include/cc/local_cc_shards.h: Writer locks (unique_lock) should continue using the original meta_data_mux_ (std::shared_mutex) rather than fast_meta_data_mux_ (FastMetaDataMutex) at this stage. Only reader locks may use the FastMetaDataMutex wrapper.
Applied to files:
tx_service/src/cc/local_cc_shards.cpp
🧬 Code graph analysis (1)
tx_service/include/cc/template_cc_map.h (1)
tx_service/include/range_record.h (1)
- `` (338-814)
🔇 Additional comments (2)
tx_service/include/cc/template_cc_map.h (1)
5434-5671: Non‑empty slice discovery and per‑core scan loop are consistent, but rely on coordinator invariantsThe new per‑core scanning path:
- Uses
req.PausePos(core_id)to resume from a per‑core pause key andfind_non_empty_sliceto advanceCurrentSlice(core_id)until it finds a slice whose[start, end)actually has keys inTemplateCcMap.- Derives
(key_it, slice_end_it, slice_end_key)withdeduce_iterator, reusing the same boundary semantics used elsewhere (start at first key ≥ slice start, stop at first key ≥ slice end).- Maintains
slice_pinned = req.IsSlicePinned(core_id)andexport_persisted_key_only = !export_base_table_item_ && slice_pinned, so thatneed_export = (commit_ts <= data_sync_ts) && (slice_pinned || cce->NeedCkpt())behaves per the comment: split/pinned slices export all persisted keys, unsplit slices only export entries that still need ckpt.- On each slice boundary (
key_it == slice_end_it), callsMoveToNextSlice(core_id)and recomputes(key_it, slice_end_it, slice_end_key)viafind_non_empty_slice, while resettingslice_pinned/export_persisted_key_onlyfor the new slice.- Terminates a batch when either memory is full,
scan_batch_size_is reached, orreq.TheBatchEnd(core_id)is hit, and only marksno_more_datawhenkey_it == slice_end_it && req.IsLastBatch().This is coherent with the rest of the scan/datasync machinery in this file, but it assumes:
CurrentSlice(core_id)andCurrentSliceKey(core_id)have been correctly initialised by the coordinator before any core reaches this block, andTheBatchEnd(core_id)/IsLastBatch()reflect the same batch/slice set that was used to prepare and pin slices in the earlier coordinator block.Given how subtle this state machine is, it would be good to exercise it with tests that cover:
- Batches where all slices in the batch are empty on a core (so
find_non_empty_slicewalks toTheBatchEndand you exit the main loop immediately).- Resume‑from‑pause scenarios (
PausePosnon‑empty) crossing slice boundaries.- Mixes of pinned and unpinned slices, ensuring
export_persisted_key_onlytoggles as intended andno_more_dataonly flips true on the final batch.From the perspective of this file the logic is internally consistent; the remaining risk is ensuring
RangePartitionDataSyncScanCc’s coordinator invariants hold across the whole pipeline.tx_service/include/cc/cc_request.h (1)
4165-4196: The lifecycle management ofunfinished_cnt_is actually safe and does not require changes.The concern about premature completion and negative counters cannot occur in practice:
Reset() flow is correct: When
Reset()setsunfinished_cnt_ = 1and resetsslice_coordinator_.ready_for_scan_ = false, the next invocation always goes throughExecute().Execute() always calls SetUnfinishedCoreCnt() before dispatch: The code checks
!req.slice_coordinator_.IsReadyForScan()and only proceeds to multi-core dispatch after callingSetUnfinishedCoreCnt(shard_->core_cnt_)at line 5414 in template_cc_map.h. There is no code path that dispatches work to multiple cores without first updating the counter.All reuse paths verified:
local_cc_shards.cpp(4 Reset calls): All follow the pattern of Reset → Wait → EnqueueLowPriorityCcRequestToShard → Executesk_generator.cpp(2 Reset calls): Same pattern- Early returns in Execute (schema mismatch, pin failure) call
SetError()and exit without reaching dispatch codeSetFinish/SetError are safe: Both methods decrement
unfinished_cnt_and properly check for zero before unpinning and notifying.The design is correct but somewhat implicit—it relies on
Execute()being the sole entry point for dispatch and always callingSetUnfinishedCoreCnt()whenready_for_scan_is false. While the suggestion to move this responsibility intoReset()would improve clarity and make the API more robust against future misuse, the current implementation does not have the bug described.Likely an incorrect or invalid review comment.
| // Execute the pinslice operation. | ||
| auto [new_slice_id, succ] = pin_range_slice(*start_key); | ||
| if (!succ) | ||
| { |
There was a problem hiding this comment.
make sure to yield during the slice pin. This could take a while if you need to pin 512 slices
There was a problem hiding this comment.
If a load slice operation occurs, it will automatically yield.
| @@ -5254,7 +5245,7 @@ class TemplateCcMap : public CcMap | |||
| } | |||
|
|
|||
| auto pin_range_slice = | |||
There was a problem hiding this comment.
add a comprehensive comment at the beginning of Exeucte() to explain the workflow of this request. This is request is too complicated and hard to follow.
| for (; prepared_slice_cnt < | ||
| RangePartitionDataSyncScanCc::SliceCoordinator:: | ||
| MaxBatchSliceCount && | ||
| !req.slice_coordinator_.IsEndSlice(); |
There was a problem hiding this comment.
can we have an estimate on how many slices we need to pin?
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (3)
tx_service/include/cc/cc_request.h (2)
4353-4394: SliceCoordinator pause tracking andslices_to_scan_flag semantics are still fragileTwo related points here, echoing earlier feedback:
min_paused_key_andmin_paused_slice_index_initialization vs update rules
- Continuous mode initializes
min_paused_key_to a default-constructedTxKey, andUpdateMinPausedSlice(const TxKey *key)only updates when*key < min_paused_key_. If the default key is not explicitly treated as “+∞” byoperator<, the first real paused key may never be recorded, leavingmin_paused_key_at its default andStartKey<KeyT>()returningnullptr.- Non‑continuous mode initializes
min_paused_slice_index_to0, andUpdateMinPausedSlice(size_t)only updates whenslice_index < min_paused_slice_index_. Without a separate “uninitialized” sentinel, a first paused index of> 0won’t ever be reflected inmin_paused_slice_index_, which is then used byStartKey<KeyT>()andUpdateBatchEnd().If the “prepare slices” phase guarantees these fields are pre‑initialized to correct minima before any
UpdateMinPausedSlicecalls, documenting that invariant (and/or adding assertions around it) would make this much less brittle. Otherwise, consider explicit uninitialized handling (e.g., checkingmin_paused_key_.KeyPtr() == nullptrbefore comparing, or using a dedicated sentinel for the index case).
slices_to_scan_comment doesn’t match actual usage
slices_to_scan_is documented as “bool is used to mark if the slice need to be split”, but in this struct it’s manipulated purely as a “pinned or not” flag (SliceCoordinator::SlicePinned()andIsSlicePinned()). If “need to split” is no longer the intended meaning here, the comment should be updated to avoid confusion for future readers.Also applies to: 4405-4533, 4575-4579
4291-4305:FixCurrentSliceIndexcan walk pastpinned_slices_boundsThe while-loop advances
curr_slice_idxoverslice_coordinator_.pinned_slices_without checkingcurr_slice_idxagainstpinned_slices_.size(). Ifpause_pos_[core_id].firstis greater than the end key of the last pinned slice, this will read past the vector boundary and invoke UB.Also, if
pinned_slices_is ever empty, directly indexing[0]is immediately invalid.Consider hardening this as follows:
- Early‑out if
pinned_slices_is empty.- Bound the loop using
curr_slice_idx + 1 < pinned_slices_.size()and stop at the last slice even ifpause_pos_lies beyond its end.For example:
Proposed fix
void FixCurrentSliceIndex(uint16_t core_id) { assert(export_base_table_item_); if (pause_pos_[core_id].first.KeyPtr() != nullptr && - !slice_coordinator_.pinned_slices_.empty()) + !slice_coordinator_.pinned_slices_.empty()) { size_t curr_slice_idx = 0; - StoreSlice *curr_slice = - slice_coordinator_.pinned_slices_[curr_slice_idx]; - while (curr_slice->EndTxKey() < pause_pos_[core_id].first) - { - ++curr_slice_idx; - curr_slice = slice_coordinator_.pinned_slices_[curr_slice_idx]; - } + StoreSlice *curr_slice = + slice_coordinator_.pinned_slices_[curr_slice_idx]; + while (curr_slice_idx + 1 < slice_coordinator_.pinned_slices_.size() && + curr_slice->EndTxKey() < pause_pos_[core_id].first) + { + ++curr_slice_idx; + curr_slice = slice_coordinator_.pinned_slices_[curr_slice_idx]; + } curr_slice_index_[core_id] = curr_slice_idx; } }tx_service/include/cc/template_cc_map.h (1)
5332-5510: Slice pinning/preparation phase: behavior looks sound; consider tightening a couple of edge casesOverall the new
pin_range_slice+ coordinator-driven preparation logic is consistent with the two‑phase design and correctly handlesRetryvsBlockedOnLoadvs hard errors, but a few details are worth tightening:
check_split_sliceassumesreq.StoreRangePtr()is non-null and thatFindSlice(slice_key)always succeeds. That’s true for the normal checkpoint path, but if this request is ever constructed without a pinnedTemplateStoreRange<KeyT>thestatic_castand subsequent dereferences will UB. Adding anassert(req.StoreRangePtr() != nullptr);before the cast would make this assumption explicit and safer.- In
pin_range_slice, theRangeSliceOpStatus::NotOwnerbranch is marked as “Dead branch” but still setssucc = true. That means the caller will treat the operation as successful and push the returnedRangeSliceIdintoslice_coordinator_.pinned_slices_, even though ownership is wrong. If this branch ever fires in release builds (e.g., a rare failover window), thatRangeSliceIdmay be invalid for laterCurrentSlice()/LastPinnedSlice()usages. It would be safer either to:
- Treat
NotOwnerlike an error (setPIN_RANGE_SLICE_FAILEDandsucc = false), or- Return
succ = falseand ensure the caller skips storing the slice id in that case.- The preparation loop pins up to
MaxBatchSliceCountslices in one go. Withprefetch_size = 32this is probably fine, but ifMaxBatchSliceCountis ever raised again, you might want to periodically yield (e.g., viaEnqueueLowPriorityCcRequest) instead of doing all pins in one tight loop on the TxProcessor thread.Functionally this block looks correct; these changes would mostly harden edge cases and clarify intent.
🧹 Nitpick comments (4)
tx_service/include/cc/cc_request.h (3)
4042-4078: Constructor: guardold_slices_delta_sizeand clarifyslices_to_scan_bool semanticsThe wiring of
slice_coordinator_and per-corecurr_slice_index_looks coherent. However:
- In the
!export_base_table_item_branch you unconditionally dereferenceold_slices_delta_size; given it defaults tonullptr, it’s worth asserting (or otherwise enforcing) that it’s non-null on this path to avoid accidental UB later.slices_to_scan_is now used downstream as a(TxKey, pinned)container; consider renaming or at least documenting the.secondsemantic here to keep it in sync with howIsSlicePinnedandSliceCoordinator::SlicePinnedactually use it.
4167-4196: Reset now relies onSetUnfinishedCoreCnt– verify all call sites
Reset()setsunfinished_cnt_ = 1and then expectsSetUnfinishedCoreCnt(core_cnt)to be called before the next multi-core scan. That’s a behavioral change from initializing tocore_cnt_directly.If any path forgets to call
SetUnfinishedCoreCnt,Wait()could either unblock prematurely or block forever. Please double‑check the TemplateCcMap callers and consider a brief comment here documenting the required call order.Resetting
curr_slice_index_(forexport_base_table_item_) andslice_coordinator_.Reset()is otherwise consistent.
4308-4346: Accessor helpers overcurr_slice_index_look correct; consider assertingstore_range_The new helpers (
CurrentSlice,CurrentSliceKey,MoveToNextSlice,TheBatchEnd,IsSlicePinned) are consistent with the design:
- In continuous mode they index into
slice_coordinator_.pinned_slices_.- In non‑continuous mode they use
slices_to_scan_andbatch_end_slice_index_.IsSlicePinnedis effectively “always true” in continuous mode, which matches all‑pinned semantics.One small robustness improvement: in
CurrentSlice’s non‑continuous branch, it would be safer to assertstore_range_ != nullptrbefore callingstore_range_->FindSlice(...), since a nullstore_range_would currently crash.tx_service/include/cc/template_cc_map.h (1)
5512-5752: Scan loop, pause/resume, and termination semantics: mostly solid; add a couple of safety checksThe reworked per‑core scan logic around
PausePos,find_non_empty_slice, and the termination condition is coherent and matches the new coordinator model, but a few points deserve attention:
In
find_non_empty_slice, the non‑migration path callsreq.CurrentSliceKey(shard_->core_id_)and immediately doescurr_start_tx_key.GetKey<KeyT>()and dereferences it. This assumes the per‑core current-slice key is always initialized and non-null before the first scan on that core. If there is any path wherecurr_slice_index_[core_id]is unset, this becomes undefined behavior. A defensiveassert(curr_start_tx_key.KeyPtr() != nullptr);(or early return) would make the invariant explicit.The
assert(key_it != slice_end_it || req.TheBatchEnd(shard_->core_id_));after the firstfind_non_empty_slicecall is good, but if all slices in a batch are empty and it is not the last batch,TheBatchEnd()must reliably return true or you’ll hit this assert. It would be useful to double‑check thatslice_coordinator_.MoveNextSlice/UpdateBatchEndguaranteeTheBatchEnd()in that case.The termination condition:
bool no_more_data = (key_it == slice_end_it) && req.IsLastBatch(); ... if (is_scan_mem_full || no_more_data || req.accumulated_scan_cnt_[core] >= req.scan_batch_size_ || req.TheBatchEnd(core)) { req.SetFinish(core); return false; } else { shard_->EnqueueLowPriorityCcRequest(&req); return false; }implies that
SetFinish(core_id_)is called at most once per core perRangePartitionDataSyncScanCc, and any further progress for that core must come from a new request. That matches the usual pattern for these Cc requests, but it puts a lot of meaning intoIsLastBatch()andTheBatchEnd(). Please verify that higher‑level driver code never re‑invokes this same request on a core afterSetFinish(core_id_)has been called, otherwiseunfinished_cnt_book‑keeping could underflow.These are mostly invariants and safety checks; the core pause/resume and slice‑advance logic reads correctly.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
tx_service/include/cc/cc_request.htx_service/include/cc/template_cc_map.h
🧰 Additional context used
🧠 Learnings (5)
📓 Common learnings
Learnt from: liunyl
Repo: eloqdata/tx_service PR: 149
File: include/cc/cc_request.h:1876-1927
Timestamp: 2025-10-20T04:30:07.884Z
Learning: ScanNextBatchCc in include/cc/cc_request.h is used only for hash-partition scans; range-partition scans are handled by ScanSliceCc.
📚 Learning: 2025-10-20T04:30:07.884Z
Learnt from: liunyl
Repo: eloqdata/tx_service PR: 149
File: include/cc/cc_request.h:1876-1927
Timestamp: 2025-10-20T04:30:07.884Z
Learning: ScanNextBatchCc in include/cc/cc_request.h is used only for hash-partition scans; range-partition scans are handled by ScanSliceCc.
Applied to files:
tx_service/include/cc/template_cc_map.htx_service/include/cc/cc_request.h
📚 Learning: 2025-11-11T07:10:40.346Z
Learnt from: lzxddz
Repo: eloqdata/tx_service PR: 199
File: include/cc/local_cc_shards.h:233-234
Timestamp: 2025-11-11T07:10:40.346Z
Learning: In the LocalCcShards class in include/cc/local_cc_shards.h, the EnqueueCcRequest methods use `shard_code & 0x3FF` followed by `% cc_shards_.size()` to distribute work across processor cores for load balancing. This is intentional and separate from partition ID calculation. The 0x3FF mask creates a consistent distribution range (0-1023) before modulo by actual core count.
Applied to files:
tx_service/include/cc/template_cc_map.htx_service/include/cc/cc_request.h
📚 Learning: 2025-12-02T10:43:27.431Z
Learnt from: lokax
Repo: eloqdata/tx_service PR: 254
File: tx_service/src/cc/local_cc_shards.cpp:2949-3188
Timestamp: 2025-12-02T10:43:27.431Z
Learning: In tx_service/src/cc/local_cc_shards.cpp, whenever TryPinNodeGroupData is used, only call Sharder::Instance().UnpinNodeGroupData(node_group) if the recorded term is >= 0 (i.e., pin succeeded). Example: LocalCcShards::PostProcessFlushTaskEntries guards the unpin with `if (term >= 0)`.
Applied to files:
tx_service/include/cc/template_cc_map.h
📚 Learning: 2025-10-21T06:46:53.700Z
Learnt from: lokax
Repo: eloqdata/tx_service PR: 149
File: src/remote/cc_stream_receiver.cpp:1066-1075
Timestamp: 2025-10-21T06:46:53.700Z
Learning: In src/remote/cc_stream_receiver.cpp, for ScanNextRequest handling, BucketIds() on RemoteScanNextBatch should never be empty—this is an expected invariant of the scan protocol.
Applied to files:
tx_service/include/cc/template_cc_map.htx_service/include/cc/cc_request.h
🧬 Code graph analysis (1)
tx_service/include/cc/template_cc_map.h (2)
tx_service/include/range_record.h (2)
KeyT(646-649)- `` (338-814)
tx_service/include/cc/ccm_scanner.h (1)
Iterator(678-704)
🔇 Additional comments (1)
tx_service/include/cc/cc_request.h (1)
4198-4214: Good centralization of finish/error handling; confirm paused-slice bookkeeping invariantsRouting both error and normal completion through
SetError/SetFinishand unconditionally callingUnpinSlices()whenunfinished_cnt_reaches 0 is a solid improvement and should prevent leaked pins across batches.The new
SetFinish(size_t core_id)logic also updates the minimal paused position:
- Continuous (
export_base_table_item_) mode: usespause_pos_[core_id].firstviaUpdateMinPausedSlice(const TxKey*).- Non‑continuous mode: uses
curr_slice_index_[core_id]viaUpdateMinPausedSlice(size_t).This assumes that, at the moment
SetFinishis called, each core’spause_pos_/curr_slice_index_is already pointing at “the first unscanned slice/key” and that every participating core callsSetFinishexactly once. Please confirm those invariants in the TemplateCcMap paths, or add assertions/comments here to make the contract explicit.Also applies to: 4228-4246
| /* | ||
| * RangePartitionDataSyncScanCc Workflow: | ||
| * ====================================== | ||
| * | ||
| * This function implements a two-phase execution model for | ||
| * range-partitioned table data synchronization scanning with two distinct | ||
| * execution modes: | ||
| * | ||
| * Execution Modes: | ||
| * ---------------- | ||
| * 1. Checkpoint Mode (export_base_table_item_ = false): | ||
| * - Used during normal checkpoint operations | ||
| * - Exports only dirty data (keys that need checkpointing) | ||
| * - The `req.slices_to_scan_` vector determines which slices need to be | ||
| * scanned. | ||
| * - Only pins slices that require splitting(PostCkptSize <= upper_bound) | ||
| * | ||
| * 2. Data Migration Mode (export_base_table_item_ = true): | ||
| * - Used during range split operations or index creation | ||
| * - Exports ALL primary key data in the specified range | ||
| * - Processes slices sequentially (idx + 1) | ||
| * - Pins all slices in the target range for complete data export | ||
| * | ||
| * Two-Phase Execution: | ||
| * -------------------- | ||
| * Phase 1 - Slice Preparation (Single Core): | ||
| * - Executed by the initial core that receives the request | ||
| * - Prepares up to MaxBatchSliceCount (128) slices for scanning | ||
| * - For each slice: | ||
| * * In checkpoint mode: Checks if slice needs splitting (PostCkptSize > | ||
| * upper_bound) | ||
| * - If not, skips pinning and only scans dirty keys | ||
| * - If yes, pins the slice for complete export | ||
| * * In migration mode: Always pins the slice | ||
| * - Stores pinned slice IDs in `slice_coordinator_.pinned_slices_` | ||
| * - Sets ready_for_scan flag when preparation completes | ||
| * - Distributes the request to all other cores for parallel execution | ||
| * | ||
| * Phase 2 - Parallel Data Export (All Cores): | ||
| * - All cores (including the initial one) execute in parallel | ||
| * - Each core maintains its own: | ||
| * * Current slice index (curr_slice_index_[core_id]) | ||
| * * Pause position (pause_pos_[core_id]) | ||
| * * Accumulated scan/flush counters | ||
| * - For each core's execution: | ||
| * 1. Finds the first non-empty slice to scan (from pause position or | ||
| * start key) | ||
| * 2. Iterates through keys in the slice using ForwardScanStart() | ||
| * 3. For each key: | ||
| * - Checks if export is needed (CommitTs <= data_sync_ts_) | ||
| * - In checkpoint mode: Also checks if slice is pinned or | ||
| * cce->NeedCkpt() | ||
| * - Calls ExportForCkpt() to export the key/record data | ||
| * - Handles MVCC archive record recycling if enabled | ||
| * 4. Processes up to DataSyncScanBatchSize (32) pages per round | ||
| * 5. Stops when: | ||
| * - Reaches scan_batch_size_ limit | ||
| * - Scan heap is full | ||
| * - Reaches end of current batch slices | ||
| * - All data is drained (IsLastBatch()) | ||
| * 6. Updates pause position for resume on next execution | ||
| * 7. Re-enqueues itself if more work remains | ||
| * | ||
| * Key Data Structures: | ||
| * -------------------- | ||
| * - slice_coordinator_: Coordinates slice preparation and tracks pinned | ||
| * slices | ||
| * - pause_pos_[core_id]: Tracks where each core paused for resumption | ||
| * - data_sync_vec_[core_id]: Accumulates flush records per core | ||
| * - archive_vec_[core_id]: Accumulates archive records for MVCC cleanup | ||
| * | ||
| * Termination: | ||
| * ------------ | ||
| * - When all cores finish (unfinished_cnt_ == 0): | ||
| * * All pinned slices are unpinned | ||
| * * Condition variable is notified to wake waiting threads | ||
| * - Each core calls SetFinish() when its portion is complete | ||
| * | ||
| * Error Handling: | ||
| * --------------- | ||
| * - Schema version mismatches: Returns false with | ||
| * REQUESTED_TABLE_SCHEMA_MISMATCH | ||
| * - Pin slice failures: Returns false, request may be retried | ||
| * - Heap full: Sets scan_heap_is_full_ flag, pauses and waits for cleanup | ||
| */ |
There was a problem hiding this comment.
Fix minor inaccuracies in the RangePartitionDataSyncScanCc workflow comment
The new high-level comment is very helpful, but there are a couple of inaccuracies that can be misleading when debugging this path:
- In the “Checkpoint Mode” section you say “Only pins slices that require splitting (PostCkptSize <= upper_bound)”, while the code in
next_slice_func/check_split_slicepins only whenPostCkptSize() > StoreSlice::slice_upper_bound. The inequality sign should be corrected to match the implementation. - Under “Error Handling”, the bullet “Pin slice failures: Returns false, request may be retried” doesn’t distinguish between retriable statuses (e.g.
Retry,BlockedOnLoad) and hard failures where you setPIN_RANGE_SLICE_FAILEDand will not retry. Clarifying that here would make the behavior easier to reason about.
Updating these lines to reflect the actual conditions will keep the documentation aligned with the execution semantics.
🤖 Prompt for AI Agents
In tx_service/include/cc/template_cc_map.h around lines 5186 to 5270, update the
RangePartitionDataSyncScanCc workflow comment: change the Checkpoint Mode
pinning condition to match the code (slice is pinned when PostCkptSize() >
StoreSlice::slice_upper_bound, not <=), and clarify Error Handling to
distinguish retriable pin failures (e.g. Retry, BlockedOnLoad) from
non-retriable failures that set PIN_RANGE_SLICE_FAILED and will not retry; make
both edits concise so the prose matches the implementation semantics.
Here are some reminders before you submit the pull request
fixes eloqdb/tx_service#issue_id./mtr --suite=mono_main,mono_multi,mono_basicSummary by CodeRabbit
Refactor
Chores
✏️ Tip: You can customize this high-level summary in your review settings.