add schema_version check for delta scan#219
Conversation
WalkthroughAdds a uint64_t schema_version_ field and SchemaVersion() accessors to several CC request structs, propagates schema_version into call sites, and changes TemplateCcMap to detect requested/in-memory schema mismatches, set REQUESTED_TABLE_SCHEMA_MISMATCH, and return without re-enqueuing. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Local as LocalCcShards
participant Req as CC Request (has schema_version_)
participant Map as TemplateCcMap
Note over Local,Req: Request construction now includes schema_version
Local->>Req: construct(..., schema_version)
Local->>Map: submit(req)
Map->>Map: if req.schema_version_ == 0 or equals table_schema_.Version()
alt match or zero
Map->>Map: proceed with scan/data-sync
Map-->>Local: enqueue/continue
else mismatch
Map->>Map: assert req.schema_version_ > table_schema_.Version()
Map-->>Req: SetError(REQUESTED_TABLE_SCHEMA_MISMATCH)
Map-->>Local: return false (no retry)
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes
Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (4)
include/cc/cc_request.h (3)
8492-8516: Schema version plumbed into range-partition delta-scan ctor — add accessor for consistency.Ctor/initializers look good. To align with other requests and avoid relying solely on friend access, expose SchemaVersion() here as well.
Apply inside struct ScanSliceDeltaSizeCcForRangePartition:
@@ struct ScanSliceDeltaSizeCcForRangePartition : public CcRequestBase { @@ void SetHasDmlSinceDdl() { bool expect = false; has_dml_since_ddl_.compare_exchange_strong( expect, true, std::memory_order_acq_rel); } @@ + // Provide uniform access like other requests. + uint64_t SchemaVersion() const override { return schema_version_; }Also consider adding a small unit/integration test that verifies a mismatched schema_version triggers the intended guard in TemplateCcMap for this request.
Please confirm all call sites were updated to pass schema_version and that TemplateCcMap checks it for this request.
8723-8732: Minimize friend exposure; prefer accessor if possible.You added a private member schema_version_ and friended TemplateCcMap. If you add SchemaVersion() as suggested, you likely can drop or narrow the friend dependency unless TemplateCcMap needs other internals. If friend is still required, keep it; otherwise, remove to reduce coupling.
8899-8915: Private schema_version_ + friend added — OK; consider const and/or accessor.schema_version_ addition is fine. If it never changes post-ctor, making it const can communicate immutability. With an accessor in place, you may also be able to trim friend scope unless TemplateCcMap needs more than the version.
include/cc/template_cc_map.h (1)
8149-8160: Hash delta-size scan version check OK; align log/comment and table name usage
- Consistent with range variant—good.
- Same nits: message should say "delta-size scan"; fix "util"; and log via class table_name_ for consistency.
Suggested diff:
- LOG(WARNING) << "Table schema version mismatched for data sync " - "scan, table name: " - << req.table_name_.String() + LOG(WARNING) << "Table schema version mismatched for delta-size scan, table name: " + << table_name_.String() << " ,scan carried version: " << req.schema_version_ << " ,ccmap version: " << table_schema_->Version(); - // yield util version matched + // yield until version matches
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
include/cc/cc_request.h(6 hunks)include/cc/template_cc_map.h(2 hunks)src/cc/local_cc_shards.cpp(2 hunks)
🧰 Additional context used
🧠 Learnings (4)
📚 Learning: 2025-10-21T06:46:53.700Z
Learnt from: lokax
Repo: eloqdata/tx_service PR: 149
File: src/remote/cc_stream_receiver.cpp:1066-1075
Timestamp: 2025-10-21T06:46:53.700Z
Learning: In src/remote/cc_stream_receiver.cpp, for ScanNextRequest handling, BucketIds() on RemoteScanNextBatch should never be empty—this is an expected invariant of the scan protocol.
Applied to files:
include/cc/template_cc_map.hinclude/cc/cc_request.h
📚 Learning: 2025-10-20T04:30:07.884Z
Learnt from: liunyl
Repo: eloqdata/tx_service PR: 149
File: include/cc/cc_request.h:1876-1927
Timestamp: 2025-10-20T04:30:07.884Z
Learning: ScanNextBatchCc in include/cc/cc_request.h is used only for hash-partition scans; range-partition scans are handled by ScanSliceCc.
Applied to files:
include/cc/template_cc_map.hinclude/cc/cc_request.hsrc/cc/local_cc_shards.cpp
📚 Learning: 2025-11-11T07:10:40.346Z
Learnt from: lzxddz
Repo: eloqdata/tx_service PR: 199
File: include/cc/local_cc_shards.h:233-234
Timestamp: 2025-11-11T07:10:40.346Z
Learning: In the LocalCcShards class in include/cc/local_cc_shards.h, the EnqueueCcRequest methods use `shard_code & 0x3FF` followed by `% cc_shards_.size()` to distribute work across processor cores for load balancing. This is intentional and separate from partition ID calculation. The 0x3FF mask creates a consistent distribution range (0-1023) before modulo by actual core count.
Applied to files:
include/cc/cc_request.hsrc/cc/local_cc_shards.cpp
📚 Learning: 2025-10-09T03:56:58.811Z
Learnt from: thweetkomputer
Repo: eloqdata/tx_service PR: 150
File: include/cc/local_cc_shards.h:626-631
Timestamp: 2025-10-09T03:56:58.811Z
Learning: For the LocalCcShards class in include/cc/local_cc_shards.h: Writer locks (unique_lock) should continue using the original meta_data_mux_ (std::shared_mutex) rather than fast_meta_data_mux_ (FastMetaDataMutex) at this stage. Only reader locks may use the FastMetaDataMutex wrapper.
Applied to files:
src/cc/local_cc_shards.cpp
🧬 Code graph analysis (1)
include/cc/cc_request.h (3)
include/cc/template_cc_map.h (3)
bool(11198-11205)RangePartitioned(11585-11588)TemplateCcMap(164-168)include/tx_request.h (1)
schema_version_(583-583)include/remote/remote_cc_request.h (1)
schema_version_(450-450)
🔇 Additional comments (2)
src/cc/local_cc_shards.cpp (2)
3505-3506: Parameter order is correct; verify caller properly handles async re-enqueue semantics during version mismatch.The constructor signature at line 8492 confirms
is_dirtyprecedesschema_version✓, and version-checking logic exists in template_cc_map.h. However, when a schema-version mismatch is detected during the data-sync scan, the request is re-enqueued (shard_->Enqueue(&req)) rather than setting an error. This meansslices_delta_sizeremains empty while awaiting schema alignment—not a silent success but an intentional wait-and-retry.Verify that the caller (lines 3543–3572) correctly handles the async retry semantics: specifically, confirm that on empty results during a version-mismatch scenario, the code does not prematurely commit or advance
last_data_sync_tsbefore the re-enqueued scan eventually completes.
4353-4355: Arg order confirmed correct; schema version mismatch handling requires clarification.The argument order in the hash path is correct:
TxNumber()andcatalog_rec.Schema()->Version()map to thetxnandschema_versionconstructor parameters.However, the schema version mismatch behavior is as flagged: in
template_cc_map.h(line 8149–8156), when the schema version mismatches, the code re-queues the request viashard_->Enqueue(&req)without callingSetError(). The caller insrc/cc/local_cc_shards.cpp(line 4359) checksif (scan_delta_size_cc.IsError()), which will befalseon re-queue, allowing subsequent logic to proceed with an uninitializedUpdatedMemory()value. This conflates "scan complete with zero updates" and "schema mismatch occurred", potentially advancing state incorrectly on version conflicts.Verify that this re-queue-without-error pattern is intentional or needs explicit error signaling to prevent premature state advancement.
0ee44cd to
f863877
Compare
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (1)
include/cc/template_cc_map.h (1)
7797-7808: Delta-size scan schema_version guard is good; tweak log wording and use class table_name_ for consistencyThe version gate and error handling for
ScanSliceDeltaSizeCcForRangePartitionandScanDeltaSizeCcForHashPartitionare sound and parallel the data-sync paths. Two small polish points:
- The log message says
"Table schema version mismatched for data sync scan"on these delta-size paths, which can be confusing when debugging. Consider changing to"delta-size scan"to distinguish from the main data-sync scan.- You’re using
req.table_name_.String()/req.table_name_->String(). SinceTemplateCcMapalready hastable_name_, usingtable_name_.String()here avoids differences in how individual request types store the name.For example:
- LOG(WARNING) << "Table schema version mismatched for data sync " - "scan, table name: " - << req.table_name_.String() + LOG(WARNING) << "Table schema version mismatched for delta-size scan, table name: " + << table_name_.String() << " ,scan carried version: " << req.schema_version_ << " ,ccmap version: " << table_schema_->Version();(Apply the same pattern to the hash-partition variant.)
Also applies to: 8155-8165
🧹 Nitpick comments (1)
include/cc/template_cc_map.h (1)
5110-5121: Schema-version mismatch gate for data-sync scans looks correct; consider strengthening the assertion messageThe new mismatch handling for
RangePartitionDataSyncScanCcandHashPartitionDataSyncScanCc—asserting the requested schema is ahead of the in-memory version, logging, settingREQUESTED_TABLE_SCHEMA_MISMATCH, and returning without continuing the scan—matches the intended behavior and aligns with the otherSchemaVersion()checks in this class.If you want easier debugging when the invariant is violated, you could embed a brief message in the assert, e.g.:
assert(req.schema_version_ > table_schema_->Version() && "data-sync scan: requested schema_version must be > in-memory version");This is optional; behavior-wise the change looks good.
Also applies to: 5667-5678
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
include/cc/cc_request.h(10 hunks)include/cc/template_cc_map.h(4 hunks)src/cc/local_cc_shards.cpp(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- src/cc/local_cc_shards.cpp
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-10-20T04:30:07.884Z
Learnt from: liunyl
Repo: eloqdata/tx_service PR: 149
File: include/cc/cc_request.h:1876-1927
Timestamp: 2025-10-20T04:30:07.884Z
Learning: ScanNextBatchCc in include/cc/cc_request.h is used only for hash-partition scans; range-partition scans are handled by ScanSliceCc.
Applied to files:
include/cc/template_cc_map.hinclude/cc/cc_request.h
📚 Learning: 2025-10-21T06:46:53.700Z
Learnt from: lokax
Repo: eloqdata/tx_service PR: 149
File: src/remote/cc_stream_receiver.cpp:1066-1075
Timestamp: 2025-10-21T06:46:53.700Z
Learning: In src/remote/cc_stream_receiver.cpp, for ScanNextRequest handling, BucketIds() on RemoteScanNextBatch should never be empty—this is an expected invariant of the scan protocol.
Applied to files:
include/cc/template_cc_map.hinclude/cc/cc_request.h
📚 Learning: 2025-11-11T07:10:40.346Z
Learnt from: lzxddz
Repo: eloqdata/tx_service PR: 199
File: include/cc/local_cc_shards.h:233-234
Timestamp: 2025-11-11T07:10:40.346Z
Learning: In the LocalCcShards class in include/cc/local_cc_shards.h, the EnqueueCcRequest methods use `shard_code & 0x3FF` followed by `% cc_shards_.size()` to distribute work across processor cores for load balancing. This is intentional and separate from partition ID calculation. The 0x3FF mask creates a consistent distribution range (0-1023) before modulo by actual core count.
Applied to files:
include/cc/cc_request.h
🧬 Code graph analysis (2)
include/cc/template_cc_map.h (2)
include/error_messages.h (1)
CcErrorCode(237-436)include/remote/remote_type.h (1)
CcErrorCode(474-479)
include/cc/cc_request.h (3)
include/tx_request.h (1)
schema_version_(583-583)include/remote/remote_cc_request.h (1)
schema_version_(450-450)include/cc/template_cc_map.h (1)
TemplateCcMap(164-168)
🔇 Additional comments (4)
include/cc/cc_request.h (4)
3936-3940: HashPartitionDataSyncScanCc: SchemaVersion() override added — LGTMAccessor is correct and matches intended bypass semantics (0 means skip).
4318-4322: RangePartitionDataSyncScanCc: SchemaVersion() override added — LGTMConsistent with other request types and exposes stored version.
8827-8835: Verify instantiation sites for new constructor parametersThe constructor signature at lines 8827-8835 has been updated to include
txnandschema_versionparameters. The corresponding accessor (SchemaVersion() at 8970-8974) and friend declaration (9000-9004) are in place. However, instantiation sites forScanDeltaSizeCcForHashPartitioncannot be located in the codebase to verify that all call sites supply these new parameters. Please manually confirm that every instantiation of this class provides bothtxnandschema_versionarguments.
8581-8595: All constructor parameters verified—instantiation is correctAt src/cc/local_cc_shards.cpp:3494, the instantiation passes all 12 constructor parameters in the correct order, including the new
is_dirtyandschema_versionparameters. The call site is properly aligned with the extended constructor signature.The suggestion to make
schema_version_const (line 8807) is valid: it is initialized once in the constructor initializer list and accessed only via the read-onlySchemaVersion()accessor. This is a safe and appropriate refactoring with no correctness implications.
Here are some reminders before you submit the pull request
fixes eloqdb/tx_service#issue_id./mtr --suite=mono_main,mono_multi,mono_basicSummary by CodeRabbit
New Features
Bug Fixes