Conversation
|
Note Currently processing new changes in this PR. This may take a few minutes, please wait... 📒 Files selected for processing (4)
✏️ Tip: You can disable in-progress messages and the fortune message in your review settings. WalkthroughThis PR updates the eloq data_substrate submodule pointer, adds EloqRecoveryUnit::batchReadCatalog to perform batched catalog reads, and refactors EloqCatalogRecordStoreCursor to prefetch catalog metadata in bulk during construction and consume from an internal prefetch cache. Changes
Sequence Diagram(s)sequenceDiagram
participant Cursor as EloqCatalogRecordStoreCursor
participant RecoveryUnit as EloqRecoveryUnit
participant TxService as TransactionService
participant Cache as PrefetchCache
Cursor->>Cursor: ctor(tableNameVector)
Cursor->>RecoveryUnit: batchReadCatalog(tableNameVector)
RecoveryUnit->>RecoveryUnit: build CatalogKey entries
RecoveryUnit->>TxService: submit BatchReadTxRequest (catalog batch)
TxService->>TxService: fetch multiple catalog entries
TxService-->>RecoveryUnit: return batch results
RecoveryUnit->>Cache: populate _prefetched (exists, CatalogRecord) pairs
RecoveryUnit-->>Cursor: ctor returns (prefetch ready)
Note over Cursor,Cache: iteration
Cursor->>Cache: next() reads _prefetched[_prefetchIndex]
Cache-->>Cursor: return Record (prefetched metadata)
Cursor->>Cursor: increment _prefetchIndex
Estimated Code Review Effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly Related PRs
Suggested Reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Pull request overview
This PR adds a batch catalog read path and updates the catalog record store cursor to prefetch catalog metadata in bulk.
Changes:
- Added
EloqRecoveryUnit::batchReadCatalog()API and implementation using a batch read TX request. - Updated
EloqCatalogRecordStoreCursorto prefetch catalog schema metadata for all tables up front and iterate over prefetched results. - Updated the
data_substratesubmodule revision.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| src/mongo/db/modules/eloq/src/eloq_recovery_unit.h | Declares new batch catalog read API with doc comment |
| src/mongo/db/modules/eloq/src/eloq_recovery_unit.cpp | Implements batch catalog read via BatchReadTxRequest |
| src/mongo/db/modules/eloq/src/eloq_record_store.cpp | Switches catalog cursor to use batch prefetch instead of per-table reads |
| src/mongo/db/modules/eloq/data_substrate | Bumps submodule to pick up batch read support dependencies |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| records.reserve(tableNames.size()); | ||
| for (const std::string& name : tableNames) { | ||
| keys.emplace_back(txservice::TableName(std::string_view(name), | ||
| txservice::TableType::RangePartition, |
There was a problem hiding this comment.
This constructs txservice::TableName with TableType::RangePartition, but the previous per-table path in EloqCatalogRecordStoreCursor used TableType::Primary. If the catalog entries are keyed by the table type, this will cause misses or reading the wrong catalog records. Use the same TableType as the existing single-read path (or derive the correct type from the input list) so batch reads address the same catalog keys.
| txservice::TableType::RangePartition, | |
| txservice::TableType::Primary, |
| /** | ||
| * Batch read catalog for multiple tables. out[i] corresponds to tableNames[i]: | ||
| * (true, record) if exists, else (false, empty). Uses readCatalog per table | ||
| * (serial); Phase 3 may switch to BatchReadCatalogTxRequest for concurrent read. | ||
| */ | ||
| void batchReadCatalog(OperationContext* opCtx, | ||
| const std::vector<std::string>& tableNames, | ||
| std::vector<std::pair<bool, txservice::CatalogRecord>>* out); |
There was a problem hiding this comment.
The comment says this method uses readCatalog serially and may later switch to a batch request, but the implementation already uses txservice::BatchReadTxRequest. Please update the comment to match the current behavior (and name the actual request type used) to avoid misleading future maintainers.
| txservice::BatchReadTxRequest req(&txservice::catalog_ccm_name, | ||
| 0, | ||
| read_batch, | ||
| false, | ||
| false, | ||
| false, | ||
| coro.yieldFuncPtr, | ||
| coro.resumeFuncPtr, | ||
| _txm, | ||
| false, | ||
| 0, | ||
| false, | ||
| true); // is_catalog_batch |
There was a problem hiding this comment.
This constructor call has many positional bool/numeric arguments (several false and 0), which is hard to read and easy to accidentally break when the API changes. Consider introducing clearly named local variables for each flag/value (or a small options struct/builder if available in txservice) and/or adding inline comments per argument so the intent of each parameter is explicit.
| txservice::BatchReadTxRequest req(&txservice::catalog_ccm_name, | |
| 0, | |
| read_batch, | |
| false, | |
| false, | |
| false, | |
| coro.yieldFuncPtr, | |
| coro.resumeFuncPtr, | |
| _txm, | |
| false, | |
| 0, | |
| false, | |
| true); // is_catalog_batch | |
| const auto* ccmName = &txservice::catalog_ccm_name; | |
| const int64_t txPriority = 0; | |
| const bool enableStrongConsistency = false; | |
| const bool enableSnapshotRead = false; | |
| const bool enableTracing = false; | |
| auto yieldFunc = coro.yieldFuncPtr; | |
| auto resumeFunc = coro.resumeFuncPtr; | |
| auto* txManager = _txm; | |
| const bool allowDirtyRead = false; | |
| const int64_t timeoutMs = 0; | |
| const bool allowPartialResult = false; | |
| const bool isCatalogBatch = true; // is_catalog_batch | |
| txservice::BatchReadTxRequest req(ccmName, | |
| txPriority, | |
| read_batch, | |
| enableStrongConsistency, | |
| enableSnapshotRead, | |
| enableTracing, | |
| yieldFunc, | |
| resumeFunc, | |
| txManager, | |
| allowDirtyRead, | |
| timeoutMs, | |
| allowPartialResult, | |
| isCatalogBatch); |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
src/mongo/db/modules/eloq/src/eloq_recovery_unit.cpp (1)
355-406: LGTM! Batch catalog read implementation is correct.The function correctly manages the lifetime of local vectors (
keys,records) that back theTxKeypointers inread_batch, and properly waits for completion before extracting results.Minor style note: Consider using
[[maybe_unused]]instead of(void)opCtxfor unused parameter suppression, which is more idiomatic in modern C++:void EloqRecoveryUnit::batchReadCatalog( - OperationContext* opCtx, + [[maybe_unused]] OperationContext* opCtx, const std::vector<std::string>& tableNames, std::vector<std::pair<bool, txservice::CatalogRecord>>* out) { - (void)opCtx;,
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/mongo/db/modules/eloq/src/eloq_recovery_unit.cpp` around lines 355 - 406, Replace the C-style unused-parameter suppression in EloqRecoveryUnit::batchReadCatalog by marking the parameter as unused with the modern C++ attribute: remove the (void)opCtx statement and annotate the opCtx parameter with [[maybe_unused]] (or if you prefer, use [[maybe_unused]] on a local alias) so the compiler knows opCtx is intentionally unused; update the function signature reference to opCtx accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/mongo/db/modules/eloq/src/eloq_recovery_unit.h`:
- Around line 107-114: The doc comment for batchReadCatalog is outdated: update
the description to state that batchReadCatalog performs a batched catalog read
using BatchReadTxRequest with is_catalog_batch=true (not serial per-table
readCatalog calls), and mention that Phase 3 may adjust concurrency if needed;
also either remove the unused opCtx parameter from the signature or explicitly
document why it is unused (e.g., add a short inline comment noting "(void)opCtx;
// kept for API compatibility") so callers/readers understand the
intentionality; reference the batchReadCatalog method, the readCatalog mention
in the comment, and the use of BatchReadTxRequest/is_catalog_batch in the
implementation when making these edits.
---
Nitpick comments:
In `@src/mongo/db/modules/eloq/src/eloq_recovery_unit.cpp`:
- Around line 355-406: Replace the C-style unused-parameter suppression in
EloqRecoveryUnit::batchReadCatalog by marking the parameter as unused with the
modern C++ attribute: remove the (void)opCtx statement and annotate the opCtx
parameter with [[maybe_unused]] (or if you prefer, use [[maybe_unused]] on a
local alias) so the compiler knows opCtx is intentionally unused; update the
function signature reference to opCtx accordingly.
ℹ️ Review info
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
src/mongo/db/modules/eloq/data_substratesrc/mongo/db/modules/eloq/src/eloq_record_store.cppsrc/mongo/db/modules/eloq/src/eloq_recovery_unit.cppsrc/mongo/db/modules/eloq/src/eloq_recovery_unit.h
| /** | ||
| * Batch read catalog for multiple tables. out[i] corresponds to tableNames[i]: | ||
| * (true, record) if exists, else (false, empty). Uses readCatalog per table | ||
| * (serial); Phase 3 may switch to BatchReadCatalogTxRequest for concurrent read. | ||
| */ | ||
| void batchReadCatalog(OperationContext* opCtx, | ||
| const std::vector<std::string>& tableNames, | ||
| std::vector<std::pair<bool, txservice::CatalogRecord>>* out); |
There was a problem hiding this comment.
Documentation is inconsistent with implementation.
The comment states "Uses readCatalog per table (serial); Phase 3 may switch to BatchReadCatalogTxRequest for concurrent read." However, the implementation already uses BatchReadTxRequest with is_catalog_batch=true for batch reading. Please update the comment to reflect the actual implementation.
Also, note that opCtx is explicitly unused in the implementation ((void)opCtx;). If this is intentional for API consistency, consider adding a comment. Otherwise, it could be removed.
📝 Suggested documentation fix
/**
* Batch read catalog for multiple tables. out[i] corresponds to tableNames[i]:
- * (true, record) if exists, else (false, empty). Uses readCatalog per table
- * (serial); Phase 3 may switch to BatchReadCatalogTxRequest for concurrent read.
+ * (true, record) if exists, else (false, empty). Uses BatchReadTxRequest for
+ * concurrent batch catalog reads.
*/
void batchReadCatalog(OperationContext* opCtx,📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| /** | |
| * Batch read catalog for multiple tables. out[i] corresponds to tableNames[i]: | |
| * (true, record) if exists, else (false, empty). Uses readCatalog per table | |
| * (serial); Phase 3 may switch to BatchReadCatalogTxRequest for concurrent read. | |
| */ | |
| void batchReadCatalog(OperationContext* opCtx, | |
| const std::vector<std::string>& tableNames, | |
| std::vector<std::pair<bool, txservice::CatalogRecord>>* out); | |
| /** | |
| * Batch read catalog for multiple tables. out[i] corresponds to tableNames[i]: | |
| * (true, record) if exists, else (false, empty). Uses BatchReadTxRequest for | |
| * concurrent batch catalog reads. | |
| */ | |
| void batchReadCatalog(OperationContext* opCtx, | |
| const std::vector<std::string>& tableNames, | |
| std::vector<std::pair<bool, txservice::CatalogRecord>>* out); |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/mongo/db/modules/eloq/src/eloq_recovery_unit.h` around lines 107 - 114,
The doc comment for batchReadCatalog is outdated: update the description to
state that batchReadCatalog performs a batched catalog read using
BatchReadTxRequest with is_catalog_batch=true (not serial per-table readCatalog
calls), and mention that Phase 3 may adjust concurrency if needed; also either
remove the unused opCtx parameter from the signature or explicitly document why
it is unused (e.g., add a short inline comment noting "(void)opCtx; // kept for
API compatibility") so callers/readers understand the intentionality; reference
the batchReadCatalog method, the readCatalog mention in the comment, and the use
of BatchReadTxRequest/is_catalog_batch in the implementation when making these
edits.
f086fbf to
be48daa
Compare
Summary by CodeRabbit