align bulk write error with mongodb#402
Conversation
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. WalkthroughAdds storage- and index-level batch duplicate-key checks before inserts and updates, implements those checks for the Eloq engine, adjusts insert/update control flow to run checks before writing, updates error handling to allow command-level transaction continuation for DuplicateKey in specific paths, and updates tests to expect partial-success results. Changes
Sequence Diagram(s)sequenceDiagram
actor Client
participant Collection
participant RecordStore
participant IndexCatalog
participant IndexAccessMethod
participant SortedDataInterface
Client->>Collection: bulkInsert(docs)
rect rgb(232,240,255)
Note over Collection,RecordStore: Primary-key pre-check
Collection->>RecordStore: batchCheckDuplicateKey(records)
RecordStore-->>Collection: Status (OK / DuplicateKey)
alt DuplicateKey
Collection-->>Client: Error (early return)
end
end
rect rgb(232,240,255)
Note over Collection,IndexCatalog: Index pre-checks
Collection->>IndexCatalog: batchCheckDuplicateKey(bsonObjPtrs)
IndexCatalog->>IndexAccessMethod: batchCheckDuplicateKey(...)
IndexAccessMethod->>SortedDataInterface: batchCheckDuplicateKey(...)
SortedDataInterface-->>IndexAccessMethod: Status (OK / DuplicateKey)
alt DuplicateKey
IndexAccessMethod-->>IndexCatalog: Error
IndexCatalog-->>Collection: Error
Collection-->>Client: Error (early return)
end
end
rect rgb(230,255,230)
Note over Collection,RecordStore: Insert phase
Collection->>RecordStore: insertRecords(...)
RecordStore-->>Collection: inserted Records + RecordIds
end
rect rgb(255,244,214)
Note over Collection,IndexCatalog: Post-insert indexing
Collection->>IndexCatalog: indexRecords(bsonRecords)
IndexCatalog->>IndexAccessMethod: index per-record keys / checkDuplicateKeysForUpdate
IndexAccessMethod->>SortedDataInterface: write/index operations
SortedDataInterface-->>IndexAccessMethod: OK
IndexAccessMethod-->>IndexCatalog: OK
IndexCatalog-->>Collection: OK
end
Collection-->>Client: WriteResult (nInserted, writeErrors)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (9)
src/mongo/db/commands/write_commands/write_commands.cpp (1)
29-31: Question logging component change and commented-outbasic.hYou’re now defining the default log component as
kStorageand have commented outmongo/platform/basic.h. The log component switch may be intentional, but it will move alllog()output in this file under the storage category, which is different from typical command logging. Also,basic.his usually included first in MongoDB source files; relying on transitive includes can be fragile.Consider:
- Confirming that
kStorageis really the desired component for write command logs.- Reinstating
#include "mongo/platform/basic.h"unless there is a concrete reason to omit it.Also applies to: 58-59
tests/jstests/core/batch_write_command_insert.js (1)
190-248: Duplicate-key tests for unique index align with bulk write semanticsThe single and multi-document duplicate-key tests now:
- Expect
result.okwithwriteErrorspopulated.- Distinguish ordered vs unordered behavior via
nandwriteErrors.length.- Verify error indices and basic types of
code/errmsg.This matches MongoDB’s bulk-write API, where duplicate-key errors are reported per-operation without failing the whole command.
You might optionally tighten these by asserting
writeErrors[i].code === ErrorCodes.DuplicateKeyeverywhere you are explicitly testing duplicate-key behavior, for consistency with the new tests below.src/mongo/db/modules/eloq/src/eloq_record_store.cpp (2)
60-70: Batch primary-key duplicate check for Eloq is well-structuredThe new
BatchReadEntryhelper andEloqRecordStore::batchCheckDuplicateKeyimplementation look sound:
_idkey extraction +checkKeySizemirror existing single-insert validation.batchKeyscorrectly enforces uniqueness within the incoming batch.- The
batchGetKVcall checks for existing keys (or pending writes) in one round-trip and interpretsRecordStatus::Normalas a duplicate.- Assigning
record.idfrom theKeyStringensures a stable mapping between primary key andRecordId, which is later used by the indexing path.This gives Eloq an efficient, pre-insert duplicate check on the primary key consistent with MongoDB semantics.
If you ever need more performance, you could consider reusing the
BatchReadEntryarray from this function in_insertRecordsinstead of rebuilding it, but that’s an optimization rather than a correctness issue.Also applies to: 811-876
1139-1175: Assumptions and reuse in_insertRecordsIn
_insertRecordsyou:
- Fetch
DiscoveredTableand computepkeySchemaVersiononce.- Rebuild
BatchReadEntryto generate primary-keyKeyStrings andMongoKeys and callsetKVwithcheckUnique = trueper record.A couple of points to call out:
- This implementation implicitly assumes that callers have already run
batchCheckDuplicateKey(for key-size and duplicate detection) and thatrecords[i].idhas been initialized consistently with theKeyStringyou’re now using. That’s true forCollectionImpl::_insertDocumentsin this PR, but please double-check that no other call sites invokeinsertRecordsdirectly without the pre-check, or you may lose early duplicate detection or key-size errors for those paths.- You’re recomputing
KeyStrings here that were already built inbatchCheckDuplicateKey; if this becomes a hot path, consolidating that computation would avoid duplication.Functionally this looks correct, but it would be good to document or assert the expected call ordering so future refactors don’t accidentally bypass the pre-check.
src/mongo/db/catalog/collection_impl.cpp (2)
66-78: Duplicatekey_string.hinclude
mongo/db/storage/key_string.his now included twice (lines 66 and 77). It’s harmless but redundant; you can safely drop the second include.
507-566: New pre-insert batch duplicate-check pipelineThe updated
_insertDocumentsnow:
- Builds
records/timestampsas before.- Calls
_recordStore->batchCheckDuplicateKey(opCtx, &records)to preflight primary-key duplicates (Eloq overrides this; other engines are effectively no-ops).- Constructs
bsonRecordsand a parallelbsonObjPtrsarray from the incoming documents.- Calls
_indexCatalog.batchCheckDuplicateKey(opCtx, bsonObjPtrs)to check all ready unique indexes in a single pass.- Performs the actual insert via
_recordStore->insertRecords(...).- Copies the resulting
RecordIds fromrecordsback intobsonRecordsand then calls_indexCatalog.indexRecords(...).This is a reasonable layering: it preserves the existing insert+index behavior while adding efficient pre-insert duplicate detection for engines that opt in.
Two things worth confirming:
- That all
RecordStoreimplementations you care about either overridebatchCheckDuplicateKeyor are happy with the new default of “no pre-check, rely on insert-time enforcement”.- That the
performInserts/WriteResultmachinery inwrite_ops_exec.cppstill attributes errors per-document correctly for ordered vs unordered batches now that_insertDocumentsreturns a singleStatusfor the whole group, with duplicate detection potentially happening before any writes.If both hold, this pipeline should give you MongoDB-like bulk duplicate-key semantics while avoiding unnecessary writes when duplicates are detected early.
src/mongo/db/modules/eloq/src/eloq_index.cpp (3)
661-663: Consider defensive error handling instead of invariant.The
invariantassertion will crash the process iftuple.status_is neitherNormalnorDeleted. If the underlying txservice API introduces additional status values in the future, this could cause unexpected crashes.🔎 Proposed fix
} else { - invariant(tuple.status_ == txservice::RecordStatus::Deleted); + // Key does not exist - this is expected + if (tuple.status_ != txservice::RecordStatus::Deleted) { + MONGO_LOG(1) << "batchCheckDuplicateKey unexpected status: " + << static_cast<int>(tuple.status_); + } }
764-774: Consider removing commented-out code or adding a clear explanation.The commented-out existence check block documents the prior validation approach, but leaving large commented code blocks can reduce readability. If this is intentionally kept for reference, consider adding a brief comment explaining why (e.g.,
// Prior approach - now handled by batchCheckDuplicateKey). Otherwise, remove it.
786-788: Debug log line exceeds typical line length limits.The log statement on line 787 is quite long. Consider breaking it up or simplifying for readability.
🔎 Proposed fix
if (err != txservice::TxErrorCode::NO_ERROR) { - MONGO_LOG(1) << "yf: EloqUniqueIndex::insert setKV failed, index: " << _indexName.StringView() << ", key: " << key.toString() << ", error: " << txservice::TxErrorMessage(err); + MONGO_LOG(1) << "EloqUniqueIndex::insert setKV failed, index: " + << _indexName.StringView() << ", key: " << key.toString() + << ", error: " << txservice::TxErrorMessage(err); }
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (17)
src/mongo/db/catalog/collection_impl.cppsrc/mongo/db/catalog/index_catalog.hsrc/mongo/db/catalog/index_catalog_impl.cppsrc/mongo/db/catalog/index_catalog_impl.hsrc/mongo/db/commands/write_commands/write_commands.cppsrc/mongo/db/index/index_access_method.cppsrc/mongo/db/index/index_access_method.hsrc/mongo/db/modules/eloq/src/eloq_index.cppsrc/mongo/db/modules/eloq/src/eloq_index.hsrc/mongo/db/modules/eloq/src/eloq_record_store.cppsrc/mongo/db/modules/eloq/src/eloq_record_store.hsrc/mongo/db/operation_context.cppsrc/mongo/db/operation_context.hsrc/mongo/db/ops/write_ops_exec.cppsrc/mongo/db/storage/record_store.hsrc/mongo/db/storage/sorted_data_interface.htests/jstests/core/batch_write_command_insert.js
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-11-28T10:55:21.020Z
Learnt from: githubzilla
Repo: eloqdata/eloqdoc PR: 279
File: src/mongo/db/modules/eloq/CMakeLists.txt:54-55
Timestamp: 2025-11-28T10:55:21.020Z
Learning: In src/mongo/db/modules/eloq/CMakeLists.txt, the CMake options EXT_TX_PROC_ENABLED, FORK_HM_PROCESS, and STATISTICS are used by the data_substrate submodule build system and should be retained as configuration interface options.
Applied to files:
src/mongo/db/modules/eloq/src/eloq_index.h
📚 Learning: 2025-11-14T03:10:22.027Z
Learnt from: githubzilla
Repo: eloqdata/eloqdoc PR: 279
File: src/mongo/db/dbmain.cpp:0-0
Timestamp: 2025-11-14T03:10:22.027Z
Learning: In this repository, avoid defining --config via gflags in src/mongo/db/dbmain.cpp; MongoDB’s options_parser owns --config. Prefer gflags flags with a distinct prefix (e.g., --data_substrate_config) parsed before options_parser, with options_parser configured to allow unregistered options.
Applied to files:
src/mongo/db/commands/write_commands/write_commands.cpp
🧬 Code graph analysis (11)
src/mongo/db/catalog/index_catalog_impl.h (5)
src/mongo/db/modules/eloq/src/eloq_index.cpp (4)
batchCheckDuplicateKey(576-668)batchCheckDuplicateKey(576-577)opCtx(217-222)opCtx(217-217)src/mongo/db/modules/eloq/src/eloq_record_store.cpp (8)
batchCheckDuplicateKey(811-876)batchCheckDuplicateKey(811-812)opCtx(147-149)opCtx(147-147)opCtx(439-448)opCtx(439-439)opCtx(549-554)opCtx(549-549)src/mongo/db/catalog/index_catalog.h (4)
batchCheckDuplicateKey(540-543)vector(478-480)BSONObj(449-452)BSONObj(454-457)src/mongo/db/catalog/index_catalog_impl.cpp (2)
batchCheckDuplicateKey(1449-1478)batchCheckDuplicateKey(1449-1450)src/mongo/db/index/index_access_method.cpp (2)
batchCheckDuplicateKey(621-630)batchCheckDuplicateKey(621-622)
src/mongo/db/index/index_access_method.h (4)
src/mongo/db/storage/sorted_data_interface.h (3)
Status(136-140)Status(146-148)Status(188-191)src/mongo/db/modules/eloq/src/eloq_index.cpp (4)
batchCheckDuplicateKey(576-668)batchCheckDuplicateKey(576-577)opCtx(217-222)opCtx(217-217)src/mongo/db/catalog/index_catalog.h (4)
batchCheckDuplicateKey(540-543)vector(478-480)BSONObj(449-452)BSONObj(454-457)src/mongo/db/index/index_access_method.cpp (2)
batchCheckDuplicateKey(621-630)batchCheckDuplicateKey(621-622)
src/mongo/db/index/index_access_method.cpp (5)
src/mongo/db/modules/eloq/src/eloq_index.cpp (4)
batchCheckDuplicateKey(576-668)batchCheckDuplicateKey(576-577)opCtx(217-222)opCtx(217-217)src/mongo/db/modules/eloq/src/eloq_record_store.cpp (8)
batchCheckDuplicateKey(811-876)batchCheckDuplicateKey(811-812)opCtx(147-149)opCtx(147-147)opCtx(439-448)opCtx(439-439)opCtx(549-554)opCtx(549-549)src/mongo/db/catalog/index_catalog.h (1)
batchCheckDuplicateKey(540-543)src/mongo/db/catalog/index_catalog_impl.cpp (2)
batchCheckDuplicateKey(1449-1478)batchCheckDuplicateKey(1449-1450)src/mongo/db/index/index_access_method.h (1)
getSortedDataInterface(165-167)
src/mongo/db/operation_context.h (1)
src/mongo/db/operation_context.cpp (2)
setRecoveryUnitState(485-487)setRecoveryUnitState(485-485)
src/mongo/db/modules/eloq/src/eloq_index.cpp (6)
src/mongo/db/modules/eloq/src/eloq_record_store.cpp (15)
batchCheckDuplicateKey(811-876)batchCheckDuplicateKey(811-812)opCtx(147-149)opCtx(147-147)opCtx(439-448)opCtx(439-439)opCtx(549-554)opCtx(549-549)MONGO_LOG(98-126)MONGO_LOG(450-483)MONGO_LOG(519-526)MONGO_LOG(528-533)MONGO_LOG(535-540)MONGO_LOG(542-547)mongoKey(724-724)src/mongo/db/catalog/index_catalog.h (1)
batchCheckDuplicateKey(540-543)src/mongo/db/catalog/index_catalog_impl.cpp (2)
batchCheckDuplicateKey(1449-1478)batchCheckDuplicateKey(1449-1450)src/mongo/db/index/index_access_method.cpp (2)
batchCheckDuplicateKey(621-630)batchCheckDuplicateKey(621-622)src/mongo/db/modules/eloq/src/eloq_recovery_unit.cpp (2)
get(277-279)get(277-277)src/mongo/db/modules/eloq/src/base/eloq_util.cpp (2)
TxErrorCodeToMongoStatus(11-75)TxErrorCodeToMongoStatus(11-11)
src/mongo/db/catalog/index_catalog.h (5)
src/mongo/db/storage/sorted_data_interface.h (3)
Status(136-140)Status(146-148)Status(188-191)src/mongo/db/modules/eloq/src/eloq_index.cpp (4)
batchCheckDuplicateKey(576-668)batchCheckDuplicateKey(576-577)opCtx(217-222)opCtx(217-217)src/mongo/db/modules/eloq/src/eloq_record_store.cpp (8)
batchCheckDuplicateKey(811-876)batchCheckDuplicateKey(811-812)opCtx(147-149)opCtx(147-147)opCtx(439-448)opCtx(439-439)opCtx(549-554)opCtx(549-549)src/mongo/db/catalog/index_catalog_impl.cpp (2)
batchCheckDuplicateKey(1449-1478)batchCheckDuplicateKey(1449-1450)src/mongo/db/index/index_access_method.cpp (2)
batchCheckDuplicateKey(621-630)batchCheckDuplicateKey(621-622)
src/mongo/db/storage/sorted_data_interface.h (4)
src/mongo/db/modules/eloq/src/eloq_index.cpp (4)
batchCheckDuplicateKey(576-668)batchCheckDuplicateKey(576-577)opCtx(217-222)opCtx(217-217)src/mongo/db/modules/eloq/src/eloq_record_store.cpp (8)
batchCheckDuplicateKey(811-876)batchCheckDuplicateKey(811-812)opCtx(147-149)opCtx(147-147)opCtx(439-448)opCtx(439-439)opCtx(549-554)opCtx(549-549)src/mongo/db/catalog/index_catalog.h (3)
batchCheckDuplicateKey(540-543)BSONObj(449-452)BSONObj(454-457)src/mongo/db/index/index_access_method.cpp (2)
batchCheckDuplicateKey(621-630)batchCheckDuplicateKey(621-622)
src/mongo/db/modules/eloq/src/eloq_record_store.cpp (3)
src/mongo/db/modules/eloq/src/eloq_index.cpp (11)
mongoKey(236-236)batchCheckDuplicateKey(576-668)batchCheckDuplicateKey(576-577)opCtx(217-222)opCtx(217-217)MONGO_LOG(187-189)MONGO_LOG(191-195)MONGO_LOG(197-208)MONGO_LOG(210-215)MONGO_LOG(331-363)MONGO_LOG(389-413)src/mongo/db/modules/eloq/src/eloq_recovery_unit.cpp (2)
get(277-279)get(277-277)src/mongo/db/modules/eloq/src/eloq_record_store.h (1)
insertRecords(87-92)
src/mongo/db/storage/record_store.h (6)
src/mongo/db/storage/sorted_data_interface.h (4)
Status(136-140)Status(146-148)Status(188-191)std(403-405)src/mongo/db/modules/eloq/src/eloq_index.cpp (4)
batchCheckDuplicateKey(576-668)batchCheckDuplicateKey(576-577)opCtx(217-222)opCtx(217-217)src/mongo/db/modules/eloq/src/eloq_record_store.cpp (8)
batchCheckDuplicateKey(811-876)batchCheckDuplicateKey(811-812)opCtx(147-149)opCtx(147-147)opCtx(439-448)opCtx(439-439)opCtx(549-554)opCtx(549-549)src/mongo/db/catalog/index_catalog.h (2)
batchCheckDuplicateKey(540-543)vector(478-480)src/mongo/db/catalog/index_catalog_impl.cpp (2)
batchCheckDuplicateKey(1449-1478)batchCheckDuplicateKey(1449-1450)src/mongo/db/index/index_access_method.cpp (2)
batchCheckDuplicateKey(621-630)batchCheckDuplicateKey(621-622)
src/mongo/db/modules/eloq/src/eloq_record_store.h (4)
src/mongo/db/storage/sorted_data_interface.h (3)
Status(136-140)Status(146-148)Status(188-191)src/mongo/db/modules/eloq/src/eloq_record_store.cpp (8)
batchCheckDuplicateKey(811-876)batchCheckDuplicateKey(811-812)opCtx(147-149)opCtx(147-147)opCtx(439-448)opCtx(439-439)opCtx(549-554)opCtx(549-549)src/mongo/db/catalog/index_catalog_impl.cpp (2)
batchCheckDuplicateKey(1449-1478)batchCheckDuplicateKey(1449-1450)src/mongo/db/index/index_access_method.cpp (2)
batchCheckDuplicateKey(621-630)batchCheckDuplicateKey(621-622)
src/mongo/db/catalog/index_catalog_impl.cpp (4)
src/mongo/db/modules/eloq/src/eloq_index.cpp (4)
batchCheckDuplicateKey(576-668)batchCheckDuplicateKey(576-577)opCtx(217-222)opCtx(217-217)src/mongo/db/modules/eloq/src/eloq_record_store.cpp (8)
batchCheckDuplicateKey(811-876)batchCheckDuplicateKey(811-812)opCtx(147-149)opCtx(147-147)opCtx(439-448)opCtx(439-439)opCtx(549-554)opCtx(549-549)src/mongo/db/catalog/index_catalog.h (1)
batchCheckDuplicateKey(540-543)src/mongo/db/index/index_access_method.cpp (2)
batchCheckDuplicateKey(621-630)batchCheckDuplicateKey(621-622)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Agent
- GitHub Check: CodeQL analysis (javascript-typescript)
🔇 Additional comments (21)
src/mongo/db/storage/sorted_data_interface.h (1)
129-140: LGTM! Well-designed extension point for batch duplicate-key validation.The new
batchCheckDuplicateKeymethod provides a clean extension point for storage engines to implement batch duplicate-key checking prior to inserts. The default no-op implementation ensures backward compatibility for engines that don't require this functionality, following the same pattern as other optional methods likecompact()andtouch().src/mongo/db/operation_context.cpp (1)
485-487: LGTM! Clean setter for recovery unit state management.The implementation correctly provides direct state manipulation capability as documented in the header. This enables resetting state after nested
WriteUnitOfWorkfailures while maintaining the outer transaction.src/mongo/db/catalog/index_catalog_impl.h (1)
344-345: LGTM! Proper catalog-level batch duplicate-key checking declaration.The method declaration correctly exposes batch duplicate-key checking at the catalog implementation level, enabling validation across all unique indexes prior to insertion. The implementation (as shown in related snippets) appropriately delegates to individual index access methods.
src/mongo/db/operation_context.h (1)
120-125: LGTM! Clear documentation for recovery unit state setter.The method declaration and documentation clearly describe the purpose: enabling state resets after nested
WriteUnitOfWorkfailures while keeping the outer transaction active. This is a useful addition to the existing recovery unit management API.src/mongo/db/modules/eloq/src/eloq_index.h (2)
25-27: LGTM! Required includes for batch duplicate-key checking.The added includes are necessary for the
batchCheckDuplicateKeyimplementation:
index_catalog.handmultikey_paths.hfor index key extractionsimple_bsonobj_comparator.hfor intra-batch duplicate detection
83-92: LGTM! Proper override of batch duplicate-key checking for EloqIndex.The method declaration correctly overrides the
SortedDataInterfaceAPI, enabling Eloq-specific batch duplicate-key validation. The documentation clearly describes the parameters and return values. The implementation (shown in related snippets) demonstrates comprehensive validation including intra-batch duplicate detection.src/mongo/db/storage/record_store.h (1)
420-432: LGTM! Well-designed extension point for batch primary key validation.The new
batchCheckDuplicateKeymethod provides a clean extension point for record stores to validate primary keys (RecordId) in batch before insertion. The default no-op implementation maintains backward compatibility for storage engines that don't require this functionality. The input/output parameter design (records with potentially set RecordId) is appropriate for this use case.src/mongo/db/index/index_access_method.h (2)
159-167: LGTM! Useful accessor for storage engine-specific implementations.The
getSortedDataInterface()accessor provides a clean way to access storage engine-specific implementations (likeEloqIndex), enabling delegation patterns for specialized functionality. The documentation clearly describes its purpose.
169-177: LGTM! Appropriate index access method API for batch duplicate checking.The method declaration correctly introduces batch duplicate-key checking at the index access layer, with clear documentation. The implementation (shown in related snippets) properly delegates to the underlying
SortedDataInterfacewhen available, falling back to OK otherwise, which maintains correct behavior across different storage engines.src/mongo/db/modules/eloq/src/eloq_record_store.h (1)
240-241: LGTM! Proper override for Eloq record store batch duplicate checking.The method declaration correctly overrides the
RecordStoreAPI, enabling Eloq-specific batch duplicate-key validation for primary keys. The implementation (shown in related snippets) demonstrates comprehensive validation including intra-batch duplicate detection and efficient batch KV lookups.src/mongo/db/index/index_access_method.cpp (1)
621-630: Delegating batch duplicate checks to SDI looks correctThe new
IndexAccessMethod::batchCheckDuplicateKeycleanly delegates to the underlyingSortedDataInterfacewhen present and is a no-op otherwise, which is the right behavior for engines that don't implement batch checking.src/mongo/db/commands/write_commands/write_commands.cpp (1)
93-95: No semantic change inshouldSkipOutputguardWrapping
if (shouldSkipOutput(opCtx))in braces is a cosmetic/readability change only; behavior is unchanged.tests/jstests/core/batch_write_command_insert.js (3)
105-121: Illegal-key insert now validated viaok: 1+writeErrorsSwitching this test from
!result.oktoresult.okwith a singlewriteErrorsentry matches modern write-command behavior (command succeeds, operation-level failure). Assertions onn == 0and collection count keep the semantics tight.
393-408: Ordered bulk duplicate-key count expectation correctedChanging the expectation from
coll.count() == 1tocoll.count() == 50in the ordered bulk test is correct: documents_id:1..49should be inserted before the duplicate_id:50stops the batch. This makes the test consistent with upstream bulk-insert semantics.
428-473: New ordered vs unordered DuplicateKey tests look solidThe added tests:
- For ordered inserts, assert
n: 0, a singlewriteErrors[0]at index 0, andErrorCodes.DuplicateKey.- For unordered inserts, assert
n: 2, at-least-onewriteErrors, all withErrorCodes.DuplicateKey, and final document count of 3 (1 initial + 2 successes).These precisely capture the expected BulkWriteException-style behavior and provide good regression coverage for future changes.
src/mongo/db/catalog/index_catalog_impl.cpp (1)
1449-1478: Unique-index batch duplicate check implementation is appropriate
IndexCatalogImpl::batchCheckDuplicateKeycorrectly:
- Skips unfinished indexes and non-unique indexes.
- Delegates to each ready unique index’s access method.
- Short-circuits on first non-OK status.
This matches the intended use as a pre-insert guard over all relevant unique indexes and composes properly with the new SDI-level hook.
src/mongo/db/catalog/index_catalog.h (1)
234-236: IndexCatalog API for batch duplicate checking is consistentAdding
Impl::batchCheckDuplicateKeyand the publicIndexCatalog::batchCheckDuplicateKeywrapper cleanly exposes the new batch-check capability at the catalog layer, matching the implementations inIndexCatalogImplandIndexAccessMethod.The doc comment is clear about the primary contract (OK vs
DuplicateKey), while still allowing other error statuses to propagate.Also applies to: 533-543
src/mongo/db/ops/write_ops_exec.cpp (4)
225-250: LGTM with a note on intentional asymmetry.The new
allowCommandLevelTransactionparameter enables error handling within command-level transactions specifically for insert operations. Update and delete operations continue to throw (via the defaultfalsevalue). Ensure this asymmetry between insert and update/delete error handling is intentional and documented in design docs.
476-485: Clarify whetherabandonSnapshot()is needed when resetting RecoveryUnit state.The commented-out
abandonSnapshot()call on line 483 suggests uncertainty about proper state cleanup. Directly resetting the RecoveryUnit state tokActiveUnitOfWorkwithout clearing any failed snapshot state could potentially leave the recovery unit in an inconsistent state.Please verify if
abandonSnapshot()should be called, and either uncomment it or remove it with an explanatory comment if it's confirmed unnecessary.
515-532: Consistent error handling for single inserts.The error handling pattern here mirrors the batch failure handling above, which is good for consistency. The same concern about the commented-out
abandonSnapshot()applies here as well.
646-649: LGTM - consistent with other insert error handling paths.The
trueargument forallowCommandLevelTransactionaligns with the insert error handling behavior established elsewhere in this file.
There was a problem hiding this comment.
Pull request overview
This pull request aligns bulk write error handling with MongoDB's standard behavior by changing error responses from ok: false to ok: true with a writeErrors array for duplicate key errors. The implementation introduces batch duplicate key checking for improved performance in the Eloq storage engine.
Key changes:
- Modified bulk write error format to return
ok: 1withwriteErrorsarray instead ofok: false - Implemented batch duplicate key checking in
EloqRecordStoreandEloqIndexto detect duplicates before insertion - Added recovery unit state management to handle nested WriteUnitOfWork failures in command-level transactions
Reviewed changes
Copilot reviewed 17 out of 17 changed files in this pull request and generated 27 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/jstests/core/batch_write_command_insert.js | Updated test assertions to expect ok: 1 with writeErrors; added new test cases for ordered/unordered batch error handling |
| src/mongo/db/ops/write_ops_exec.cpp | Modified error handling to support command-level transactions; added recovery unit state reset logic |
| src/mongo/db/operation_context.h/cpp | Added setRecoveryUnitState() method to manipulate recovery unit state without aborting transactions |
| src/mongo/db/storage/sorted_data_interface.h | Declared virtual batchCheckDuplicateKey() method for duplicate checking in storage interfaces |
| src/mongo/db/storage/record_store.h | Declared virtual batchCheckDuplicateKey() method for primary key duplicate checking |
| src/mongo/db/modules/eloq/src/eloq_record_store.h/cpp | Implemented batchCheckDuplicateKey() with batch key validation and duplicate detection |
| src/mongo/db/modules/eloq/src/eloq_index.h/cpp | Implemented batchCheckDuplicateKey() for unique index duplicate checking; commented out inline duplicate check in insert |
| src/mongo/db/index/index_access_method.h/cpp | Added batchCheckDuplicateKey() method that delegates to underlying SortedDataInterface |
| src/mongo/db/catalog/index_catalog.h/cpp | Added batchCheckDuplicateKey() to check all unique indexes before insertion |
| src/mongo/db/catalog/collection_impl.cpp | Modified _insertDocuments() to call batch duplicate checks before actual insertion |
| src/mongo/db/commands/write_commands/write_commands.cpp | Added logging include and minor formatting changes |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| << idObj.toString(); | ||
| return {ErrorCodes::DuplicateKey, "DuplicateKey"}; | ||
| } | ||
| batchKeys.insert(idObj.getOwned()); |
There was a problem hiding this comment.
Potential performance issue: calling getOwned() for every key in the batch may cause unnecessary memory allocations. Since the BSONObj keys are extracted from documents that will remain in scope, consider using a different approach or ensuring this overhead is acceptable for the expected batch sizes.
| batchKeys.insert(idObj.getOwned()); | |
| batchKeys.insert(idObj); |
src/mongo/db/ops/write_ops_exec.cpp
Outdated
| MONGO_LOG(0) << "yf: handleError, exception = " << ex.toString() << ", code = " << ex.code() << ", key = " << it->doc.firstElement().fieldNameStringData() << ", value = " << it->doc.firstElement().toString() ; | ||
|
|
There was a problem hiding this comment.
Debug logging statements should be removed before merging to production. These MONGO_LOG statements were likely added during development for debugging purposes.
| MONGO_LOG(0) << "yf: handleError, exception = " << ex.toString() << ", code = " << ex.code() << ", key = " << it->doc.firstElement().fieldNameStringData() << ", value = " << it->doc.firstElement().toString() ; | |
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (4)
src/mongo/db/modules/eloq/src/eloq_record_store.cpp (4)
838-842: Remove developer identifier prefix and improve error context.The logging statement contains a developer identifier "yf:" that should be removed before merging, and the getOwned() call may cause unnecessary allocations.
These issues were previously flagged:
- Line 838: Developer identifier prefix "yf:" should be removed from logging statements
- Line 840: Error message "DuplicateKey" is too generic and lacks context about which key caused the duplicate
- Line 842: Calling getOwned() for every key in the batch may cause unnecessary memory allocations
867-869: Remove developer identifier prefix and improve error context.The logging statement contains a developer identifier "yf:" that should be removed before merging, and the error message lacks context.
These issues were previously flagged:
- Line 867: Developer identifier prefix "yf:" should be removed from logging statements
- Line 869: Error message "DuplicateKey" is too generic and doesn't provide information about which key caused the duplicate or which collection/index was affected
1166-1166: Remove developer identifier prefix from logging.The logging statement contains the developer identifier "yf:" which should be removed before merging to production.
This issue was previously flagged in past reviews.
1174-1174: Remove developer identifier prefix from logging.The logging statement contains the developer identifier "yf:" which should be removed before merging to production.
This issue was previously flagged in past reviews.
🧹 Nitpick comments (1)
src/mongo/db/modules/eloq/src/eloq_record_store.cpp (1)
1146-1153: Optimize by reusing record.id if batchCheckDuplicateKey was called.If
batchCheckDuplicateKeyis called before_insertRecords(as suggested by the assertion at line 1132), thenrecord.idis already set and contains the computedRecordId. This code redundantly re-extracts the ID from the BSONObj.Once the integration issue (see previous comment on lines 1131-1132) is resolved, consider optimizing this to reuse the existing
record.id:🔎 Proposed optimization (apply after fixing the integration)
- // All keys are valid (no duplicates found). Now insert all records. - // Rebuild batchEntries for insertion auto batchEntries = std::make_unique<BatchReadEntry[]>(nRecords); for (size_t i = 0; i < nRecords; i++) { Record& record = records[i]; - BSONObj obj{record.data.data()}; - const BSONObj idObj = getIdBSONObjWithoutFieldName(obj); BatchReadEntry& entry = batchEntries[i]; - entry.resetToKey(idObj); + // Reuse the record.id that was already computed by batchCheckDuplicateKey + entry.keyString.resetFromBuffer(record.id.repr(), record.id.reprSize()); + entry.mongoKey = std::make_unique<Eloq::MongoKey>(record.id); }Note: Verify that the KeyString and MongoKey constructors support initialization from the RecordId representation.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/mongo/db/modules/eloq/src/eloq_record_store.cpp
🔇 Additional comments (1)
src/mongo/db/modules/eloq/src/eloq_record_store.cpp (1)
60-70: LGTM: Struct definition is clean and well-structured.The BatchReadEntry structure consolidates the previously duplicated definition and provides a clear helper method for key initialization.
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (7)
src/mongo/db/modules/eloq/src/eloq_record_store.cpp (7)
840-840: Improve error message with context.The error message lacks information about which key caused the duplicate or which collection was affected, making debugging difficult.
Based on past review comments.
Enhance error message
- return {ErrorCodes::DuplicateKey, "DuplicateKey"}; + return {ErrorCodes::DuplicateKey, + str::stream() << "Duplicate key found in batch for collection " + << _tableName.StringView() << ": " << idObj.toString()};
842-842: Potential performance issue with getOwned() calls in batch.Calling
getOwned()for every key in the batch may cause unnecessary memory allocations. Since the BSONObj keys are extracted from documents that remain in scope throughout the batch operation, consider whether owned copies are truly necessary here.Based on past review comments. The concern is that for large batches, repeated allocations could impact performance. Evaluate whether
batchKeys.insert(idObj)would suffice, or implement a check to avoid getOwned() when the object is already owned.
869-869: Improve error message with context.The error message lacks information about which key caused the duplicate or which index was affected, making debugging difficult.
Based on past review comments.
Enhance error message
- return {ErrorCodes::DuplicateKey, "DuplicateKey"}; + return {ErrorCodes::DuplicateKey, + str::stream() << "Duplicate key found for collection " + << _tableName.StringView() << ": " + << batchEntries[i].keyString.toString()};
838-838: Remove developer identifier prefix from production code.The logging prefix "yf:" is a developer identifier that should be removed before merging to production.
Based on past review comments.
Apply this change
- MONGO_LOG(1) << "yf: EloqRecordStore::batchCheckDuplicateKey duplicate key found within batch, key: " + MONGO_LOG(1) << "EloqRecordStore::batchCheckDuplicateKey duplicate key found within batch, key: " << idObj.toString();
867-867: Remove developer identifier prefix from production code.The logging prefix "yf:" should be removed before merging to production.
Based on past review comments.
Apply this change
- MONGO_LOG(1) << "yf: EloqRecordStore::batchCheckDuplicateKey duplicate key found, key: " + MONGO_LOG(1) << "EloqRecordStore::batchCheckDuplicateKey duplicate key found, key: " << batchEntries[i].keyString.toString();
1166-1166: Remove developer identifier prefix from production code.The logging prefix "yf:" should be removed before merging to production.
Based on past review comments.
Apply this change
- MONGO_LOG(1) << "yf: EloqRecordStore::_insertRecords setKV, table: " << _tableName.StringView() << ", txn: " << ru->getTxm()->TxNumber() << ", key: " << ks.toString() << ", checkUnique: " << checkUnique; + MONGO_LOG(1) << "EloqRecordStore::_insertRecords setKV, table: " << _tableName.StringView() << ", txn: " << ru->getTxm()->TxNumber() << ", key: " << ks.toString() << ", checkUnique: " << checkUnique;
1174-1174: Remove developer identifier prefix from production code.The logging prefix "yf:" should be removed before merging to production.
Based on past review comments.
Apply this change
- MONGO_LOG(1) << "yf: _insertRecords setKV failed, table: " << _tableName.StringView() << ", txn: " << ru->getTxm()->TxNumber() << ", key: " << ks.toString() << ", checkUnique: " << checkUnique << ", error: " << txservice::TxErrorMessage(err); + MONGO_LOG(1) << "_insertRecords setKV failed, table: " << _tableName.StringView() << ", txn: " << ru->getTxm()->TxNumber() << ", key: " << ks.toString() << ", checkUnique: " << checkUnique << ", error: " << txservice::TxErrorMessage(err);
🧹 Nitpick comments (1)
tests/jstests/core/write_result.js (1)
166-171: Test expectations correctly updated for partial success behavior.The updated test now expects that when a bulk insert encounters a duplicate key on the second document, the first document successfully inserts (
nInserted == 1), with write errors present but no write concern error. This aligns with the PR's goal to implement batch duplicate-key validation that enables partial successes.Consider removing commented-out code
The commented-out lines 170-171 represent the old behavior and could be removed for code cleanliness, unless they're intentionally preserved for reference during the transition period.
assert.eq(coll.count(), 1); -// assert.commandFailedWithCode(result, ErrorCodes.DuplicateKey, result.errmsg) -// assert.eq(coll.count(), 0);
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
src/mongo/db/catalog/collection_impl.cppsrc/mongo/db/modules/eloq/src/eloq_record_store.cpptests/jstests/core/write_result.js
🔇 Additional comments (4)
src/mongo/db/modules/eloq/src/eloq_record_store.cpp (2)
60-70: LGTM! Well-structured helper for batch operations.The
BatchReadEntrystruct provides a clean abstraction for batch read operations, encapsulating key string conversion and storage pointers. TheresetToKeyhelper method properly manages the KeyString and MongoKey lifecycle.
1139-1176: LGTM! Insertion flow correctly sequences ID assignment and validation.The implementation properly:
- Validates the precondition that RecordIds are null on entry (line 1131)
- Rebuilds batch entries and assigns RecordIds from key strings (lines 1145-1153)
- Performs per-record insertion with duplicate checking (lines 1167-1172)
This design correctly separates the pre-insert validation phase (via
batchCheckDuplicateKey) from the actual insertion phase, where RecordIds are assigned.src/mongo/db/catalog/collection_impl.cpp (2)
77-77: Added include for KeyString support.The new include is necessary for the batch duplicate-key validation infrastructure.
518-555: Well-structured batch duplicate-key validation flow.The implementation correctly sequences the validation and insertion phases:
- Pre-insert validation (lines 520, 534): Batch checks for duplicates at both the record store and index catalog levels before any mutations
- Actual insertion (line 540): Only proceeds if all duplicate checks pass
- Post-insert index updates (lines 545-558): Uses the newly assigned RecordIds for index operations
This design ensures atomicity—either all records pass validation and are inserted, or none are—which aligns with the PR's goal of proper bulk write error handling. The assertion at line 551 appropriately verifies that
insertRecordshas assigned RecordIds.
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (12)
src/mongo/db/ops/write_ops_exec.cpp (6)
488-488: Remove commented-out code.🔎 Proposed fix
- //opCtx->recoveryUnit()->abandonSnapshot(); // Clear any failed snapshot state
535-535: Remove commented-out code.🔎 Proposed fix
- // opCtx->recoveryUnit()->abandonSnapshot();
371-371: Remove debug log before merge.This always-on debug log with a developer prefix should be removed or changed to a higher log level.
🔎 Proposed fix
- MONGO_LOG(0) << "yf: insertDocuments, commit";Remove the line or change to:
- MONGO_LOG(0) << "yf: insertDocuments, commit"; + MONGO_LOG(3) << "insertDocuments commit";
477-477: Remove debug log before merge.🔎 Proposed fix
- MONGO_LOG(0) << "yf: catach, exception = " << e.toString() << ", code = " << e.code();
503-503: Remove debug log before merge.🔎 Proposed fix
- MONGO_LOG(0) << "yf: insertDocuments, key = " << it->doc.firstElement().fieldNameStringData() << ", value = " << it->doc.firstElement().toString();
528-529: Remove debug log before merge.🔎 Proposed fix
- MONGO_LOG(0) << "yf: handleError, exception = " << ex.toString() << ", code = " << ex.code() << ", key = " << it->doc.firstElement().fieldNameStringData() << ", value = " << it->doc.firstElement().toString() ;src/mongo/db/modules/eloq/src/eloq_index.cpp (6)
702-703: Consider avoiding getOwned() for keys in batch.Calling
getOwned()for every key creates unnecessary copies. Since the source BSON documents remain in scope for the duration of the batch check, consider whether owned copies are truly necessary.If owned copies are required for safety, the current approach is correct. Otherwise:
💡 Alternative approach
// Add all keys to the vector for (const BSONObj& key : keys) { - allKeys.push_back(key.getOwned()); + allKeys.push_back(key); // Unowned reference, valid for batch duration }Only use this if the lifetime of the source documents is guaranteed.
577-674: Critical: Memory safety issue with vector reallocation.The
keyStringBuffersvector is not pre-reserved, and as elements are added viaemplace_back(line 616), reallocation can occur. This invalidates pointers held byMongoKeyobjects inmongoKeys(line 619-620), creating dangling pointers. Additionally, for multikey indexes,GetKeys()can return multiple keys per document, but the code at line 650/661 useskeyStringBuffers[batchIdx]assuming a 1:1 mapping withindexBatchTuples, which will be incorrect.🔎 Proposed fix
Use a stable container for key storage and track the correct key-to-tuple association:
- std::vector<std::string> keyStringBuffers; // Store KeyString buffer data - std::vector<std::unique_ptr<Eloq::MongoKey>> mongoKeys; // Store MongoKey objects + // Use deque for stable storage - elements don't move on push_back + std::deque<std::string> keyStringBuffers; + std::vector<std::unique_ptr<Eloq::MongoKey>> mongoKeys; + std::vector<size_t> tupleToBufferIndex; // Map batch tuple index to buffer index std::vector<Eloq::MongoRecord> mongoRecords; // Store MongoRecord objects std::vector<txservice::ScanBatchTuple> indexBatchTuples; + keyStringBuffers.reserve(keys.size()); // Reserve estimate + mongoKeys.reserve(keys.size()); + mongoRecords.reserve(keys.size()); + indexBatchTuples.reserve(keys.size()); + tupleToBufferIndex.reserve(keys.size());Then when building the batch:
// Store KeyString buffer data keyStringBuffers.emplace_back(keyString.getBuffer(), keyString.getSize()); + size_t bufferIndex = keyStringBuffers.size() - 1; + // Create MongoKey from buffer auto mongoKey = std::make_unique<Eloq::MongoKey>( keyStringBuffers.back().data(), keyStringBuffers.back().size()); mongoKeys.push_back(std::move(mongoKey)); // Create MongoRecord mongoRecords.emplace_back(); // Add to batch indexBatchTuples.emplace_back(txservice::TxKey(mongoKeys.back().get()), &mongoRecords.back()); + tupleToBufferIndex.push_back(bufferIndex);Then when checking results:
for (size_t batchIdx = 0; batchIdx < indexBatchTuples.size(); batchIdx++) { const txservice::ScanBatchTuple& tuple = indexBatchTuples[batchIdx]; + size_t bufferIdx = tupleToBufferIndex[batchIdx]; if (tuple.status_ == txservice::RecordStatus::Normal) { // Key exists in database or writeset if (currentRecordId.isNull()) { // For insert operations, any existing key is a duplicate MONGO_LOG(1) << "yf: _checkDuplicateKeysInternal duplicate key found, index: " - << _indexName.StringView() << ", key = " << keyStringBuffers[batchIdx]; + << _indexName.StringView() << ", key = " << keyStringBuffers[bufferIdx];
606-608: Remove debug log prefix and improve error message.🔎 Proposed fix
- MONGO_LOG(1) << "yf: _checkDuplicateKeysInternal duplicate key found within batch, index: " + MONGO_LOG(1) << "_checkDuplicateKeysInternal duplicate key found within batch, index: " << _indexName.StringView() << ", key: " << key.toString(); - return {ErrorCodes::DuplicateKey, "DuplicateKey"}; + return {ErrorCodes::DuplicateKey, + str::stream() << "Duplicate key in batch for index " << _indexName.StringView() + << ": " << key.toString()};
649-651: Remove debug log prefix and improve error message.🔎 Proposed fix
- MONGO_LOG(1) << "yf: _checkDuplicateKeysInternal duplicate key found, index: " + MONGO_LOG(1) << "_checkDuplicateKeysInternal duplicate key found, index: " << _indexName.StringView() << ", key = " << keyStringBuffers[batchIdx]; - return {ErrorCodes::DuplicateKey, "DuplicateKey"}; + return {ErrorCodes::DuplicateKey, + str::stream() << "Duplicate key for index " << _indexName.StringView()};
660-664: Remove debug log prefix and improve error message.🔎 Proposed fix
- MONGO_LOG(1) << "yf: _checkDuplicateKeysInternal duplicate key found, index: " + MONGO_LOG(1) << "_checkDuplicateKeysInternal duplicate key found, index: " << _indexName.StringView() << ", key = " << keyStringBuffers[batchIdx] << ", existingRecordId = " << existingRecordId << ", currentRecordId = " << currentRecordId; - return {ErrorCodes::DuplicateKey, "DuplicateKey"}; + return {ErrorCodes::DuplicateKey, + str::stream() << "Duplicate key for index " << _indexName.StringView() + << " (existing doc: " << existingRecordId.toString() << ")"};
840-840: Remove debug log prefix.🔎 Proposed fix
- MONGO_LOG(1) << "yf: EloqUniqueIndex::insert setKV failed, index: " << _indexName.StringView() << ", key: " << key.toString() << ", error: " << txservice::TxErrorMessage(err); + MONGO_LOG(1) << "EloqUniqueIndex::insert setKV failed, index: " << _indexName.StringView() << ", key: " << key.toString() << ", error: " << txservice::TxErrorMessage(err);
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
src/mongo/db/catalog/collection_impl.cppsrc/mongo/db/index/index_access_method.cppsrc/mongo/db/index/index_access_method.hsrc/mongo/db/modules/eloq/data_substratesrc/mongo/db/modules/eloq/src/eloq_index.cppsrc/mongo/db/modules/eloq/src/eloq_index.hsrc/mongo/db/ops/write_ops_exec.cppsrc/mongo/db/storage/sorted_data_interface.h
✅ Files skipped from review due to trivial changes (1)
- src/mongo/db/modules/eloq/data_substrate
🚧 Files skipped from review as they are similar to previous changes (1)
- src/mongo/db/storage/sorted_data_interface.h
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-11-28T10:55:21.020Z
Learnt from: githubzilla
Repo: eloqdata/eloqdoc PR: 279
File: src/mongo/db/modules/eloq/CMakeLists.txt:54-55
Timestamp: 2025-11-28T10:55:21.020Z
Learning: In src/mongo/db/modules/eloq/CMakeLists.txt, the CMake options EXT_TX_PROC_ENABLED, FORK_HM_PROCESS, and STATISTICS are used by the data_substrate submodule build system and should be retained as configuration interface options.
Applied to files:
src/mongo/db/modules/eloq/src/eloq_index.hsrc/mongo/db/ops/write_ops_exec.cppsrc/mongo/db/modules/eloq/src/eloq_index.cpp
📚 Learning: 2025-11-14T03:10:22.027Z
Learnt from: githubzilla
Repo: eloqdata/eloqdoc PR: 279
File: src/mongo/db/dbmain.cpp:0-0
Timestamp: 2025-11-14T03:10:22.027Z
Learning: In this repository, avoid defining --config via gflags in src/mongo/db/dbmain.cpp; MongoDB’s options_parser owns --config. Prefer gflags flags with a distinct prefix (e.g., --data_substrate_config) parsed before options_parser, with options_parser configured to allow unregistered options.
Applied to files:
src/mongo/db/ops/write_ops_exec.cpp
📚 Learning: 2025-11-24T03:11:24.955Z
Learnt from: lzxddz
Repo: eloqdata/eloqdoc PR: 325
File: src/mongo/db/modules/eloq/src/eloq_kv_engine.cpp:1362-1384
Timestamp: 2025-11-24T03:11:24.955Z
Learning: The EloqDoc codebase uses an older version of glog that supports the two-parameter signature `google::InitGoogleLogging(const char* argv0, PrefixCallback)` for custom log prefix formatting, as seen in src/mongo/db/modules/eloq/src/eloq_kv_engine.cpp line 1383.
Applied to files:
src/mongo/db/ops/write_ops_exec.cppsrc/mongo/db/modules/eloq/src/eloq_index.cpp
🧬 Code graph analysis (3)
src/mongo/db/index/index_access_method.h (4)
src/mongo/db/storage/sorted_data_interface.h (4)
Status(136-140)Status(152-157)Status(163-165)Status(205-208)src/mongo/db/modules/eloq/src/eloq_index.cpp (6)
batchCheckDuplicateKey(676-709)batchCheckDuplicateKey(676-677)opCtx(218-223)opCtx(218-218)checkDuplicateKeysForUpdate(711-721)checkDuplicateKeysForUpdate(711-713)src/mongo/db/index/index_access_method.cpp (6)
batchCheckDuplicateKey(621-630)batchCheckDuplicateKey(621-622)checkDuplicateKeysForUpdate(632-642)checkDuplicateKeysForUpdate(632-634)checkDuplicateKeysForUpdate(644-648)checkDuplicateKeysForUpdate(644-645)src/mongo/db/modules/eloq/src/eloq_record_store.cpp (8)
batchCheckDuplicateKey(811-876)batchCheckDuplicateKey(811-812)opCtx(147-149)opCtx(147-147)opCtx(439-448)opCtx(439-439)opCtx(549-554)opCtx(549-549)
src/mongo/db/modules/eloq/src/eloq_index.h (3)
src/mongo/db/modules/eloq/src/eloq_index.cpp (8)
batchCheckDuplicateKey(676-709)batchCheckDuplicateKey(676-677)opCtx(218-223)opCtx(218-218)checkDuplicateKeysForUpdate(711-721)checkDuplicateKeysForUpdate(711-713)_checkDuplicateKeysInternal(579-674)_checkDuplicateKeysInternal(579-581)src/mongo/db/catalog/index_catalog.h (4)
batchCheckDuplicateKey(540-543)vector(478-480)BSONObj(449-452)BSONObj(454-457)src/mongo/db/storage/sorted_data_interface.h (4)
Status(136-140)Status(152-157)Status(163-165)Status(205-208)
src/mongo/db/modules/eloq/src/eloq_index.cpp (5)
src/mongo/db/modules/eloq/src/eloq_record_store.cpp (15)
opCtx(147-149)opCtx(147-147)opCtx(439-448)opCtx(439-439)opCtx(549-554)opCtx(549-549)MONGO_LOG(98-126)MONGO_LOG(450-483)MONGO_LOG(519-526)MONGO_LOG(528-533)MONGO_LOG(535-540)MONGO_LOG(542-547)mongoKey(724-724)batchCheckDuplicateKey(811-876)batchCheckDuplicateKey(811-812)src/mongo/db/modules/eloq/src/eloq_recovery_unit.cpp (2)
get(277-279)get(277-277)src/mongo/db/modules/eloq/src/base/eloq_util.cpp (2)
TxErrorCodeToMongoStatus(11-75)TxErrorCodeToMongoStatus(11-11)src/mongo/db/index/index_access_method.cpp (6)
batchCheckDuplicateKey(621-630)batchCheckDuplicateKey(621-622)checkDuplicateKeysForUpdate(632-642)checkDuplicateKeysForUpdate(632-634)checkDuplicateKeysForUpdate(644-648)checkDuplicateKeysForUpdate(644-645)src/mongo/db/catalog/index_catalog_impl.cpp (2)
batchCheckDuplicateKey(1449-1478)batchCheckDuplicateKey(1449-1450)
🔇 Additional comments (13)
src/mongo/db/index/index_access_method.h (1)
159-201: LGTM! Well-documented batch duplicate-check API.The new public methods provide a clean interface for batch duplicate-key validation and update-time checks. Documentation is thorough and the delegation pattern to SortedDataInterface is appropriate.
src/mongo/db/index/index_access_method.cpp (1)
621-648: LGTM! Straightforward delegation pattern.The implementations correctly delegate to the underlying SortedDataInterface with appropriate fallback behavior when the interface is unavailable. The wrapper method properly extracts added keys and location from the UpdateTicket.
src/mongo/db/catalog/collection_impl.cpp (4)
519-524: Batch duplicate-key check at record store level.The pre-insert duplicate check at the record store level is correctly positioned before index checks and actual insertion.
727-743: Update-time duplicate-key validation added.The code correctly iterates through ready unique indexes and performs duplicate-key checks before applying updates. The check uses the UpdateTicket which contains the added keys and current RecordId.
526-537: Batches are already bounded; multikey index performance for unique indexes warrants monitoring.Batch sizes are capped at 256 KB (insertVectorMaxBytes), so the batch size concern is mitigated. The implementation properly uses batch operations (batchGetKV) for efficient checking. However, for collections with multiple unique indexes—especially multikey indexes where each document generates multiple keys—monitor performance as the index catalog iterates all unique indexes during the batch check.
545-555: RecordId availability after insertRecords is correctly managed.Verification confirms that
RecordStore::insertRecordsalways setsrecord.idfor all records before returning, andRecordStore::batchCheckDuplicateKeynever modifiesrecord.id(the id assignment inEloqRecordStore::batchCheckDuplicateKeyis commented out). The code flow is correct.src/mongo/db/modules/eloq/src/eloq_index.h (2)
25-27: Necessary includes for batch duplicate-check functionality.The added includes support the new batch duplicate-check methods.
83-119: Well-documented batch duplicate-check interface.The new public methods and private helper are clearly documented with parameter descriptions and return value semantics. The separation of the internal helper allows code reuse between insert and update paths.
src/mongo/db/ops/write_ops_exec.cpp (1)
225-250: Command-level transaction support added for inserts.The new
allowCommandLevelTransactionparameter enables insert operations to handle errors within a command-level transaction without throwing. The logic correctly distinguishes between multi-document transactions (which should still throw) and command-level transactions.src/mongo/db/modules/eloq/src/eloq_index.cpp (4)
30-30: Include added for MongoRecord support.
676-709: Batch duplicate-check delegates to internal helper.The implementation correctly extracts keys from documents using the index schema and delegates to the internal helper with a null RecordId for insert operations.
711-721: Update duplicate-check delegates to internal helper.Correctly delegates to the internal helper with the currentRecordId to allow the existing key if it belongs to the document being updated.
816-842: Constraint enforcement is properly delegated to TxUpsert.The setKV method with the
checkUniqueparameter (notcheckConstraint) correctly delegates unique constraint enforcement to the underlying_txm->TxUpsert()call. Error codes includingDUPLICATE_KEYandUNIQUE_CONSTRAINTare properly mapped toErrorCodes::DuplicateKeyviaTxErrorCodeToMongoStatus(). The method correctly returns the error status, ensuring duplicates are detected and reported to the caller.
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (14)
src/mongo/db/ops/write_ops_exec.cpp (7)
371-371: Remove or adjust debug logging before merge.This debug log uses level 0 (always logged) and has a developer prefix
"yf:". This will cause excessive logging in production.
463-463: Variable name is unclear.The name
efor the caught exception is inconsistent with other catch blocks in this file that useex. Consider using a more descriptive name for consistency.
477-477: Remove debug log before committing.The debug log contains developer initials and a typo ("catach" instead of "catch"), indicating this is unfinished development code.
488-488: Commented-out code should be removed.If
abandonSnapshot()is not needed, the comment should be deleted to improve code cleanliness.
503-503: Debug logging statements should be removed before merging to production.These MONGO_LOG statements were likely added during development for debugging purposes.
528-528: Debug logging statements should be removed before merging to production.These MONGO_LOG statements were likely added during development for debugging purposes.
535-535: Commented-out code should be removed.If recovery unit snapshot abandonment is not needed, the comments should be deleted to improve code cleanliness.
src/mongo/db/storage/sorted_data_interface.h (1)
129-157: Strengthen documentation for new duplicate-key hooksThe new hooks are well-placed and the default OK implementations preserve backward compatibility, but the comments leave some behavior ambiguous:
- For
batchCheckDuplicateKey, please document explicitly whether implementers should:
- detect duplicates within
bsonObjPtrsthemselves, and- detect duplicates against existing index entries, and that this is typically relevant only for unique indexes.
- For
checkDuplicateKeysForUpdate, clarify that:
addedKeysare the new keys for a single document,currentRecordIdshould be excluded from duplicate detection, and- the method is a preflight; uniqueness must still be enforced in the actual write path.
Also consider stating that the default implementations are no-ops and may return
Status::OK()even when duplicates exist, so callers must treat these as best-effort checks.src/mongo/db/modules/eloq/src/eloq_record_store.cpp (1)
811-876: Reintroduce key-size validation in_insertRecordsand improve duplicate-key reportingTwo related concerns here:
- Key-size validation appears to be skipped for some insert paths
batchCheckDuplicateKey()validates_idsize viacheckKeySize(idObj, "RecordStore"), but_insertRecords()(lines 1145–1153) now only rebuilds keys andrecord.idwithout any size check.EloqRecordStore::insertRecord()calls_insertRecords()directly and never callsbatchCheckDuplicateKey(), so single-document inserts will bypass the_idkey-size validation thatbatchCheckDuplicateKey()performs.- To keep behavior consistent and robust,
_insertRecords()itself should validate key size for every record before constructing theKeyString/RecordId, so that:
- single-document inserts are covered, and
- bulk callers are protected even if they forget to invoke
batchCheckDuplicateKey()first.Consider adding something like:
BSONObj obj{record.data.data()}; const BSONObj idObj = getIdBSONObjWithoutFieldName(obj); Status s = checkKeySize(idObj, "RecordStore"); if (!s.isOK()) { return s; } BatchReadEntry& entry = batchEntries[i]; entry.resetToKey(idObj); record.id = RecordId{entry.keyString.getBuffer(), entry.keyString.getSize()};inside
_insertRecords()’s loop, and optionally factoring the common “extract id + checkKeySize + resetToKey” into a small helper shared withbatchCheckDuplicateKey().
- Duplicate-key diagnostics and logging are still very minimal / personalized
- Log messages include a developer-specific prefix (
"yf: ..."), e.g. inbatchCheckDuplicateKeyand_insertRecords(setKVlogs). These should be cleaned up for production.- Both duplicate-key returns in
batchCheckDuplicateKey()use a bare"DuplicateKey"message, without indicating the offending key or collection/table, which makes debugging harder.Consider:
- Dropping the
"yf:"prefix and using a neutral message.- Including at least the offending key and
_tableNamein the returnedStatusmessage, e.g.:return Status(ErrorCodes::DuplicateKey, str::stream() << "Duplicate primary key " << idObj << " in table " << _tableName.StringView());and similarly for the on-disk duplicate case using
batchEntries[i].keyString.toString()plus_tableName.This keeps wire-level behavior aligned with
ErrorCodes::DuplicateKeywhile giving operators actionable context.Also applies to: 1139-1176
src/mongo/db/catalog/collection_impl.cpp (1)
518-543: Verify that batchCheckDuplicateKey no longer sets RecordId prematurely.A past review comment (from Copilot) flagged that
batchCheckDuplicateKeysetsrecord.idat line 846 in eloq_record_store.cpp, which would cause the assertionrecord.id.isNull()at line 1131 in_insertRecordsto fail. While the current code flow appears correct (check duplicates → insert records → use RecordIds), please verify thatbatchCheckDuplicateKeyimplementations do not modifyrecord.idfields.Run the following script to verify that
batchCheckDuplicateKeydoes not setrecord.id:#!/bin/bash # Search for record.id assignments within batchCheckDuplicateKey implementations rg -A 10 -B 2 'batchCheckDuplicateKey.*\{' --type=cpp | rg 'record\.id\s*='src/mongo/db/modules/eloq/src/eloq_index.cpp (4)
610-610: Consider removing redundant getOwned() call.The
keyparameter comes from thekeysvector (line 603), which should already contain ownedBSONObjinstances. CallinggetOwned()here may create an unnecessary copy. Consider insertingkeydirectly intobatchKeysif the keys are guaranteed to be owned.🔎 Proposed optimization
- batchKeys.insert(key.getOwned()); + batchKeys.insert(key);Only apply this change if you can verify that the
keysparameter always contains owned BSONObj instances.Based on learnings, past review from Copilot identified a similar concern.
817-842: Verify that removing inline duplicate check is safe.The existing duplicate-key check via
getKV(lines 817-827) has been commented out, presumably because duplicate checking now occurs earlier viabatchCheckDuplicateKey. Please verify that:
- All code paths that call
EloqUniqueIndex::insertare protected by prior duplicate-key checks- There are no scenarios where
insertcould be called without prior validationAdditionally, the log message at line 840 contains the developer identifier
"yf:"which should be removed (duplicate of past review comment).#!/bin/bash # Find all call sites of EloqUniqueIndex::insert to verify they perform prior duplicate checks ast-grep --pattern $'$_.insert($$$)' | rg -B 5 'EloqUniqueIndex'
592-629: Critical: Memory safety issue due to vector reallocation.The
keyStringBuffersvector (line 594) is not pre-reserved. Whenemplace_backat line 616 triggers reallocation, all existing elements are moved to a new memory location, invalidating the pointers stored inmongoKeys(created at lines 619-620 usingkeyStringBuffers.back().data()). When these dangling pointers are later used inindexBatchTuplesat line 627 and inbatchGetKVat line 634, this causes undefined behavior.🔎 Recommended fix
Use a stable container like
std::dequeforkeyStringBuffers, or haveMongoKeytake ownership of the data:// Build batch for this unique index - // Use vectors to store KeyString buffers and MongoKey objects - std::vector<std::string> keyStringBuffers; // Store KeyString buffer data + // Use deque to prevent pointer invalidation on reallocation + std::deque<std::string> keyStringBuffers; // Store KeyString buffer data std::vector<std::unique_ptr<Eloq::MongoKey>> mongoKeys; // Store MongoKey objects std::vector<Eloq::MongoRecord> mongoRecords; // Store MongoRecord objects std::vector<txservice::ScanBatchTuple> indexBatchTuples;This ensures that adding new elements doesn't invalidate pointers to existing elements.
Based on learnings, the past review from coderabbitai[bot] identified this same issue.
606-608: Remove developer identifier from logs and improve error messages.Multiple issues flagged in past reviews:
- Log messages at lines 606, 649, and 660 contain the developer identifier prefix
"yf:"which should be removed before merging.- Error messages at lines 608, 651, and 664 return generic
"DuplicateKey"without context about which key or index caused the duplicate.🔎 Proposed fix
if (batchKeys.find(key) != batchKeys.end()) { - MONGO_LOG(1) << "yf: _checkDuplicateKeysInternal duplicate key found within batch, index: " + MONGO_LOG(1) << "_checkDuplicateKeysInternal duplicate key found within batch, index: " << _indexName.StringView() << ", key: " << key.toString(); - return {ErrorCodes::DuplicateKey, "DuplicateKey"}; + return {ErrorCodes::DuplicateKey, + str::stream() << "Duplicate key in index " << _indexName.StringView() + << ": " << key.toString()}; }Apply similar changes to lines 649-651 and 660-664.
Also applies to: 649-651, 660-664
🧹 Nitpick comments (2)
src/mongo/db/storage/record_store.h (1)
420-432: Clarify contract forbatchCheckDuplicateKeyin the interface docThe API surface looks fine, but the comment is quite vague for storage engines implementing it. Consider clarifying a few points:
- Whether implementations are expected to check both:
- duplicates within
records, and- duplicates between
recordsand existing data in the store.- That this is a best-effort preflight and engines must still enforce uniqueness at insert time.
- That the default implementation is a no-op and may return
Status::OK()even if duplicates exist.This will make it easier for other engines to implement consistent behavior.
src/mongo/db/catalog/index_catalog.h (1)
234-235: Index-level batch duplicate check is wired correctly; consider documenting uniqueness/scope more preciselyThe new
Impl::batchCheckDuplicateKeyplus the inline wrapper onIndexCatalogare consistent with the rest of the design (delegation down toIndexAccessMethod/SortedDataInterface).To make this easier to use correctly from higher layers, consider extending the comment around the public wrapper to note that:
- Only ready, unique indexes are expected to participate (as enforced in
IndexCatalogImpl::batchCheckDuplicateKey).- The method is a best-effort preflight; actual insert paths must still enforce uniqueness.
- Callers should not assume per-document granularity from the
Statusalone (it represents “some duplicate in this batch”), which matters for how they map errors back to individual writes.Functionally the addition looks good.
Also applies to: 533-543
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (18)
src/mongo/db/catalog/collection_impl.cppsrc/mongo/db/catalog/index_catalog.hsrc/mongo/db/catalog/index_catalog_impl.cppsrc/mongo/db/catalog/index_catalog_impl.hsrc/mongo/db/commands/write_commands/write_commands.cppsrc/mongo/db/index/index_access_method.cppsrc/mongo/db/index/index_access_method.hsrc/mongo/db/modules/eloq/src/eloq_index.cppsrc/mongo/db/modules/eloq/src/eloq_index.hsrc/mongo/db/modules/eloq/src/eloq_record_store.cppsrc/mongo/db/modules/eloq/src/eloq_record_store.hsrc/mongo/db/operation_context.cppsrc/mongo/db/operation_context.hsrc/mongo/db/ops/write_ops_exec.cppsrc/mongo/db/storage/record_store.hsrc/mongo/db/storage/sorted_data_interface.htests/jstests/core/batch_write_command_insert.jstests/jstests/core/write_result.js
🚧 Files skipped from review as they are similar to previous changes (6)
- tests/jstests/core/write_result.js
- src/mongo/db/commands/write_commands/write_commands.cpp
- src/mongo/db/catalog/index_catalog_impl.h
- src/mongo/db/operation_context.h
- src/mongo/db/index/index_access_method.h
- src/mongo/db/operation_context.cpp
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-11-14T03:10:22.027Z
Learnt from: githubzilla
Repo: eloqdata/eloqdoc PR: 279
File: src/mongo/db/dbmain.cpp:0-0
Timestamp: 2025-11-14T03:10:22.027Z
Learning: In this repository, avoid defining --config via gflags in src/mongo/db/dbmain.cpp; MongoDB’s options_parser owns --config. Prefer gflags flags with a distinct prefix (e.g., --data_substrate_config) parsed before options_parser, with options_parser configured to allow unregistered options.
Applied to files:
src/mongo/db/ops/write_ops_exec.cpp
📚 Learning: 2025-11-24T03:11:24.955Z
Learnt from: lzxddz
Repo: eloqdata/eloqdoc PR: 325
File: src/mongo/db/modules/eloq/src/eloq_kv_engine.cpp:1362-1384
Timestamp: 2025-11-24T03:11:24.955Z
Learning: The EloqDoc codebase uses an older version of glog that supports the two-parameter signature `google::InitGoogleLogging(const char* argv0, PrefixCallback)` for custom log prefix formatting, as seen in src/mongo/db/modules/eloq/src/eloq_kv_engine.cpp line 1383.
Applied to files:
src/mongo/db/ops/write_ops_exec.cppsrc/mongo/db/modules/eloq/src/eloq_index.cpp
📚 Learning: 2025-11-28T10:55:21.020Z
Learnt from: githubzilla
Repo: eloqdata/eloqdoc PR: 279
File: src/mongo/db/modules/eloq/CMakeLists.txt:54-55
Timestamp: 2025-11-28T10:55:21.020Z
Learning: In src/mongo/db/modules/eloq/CMakeLists.txt, the CMake options EXT_TX_PROC_ENABLED, FORK_HM_PROCESS, and STATISTICS are used by the data_substrate submodule build system and should be retained as configuration interface options.
Applied to files:
src/mongo/db/ops/write_ops_exec.cppsrc/mongo/db/modules/eloq/src/eloq_index.cppsrc/mongo/db/modules/eloq/src/eloq_index.h
🧬 Code graph analysis (10)
src/mongo/db/catalog/index_catalog_impl.cpp (5)
src/mongo/db/modules/eloq/src/eloq_index.cpp (4)
batchCheckDuplicateKey(676-709)batchCheckDuplicateKey(676-677)opCtx(218-223)opCtx(218-218)src/mongo/db/index/index_access_method.cpp (2)
batchCheckDuplicateKey(621-630)batchCheckDuplicateKey(621-622)src/mongo/db/modules/eloq/src/eloq_record_store.cpp (8)
batchCheckDuplicateKey(811-876)batchCheckDuplicateKey(811-812)opCtx(147-149)opCtx(147-147)opCtx(439-448)opCtx(439-439)opCtx(549-554)opCtx(549-549)src/mongo/db/catalog/index_catalog.h (3)
batchCheckDuplicateKey(540-543)accessMethod(117-119)indexRecords(515-519)src/mongo/db/commands/create_indexes.cpp (4)
opCtx(261-457)opCtx(261-265)opCtx(460-478)opCtx(460-462)
src/mongo/db/modules/eloq/src/eloq_record_store.h (5)
src/mongo/db/storage/sorted_data_interface.h (4)
Status(136-140)Status(152-157)Status(163-165)Status(205-208)src/mongo/db/modules/eloq/src/eloq_index.cpp (4)
batchCheckDuplicateKey(676-709)batchCheckDuplicateKey(676-677)opCtx(218-223)opCtx(218-218)src/mongo/db/index/index_access_method.cpp (2)
batchCheckDuplicateKey(621-630)batchCheckDuplicateKey(621-622)src/mongo/db/modules/eloq/src/eloq_record_store.cpp (8)
batchCheckDuplicateKey(811-876)batchCheckDuplicateKey(811-812)opCtx(147-149)opCtx(147-147)opCtx(439-448)opCtx(439-439)opCtx(549-554)opCtx(549-549)src/mongo/db/catalog/index_catalog_impl.cpp (2)
batchCheckDuplicateKey(1449-1478)batchCheckDuplicateKey(1449-1450)
src/mongo/db/storage/sorted_data_interface.h (3)
src/mongo/db/modules/eloq/src/eloq_index.cpp (6)
batchCheckDuplicateKey(676-709)batchCheckDuplicateKey(676-677)opCtx(218-223)opCtx(218-218)checkDuplicateKeysForUpdate(711-721)checkDuplicateKeysForUpdate(711-713)src/mongo/db/index/index_access_method.cpp (6)
batchCheckDuplicateKey(621-630)batchCheckDuplicateKey(621-622)checkDuplicateKeysForUpdate(632-642)checkDuplicateKeysForUpdate(632-634)checkDuplicateKeysForUpdate(644-648)checkDuplicateKeysForUpdate(644-645)src/mongo/db/catalog/index_catalog_impl.cpp (2)
batchCheckDuplicateKey(1449-1478)batchCheckDuplicateKey(1449-1450)
src/mongo/db/modules/eloq/src/eloq_record_store.cpp (2)
src/mongo/db/modules/eloq/src/eloq_index.cpp (4)
batchCheckDuplicateKey(676-709)batchCheckDuplicateKey(676-677)opCtx(218-223)opCtx(218-218)src/mongo/db/modules/eloq/src/base/eloq_util.cpp (2)
TxErrorCodeToMongoStatus(11-75)TxErrorCodeToMongoStatus(11-11)
src/mongo/db/modules/eloq/src/eloq_index.cpp (4)
src/mongo/db/modules/eloq/src/eloq_record_store.cpp (15)
opCtx(147-149)opCtx(147-147)opCtx(439-448)opCtx(439-439)opCtx(549-554)opCtx(549-549)MONGO_LOG(98-126)MONGO_LOG(450-483)MONGO_LOG(519-526)MONGO_LOG(528-533)MONGO_LOG(535-540)MONGO_LOG(542-547)mongoKey(724-724)batchCheckDuplicateKey(811-876)batchCheckDuplicateKey(811-812)src/mongo/db/modules/eloq/src/base/eloq_util.cpp (2)
TxErrorCodeToMongoStatus(11-75)TxErrorCodeToMongoStatus(11-11)src/mongo/db/index/index_access_method.cpp (6)
batchCheckDuplicateKey(621-630)batchCheckDuplicateKey(621-622)checkDuplicateKeysForUpdate(632-642)checkDuplicateKeysForUpdate(632-634)checkDuplicateKeysForUpdate(644-648)checkDuplicateKeysForUpdate(644-645)src/mongo/db/catalog/index_catalog_impl.cpp (2)
batchCheckDuplicateKey(1449-1478)batchCheckDuplicateKey(1449-1450)
src/mongo/db/catalog/collection_impl.cpp (2)
src/mongo/db/commands/create_indexes.cpp (4)
opCtx(261-457)opCtx(261-265)opCtx(460-478)opCtx(460-462)src/mongo/db/catalog/index_catalog_entry.h (1)
begin(306-308)
src/mongo/db/modules/eloq/src/eloq_index.h (1)
src/mongo/db/modules/eloq/src/eloq_index.cpp (8)
batchCheckDuplicateKey(676-709)batchCheckDuplicateKey(676-677)opCtx(218-223)opCtx(218-218)checkDuplicateKeysForUpdate(711-721)checkDuplicateKeysForUpdate(711-713)_checkDuplicateKeysInternal(579-674)_checkDuplicateKeysInternal(579-581)
src/mongo/db/storage/record_store.h (6)
src/mongo/db/storage/sorted_data_interface.h (5)
Status(136-140)Status(152-157)Status(163-165)Status(205-208)std(420-422)src/mongo/db/modules/eloq/src/eloq_index.cpp (4)
batchCheckDuplicateKey(676-709)batchCheckDuplicateKey(676-677)opCtx(218-223)opCtx(218-218)src/mongo/db/index/index_access_method.cpp (2)
batchCheckDuplicateKey(621-630)batchCheckDuplicateKey(621-622)src/mongo/db/modules/eloq/src/eloq_record_store.cpp (8)
batchCheckDuplicateKey(811-876)batchCheckDuplicateKey(811-812)opCtx(147-149)opCtx(147-147)opCtx(439-448)opCtx(439-439)opCtx(549-554)opCtx(549-549)src/mongo/db/catalog/index_catalog.h (2)
batchCheckDuplicateKey(540-543)vector(478-480)src/mongo/db/catalog/index_catalog_impl.cpp (2)
batchCheckDuplicateKey(1449-1478)batchCheckDuplicateKey(1449-1450)
src/mongo/db/index/index_access_method.cpp (4)
src/mongo/db/modules/eloq/src/eloq_index.cpp (6)
batchCheckDuplicateKey(676-709)batchCheckDuplicateKey(676-677)opCtx(218-223)opCtx(218-218)checkDuplicateKeysForUpdate(711-721)checkDuplicateKeysForUpdate(711-713)src/mongo/db/modules/eloq/src/eloq_record_store.cpp (8)
batchCheckDuplicateKey(811-876)batchCheckDuplicateKey(811-812)opCtx(147-149)opCtx(147-147)opCtx(439-448)opCtx(439-439)opCtx(549-554)opCtx(549-549)src/mongo/db/catalog/index_catalog_impl.cpp (2)
batchCheckDuplicateKey(1449-1478)batchCheckDuplicateKey(1449-1450)src/mongo/db/index/index_access_method.h (1)
getSortedDataInterface(165-167)
src/mongo/db/catalog/index_catalog.h (5)
src/mongo/db/storage/sorted_data_interface.h (4)
Status(136-140)Status(152-157)Status(163-165)Status(205-208)src/mongo/db/modules/eloq/src/eloq_index.cpp (4)
batchCheckDuplicateKey(676-709)batchCheckDuplicateKey(676-677)opCtx(218-223)opCtx(218-218)src/mongo/db/index/index_access_method.cpp (2)
batchCheckDuplicateKey(621-630)batchCheckDuplicateKey(621-622)src/mongo/db/modules/eloq/src/eloq_record_store.cpp (8)
batchCheckDuplicateKey(811-876)batchCheckDuplicateKey(811-812)opCtx(147-149)opCtx(147-147)opCtx(439-448)opCtx(439-439)opCtx(549-554)opCtx(549-549)src/mongo/db/catalog/index_catalog_impl.cpp (2)
batchCheckDuplicateKey(1449-1478)batchCheckDuplicateKey(1449-1450)
🔇 Additional comments (14)
src/mongo/db/ops/write_ops_exec.cpp (5)
225-230: LGTM: Command-level transaction parameter addition.The new
allowCommandLevelTransactionparameter with a default value offalsemaintains backward compatibility while enabling special error handling for insert operations within command-level transactions, consistent with the PR objective.
246-250: LGTM: Differential error handling for insert vs update/delete.The conditional logic correctly allows command-level transaction error handling for insert operations while preserving the original throw behavior for update/delete operations, aligning with MongoDB's bulk write semantics.
466-475: LGTM: Correct guard for multi-document transactions and non-DuplicateKey errors.The logic appropriately rethrows exceptions for multi-document transactions and non-DuplicateKey errors, allowing only DuplicateKey errors to be handled in command-level transactions. This aligns with the PR objective of supporting partial inserts with duplicate-key violations.
487-490: LGTM: Correct recovery unit state reset.The state reset to
kActiveUnitOfWorkis correct and allows continued processing within the outer command-level transaction after a nested WriteUnitOfWork failure, as documented in previous reviews.
656-658: LGTM: Consistent handleError call for insert validation errors.The updated call correctly passes
trueforallowCommandLevelTransaction, consistent with the new command-level transaction support for insert operations.tests/jstests/core/batch_write_command_insert.js (2)
105-121: Updated expectations correctly reflect bulk insert semanticsThe revised assertions for:
- illegal-key insert (lines 105–121),
- single and multi-document unique-index duplicates (lines ~190–248), and
- the ordered bulk failure count (lines ~398–408)
all now expect
ok: 1withwriteErrorsand an appropriaten, which matches bulk write behavior (partial success with per-write errors). The final collection counts in each case (1 doc for unique-index scenarios) also look correct.Also applies to: 190-248, 398-408
428-473: New bulk write error-format tests provide good coverageThe new ordered and unordered tests at the end:
- Assert
ok: 1and verifywriteErrors[].code === ErrorCodes.DuplicateKeyand properindexvalues.- Distinguish ordered behavior (
n: 0, stop at first error) from unordered behavior (n: 2, continue after duplicate errors).- Confirm final collection counts (
1and3respectively) match expected partial inserts.These checks nicely pin down the wire protocol shape for DuplicateKey in bulk writes.
src/mongo/db/modules/eloq/src/eloq_record_store.h (1)
240-242: Header override forbatchCheckDuplicateKeymatches implementationThe new override in
EloqRecordStorecleanly exposes the batch duplicate-key hook fromRecordStoreand matches the implementation in the.cpp. No interface issues here.src/mongo/db/catalog/index_catalog_impl.cpp (1)
1449-1478: Index-level batch duplicate check is implemented correctly over unique, ready indexes
IndexCatalogImpl::batchCheckDuplicateKey:
- Iterates all entries, skips unfinished and non-unique indexes, and
- Delegates to each index’s
IndexAccessMethod::batchCheckDuplicateKey, returning the first non-OK status.This matches the intended layering (catalog → access method → SDI/EloqIndex) and keeps non-participating engines effectively no-op via their defaults. The updated
indexRecordsloop is also consistent with how_indexRecordsis applied per entry.src/mongo/db/index/index_access_method.cpp (1)
621-648: Delegation pattern is clean; verify call sites don't rely on these as the sole uniqueness checkThe new methods correctly delegate to
SortedDataInterface:
batchCheckDuplicateKey(...)→ forwards togetSortedDataInterface()->batchCheckDuplicateKey(...)or returns OK if SDI is unavailable.checkDuplicateKeysForUpdate(...)(both overloads) → similarly forward to SDI or return OK.Storage engines can override these (e.g., Eloq does). Call sites in
collection_impl.cppcheck return statuses and fail if non-OK, indicating these are relied upon. Since default SDI implementations are no-ops returningStatus::OK(), verify that the existing unique-index logic is still enforced in the same code paths—these should be preflight checks in addition to, not instead of, the underlying storage engine's uniqueness guarantees.src/mongo/db/catalog/collection_impl.cpp (1)
727-743: Good addition: pre-update duplicate-key validation.The new duplicate-key check for update operations correctly validates unique index constraints before applying the update. This aligns with the pre-insert validation pattern and should catch duplicate-key violations early.
src/mongo/db/modules/eloq/src/eloq_index.h (1)
25-27: LGTM: Clean API additions with comprehensive documentation.The new includes, public methods (
batchCheckDuplicateKey,checkDuplicateKeysForUpdate), and private helper (_checkDuplicateKeysInternal) are well-documented and integrate cleanly with the existing interface. The documentation clearly explains the purpose, parameters, and return values.Also applies to: 83-119
src/mongo/db/modules/eloq/src/eloq_index.cpp (2)
676-709: LGTM: Batch duplicate-key check implementation is correct.The method correctly:
- Early-returns for non-unique indexes and empty batches
- Extracts keys from documents using the key schema (handling multikey indexes)
- Delegates to the internal helper with a null RecordId (indicating insert operation)
The
getOwned()call at line 703 is necessary to ensure extracted keys outlive the loop.
711-721: LGTM: Update duplicate-key check delegates correctly.The method appropriately delegates to
_checkDuplicateKeysInternalwith thecurrentRecordIdparameter, enabling the internal method to distinguish between updates (where the existing record with the same RecordId is acceptable) and inserts (where any existing key is a duplicate).
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (3)
src/mongo/db/modules/eloq/src/eloq_index.cpp (3)
594-628: Vector reallocation can invalidate pointers stored inmongoKeys.This issue was flagged in a previous review. The
keyStringBuffersvector is not pre-reserved, so as elements are emplaced, reallocation can invalidate pointers to previously stored data. TheMongoKeyobjects inmongoKeyshold pointers tokeyStringBuffers[i].data(), which become dangling after reallocation.
606-608: Remove developer identifier prefix and improve error message.The
"yf:"prefix should be removed from log messages before merging, and the error message"DuplicateKey"should include context (key value, index name) to aid debugging. These issues were flagged in previous reviews.
800-827: Approve the refactored insert path; note the log prefix issue.Commenting out the pre-existence
getKVcheck aligns with the new batch duplicate-checking flow. ThesetKVcall remains as the actual insertion mechanism.The
"yf:"prefix on line 824 should be removed as noted in previous reviews.
🧹 Nitpick comments (1)
src/mongo/db/catalog/collection_impl.cpp (1)
726-743: Consider consolidating duplicate index iterations.The code correctly performs pre-update duplicate-key checks for ready, unique indexes before applying the update. However, this results in two separate iterations through the indexes (lines 708-725 for
validateUpdate, then lines 728-743 forcheckDuplicateKeysForUpdate).🔎 Optional refactor to combine index iterations
IndexCatalog::IndexIterator ii = _indexCatalog.getIndexIterator(opCtx, true); while (ii.more()) { IndexDescriptor* descriptor = ii.next(); IndexCatalogEntry* entry = ii.catalogEntry(descriptor); IndexAccessMethod* iam = ii.accessMethod(descriptor); InsertDeleteOptions options; IndexCatalog::prepareInsertDeleteOptions(opCtx, descriptor, &options); UpdateTicket* updateTicket = new UpdateTicket(); updateTickets.mutableMap()[descriptor] = updateTicket; uassertStatusOK(iam->validateUpdate(opCtx, oldDoc.value(), newDoc, oldLocation, options, updateTicket, entry->getFilterExpression())); + + // Check for duplicate keys immediately after validation + if (entry->isReady(opCtx) && descriptor->unique()) { + uassertStatusOK(iam->checkDuplicateKeysForUpdate(opCtx, *updateTicket)); + } } - - // Check for duplicate keys in the added keys of each index before updating the record - ii = _indexCatalog.getIndexIterator(opCtx, true); - while (ii.more()) { - IndexDescriptor* descriptor = ii.next(); - IndexCatalogEntry* entry = ii.catalogEntry(descriptor); - IndexAccessMethod* iam = ii.accessMethod(descriptor); - if (!entry->isReady(opCtx)) { - continue; - } - - if (!descriptor->unique()) { - continue; - } - - UpdateTicket* updateTicket = updateTickets.mutableMap()[descriptor]; - uassertStatusOK(iam->checkDuplicateKeysForUpdate(opCtx, *updateTicket)); - } }
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
src/mongo/db/catalog/collection_impl.cppsrc/mongo/db/modules/eloq/src/eloq_index.cppsrc/mongo/db/modules/eloq/src/eloq_index.h
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-11-24T03:11:24.955Z
Learnt from: lzxddz
Repo: eloqdata/eloqdoc PR: 325
File: src/mongo/db/modules/eloq/src/eloq_kv_engine.cpp:1362-1384
Timestamp: 2025-11-24T03:11:24.955Z
Learning: The EloqDoc codebase uses an older version of glog that supports the two-parameter signature `google::InitGoogleLogging(const char* argv0, PrefixCallback)` for custom log prefix formatting, as seen in src/mongo/db/modules/eloq/src/eloq_kv_engine.cpp line 1383.
Applied to files:
src/mongo/db/modules/eloq/src/eloq_index.cpp
📚 Learning: 2025-11-28T10:55:21.020Z
Learnt from: githubzilla
Repo: eloqdata/eloqdoc PR: 279
File: src/mongo/db/modules/eloq/CMakeLists.txt:54-55
Timestamp: 2025-11-28T10:55:21.020Z
Learning: In src/mongo/db/modules/eloq/CMakeLists.txt, the CMake options EXT_TX_PROC_ENABLED, FORK_HM_PROCESS, and STATISTICS are used by the data_substrate submodule build system and should be retained as configuration interface options.
Applied to files:
src/mongo/db/modules/eloq/src/eloq_index.cppsrc/mongo/db/modules/eloq/src/eloq_index.h
🧬 Code graph analysis (1)
src/mongo/db/modules/eloq/src/eloq_index.h (5)
src/mongo/db/modules/eloq/src/eloq_index.cpp (8)
batchCheckDuplicateKey(660-693)batchCheckDuplicateKey(660-661)opCtx(218-223)opCtx(218-218)checkDuplicateKeysForUpdate(695-705)checkDuplicateKeysForUpdate(695-697)_checkDuplicateKeysInternal(579-658)_checkDuplicateKeysInternal(579-581)src/mongo/db/index/index_access_method.cpp (6)
batchCheckDuplicateKey(621-630)batchCheckDuplicateKey(621-622)checkDuplicateKeysForUpdate(632-642)checkDuplicateKeysForUpdate(632-634)checkDuplicateKeysForUpdate(644-648)checkDuplicateKeysForUpdate(644-645)src/mongo/db/catalog/index_catalog.h (4)
batchCheckDuplicateKey(540-543)vector(478-480)BSONObj(449-452)BSONObj(454-457)src/mongo/db/catalog/index_catalog_impl.cpp (2)
batchCheckDuplicateKey(1449-1478)batchCheckDuplicateKey(1449-1450)src/mongo/db/storage/sorted_data_interface.h (4)
Status(136-140)Status(152-157)Status(163-165)Status(205-208)
🔇 Additional comments (5)
src/mongo/db/catalog/collection_impl.cpp (3)
77-77: Duplicate include already flagged.This duplicate include was already identified in a previous review.
545-555: LGTM: Correct use of assigned RecordIds.The code correctly retrieves RecordIds from the
recordsvector afterinsertRecordshas assigned them. The assertion at line 551 ensures the RecordId is valid before constructing theBsonRecord.
518-543: No action required. The code correctly handles RecordIds.The
batchCheckDuplicateKeyfunction (line 846 in eloq_record_store.cpp) has the RecordId assignment commented out (// record.id = RecordId{...}), so it does not modifyrecord.id. The subsequentinsertRecordscall at line 540 proceeds with null RecordIds, satisfying the assertion at line 1131 (assert(record.id.isNull())).Likely an incorrect or invalid review comment.
src/mongo/db/modules/eloq/src/eloq_index.h (1)
83-119: LGTM! Well-documented batch duplicate-key checking API.The new public methods align with the
SortedDataInterfacebase class declarations, and the private helper appropriately consolidates the duplicate-checking logic. The documentation clearly explains the purpose and parameters of each method.src/mongo/db/modules/eloq/src/eloq_index.cpp (1)
660-705: Implementation structure is correct; depends on fixing the internal helper.The
batchCheckDuplicateKeymethod correctly extracts keys from documents and thecheckDuplicateKeysForUpdateproperly passes thecurrentRecordId. Both delegate to_checkDuplicateKeysInternal, so they will work correctly once thecurrentRecordIdexclusion logic is implemented in that helper.
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (4)
tests/jstests/core/batch_write_command_insert.js (1)
426-426: Remove trailing whitespace.This line contains trailing whitespace that should be cleaned up.
src/mongo/db/ops/write_ops_exec.cpp (3)
371-371: Remove debug logging before merge.This debug log uses level 0 (always logged) and has a developer prefix
"yf:". This will cause excessive logging in production. Similar debug logs appear at lines 477, 503, and 529.🔎 Proposed fix
- MONGO_LOG(0) << "yf: insertDocuments, commit";Apply similar removal to lines 477, 503, and 529.
529-530: Remove debug logging.This debug log with the "yf:" prefix and level 0 should be removed before merge.
🔎 Proposed fix
- MONGO_LOG(0) << "yf: handleError, exception = " << ex.toString() << ", code = " << ex.code() << ", key = " << it->doc.firstElement().fieldNameStringData() << ", value = " << it->doc.firstElement().toString() ;
503-503: Remove debug logging.This debug log should be removed before merge.
🔎 Proposed fix
- MONGO_LOG(0) << "yf: insertDocuments, key = " << it->doc.firstElement().fieldNameStringData() << ", value = " << it->doc.firstElement().toString();
🧹 Nitpick comments (2)
src/mongo/db/ops/write_ops_exec.cpp (1)
481-481: Remove trailing whitespace.- +src/mongo/db/catalog/collection_impl.cpp (1)
520-522: Nitpick: Inconsistent brace placement.The opening brace on line 520 is on a separate line, which differs from the predominant style in this codebase (e.g., lines 519, 533, 539). Consider aligning with the established pattern for consistency.
🔎 Proposed fix
- if (!status.isOK()) - { + if (!status.isOK()) { return status; - } + }
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
src/mongo/db/catalog/collection_impl.cppsrc/mongo/db/ops/write_ops_exec.cpptests/jstests/core/batch_write_command_insert.js
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-11-14T03:10:22.027Z
Learnt from: githubzilla
Repo: eloqdata/eloqdoc PR: 279
File: src/mongo/db/dbmain.cpp:0-0
Timestamp: 2025-11-14T03:10:22.027Z
Learning: In this repository, avoid defining --config via gflags in src/mongo/db/dbmain.cpp; MongoDB’s options_parser owns --config. Prefer gflags flags with a distinct prefix (e.g., --data_substrate_config) parsed before options_parser, with options_parser configured to allow unregistered options.
Applied to files:
src/mongo/db/ops/write_ops_exec.cpp
📚 Learning: 2025-11-24T03:11:24.955Z
Learnt from: lzxddz
Repo: eloqdata/eloqdoc PR: 325
File: src/mongo/db/modules/eloq/src/eloq_kv_engine.cpp:1362-1384
Timestamp: 2025-11-24T03:11:24.955Z
Learning: The EloqDoc codebase uses an older version of glog that supports the two-parameter signature `google::InitGoogleLogging(const char* argv0, PrefixCallback)` for custom log prefix formatting, as seen in src/mongo/db/modules/eloq/src/eloq_kv_engine.cpp line 1383.
Applied to files:
src/mongo/db/ops/write_ops_exec.cpp
📚 Learning: 2025-11-28T10:55:21.020Z
Learnt from: githubzilla
Repo: eloqdata/eloqdoc PR: 279
File: src/mongo/db/modules/eloq/CMakeLists.txt:54-55
Timestamp: 2025-11-28T10:55:21.020Z
Learning: In src/mongo/db/modules/eloq/CMakeLists.txt, the CMake options EXT_TX_PROC_ENABLED, FORK_HM_PROCESS, and STATISTICS are used by the data_substrate submodule build system and should be retained as configuration interface options.
Applied to files:
src/mongo/db/ops/write_ops_exec.cpp
🧬 Code graph analysis (2)
src/mongo/db/catalog/collection_impl.cpp (1)
src/mongo/db/commands/create_indexes.cpp (4)
opCtx(261-457)opCtx(261-265)opCtx(460-478)opCtx(460-462)
src/mongo/db/ops/write_ops_exec.cpp (1)
src/mongo/db/commands/write_commands/write_commands.cpp (10)
opCtx(230-230)opCtx(232-247)opCtx(232-232)opCtx(258-265)opCtx(258-258)opCtx(267-282)opCtx(267-267)opCtx(306-315)opCtx(306-306)opCtx(353-361)
🔇 Additional comments (14)
tests/jstests/core/batch_write_command_insert.js (5)
115-120: LGTM - Test now validates MongoDB-compatible bulk write error semantics.The change from asserting
!result.oktoresult.okwithwriteErrorsarray correctly aligns with MongoDB's BulkWriteException format where partial failures returnok: 1with errors in thewriteErrorsarray.
214-226: LGTM - Thorough validation of unordered batch error semantics.The test correctly validates that in unordered mode:
- First document succeeds (
n: 1)- Subsequent duplicates generate
writeErrorsat correct indices (1 and 2)- Error structure includes required fields (
code,errmsg)
238-246: LGTM - Ordered batch error handling is correctly validated.The test verifies that ordered inserts stop at the first error, resulting in 1 successful insert and 1
writeErrorat index 1.
405-406: Verify the expected count logic.The test inserts
{_id: 50}initially, then bulk inserts{_id: 1}through{_id: 99}. Since this is an ordered bulk operation:
- Documents
_id: 1through_id: 49insert successfully (49 docs)_id: 50fails with duplicate key, stopping the ordered batch- Total count: 1 (initial) + 49 (successful inserts) = 50
The assertion
coll.count() == 50appears correct for the new partial-success semantics. The commented-out assertion expecting 1 document reflects the old behavior where the entire batch would fail.
433-472: LGTM - Comprehensive new tests for bulk write error format.These new tests thoroughly validate:
- Ordered batch (lines 438-450): Stops at first error, returns
ok: 1,n: 0, singlewriteErrorat index 0 withDuplicateKeycode- Unordered batch (lines 456-472): Continues after errors, returns
ok: 1,n: 2,writeErrorsfor all duplicatesThe
forEachvalidation ensures each error has the propercode(number) anderrmsg(string) types.src/mongo/db/ops/write_ops_exec.cpp (7)
229-230: LGTM - New parameter enables command-level transaction handling for inserts.The
allowCommandLevelTransactionparameter with a default offalsemaintains backward compatibility while allowing insert operations to handle errors within command-level transactions.
246-250: LGTM - Correctly differentiates insert vs update/delete error handling.The logic allows inserts (with
allowCommandLevelTransaction=true) to continue error handling within aWriteUnitOfWork, while updates and deletes maintain the original behavior of throwing.
463-475: Exception handling order is correct.The flow correctly:
- Checks for multi-document transaction (rethrows if true)
- Checks for non-DuplicateKey errors (rethrows if true)
- Falls through to recovery logic for DuplicateKey errors
This allows only DuplicateKey errors to be handled gracefully during batch inserts.
487-490: State reset logic is correct.The reset from
kFailedUnitOfWorktokActiveUnitOfWorkis necessary for the fallback single-insert path. The commented-outabandonSnapshot()is correctly left commented as it would violate invariants within the activeWriteUnitOfWork.
521-526: Error handling order is now correct.The code correctly checks
ex.code() != ErrorCodes::DuplicateKeybefore callinghandleError, ensuring that non-DuplicateKey errors are rethrown without being recorded inout->results. This addresses the previously flagged critical issue about error handling order.
535-538: State reset logic is consistent with the batch fallback path.The same pattern used in lines 487-490 is correctly applied here for the single-insert error recovery path.
658-658: Consistent use ofallowCommandLevelTransaction=truefor insert error handling.This ensures document validation errors during inserts also benefit from the new command-level transaction error handling.
src/mongo/db/catalog/collection_impl.cpp (2)
724-741: Good: Duplicate key validation added for updates.The second pass through indexes to check for duplicate keys before applying updates is well-structured. It correctly:
- Filters for ready indexes only
- Processes only unique indexes
- Uses the prepared UpdateTickets from the first pass
Ensure that transaction isolation or collection-level locking prevents concurrent inserts from creating duplicate keys between the check (line 740) and the actual update (line 746+).
516-541: No critical issue exists with RecordId handling.The review comment's premise is incorrect. In
batchCheckDuplicateKey(eloq_record_store.cpp), the line that would modifyrecord.idat line 846 is commented out:// record.id = RecordId{entry.keyString.getBuffer(), entry.keyString.getSize()};As a result,
batchCheckDuplicateKeydoes not modifyrecord.id. Records retain their nullRecordIdvalues when passed toinsertRecords, which correctly expects null ids (assert at line 1131). The implementation follows the correct sequence:
- Records created with null RecordId
batchCheckDuplicateKeyvalidates without modifying RecordIdinsertRecordsreceives null RecordIds and sets them during actual insertion- Post-insertion, RecordIds are populated for index operations
No RecordId clearing is needed. The code is correct as written.
Likely an incorrect or invalid review comment.
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
tests/jstests/core/opcounters_write_cmd.js (2)
48-53: Correct alignment with documented behavior for ordered inserts.The updated expectation
opCounters.insert + 2correctly implements the documented behavior on lines 21-23: "counted as K + 1 ops, where K is number of docs successfully inserted, adding the failed attempt." In this scenario, the first document succeeds (K=1) and the second fails on duplicate key (+1), totaling 2 operations.Consider removing the commented-out assertion on line 53 to reduce clutter, as the old behavior is superseded by this PR's changes.
🔎 Suggested cleanup
assert.eq(opCounters.insert + 2, newdb.serverStatus().opcounters.insert); - // assert.eq(opCounters.insert, newdb.serverStatus().opcounters.insert);
55-61: Correct alignment with documented behavior for unordered inserts.The updated expectation
opCounters.insert + 3correctly implements the documented behavior on lines 19-20: "counted as N ops, regardless of errors." All three insert attempts are counted in unordered mode, even though the second insert fails due to duplicate_id: 5.Remove the commented-out assertion on line 61 since it represents the previous incorrect behavior.
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (4)
src/mongo/db/modules/eloq/src/eloq_index.cpp (3)
671-683: Consider avoiding unnecessarygetOwned()calls.Calling
getOwned()for every key in the batch may cause unnecessary memory allocations. Since the source BSON documents remain in scope for the duration of this batch, the keys could be used directly if lifetime is carefully managed.This is an optional optimization—only pursue if profiling shows this is a bottleneck.
577-652: Critical:currentRecordIdparameter is accepted but never used.The method signature accepts
currentRecordIdand the documentation states it should be excluded from duplicate checks for update operations. However, the implementation never referencescurrentRecordId. Lines 641-647 returnDuplicateKeyfor any existing key without checking whether the existing record belongs to the document being updated.This will cause update operations to incorrectly fail with duplicate key errors when the document's own indexed key is being checked.
🔎 Required fix
// Check results for duplicates for (size_t batchIdx = 0; batchIdx < indexBatchTuples.size(); batchIdx++) { const txservice::ScanBatchTuple& tuple = indexBatchTuples[batchIdx]; if (tuple.status_ == txservice::RecordStatus::Normal) { - // For insert operations, any existing key is a duplicate - return {ErrorCodes::DuplicateKey, "DuplicateKey"}; - + // For update operations, check if the existing record belongs to the current document + if (currentRecordId.isValid()) { + const Eloq::MongoRecord* existingRecord = + static_cast<const Eloq::MongoRecord*>(tuple.record_); + if (existingRecord) { + RecordId existingRecordId = existingRecord->ToRecordId(false); + if (existingRecordId == currentRecordId) { + // Same document, not a duplicate + continue; + } + } + } + // Different document or insert operation - this is a duplicate + return {ErrorCodes::DuplicateKey, + str::stream() << "Duplicate key in index " << _indexName.StringView()}; } else if (tuple.status_ == txservice::RecordStatus::Deleted) { - invariant(tuple.status_ == txservice::RecordStatus::Deleted); + // Expected case + } else { + return {ErrorCodes::InternalError, + str::stream() << "Unexpected record status: " << static_cast<int>(tuple.status_)}; } }
592-627: Critical: Vector reallocation causes dangling pointers in MongoKey objects.
keyStringBuffersis not pre-reserved, so as elements are emplaced, reallocation can invalidate pointers to previously stored data. TheMongoKeyobjects inmongoKeyshold pointers tokeyStringBuffers[i].data(), which become dangling after reallocation.Additionally, for multikey indexes where
GetKeys()returns multiple keys per document, the current 1:1 index mapping assumption breaks.🔎 Proposed fix
// Build batch for this unique index - // Use vectors to store KeyString buffers and MongoKey objects - std::vector<std::string> keyStringBuffers; // Store KeyString buffer data + // Use deque for stable addresses even after insertions + std::deque<std::string> keyStringBuffers; // Store KeyString buffer data std::vector<std::unique_ptr<Eloq::MongoKey>> mongoKeys; // Store MongoKey objects std::vector<Eloq::MongoRecord> mongoRecords; // Store MongoRecord objects std::vector<txservice::ScanBatchTuple> indexBatchTuples;Alternatively, store the key data inside MongoKey or use
reserve()with careful capacity management.src/mongo/db/modules/eloq/src/eloq_record_store.cpp (1)
836-840: Improve error message with context.The error message "DuplicateKey" is too generic and doesn't identify which key or table is affected. Include the duplicate key value and table name for better debugging.
🔎 Suggested improvement
// Check if this key already exists in the batch if (batchKeys.find(idObj) != batchKeys.end()) { - return {ErrorCodes::DuplicateKey, "DuplicateKey"}; + return {ErrorCodes::DuplicateKey, + str::stream() << "Duplicate key in batch for table " + << _tableName.StringView() << ": " << idObj}; }
🧹 Nitpick comments (1)
src/mongo/db/modules/eloq/src/eloq_index.cpp (1)
795-805: Remove commented-out code before merging.The old duplicate-check logic is commented out rather than deleted. Since the new approach delegates duplicate detection to
setKVwithcheckUnique=true, the old code should be removed to avoid clutter.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
src/mongo/db/commands/write_commands/write_commands.cppsrc/mongo/db/modules/eloq/src/eloq_index.cppsrc/mongo/db/modules/eloq/src/eloq_record_store.cpp
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-11-14T03:10:22.027Z
Learnt from: githubzilla
Repo: eloqdata/eloqdoc PR: 279
File: src/mongo/db/dbmain.cpp:0-0
Timestamp: 2025-11-14T03:10:22.027Z
Learning: In this repository, avoid defining --config via gflags in src/mongo/db/dbmain.cpp; MongoDB’s options_parser owns --config. Prefer gflags flags with a distinct prefix (e.g., --data_substrate_config) parsed before options_parser, with options_parser configured to allow unregistered options.
Applied to files:
src/mongo/db/commands/write_commands/write_commands.cpp
📚 Learning: 2025-11-24T03:11:24.955Z
Learnt from: lzxddz
Repo: eloqdata/eloqdoc PR: 325
File: src/mongo/db/modules/eloq/src/eloq_kv_engine.cpp:1362-1384
Timestamp: 2025-11-24T03:11:24.955Z
Learning: The EloqDoc codebase uses an older version of glog that supports the two-parameter signature `google::InitGoogleLogging(const char* argv0, PrefixCallback)` for custom log prefix formatting, as seen in src/mongo/db/modules/eloq/src/eloq_kv_engine.cpp line 1383.
Applied to files:
src/mongo/db/modules/eloq/src/eloq_index.cpp
📚 Learning: 2025-11-28T10:55:21.020Z
Learnt from: githubzilla
Repo: eloqdata/eloqdoc PR: 279
File: src/mongo/db/modules/eloq/CMakeLists.txt:54-55
Timestamp: 2025-11-28T10:55:21.020Z
Learning: In src/mongo/db/modules/eloq/CMakeLists.txt, the CMake options EXT_TX_PROC_ENABLED, FORK_HM_PROCESS, and STATISTICS are used by the data_substrate submodule build system and should be retained as configuration interface options.
Applied to files:
src/mongo/db/modules/eloq/src/eloq_index.cpp
🧬 Code graph analysis (2)
src/mongo/db/modules/eloq/src/eloq_record_store.cpp (3)
src/mongo/db/modules/eloq/src/eloq_index.cpp (4)
batchCheckDuplicateKey(654-687)batchCheckDuplicateKey(654-655)opCtx(218-223)opCtx(218-218)src/mongo/db/modules/eloq/src/eloq_recovery_unit.cpp (2)
get(277-279)get(277-277)src/mongo/db/modules/eloq/src/base/eloq_util.cpp (2)
TxErrorCodeToMongoStatus(11-75)TxErrorCodeToMongoStatus(11-11)
src/mongo/db/modules/eloq/src/eloq_index.cpp (3)
src/mongo/db/modules/eloq/src/eloq_record_store.cpp (9)
opCtx(147-149)opCtx(147-147)opCtx(439-448)opCtx(439-439)opCtx(549-554)opCtx(549-549)mongoKey(724-724)batchCheckDuplicateKey(811-872)batchCheckDuplicateKey(811-812)src/mongo/db/modules/eloq/src/eloq_recovery_unit.cpp (2)
get(277-279)get(277-277)src/mongo/db/modules/eloq/src/base/eloq_util.cpp (2)
TxErrorCodeToMongoStatus(11-75)TxErrorCodeToMongoStatus(11-11)
🔇 Additional comments (3)
src/mongo/db/commands/write_commands/write_commands.cpp (1)
57-57: Remove unused logging include.The
mongo/util/log.hinclude at line 57 is unused—no logging calls (LOG, LOGV2, warning, error) are present in this file. Remove it to minimize unnecessary compilation dependencies.Likely an incorrect or invalid review comment.
src/mongo/db/modules/eloq/src/eloq_record_store.cpp (2)
60-70: LGTM! Clean batch entry structure.The
BatchReadEntrystruct provides a well-organized container for batch read operations with proper initialization viaresetToKey.
1113-1149: Well-designed batch insert flow.The method correctly handles the ID extraction and assignment workflow:
- Expects null
record.idon entry (line 1127)- Extracts IDs from BSON documents (lines 1141-1148)
- Sets
record.idbefore insertionThis properly addresses the previous concern about
insertRecordfailing due to null IDs.
| // Check results for duplicates | ||
| for (size_t i = 0; i < nRecords; i++) { | ||
| const txservice::ScanBatchTuple& tuple = batchTuples[i]; | ||
| if (tuple.status_ == txservice::RecordStatus::Normal) { | ||
| return {ErrorCodes::DuplicateKey, "DuplicateKey"}; | ||
| } else { | ||
| invariant(tuple.status_ == txservice::RecordStatus::Deleted); | ||
| } | ||
| } |
There was a problem hiding this comment.
Replace unsafe invariant with explicit status handling.
The invariant at line 867 assumes only Normal or Deleted statuses, but the getKV() function explicitly handles txservice::RecordStatus::Unknown with retry logic. If Unknown appears in batch results, the invariant will crash the server.
🔎 Proposed fix
// Check results for duplicates
for (size_t i = 0; i < nRecords; i++) {
const txservice::ScanBatchTuple& tuple = batchTuples[i];
if (tuple.status_ == txservice::RecordStatus::Normal) {
- return {ErrorCodes::DuplicateKey, "DuplicateKey"};
- } else {
- invariant(tuple.status_ == txservice::RecordStatus::Deleted);
+ return {ErrorCodes::DuplicateKey,
+ str::stream() << "Duplicate key for table " << _tableName.StringView()};
+ } else if (tuple.status_ == txservice::RecordStatus::Deleted) {
+ // Expected case for insert - key doesn't exist
+ } else {
+ return {ErrorCodes::InternalError,
+ str::stream() << "Unexpected record status from batchGetKV: "
+ << static_cast<int>(tuple.status_)};
}
}🤖 Prompt for AI Agents
In src/mongo/db/modules/eloq/src/eloq_record_store.cpp around lines 861-869,
replace the unsafe invariant that assumes only Normal or Deleted statuses with
explicit handling for all RecordStatus values: for Normal return DuplicateKey,
for Deleted continue, and for Unknown (or any other unexpected status) return a
transient/ retryable error (e.g. WriteConflict or a clearly named error) so
callers that implement getKV() retry logic can handle it safely; add a short log
or comment explaining the Unknown case is transient.
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (7)
src/mongo/db/ops/write_ops_exec.cpp (1)
485-488: Consider removing commented-out code.The commented-out
abandonSnapshot()call adds noise. The surrounding comment already explains why we reset state without aborting. If you want to preserve the historical context, consider a brief inline comment instead of the dead code.🔎 Proposed fix
if (opCtx->getRecoveryUnitState() == WriteUnitOfWork::RecoveryUnitState::kFailedUnitOfWork) { - //opCtx->recoveryUnit()->abandonSnapshot(); // Clear any failed snapshot state + // Note: abandonSnapshot() is intentionally NOT called here - it would violate + // invariant(!_inUnitOfWork) in recovery unit implementations. opCtx->setRecoveryUnitState(WriteUnitOfWork::RecoveryUnitState::kActiveUnitOfWork); }src/mongo/db/modules/eloq/src/eloq_index.cpp (4)
671-683: Consider avoiding getOwned() if keys remain in scope.Line 681 calls
getOwned()for every key. If the source documents inbsonObjPtrsremain valid for the duration of the batch check, consider storing references or ensuring the source lifetime is sufficient to avoid the allocation overhead.
594-627: Memory safety issue with vector reallocation.
keyStringBuffersis not pre-reserved, so reallocation can invalidate pointers held byMongoKeyobjects (line 617-618), creating dangling pointers. Additionally, for multikey indexes whereGetKeys()returns multiple keys per document, the indexing assumption betweenkeyStringBuffers[batchIdx]breaks.🔎 Proposed fix
// Build batch for this unique index - std::vector<std::string> keyStringBuffers; - std::vector<std::unique_ptr<Eloq::MongoKey>> mongoKeys; - std::vector<Eloq::MongoRecord> mongoRecords; - std::vector<txservice::ScanBatchTuple> indexBatchTuples; + // Use deque for stable addresses + std::deque<std::string> keyStringBuffers; + std::vector<std::unique_ptr<Eloq::MongoKey>> mongoKeys; + std::vector<Eloq::MongoRecord> mongoRecords; + std::vector<txservice::ScanBatchTuple> indexBatchTuples; + + // Reserve space to reduce allocations + mongoKeys.reserve(keys.size()); + mongoRecords.reserve(keys.size()); + indexBatchTuples.reserve(keys.size());
579-643: Critical: currentRecordId parameter is accepted but never used.The method accepts
currentRecordIdand documentation states it should exclude the current record from duplicate checks during updates. However, line 641-643 returnsDuplicateKeyfor any existing key without verifying whether it belongs to the document being updated. This will cause update operations to incorrectly fail with duplicate key errors.🔎 Proposed fix
for (size_t batchIdx = 0; batchIdx < indexBatchTuples.size(); batchIdx++) { const txservice::ScanBatchTuple& tuple = indexBatchTuples[batchIdx]; if (tuple.status_ == txservice::RecordStatus::Normal) { - // For insert operations, any existing key is a duplicate - return {ErrorCodes::DuplicateKey, "DuplicateKey"}; + // Check if this is the current document being updated + if (currentRecordId.isValid()) { + const Eloq::MongoRecord* existingRecord = + static_cast<const Eloq::MongoRecord*>(tuple.record_); + if (existingRecord) { + RecordId existingRecordId = existingRecord->ToRecordId(false); + if (existingRecordId == currentRecordId) { + continue; // Same document, not a duplicate + } + } + } + return {ErrorCodes::DuplicateKey, + str::stream() << "Duplicate key in index " << _indexName.StringView()}; } else { invariant(tuple.status_ == txservice::RecordStatus::Deleted); } }
645-647: Replace unsafe invariant with explicit status handling.The invariant assumes only
NormalorDeletedstatuses, butgetKV()in the codebase handlestxservice::RecordStatus::Unknownwith retry logic. IfUnknownappears, the invariant will crash the server.🔎 Proposed fix
if (tuple.status_ == txservice::RecordStatus::Normal) { // ... duplicate handling - } else { - invariant(tuple.status_ == txservice::RecordStatus::Deleted); + } else if (tuple.status_ == txservice::RecordStatus::Deleted) { + // Expected: key doesn't exist + } else { + return {ErrorCodes::InternalError, + str::stream() << "Unexpected record status: " + << static_cast<int>(tuple.status_)}; }src/mongo/db/modules/eloq/src/eloq_record_store.cpp (2)
837-840: Generic error message lacks context.The error message "DuplicateKey" doesn't indicate which key caused the duplicate or which collection was affected. Including the key value and table name would aid debugging.
861-869: Replace unsafe invariant with explicit status handling.The invariant at line 867 assumes only
NormalorDeletedstatuses exist, butgetKV()elsewhere handlestxservice::RecordStatus::Unknownwith retry logic. IfUnknownappears in batch results, this invariant will crash the server.🔎 Proposed fix
// Check results for duplicates for (size_t i = 0; i < nRecords; i++) { const txservice::ScanBatchTuple& tuple = batchTuples[i]; if (tuple.status_ == txservice::RecordStatus::Normal) { return {ErrorCodes::DuplicateKey, str::stream() << "Duplicate key in table " << _tableName.StringView()}; - } else { - invariant(tuple.status_ == txservice::RecordStatus::Deleted); + } else if (tuple.status_ == txservice::RecordStatus::Deleted) { + // Expected: key doesn't exist + } else { + return {ErrorCodes::InternalError, + str::stream() << "Unexpected record status from batchGetKV: " + << static_cast<int>(tuple.status_)}; } }
🧹 Nitpick comments (2)
src/mongo/db/ops/write_ops_exec.cpp (1)
462-462: Fix formatting: missing space after&.There's a missing space between
&andein the catch clause.🔎 Proposed fix
- } catch (const DBException&e) { + } catch (const DBException& e) {src/mongo/db/modules/eloq/src/eloq_record_store.cpp (1)
1135-1149: Consider refactoring duplicate key extraction logic.Lines 1141-1149 duplicate the key extraction logic found in
batchCheckDuplicateKey(lines 828-846). Consider extracting this into a shared helper method to reduce code duplication and ensure consistency.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
src/mongo/db/commands/write_commands/write_commands.cppsrc/mongo/db/modules/eloq/src/eloq_index.cppsrc/mongo/db/modules/eloq/src/eloq_record_store.cppsrc/mongo/db/ops/write_ops_exec.cpp
🚧 Files skipped from review as they are similar to previous changes (1)
- src/mongo/db/commands/write_commands/write_commands.cpp
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-11-24T03:11:24.955Z
Learnt from: lzxddz
Repo: eloqdata/eloqdoc PR: 325
File: src/mongo/db/modules/eloq/src/eloq_kv_engine.cpp:1362-1384
Timestamp: 2025-11-24T03:11:24.955Z
Learning: The EloqDoc codebase uses an older version of glog that supports the two-parameter signature `google::InitGoogleLogging(const char* argv0, PrefixCallback)` for custom log prefix formatting, as seen in src/mongo/db/modules/eloq/src/eloq_kv_engine.cpp line 1383.
Applied to files:
src/mongo/db/modules/eloq/src/eloq_index.cpp
📚 Learning: 2025-11-28T10:55:21.020Z
Learnt from: githubzilla
Repo: eloqdata/eloqdoc PR: 279
File: src/mongo/db/modules/eloq/CMakeLists.txt:54-55
Timestamp: 2025-11-28T10:55:21.020Z
Learning: In src/mongo/db/modules/eloq/CMakeLists.txt, the CMake options EXT_TX_PROC_ENABLED, FORK_HM_PROCESS, and STATISTICS are used by the data_substrate submodule build system and should be retained as configuration interface options.
Applied to files:
src/mongo/db/modules/eloq/src/eloq_index.cpp
📚 Learning: 2025-11-14T03:10:22.027Z
Learnt from: githubzilla
Repo: eloqdata/eloqdoc PR: 279
File: src/mongo/db/dbmain.cpp:0-0
Timestamp: 2025-11-14T03:10:22.027Z
Learning: In this repository, avoid defining --config via gflags in src/mongo/db/dbmain.cpp; MongoDB’s options_parser owns --config. Prefer gflags flags with a distinct prefix (e.g., --data_substrate_config) parsed before options_parser, with options_parser configured to allow unregistered options.
Applied to files:
src/mongo/db/ops/write_ops_exec.cpp
🧬 Code graph analysis (1)
src/mongo/db/modules/eloq/src/eloq_record_store.cpp (4)
src/mongo/db/modules/eloq/src/eloq_index.cpp (4)
batchCheckDuplicateKey(654-687)batchCheckDuplicateKey(654-655)opCtx(218-223)opCtx(218-218)src/mongo/db/catalog/index_catalog_impl.cpp (3)
batchCheckDuplicateKey(1449-1478)batchCheckDuplicateKey(1449-1450)i(1576-1576)src/mongo/db/modules/eloq/src/eloq_recovery_unit.cpp (2)
get(277-279)get(277-277)src/mongo/db/modules/eloq/src/base/eloq_util.cpp (2)
TxErrorCodeToMongoStatus(11-75)TxErrorCodeToMongoStatus(11-11)
🔇 Additional comments (11)
src/mongo/db/ops/write_ops_exec.cpp (4)
225-230: LGTM on the parameter addition.The new
allowCommandLevelTransactionparameter with a default offalsemaintains backward compatibility with existing call sites while enabling the new DuplicateKey handling behavior for insert operations.
245-250: Logic is correct.The conditional properly gates error handling: inserts (with
allowCommandLevelTransaction=true) can continue processing in command-level transactions, while updates/deletes maintain the original throw behavior.
517-536: Error handling logic is correct.The order is now correct: non-DuplicateKey errors are rethrown immediately (lines 518-520), then
handleErroris called only for DuplicateKey errors. This addresses the previous review concern about recording errors before type checking.The same commented-out
abandonSnapshot()at line 531 should be cleaned up (same suggestion as above).
651-654: Verify: does the fixedDoc error path match the intended error handling scope?The code comment at lines 245-247 states "For insert operations, allow error handling in command-level transactions," which appears to justify this behavior. However, the DuplicateKey paths (lines 472, 518) explicitly check
if (ex.code() != ErrorCodes::DuplicateKey) { throw; }before usingallowCommandLevelTransaction=true, whereas this fixedDoc validation error path callshandleError(..., true)without filtering the error code. If the intent is to allow only DuplicateKey errors to continue in command-level transactions, this path should similarly filter before passingtrue. If the intent is to allow all insert errors to continue, the other paths are unnecessarily selective.src/mongo/db/modules/eloq/src/eloq_record_store.cpp (4)
60-70: LGTM: Clean batch entry structure.The
BatchReadEntrystruct provides a clear container for batch read operations, managing KeyString, MongoKey, and MongoRecord together.
874-880: LGTM: Clean delegation to internal insert.The updated signature accepts vector pointers and delegates appropriately to
_insertRecords.
1127-1129: Verify assertion expectations are documented.The assertion at line 1127 expects
record.id.isNull()for new inserts. Ensure callers understand this precondition, as the method setsrecord.idduring processing (line 1148).
1161-1167: Verify checkUnique flag behavior.Line 1161 sets
checkUnique = truefor all inserts, which will causesetKVto perform duplicate checks at the storage layer. SincebatchCheckDuplicateKeyis expected to have been called before this point, confirm whether the redundant check is intentional (defense-in-depth) or if it introduces unnecessary overhead.src/mongo/db/modules/eloq/src/eloq_index.cpp (3)
30-30: LGTM: Include addition.The include for
eloq_record.his appropriate for the new batch checking functionality.
689-699: LGTM once the underlying bug is fixed.This method correctly delegates to
_checkDuplicateKeysInternalwith thecurrentRecordIdparameter. However, it depends on that method actually using the parameter (currently flagged as a critical bug).
795-818: Verify consistency with batch duplicate checking.The pre-existence check via
getKV(lines 796-804) is now commented out, and the code relies onsetKVwithcheckUnique=true(line 816) to detect duplicates. Confirm this approach is consistent with the new batch checking flow wherebatchCheckDuplicateKeyshould be called beforeinsert. If batch checking already validates uniqueness, thecheckUnique=trueflag may be redundant or serve as a defense-in-depth mechanism.
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (5)
src/mongo/db/modules/eloq/src/eloq_record_store.cpp (2)
861-869: Replace unsafe invariant with explicit status handling.The invariant at line 867 assumes only
NormalorDeletedstatuses, but thegetKV()function explicitly handlestxservice::RecordStatus::Unknownwith retry logic. IfUnknownappears in batch results, the invariant will crash the server.🔎 Proposed fix
// Check results for duplicates for (size_t i = 0; i < nRecords; i++) { const txservice::ScanBatchTuple& tuple = batchTuples[i]; if (tuple.status_ == txservice::RecordStatus::Normal) { - return {ErrorCodes::DuplicateKey, "DuplicateKey"}; - } else { - invariant(tuple.status_ == txservice::RecordStatus::Deleted); + return {ErrorCodes::DuplicateKey, + str::stream() << "Duplicate key for table " << _tableName.StringView()}; + } else if (tuple.status_ == txservice::RecordStatus::Deleted) { + // Expected case for insert - key doesn't exist + } else { + return {ErrorCodes::InternalError, + str::stream() << "Unexpected record status from batchGetKV: " + << static_cast<int>(tuple.status_)}; } }Based on learnings, the retry logic in
getKV()suggestsUnknownis a valid transient state that should be handled, not asserted away.
837-838: Improve error message context for debugging.The generic
"DuplicateKey"message doesn't indicate which key or table caused the duplicate, making debugging difficult in production. Similar issue exists at line 865.🔎 Proposed fix
- if (batchKeys.find(idObj) != batchKeys.end()) { - return {ErrorCodes::DuplicateKey, "DuplicateKey"}; - } + if (batchKeys.find(idObj) != batchKeys.end()) { + return {ErrorCodes::DuplicateKey, + str::stream() << "Duplicate key in batch for table " + << _tableName.StringView() << ": " << idObj}; + }Apply similar fix at line 865:
- return {ErrorCodes::DuplicateKey, "DuplicateKey"}; + return {ErrorCodes::DuplicateKey, + str::stream() << "Duplicate key exists for table " + << _tableName.StringView() << ": " + << getIdBSONObjWithoutFieldName(BSONObj{records->at(i).data.data()})};src/mongo/db/modules/eloq/src/eloq_index.cpp (3)
594-627: Memory safety issue: vector reallocation invalidates pointers.
keyStringBuffersis not pre-reserved. As elements are emplaced (line 614), reallocation can invalidate pointers to previously stored data. TheMongoKeyobjects inmongoKeys(lines 617-619) hold pointers tokeyStringBuffers[i].data(), which become dangling after reallocation.🔎 Proposed fix
// Build batch for this unique index - std::vector<std::string> keyStringBuffers; + std::vector<std::string> keyStringBuffers; + keyStringBuffers.reserve(keys.size()); std::vector<std::unique_ptr<Eloq::MongoKey>> mongoKeys; + mongoKeys.reserve(keys.size()); std::vector<Eloq::MongoRecord> mongoRecords; + mongoRecords.reserve(keys.size()); std::vector<txservice::ScanBatchTuple> indexBatchTuples; + indexBatchTuples.reserve(keys.size());Note: For multikey indexes,
GetKeys()at line 677 can return multiple keys per document, so the actual count may exceedkeys.size(). Consider usingstd::deque<std::string>forkeyStringBufferswhich never invalidates references on growth.
579-581: Critical bug:currentRecordIdparameter is ignored.The method signature accepts
currentRecordId(for update operations to exclude the current document from duplicate checks), but the implementation never uses it. Line 642 returnsDuplicateKeyfor any existing key without checking whether it belongs to the document being updated.This will cause update operations to incorrectly fail with duplicate key errors when checking the document's own indexed keys.
🔎 Proposed fix
// Check results for duplicates for (size_t batchIdx = 0; batchIdx < indexBatchTuples.size(); batchIdx++) { const txservice::ScanBatchTuple& tuple = indexBatchTuples[batchIdx]; if (tuple.status_ == txservice::RecordStatus::Normal) { - // For insert operations, any existing key is a duplicate - return {ErrorCodes::DuplicateKey, "DuplicateKey"}; + // For update operations, check if the existing record is the current document + if (currentRecordId.isValid()) { + const Eloq::MongoRecord* existingRecord = + static_cast<const Eloq::MongoRecord*>(tuple.record_); + if (existingRecord) { + RecordId existingRecordId = existingRecord->ToRecordId(false); + if (existingRecordId == currentRecordId) { + // Same document, not a duplicate + continue; + } + } + } + // Different document or insert operation - this is a duplicate + return {ErrorCodes::DuplicateKey, + str::stream() << "Duplicate key for index " << _indexName.StringView()}; } else if (tuple.status_ == txservice::RecordStatus::Deleted) { // Expected case } else { invariant(tuple.status_ == txservice::RecordStatus::Deleted); } }Also applies to: 641-643
645-647: Replace unsafe invariant with explicit status handling.The invariant assumes only
NormalorDeletedstatuses, butgetKV()explicitly handlestxservice::RecordStatus::Unknownwith retry logic (per the related code in eloq_recovery_unit.cpp). IfUnknownstatus appears in batch results, the invariant will crash the server.🔎 Proposed fix
for (size_t batchIdx = 0; batchIdx < indexBatchTuples.size(); batchIdx++) { const txservice::ScanBatchTuple& tuple = indexBatchTuples[batchIdx]; if (tuple.status_ == txservice::RecordStatus::Normal) { // ... duplicate key handling + } else if (tuple.status_ == txservice::RecordStatus::Deleted) { + // Expected case for insert } else { - invariant(tuple.status_ == txservice::RecordStatus::Deleted); + return {ErrorCodes::InternalError, + str::stream() << "Unexpected record status from batchGetKV: " + << static_cast<int>(tuple.status_)}; } }Based on learnings, the retry logic in
getKV()suggestsUnknownis a valid transient state that should be handled.
🧹 Nitpick comments (2)
src/mongo/db/modules/eloq/src/eloq_record_store.cpp (1)
840-840: Consider performance impact of getOwned() in tight loop.Calling
getOwned()for every key in the batch allocates memory for each BSON object. For large batches, this overhead may be significant.If the source documents remain in scope for the batch lifetime (which is typical), consider storing pointers or references instead, or profile to verify this overhead is acceptable for expected batch sizes.
src/mongo/db/modules/eloq/src/eloq_index.cpp (1)
680-682: Consider performance impact of getOwned() in tight loop.Calling
getOwned()for every key allocates memory. For large batches or multikey indexes (which can produce many keys per document), this overhead may be significant.If the source documents remain in scope, consider using references or measuring whether this overhead is acceptable for expected workloads.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
src/mongo/db/commands/write_commands/write_commands.cppsrc/mongo/db/modules/eloq/data_substratesrc/mongo/db/modules/eloq/src/eloq_index.cppsrc/mongo/db/modules/eloq/src/eloq_record_store.cppsrc/mongo/db/ops/write_ops_exec.cpp
✅ Files skipped from review due to trivial changes (1)
- src/mongo/db/modules/eloq/data_substrate
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-11-24T03:11:24.955Z
Learnt from: lzxddz
Repo: eloqdata/eloqdoc PR: 325
File: src/mongo/db/modules/eloq/src/eloq_kv_engine.cpp:1362-1384
Timestamp: 2025-11-24T03:11:24.955Z
Learning: The EloqDoc codebase uses an older version of glog that supports the two-parameter signature `google::InitGoogleLogging(const char* argv0, PrefixCallback)` for custom log prefix formatting, as seen in src/mongo/db/modules/eloq/src/eloq_kv_engine.cpp line 1383.
Applied to files:
src/mongo/db/modules/eloq/src/eloq_index.cpp
📚 Learning: 2025-11-28T10:55:21.020Z
Learnt from: githubzilla
Repo: eloqdata/eloqdoc PR: 279
File: src/mongo/db/modules/eloq/CMakeLists.txt:54-55
Timestamp: 2025-11-28T10:55:21.020Z
Learning: In src/mongo/db/modules/eloq/CMakeLists.txt, the CMake options EXT_TX_PROC_ENABLED, FORK_HM_PROCESS, and STATISTICS are used by the data_substrate submodule build system and should be retained as configuration interface options.
Applied to files:
src/mongo/db/modules/eloq/src/eloq_index.cpp
📚 Learning: 2025-11-14T03:10:22.027Z
Learnt from: githubzilla
Repo: eloqdata/eloqdoc PR: 279
File: src/mongo/db/dbmain.cpp:0-0
Timestamp: 2025-11-14T03:10:22.027Z
Learning: In this repository, avoid defining --config via gflags in src/mongo/db/dbmain.cpp; MongoDB’s options_parser owns --config. Prefer gflags flags with a distinct prefix (e.g., --data_substrate_config) parsed before options_parser, with options_parser configured to allow unregistered options.
Applied to files:
src/mongo/db/ops/write_ops_exec.cppsrc/mongo/db/commands/write_commands/write_commands.cpp
🧬 Code graph analysis (1)
src/mongo/db/modules/eloq/src/eloq_record_store.cpp (3)
src/mongo/db/modules/eloq/src/eloq_index.cpp (11)
mongoKey(237-237)batchCheckDuplicateKey(654-687)batchCheckDuplicateKey(654-655)opCtx(218-223)opCtx(218-218)MONGO_LOG(188-190)MONGO_LOG(192-196)MONGO_LOG(198-209)MONGO_LOG(211-216)MONGO_LOG(332-364)MONGO_LOG(390-414)src/mongo/db/modules/eloq/src/eloq_recovery_unit.cpp (2)
get(277-279)get(277-277)src/mongo/db/modules/eloq/src/eloq_record_store.h (1)
insertRecords(87-92)
Summary by CodeRabbit
New Features
Bug Fixes
Tests
✏️ Tip: You can customize this high-level summary in your review settings.