Skip to content

Conversation

jyemin
Copy link
Collaborator

@jyemin jyemin commented Jan 6, 2022

The previous code was incorrect because it was comparing absolute write indexes
with indexes that are relative to the current batch. This patch avoids that
by using the insertedId map from SplittablePayload directly, which already
contains absolute write indexes.

JAVA-4436

@jyemin jyemin requested a review from rozza January 6, 2022 13:28
@jyemin jyemin self-assigned this Jan 6, 2022
The previous code was incorrect because it was comparing absolute write indexes
with indexes that are relative to the current batch. This patch avoids that
by using the insertedId map from SplittablePayload directly, which already
contains absolute write indexes.

JAVA-4436
@@ -282,21 +281,11 @@ private BulkWriteResult getBulkWriteResult(final BsonDocument result) {
}

private List<BulkWriteInsert> getInsertedItems(final BsonDocument result) {
if (payload.getPayloadType() == SplittablePayload.Type.INSERT) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the conditional just to simplify the code. If the payload type is anything but INSERT, then payload.getInsertedIds will be empty anyway.

if (payload.getPayloadType() == SplittablePayload.Type.INSERT) {

Stream<WriteRequestWithIndex> writeRequests = payload.getWriteRequestWithIndexes().stream();
List<Integer> writeErrors = getWriteErrors(result).stream().map(BulkWriteError::getIndex).collect(Collectors.toList());
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed this to a Set to make the contains check more efficient.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack


Stream<WriteRequestWithIndex> writeRequests = payload.getWriteRequestWithIndexes().stream();
List<Integer> writeErrors = getWriteErrors(result).stream().map(BulkWriteError::getIndex).collect(Collectors.toList());
if (!writeErrors.isEmpty()) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed this conditional, as it seems an unnecessary optimization. The contains check will be fast enough on any empty set.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack

}
return Collections.emptyList();
Set<Integer> writeErrors = getWriteErrors(result).stream().map(BulkWriteError::getIndex).collect(Collectors.toSet());
return payload.getInsertedIds().entrySet().stream()
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's safe to use SplittablePayload`s insertedIds map directly, since it will only contain the ids from the previously inserted batch.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So much neater and easier to read.

@@ -228,23 +229,71 @@ class BulkWriteBatchSpecification extends Specification {
!bulkWriteBatch.hasAnotherBatch()
}

def 'should only map inserts up to the payload position'() {
def 'should map all inserted ids'() {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might have made this test more complicated that it needs to be, as I changed it first just to reproduce the bug, but before I settled on the simplified approach in BulkWriteBatch

new BulkWriteInsert(2, new BsonInt32(2))]
}

def 'should not map inserted id with a write error'() {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There wasn't a test for this so added one.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good stuff

Copy link
Member

@rozza rozza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much easier to read now.

I think you mentioned checking upsertedIds as well - does anything need happen with them?

}
return Collections.emptyList();
Set<Integer> writeErrors = getWriteErrors(result).stream().map(BulkWriteError::getIndex).collect(Collectors.toSet());
return payload.getInsertedIds().entrySet().stream()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So much neater and easier to read.

if (payload.getPayloadType() == SplittablePayload.Type.INSERT) {

Stream<WriteRequestWithIndex> writeRequests = payload.getWriteRequestWithIndexes().stream();
List<Integer> writeErrors = getWriteErrors(result).stream().map(BulkWriteError::getIndex).collect(Collectors.toList());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack


Stream<WriteRequestWithIndex> writeRequests = payload.getWriteRequestWithIndexes().stream();
List<Integer> writeErrors = getWriteErrors(result).stream().map(BulkWriteError::getIndex).collect(Collectors.toList());
if (!writeErrors.isEmpty()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack

new BulkWriteInsert(2, new BsonInt32(2))]
}

def 'should not map inserted id with a write error'() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good stuff

@jyemin
Copy link
Collaborator Author

jyemin commented Jan 6, 2022

I think you mentioned checking upsertedIds as well - does anything need happen with them?

No, I was wrong about that, as upsertedIds come directly from the server's response and wouldn't suffer from the same mapping issue.

@jyemin jyemin merged commit 3e5992e into mongodb:master Jan 6, 2022
@jyemin jyemin deleted the j4436 branch January 6, 2022 14:39
jyemin added a commit that referenced this pull request Jan 6, 2022
The previous code was incorrect because it was comparing absolute write indexes
with indexes that are relative to the current batch. This patch avoids that
by using the insertedId map from SplittablePayload directly, which already
contains absolute write indexes.

JAVA-4436
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants