Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eliminate temporary buffer allocations during bson serialization #1628

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

Edarke
Copy link
Contributor

@Edarke Edarke commented Feb 10, 2025

Fixes: JAVA-5786

  • Override Outputbuffer.toByteArray() to avoid extraneous array copy
  • Modify BsonBinaryWriter to call OutputBuffers.writeObjectId so implementations can opt to use putToByteBuffer rather than allocating a temporary byte[]
  • Wrap array in ByteBuffer so we can write numeric values 8 bytes at a time

Motivation

Atlas Search uses the Java driver to communicate with the Mongodb Server. We serialize up to ~300k documents per batch, primarily consisting of ObjectID and numbers. For these large batches, serialization has measurable overhead. This PR proposes some simple optimizations for BasicOutputBuffer without altering the API.

JMH Benchmarks

Benchmark Main (ops/ms) GC (B/op) This PR (ops/ms) GC (B/op)
toByteArray_large 0.907 26,777,857 2.269 13,388,928
toByteArray_small 110,515 160 184,816 80
encodeId 23,774 134 32,696 104
write_id 28,255 168 36,467 80
write_numbers 16,327 80 18,125 80

@jyemin
Copy link
Collaborator

jyemin commented Feb 10, 2025

@Edarke please open a Jira issue in the JAVA project for this PR, as per our contribution guide.

Thanks!

@Edarke Edarke requested a review from jyemin February 11, 2025 03:57
@jyemin jyemin requested a review from vbabanin February 20, 2025 16:18
@jyemin jyemin requested review from jyemin and removed request for jyemin March 7, 2025 19:29
Copy link
Member

@vbabanin vbabanin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! I just have a few minor comments.

throw new IllegalArgumentException(format("position must be >= 0 but was %d", absolutePosition));
}
if (absolutePosition > buffer.position() - bytesToWrite) {
throw new IllegalArgumentException(format("position must be <= %d but was %d", buffer.position() - 1, absolutePosition));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
throw new IllegalArgumentException(format("position must be <= %d but was %d", buffer.position() - 1, absolutePosition));
throw new IllegalArgumentException(format("position must be <= %d but was %d", buffer.position() - bytesToWrite, absolutePosition));

throw new IllegalArgumentException();
}
position = newPosition;
((Buffer) buffer).position(newPosition);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
((Buffer) buffer).position(newPosition);
// The cast is required for compatibility with JDK 9+ where ByteBuffer's position method is inherited from Buffer.
((Buffer) buffer).position(newPosition);

}

@Override
public void writeByte(final int value) {
ensureOpen();

ensure(1);
buffer[position++] = (byte) (0xFF & value);
buffer.put((byte) (0xFF & value));
}

@Override
protected void write(final int absolutePosition, final int value) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that the writeInt32(final int value) method has been updated to use ByteBuffer's bulk operation putInt(...), this method is no longer covered by tests. Previously, it was indirectly tested through writeInt32. Could you add a dedicated test case for this?

bsonOutput.position == 4
bsonOutput.size == 4
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def 'toByteArray creates a copy'() {
given:
def bsonOutput = new BasicOutputBuffer(10)
bsonOutput.writeBytes([1, 2, 3, 4] as byte[])
when:
def first = bsonOutput.toByteArray()
def second = bsonOutput.toByteArray()
then:
first !== second
first == [1, 2, 3, 4] as byte[]
second == [1, 2, 3, 4] as byte[]
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants