-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Performance enhancements for the MongoDB Java driver #66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
that is wrapped around the DBPort network socket input stream. - Set the default buffer size to 65536, overriding the default BufferedInputStream buffer size, which seems to be currently set at 8192 bytes. This results in a consistent throughput increase of about 2% in a microbenchmark. Users can set a different size, should the need (or curiosity) arise.
for optimization. Using an extra class field to avoid unnecessary method calls results in a throughput increase of at least 2% in a few benchmarks.
reading bytes into a temporary buffer (_random) and using the more efficient PoolOutputBuffer.write(byte[], int, int) method to update the output buffer. Micro-benchmarks show a throughput increase in the 5-6% range.
Cool, the changes look good generally. I've added a jira issue to track this: https://jira.mongodb.org/browse/JAVA-541 |
Hi. Can you provide your benchmark in a comment or gist? |
Are you asking about the benchmark results, or the benchmark code itself? The first I could type into a comment, but the second refers to significant parts of my codebase that I would not be able to release at this time - at least not without a lot of work. |
The benchmark code. I understand you can't provide your application's source code, but can you create a small benchmark program that demonstrates the effectiveness of this change? |
I cobbled up this little benchmark - just point it to a DB and collection with lots of C-strings, although some of the improvements should be visible everywhere:
In my own code I was using a custom callback that did some pre-processing on the received documents. This one is as simple as it gets and it seems to highlight the performance improvements even more. I see a 25% improvement over the 2.7.3 stable release - I'll see about comparing with git master shortly... |
Any plans to incorporate any of these changes (in particular the buffer size) I'm working with large result sets and I suspect a larger network buffer size would help things for me. |
Hi, We're currently working on a new driver, on the 3.0 branch. We won't be making major changes to the 2.x driver, especially simply for performance, but we do hope the 3.0 driver will (eventually) be more performant. We certainly hope that people will be able to tweak and tune it better. When that's released, please feel free to profile that version and send us feedback. Trisha |
After several runs under a profiler for a specific use case, a few alterations resulted in a noticeable throughput increase. The improvement was consistently measured in the 10-15% range (or better - this number is rather conservative). This changeset affects three classes:
DBPort
, set by default to 64KPoolOutputBuffer
BasicBSONDecoder$BSONInput.readCstr()
The target benchmark was a single-threaded dump of a collection that contains mostly ASCII C-style strings. The changes are self-contained - each commit can be applied independently. The test suite passes cleanly on my own single-server setup. Considering the particulars, I believe these changes to be an improvement with no potential for adverse effects.