-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf(NODE-3570): short circuit check for keys that start with $ #78
Conversation
Hi @znewsham thanks for taking the time to offer some performance improvements for bson-ext, we have not maintained the library for some time so performance degradation is expected. It was written when Node.js and v8 were in a very different place (perf-wise) so there was motivation to write native addons to increase performance. Today we believe we can get more performance out of improvements to our js-bson library than we can in bson-ext due to the inherent limitation of crossing the JS-CPP boundary. So we planned to soon deprecate this library in favor of focusing on making js-bson the go to library for performance. Notably, a recent release to js-bson contained an improvement similar to the one you suggest here about how we go about parsing DBRef. That being said, we appreciate the effort to improve this library and are open to accepting changes. If you still want to see these improvements land could you share with me the numbers you've measured and how you went about measuring them? Also what version of js-bson are you comparing against? And, just confirming, is this your community post? If so I'll just link to this thread and we can continue the conversation here :) Tracking: NODE-3570 |
Hi @nbbeeken - I had no idea there was a plan to deprecate this library :( I can't find any reference to this anywhere in the documentation and it's still referenced by the mongo node driver. We recently switched to it as it offered a small (~2% in our testing) performance improvement in our general cases, though is slower in some specific cases. Yes - that's my post! Figured this out about 5 mins after I posted that :D I'm a little surprised that native modules don't have a clear performance benefit over JS for cases such as this - I guess perhaps there is a tradeoff where smaller objects are faster with JS and larger are vaster with native? Perhaps a hybrid model could work here, something heuristic based, e.g., > X bytes use native? I'm comparing I'm getting results like this - where each line is the average time taken over about 333 passes (randomized order)
|
Don't give up, we use large object processing and get a significant performance improvement |
TL;DR we'll look into merging this soon if logic checks out. Okay! So I've taken the benchmark you provided here and expanded it a bit in this repository, take a look at the code here and the results can be seen here (from a run on a Github Action machine) I added a large object that is from this JSON file to give us something big with a complex shape, it is marked as DATAS in the results. Also tested serialization just to see it compared. Bare with me while I describe the steps I've taken below and how they confirm our performance expectations. (Also feel free to correct the direction I took for testing this if you see something is off) Focusing on only deserialization and bson-ext the results currently posted are:
Calculating the low end of the error margin for
Calculating the high end of the error margin for
So the calculations above show that even when taking the margins of error to their extreme in either direction we still have a clear improvement in performance. Thanks again for taking the time to contribute this! I will bring the team in on taking a look at the logic to confirm we're set to merge this in. On the subject of bson-ext itself: It is correct we don't have the deprecation I mention properly documented anywhere, apologies, so I should instead say TBD about the planned support for this lib. We will clear up the messaging on that soon. I agree that maybe there is a future for some hybrid model or possibly there are more improvements that could be made to the javascript implementation. |
Started a patch here: https://spruce.mongodb.com/version/6137869a850e61403b8b54d4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM Thanks again for this!
Description
If no key starts with a $ - then there can be no $ref/$id/$db property. Looking for these is pretty inefficient currently.
Doing this results in about a 25% performance improvement (making it faster than JS where otherwise it would be slower).
All tests are passing
Turn off whitespace changes when reviewing - most of the change is just indenting the existing functionality into an if