Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(NODE-3993): Improve serializer performance by avoiding V8-bottleneck #490

Closed
wants to merge 6 commits into from

Conversation

Uzlopak
Copy link
Contributor

@Uzlopak Uzlopak commented Feb 14, 2022

https://jira.mongodb.org/browse/NODE-3993

This PR targets to improve the serializer performance. Benchmarks are included.

There is a bottleneck in V8 since node 10 when using buffer.write results in a performance penalty. nodejs/node#32226

I used this gist as blueprint for the fast buffer write functions.
https://gist.github.com/ledbit/fa90e5fbefd8076f4395aa6d8e410aa6

before:

$node benchmarks/serialize.js 
serialize bigboard x 1,363 ops/sec ±1.12% (92 runs sampled)
serialize checkitem x 401,855 ops/sec ±0.91% (93 runs sampled)

after:

$ node benchmarks/serialize.js 
serialize bigboard x 3,316 ops/sec ±0.63% (92 runs sampled)
serialize checkitem x 650,828 ops/sec ±1.84% (81 runs sampled)

So there is a 50-100% performance gain.

CCing the performance gang.
@mcollina
@kibertoad
@zekth

@dariakp
Copy link
Contributor

dariakp commented Feb 14, 2022

@Uzlopak Hi, thanks for submitting this improvement. As it stands, this would be a breaking change - it seems we do not have an existing test for the case of invalid utf-8 roundtripping checked into the codebase, but this is the difference in behavior for this PR vs our released version:

> BSONUzlo.serialize({ a: '😎'.slice(0,1) })
<Buffer 11 00 00 00 02 61 00 05 00 00 00 f0 9f 90 80 00 00>
> BSONUzlo.deserialize(BSONUzlo.serialize({ a: '😎'.slice(0,1) }))
{ a: '🐀' }
//////////////////////////////////////////////////////////////////////////////
> BSONRelease.serialize({ a: '😎'.slice(0,1) })
<Buffer 10 00 00 00 02 61 00 04 00 00 00 ef bf bd 00 00>
> BSONRelease.deserialize(BSONRelease.serialize({ a: '😎'.slice(0,1) }))
{ a: '�' }

In general, this feature aligns with our goal to improve the BSON library's performance; however, we'll want to make sure that any major changes to the serializer are thoroughly unit tested, and come after we do some work to get a perf testing suite into our CI as well as revamp the underlying dependencies of this library in our effort to modernize the codebase. We'll discuss where this work fits into our existing feature queue and give you an update once we have a more solid idea of the path forward and the corresponding timeline.

@Uzlopak
Copy link
Contributor Author

Uzlopak commented Feb 14, 2022

@dariakp

I took now some code from the buffer package. It seems that it works now like your code example.

@Uzlopak
Copy link
Contributor Author

Uzlopak commented Feb 14, 2022

The performance gain is still existing.

node benchmarks/serialize.js
serialize bigboard x 3,515 ops/sec ±0.60% (93 runs sampled)
serialize checkitem x 658,001 ops/sec ±2.03% (80 runs sampled)

@Uzlopak
Copy link
Contributor Author

Uzlopak commented Feb 19, 2022

@dariakp

I wonder. Is it possible to create from js-bson a conformancy test suite?
Also making mongodb-native-driver to accept a custom js-bson serializer/deserializer?

This would make empower us to write our own fast bson implementations and use them in mongodb-native-driver.

Also maybe consider to hire other experts to write a fast bson serializer/deserializer, like @marcj or @kriszyp .
Tbh, I think @kriszyp based on his work on his projects cbor-x and msgpackr would be a perfect dev for this work. Also I think we should consider to not use typescript for this if it means slower performance. It needs to be as fast as possible and as lightweight as possible.

@nbbeeken
Copy link
Contributor

Hi @Uzlopak BSON conformance testing would be our BSON corpus test suite described in our specifications repo here. And I've addressed your other questions in the related tickets: NODE-4024, NODE-4022

@jimmywarting

This comment was marked as resolved.

@baileympearson
Copy link
Contributor

Hey @jimmywarting - We use Jira for feature requests so that they're visible to more than just our team and easier to keep track of. I created a ticket to track this particular request here - https://jira.mongodb.org/browse/NODE-4055.

Feel free to comment on the ticket with additional context if you'd like!

@jimmywarting

This comment was marked as resolved.

@bajanam bajanam added the tracked-in-jira There is a ticket in Mongo's Jira instance tracking this issue/PR label Mar 7, 2022
@billouboq
Copy link
Contributor

Hello, is this PR gonna be merged soon ?

@nbbeeken
Copy link
Contributor

Hi @billouboq, we have this tagged for our next major version of the library which is on our horizon but hasn't been kicked off yet. Currently, the BSON library was not directly responsible for UTF encoding so this is a sizable change that we need to make sure we've adequately covered and considered the impact across all environments we support (non-chrome web, electron, react native) and downstream projects. You can click "Start watching this issue" on JIRA to follow for updates.

But luckily there is an immediate solution if you are comfortable using an unsupported version of the library, npm overrides can pull this change in:

"overrides": {
    "bson": "github:Uzlopak/js-bson#add-benchmarks"
  },
  "dependencies": {
    "mongodb": "^4.5.0"
  }

@j3bb9z
Copy link

j3bb9z commented Oct 5, 2022

Hi @nbbeeken, when is the next major release planned? Around October 18th, along with MongoDB 6.0?

@Uzlopak
Copy link
Contributor Author

Uzlopak commented Oct 5, 2022

I dont think so. Anyway this perf-bottleneck is probably fixed or not fixed in node 18. I close this as there is no point to have this open as it will never be merged.

@Uzlopak Uzlopak closed this Oct 5, 2022
@Uzlopak Uzlopak deleted the add-benchmarks branch October 5, 2022 11:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tracked-in-jira There is a ticket in Mongo's Jira instance tracking this issue/PR
Projects
None yet
8 participants