Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace buffer implementation with standard Uint8Array #35

Closed
jfirebaugh opened this issue May 13, 2016 · 10 comments
Closed

Replace buffer implementation with standard Uint8Array #35

jfirebaugh opened this issue May 13, 2016 · 10 comments
Assignees

Comments

@jfirebaugh
Copy link

The conditional use of node's Buffer versus a shim based on Uint8Array leads to subtle incompatibilities between node and the browser. I just spent a couple hours debugging an issue in mapbox-gl-js that stems from the following difference:

var Pbf = require('pbf');
var pbf = new Pbf();
pbf.writeVarint(0);
console.log(pbf.finish().buffer.byteLength);

In the browser, this prints 16, as expected. In node, it prints 8192.

To eliminate this gotcha, pbf should use standard Uint8Array's everywhere, and drop usage of Buffer.

@jfirebaugh
Copy link
Author

Upstream issue: nodejs/node#6744

@mourner
Copy link
Member

mourner commented May 14, 2016

I think the main reason to use Buffer instead of Uint8Array was that native write, toString and other shimmed methods were simply much faster in the Node internal version. This may have changed with new Node versions — we need to check this again.

@mourner
Copy link
Member

mourner commented May 16, 2016

Yeah, just confirmed by running the benchmark in Node on both:

# Node Buffer
decode vector tile with pbf x 750 ops/sec ±2.59% (84 runs sampled)
encode vector tile with pbf x 596 ops/sec ±1.70% (80 runs sampled)

# Uint8Array shim
decode vector tile with pbf x 404 ops/sec ±1.17% (83 runs sampled)
encode vector tile with pbf x 631 ops/sec ±1.96% (84 runs sampled)

Writing is surprisingly slightly faster, but reading performance drop is dramatic (almost twice as slow) — we can't allow such a regression.

@jfirebaugh
Copy link
Author

It's on the writing side that I lost most of a day to debugging a subtle difference between Buffer and Uint8Array, namely that slice() returns a new Uint8Array but a view of the underlying Buffer.

Can we use Uint8Array unconditionally for writing, or at least ensure that finish() returns a fresh Uint8Array in both implementations?

@mourner
Copy link
Member

mourner commented May 16, 2016

I don't yet see a good solution here. We can't switch the shim for writing only because you create a Pbf object once and can then mix read/write on it. And converting from Buffer to Uint8Array in finish would require a full copy of the data.

Would it be acceptable to leave the current implementation but add a warning to the finish method description in the docs? E.g. warning about type difference and advocating the use of pbf.length instead of result.buffer.byteLength.

@jfirebaugh
Copy link
Author

Nobody will find that comment until they've already spent a long time debugging some obscure issue.

You've already benchmarked that using Uint8Array for writes is actually faster. So why not use them?

@jfirebaugh
Copy link
Author

Workaround in gl-js: mapbox/mapbox-gl-js@57be2ce

@mourner
Copy link
Member

mourner commented May 16, 2016

You've already benchmarked that using Uint8Array for writes is actually faster. So why not use them?

As I mentioned earlier, we can only switch to Uint8Array for both read/write at the same time, and making reads 2 times slower is not an option. I'm investigating why exactly it's slower, but can't promise to find an actionable cause.

@jfirebaugh
Copy link
Author

jfirebaugh commented Aug 24, 2016

I still think we should just use and accept the performance of Uint8Array. High-performance node applications are going to want to use a native module for pbfs anyway.

The current code forces copying the data in the browser when you do new Pbf(uint8Array).

@mourner
Copy link
Member

mourner commented Aug 25, 2016

Working on this. Managed to get the Uint8Array decoding perf drop to ~9% from ~46%.

@mourner mourner self-assigned this Aug 25, 2016
jfirebaugh added a commit to mapbox/mapbox-gl-js that referenced this issue Aug 25, 2016
Refs mapbox/pbf#35 (comment)

A copy, rather than a transfer, _is_ needed when sending the rawTileData to the main thread. In that case, don't use a transferrable.
jfirebaugh added a commit to mapbox/mapbox-gl-js that referenced this issue Aug 25, 2016
Refs mapbox/pbf#35 (comment)

A copy, rather than a transfer, _is_ needed when sending the rawTileData to the main thread. In that case, don't use a transferrable.
@mourner mourner closed this as completed Aug 25, 2016
jfirebaugh added a commit to mapbox/mapbox-gl-js that referenced this issue Aug 30, 2016
Refs mapbox/pbf#35 (comment)

A copy, rather than a transfer, _is_ needed when sending the rawTileData to the main thread. In that case, don't use a transferrable.
jfirebaugh added a commit to mapbox/mapbox-gl-js that referenced this issue Aug 30, 2016
Refs mapbox/pbf#35 (comment)

A copy, rather than a transfer, _is_ needed when sending the rawTileData to the main thread. In that case, don't use a transferrable.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants