Possible performance optimizations of structured clone algorithm against ArrayBufferView? #9450

shuhaowu · 2023-06-21T15:57:33Z

I'm working on a web application where there is a large ArrayBuffer holding a large amount of data and the code is occasionally copying a subset of this ArrayBuffer (via a view constructed as a Uint8Array) to a web worker. I've noticed that this copy is quite slow as each copy is copying the entire underlying array buffer as opposed only the viewed data.

Looking under the hood, I see the structured clone algorithm is implemented via the StructuredSerializedInternal operation specified in section 2.7.3. If I'm reading this correctly, the Uint8Array is being cloned via point 14.2 and point 14.3, which invokes StructuredSerializedInternal again in which point 13.2 applies, where the entire underlying buffer is copied.

IMO this is somewhat of a footgun, as I and others assumed that copying an ArrayBufferView to a web worker would not copy the entire underlying buffer (in my case the underlying buffer is ~100x bigger than the view). If the standard were to change to copy only the viewed segment of the buffer when copying an ArrayBufferView, it would eliminate this footgun. However, this means the cloned ArrayBufferView would not be an exact clone, which might have side effects that I don't know about. In any case, I created this issue to raise awareness about this.

The text was updated successfully, but these errors were encountered:

annevk · 2023-06-21T17:25:53Z

Thanks for raising this. It's been this way for a while: 2cb7f4e. And that was based on a specification by @kenrussell per https://www.w3.org/Bugs/Public/show_bug.cgi?id=13800 that had the same behavior: https://web.archive.org/web/20130906003847/http://www.khronos.org/registry/typedarray/specs/latest/#9.

https://www.khronos.org/pipermail/public_webgl_khronos.org/2011-April/003287.html discusses how the transferring semantics were decided upon.

Per https://www.khronos.org/pipermail/public_webgl_khronos.org/2011-April/003343.html I think they might have tried to always have transfer semantics initially, so maybe the "clone" semantics were more of an afterthought?

That's a long way of saying that we most likely cannot change the default at this point, but we might be able to offer some kind of opt-in for creating a new view based on a slice of the original buffer. What would help motivate that is Stack Overflow questions, library code that already does essentially that, etc.

shuhaowu · 2023-06-21T17:55:09Z

Thanks for the prompt response and all the links to the relevant context. It is helpful for a more in-depth understanding.

That's a long way of saying that we most likely cannot change the default at this point, but we might be able to offer some kind of opt-in for creating a new view based on a slice of the original buffer. What would help motivate that is Stack Overflow questions, library code that already does essentially that, etc.

I don't know what to do either given it's been in the standards for this long. Creating an opt-in API to illustrate this problem might be good, but I don't think many people will be posting about this because it's a relatively niche problems that requires all of the following to be true:

You have a large ArrayBuffer.
You have to take a view of a small subset of said ArrayBuffer
You have to copy this view from/to a webworker.
You have to actually notice that this is a performance problem issue, as in many cases this is not going to be that big of a deal.

It happened to me because I'm working on a relatively complex application where milliseconds matter. Typically copying a small array takes about ~100us and I noticed it started to take 12ms with this issue, which is enough to cause problems for the application I'm working on. My point is that people are likely to not know about this problem as it flies under the radar and it causes only minor performance degradation that likely don't matter.

If others encounters this issue, the workaround is relatively simple: just copy the view into a separate typed array and transfer that instead.

domenic · 2023-06-21T21:53:15Z

I think changing this would not make a lot of sense with the semantics of structured clone. When you clone a plain JS object foo whose property bar is a cloneable object such as an ArrayBuffer, you end up with clonedFoo.bar being a full copy of foo.bar. Similarly, when you clone an ArrayBufferView view, you should end up with clonedView.buffer being a full copy of view.buffer.

In other words, ArrayBufferViews are just convenient classes that let you couple together { buffer, byteOffset, length }. Cloning one of these three properties in an incomplete fashion, based on the other two, seems like a bad idea. If you explicitly want to lose information, you should slice or resize the buffer directly.

shuhaowu · 2023-06-21T21:57:45Z

Agreed. That's what I meant earlier when I said changing the API would imply it is no longer an exact clone.

In that case, what about changing how postMessage work such that typed array is copied more efficiently? I haven't looked at that part of the spec and don't know what kind of side effects it would have. It seems unlikely that someone would want to postMessage with a typed array and intentionally wants to access the underlying buffer instead of the view.

domenic · 2023-06-21T22:54:25Z

I don't think that makes sense either. postMessage sends a clone of the values, and people have the same expectations about such clones.

kenrussell · 2023-06-21T23:48:24Z

@shuhaowu the APIs already exist to make a quick copy of the typed array, if the application's intent is to copy over only a small sub-region of a larger ArrayBuffer - just call the constructor (for example, new Uint8Array), passing in the existing Uint8Array. Then the copy can be sent to the worker via postMessage.

A zero-copy mechanism already exists for sending data to workers for processing: transferring the ArrayBuffer during postMessage, rather than copying it. However, this has the side-effect that the sending thread can not access the data until the worker is done, and transfers the ArrayBuffer back. Depending on the use case, the app might use a set of smaller ArrayBuffers, rather than one large one.

If using postMessage to send a typed array, it's implicitly sending the object graph reachable from that typed array, and that includes the ArrayBuffer. I don't think API changes are warranted here; the application needs to understand what is going on, and use the available primitives.

annevk added the topic: serialize and transfer label Jun 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Possible performance optimizations of structured clone algorithm against ArrayBufferView? #9450

Possible performance optimizations of structured clone algorithm against ArrayBufferView? #9450

shuhaowu commented Jun 21, 2023

annevk commented Jun 21, 2023

shuhaowu commented Jun 21, 2023

domenic commented Jun 21, 2023

shuhaowu commented Jun 21, 2023

domenic commented Jun 21, 2023

kenrussell commented Jun 21, 2023

Possible performance optimizations of structured clone algorithm against ArrayBufferView? #9450

Possible performance optimizations of structured clone algorithm against ArrayBufferView? #9450

Comments

shuhaowu commented Jun 21, 2023

annevk commented Jun 21, 2023

shuhaowu commented Jun 21, 2023

domenic commented Jun 21, 2023

shuhaowu commented Jun 21, 2023

domenic commented Jun 21, 2023

kenrussell commented Jun 21, 2023