Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

discuss: object serializer/deserializer + transferables #6300

Closed
eljefedelrodeodeljefe opened this issue Apr 20, 2016 · 15 comments
Closed

discuss: object serializer/deserializer + transferables #6300

eljefedelrodeodeljefe opened this issue Apr 20, 2016 · 15 comments
Labels
c++ Issues and PRs that require attention from people who are familiar with C++. discuss Issues opened for discussions and feedbacks. feature request Issues that request new features to be added to Node.js. v8 engine Issues and PRs related to the V8 dependency.

Comments

@eljefedelrodeodeljefe
Copy link
Contributor

I wanted to propose this issue as a top-level home for discussions from multiple issues, amongst #2133, #3145.

Both issues are too big to have immediate action (judging from what I have read), but they have a serializer/deserializer feature in common, node, c++ and v8 are not providing out of the box. @bnoordhuis mentioned that it seems to be unrealistic to drag in Blink code because of the many dependencies within that repo. I had a look also into their WebWorker implementation, which is fairly huge.

(Ignoring the WebWorker and IPC discussion) the question would be whether we first wanted to provide an efficient serializer/ deserializer + transferables of c++ objects, which might aid a lot of problems at once and also be a nice standalone feature.

Reading up on this from various c++ resources this is of course a non-trivial task. Whereas I also don't believe there will ever be a single method for arbitrary object serialization (please prove me wrong), due to the way v8 works. We would need to judge this before hand because that means a lot of serialization strategies would be insufficient for us.

As a quick guess, that would probably lead us to implement ::serialize() methods per Class, no matter if binary or text, which would ensure serilization happens correctly at dev / build time.

@Fishrock123 Fishrock123 added the discuss Issues opened for discussions and feedbacks. label Apr 20, 2016
@mscdex mscdex added the c++ Issues and PRs that require attention from people who are familiar with C++. label Apr 20, 2016
@Fishrock123 Fishrock123 added the v8 engine Issues and PRs related to the V8 dependency. label Apr 24, 2016
@Fishrock123
Copy link
Member

Should this be pushed for in v8?

Otherwise, is this an eventual candidate for the VM/API shim?

Honestly, there isn't much point for this until workers land afaik.

@bnoordhuis
Copy link
Member

Cluster would benefit from faster serialization/deserialization. People complain it's slow to send large objects, thinking it's because it goes over a UNIX socket, but in my experience, it's the JSON.stringify/JSON.parse calls that are most expensive.

@ChALkeR
Copy link
Member

ChALkeR commented Apr 24, 2016

Another potentialy related issue: #5453.

@ChALkeR
Copy link
Member

ChALkeR commented Apr 24, 2016

child_process is documented to use JSON.stringify since #5723, so any change there would be a semver-major.

@eljefedelrodeodeljefe
Copy link
Contributor Author

eljefedelrodeodeljefe commented Apr 24, 2016

@ChALkeR wouldn't (w/o searching the bottleneck) need to be a replacement but rather be an option. E.g. we could provide the option for shared memory IPC where this would be useful.

@Fishrock123 looking at specs I doubt that we want to be a vm-concurrency only program, no? Since there are multiple workers also, I see anyhow a multi-concurrency model coming for the lang.

So from the discussion so far I conclude, that it would be interesting to at least look into it. If time allows I do so in the next couple of weeks. @bnoordhuis have you done any work on this so far? I know you have deferred this more then once in the past.

@bnoordhuis
Copy link
Member

child_process is documented to use JSON.stringify since #5723, so any change there would be a semver-major.

We could make it opt-in by adding a new configuration option or a command line flag. That would make it semver-minor.

@bnoordhuis have you done any work on this so far?

Nothing substantial.

@targos
Copy link
Member

targos commented Apr 25, 2016

While I am for some kind of serializer that would allow to transfer more things than JSON.stringify (thinking about undefined or circular structures), I don't think we would achieve the main goal (performance) with Blink's code.
It is a lot faster to use JSON.parse/JSON.stringify to transfer objects to Web Workers or iframes in Chrome.

cjihrig added a commit to cjihrig/node that referenced this issue Apr 26, 2016
By default process.send() serializes messages using vanilla
JSON.stringify(). This commit makes the serialization function
configurable.

Refs: nodejs#6300
@camillobruni
Copy link
Contributor

V8-dev here: There work in progress to migrate the blink serializer to V8 see https://bugs.chromium.org/p/chromium/issues/detail?id=148757.
Note that as a first step we will implement a legacy serialization format which is not supposed to be used outside of blink as it has certain shortcomings. In a second step a newer format is going to be implemented which should be more beneficial. If Node is interested in a better serialization format other than JSON, I suggest to have a look at the new format once it's ready.

@eljefedelrodeodeljefe
Copy link
Contributor Author

Thanks for sharing. Might have missed it, but is there any other place a design discussion happened (on the internet)?

@jasnell jasnell added the feature request Issues that request new features to be added to Node.js. label Aug 16, 2016
@jasnell
Copy link
Member

jasnell commented Aug 16, 2016

@camillobruni ... out of curiosity, what is the newer format that is being looked at? Wouldn't be https://tools.ietf.org/html/rfc7049 by chance, would it? If so, +1.

@camillobruni
Copy link
Contributor

@jeremyroman is the implementor, he will know more details. From what I recall, its definitely not based on rfc7049 but rather an extension of the existing format.

@jeremyroman
Copy link

Hi, Blink dev here.

My objective is to support the HTML structured clone algorithm for Blink, and for legacy reasons, we need to be able to read the format we already store on disk for IndexedDB. Even without the legacy constraint, the RFC7049 format doesn't seem to support some of the things HTML structured clone requires (e.g. reference equality is preserved, i.e., if a is an object, [a,a] deserializes such that a === a). Once that's done, I'll probably tweak the format to remove some legacy quirks and store some extra information to help speed up deserialization a little.

I have a slightly out of date doc about that work. It's written from the perspective of Blink's needs.

While I do expect it to substantially outperform Blink's current structured clone, there is some cost to tracking references (to handle circular structures etc.), so it may not outperform JSON.stringify.

@rumkin
Copy link
Contributor

rumkin commented Sep 12, 2016

@jeremyroman Will it support regular expression, functions and streams?

@jeremyroman
Copy link

It will support RegExp, yes. Functions and streams are not structured clonable, so at present, no. I believe it's "no" to the others. (It's not even clear how a function would be cloned in general, given the existence of both "native" functions and functions which close over variables.)

@addaleax
Copy link
Member

This has happened in #11048, closing 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c++ Issues and PRs that require attention from people who are familiar with C++. discuss Issues opened for discussions and feedbacks. feature request Issues that request new features to be added to Node.js. v8 engine Issues and PRs related to the V8 dependency.
Projects
None yet
Development

No branches or pull requests