New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Async methods for JSON.parse and JSON.stringify #2031

Closed
asilvas opened this Issue Jun 22, 2015 · 17 comments

Comments

Projects
None yet
@asilvas

asilvas commented Jun 22, 2015

Reviving a request made in the old joyent/node repo: nodejs/node-v0.x-archive#7543

Given that parse/stringify of large JSON objects can tie up the current thread, it would be nice to see async versions of these methods. I know that parseAsync and stringifyAsync are a bit alien in node.js, just the same it is functionality that would be best served against a thread pool in a separate thread internally.

Also exposing options for when to use a given level of fallback internally based on the size of the string/object. ex: a 300kb json takes about 12-14ms to parse holding up the main thread. This is an extreme example, but it all adds up.

@trevnorris

This comment has been minimized.

Show comment
Hide comment
@trevnorris

trevnorris Jun 22, 2015

Contributor

It is impossible to serialize a V8 object, or convert a string back to a V8 object, off the main thread. This is a limitation of the VM and not something io.js can implement on its own.

Contributor

trevnorris commented Jun 22, 2015

It is impossible to serialize a V8 object, or convert a string back to a V8 object, off the main thread. This is a limitation of the VM and not something io.js can implement on its own.

@Qard

This comment has been minimized.

Show comment
Hide comment
@Qard

Qard Jun 22, 2015

Member

It could conceivably be done using an abstract representation of V8 object types outside V8 that can be converted to and from real V8 types on the main thread. I can't really comment on what the performance would be like though...it might be even worse than the regular sync calls. It'd also be rather complicated.

Member

Qard commented Jun 22, 2015

It could conceivably be done using an abstract representation of V8 object types outside V8 that can be converted to and from real V8 types on the main thread. I can't really comment on what the performance would be like though...it might be even worse than the regular sync calls. It'd also be rather complicated.

@Fishrock123

This comment has been minimized.

Show comment
Hide comment
@Fishrock123

Fishrock123 Jun 22, 2015

Member

It would be possible to offload this from the main thread, and much simpler to do, in a worker (working proposal, see: #1159).

Though it'd only be worthwhile if it was large JSON objects.

Member

Fishrock123 commented Jun 22, 2015

It would be possible to offload this from the main thread, and much simpler to do, in a worker (working proposal, see: #1159).

Though it'd only be worthwhile if it was large JSON objects.

@ChALkeR

This comment has been minimized.

Show comment
Hide comment
@ChALkeR

ChALkeR Jun 23, 2015

Member

@asilvas How is that supposed to work if you change the object while it's being serialized?

Member

ChALkeR commented Jun 23, 2015

@asilvas How is that supposed to work if you change the object while it's being serialized?

@bnoordhuis

This comment has been minimized.

Show comment
Hide comment
@bnoordhuis

bnoordhuis Jun 23, 2015

Member

Something (exactly what TBD) might be possible once V8 implements typed objects. I don't expect that to happen anytime soon though.

Member

bnoordhuis commented Jun 23, 2015

Something (exactly what TBD) might be possible once V8 implements typed objects. I don't expect that to happen anytime soon though.

@trevnorris

This comment has been minimized.

Show comment
Hide comment
@trevnorris

trevnorris Jun 23, 2015

Contributor

@Fishrock123 In order to migrate the object from one thread to another it would have to be serialized. There's no way to share native objects across Isolates.

Contributor

trevnorris commented Jun 23, 2015

@Fishrock123 In order to migrate the object from one thread to another it would have to be serialized. There's no way to share native objects across Isolates.

@benjamingr

This comment has been minimized.

Show comment
Hide comment
@benjamingr

benjamingr Jun 23, 2015

Member

Yes, as trev said this is impossible - v8 isolates can not share structured memory only buffers.

You can, however, write a streaming JSON parser that gives you the next meaningful result as it is ready (for example, when parsing an array, the next element). JSON was never meant for holding large sets of data - it's readable and simple and nice but it's not built for fast serialization or deserialization.

Also, for what it's worth - we still have a lot of advancement to make with JSON parsing - for example through type hinting or even schemas which would make it much faster.

In either case- there is nothing core can provide here over a user library because of the technical limitations of v8.

Member

benjamingr commented Jun 23, 2015

Yes, as trev said this is impossible - v8 isolates can not share structured memory only buffers.

You can, however, write a streaming JSON parser that gives you the next meaningful result as it is ready (for example, when parsing an array, the next element). JSON was never meant for holding large sets of data - it's readable and simple and nice but it's not built for fast serialization or deserialization.

Also, for what it's worth - we still have a lot of advancement to make with JSON parsing - for example through type hinting or even schemas which would make it much faster.

In either case- there is nothing core can provide here over a user library because of the technical limitations of v8.

@meandmycode

This comment has been minimized.

Show comment
Hide comment
@meandmycode

meandmycode Jun 23, 2015

I would say, due to the JSON object being part of EcmaScript spec it would probably be more likely to happen if you proposed async versions of each method, that returned Promise's.

meandmycode commented Jun 23, 2015

I would say, due to the JSON object being part of EcmaScript spec it would probably be more likely to happen if you proposed async versions of each method, that returned Promise's.

@thelinuxlich

This comment has been minimized.

Show comment
Hide comment
@thelinuxlich

thelinuxlich commented Jun 23, 2015

👍

@chrisdickinson

This comment has been minimized.

Show comment
Hide comment
@chrisdickinson

chrisdickinson Jun 24, 2015

Contributor

From our end, it's not possible to create a JSON.parse that operates in a thread and returns asynchronously — we'd still need to do some marshaling on the objects to get them into V8. @Qard's proposal fits in this vein, as does @Fishrock123's — though for different reasons. In the former case, the text is parsed into structs, and then those structs have to be visited on the main thread and turned into objects. In the latter case, there's no "zero-copy" method for transferring non-Transferrable objects between workers (which is to say, most vanilla JS objects.) @bnoordhuis' suggestion of typed objects also falls into this category — we would still have to marshal the typed objects into JS objects on the main thread.

As @trevnorris notes, V8 would have to support creating objects from another thread for this to be feasible. For the time being the recommendation is to use a streaming JSON parser (like JSONStream) if you need to be able to yield back to the event loop while parsing JSON text.

There may be a case where we'd want to include such a streaming parser at some point, in order to remove a blocking point from the existing JSON-IPC communication, but it hasn't been picked up in quite some time. If you'd like to work on that it'd be great to open a PR or an issue for that specific concern.

Contributor

chrisdickinson commented Jun 24, 2015

From our end, it's not possible to create a JSON.parse that operates in a thread and returns asynchronously — we'd still need to do some marshaling on the objects to get them into V8. @Qard's proposal fits in this vein, as does @Fishrock123's — though for different reasons. In the former case, the text is parsed into structs, and then those structs have to be visited on the main thread and turned into objects. In the latter case, there's no "zero-copy" method for transferring non-Transferrable objects between workers (which is to say, most vanilla JS objects.) @bnoordhuis' suggestion of typed objects also falls into this category — we would still have to marshal the typed objects into JS objects on the main thread.

As @trevnorris notes, V8 would have to support creating objects from another thread for this to be feasible. For the time being the recommendation is to use a streaming JSON parser (like JSONStream) if you need to be able to yield back to the event loop while parsing JSON text.

There may be a case where we'd want to include such a streaming parser at some point, in order to remove a blocking point from the existing JSON-IPC communication, but it hasn't been picked up in quite some time. If you'd like to work on that it'd be great to open a PR or an issue for that specific concern.

@asilvas

This comment has been minimized.

Show comment
Hide comment
@asilvas

asilvas Jun 24, 2015

Understandable response, but I want to point out that JSON Streaming is not a very good alternative. https://github.com/asilvas/json-stream-bench

asilvas commented Jun 24, 2015

Understandable response, but I want to point out that JSON Streaming is not a very good alternative. https://github.com/asilvas/json-stream-bench

@isiahmeadows

This comment has been minimized.

Show comment
Hide comment
@isiahmeadows

isiahmeadows Jun 27, 2015

One more thing: JSON.stringify is implemented in almost pure JS, except for the common case of JSON.stringify(value), which is implemented in pure C++.

isiahmeadows commented Jun 27, 2015

One more thing: JSON.stringify is implemented in almost pure JS, except for the common case of JSON.stringify(value), which is implemented in pure C++.

@thelinuxlich

This comment has been minimized.

Show comment
Hide comment
@vkurchatkin

This comment has been minimized.

Show comment
Hide comment
@vkurchatkin

vkurchatkin Aug 1, 2015

Member

@thelinuxlich I'm pretty sure that what you describe is not better than using setTimeout. You are not blocking main thread now, you are blocking it later. json() returns promise because it reads response body asynchronously, not because it parses json in a background thread.

Here is the code from blink: https://github.com/nwjs/blink/blob/be948afff52a140cdb9339c918e62fc71759904e/Source/modules/fetch/Body.cpp#L322

as you can, see parsing is implemented using standard v8 API

Member

vkurchatkin commented Aug 1, 2015

@thelinuxlich I'm pretty sure that what you describe is not better than using setTimeout. You are not blocking main thread now, you are blocking it later. json() returns promise because it reads response body asynchronously, not because it parses json in a background thread.

Here is the code from blink: https://github.com/nwjs/blink/blob/be948afff52a140cdb9339c918e62fc71759904e/Source/modules/fetch/Body.cpp#L322

as you can, see parsing is implemented using standard v8 API

@sindresorhus

This comment has been minimized.

Show comment
Hide comment
@sindresorhus

sindresorhus Aug 2, 2015

ESDiscuss thread about writing a spec for async JSON APIs and proposal.

sindresorhus commented Aug 2, 2015

ESDiscuss thread about writing a spec for async JSON APIs and proposal.

@smasher164

This comment has been minimized.

Show comment
Hide comment
@smasher164

smasher164 Aug 21, 2015

SharedArrayBuffer is undergoing revision at the moment, for which the current specification can be accessed at this link. http://lars-t-hansen.github.io/ecmascript_sharedmem/shmem.html

It can be mutably shared among workers, without the need for postMessage(). By the way, this is how ASM.js transpiles C++ to Javascript and maintains threading-like functionality. This is referenced in Mozilla's post on the topic. https://blog.mozilla.org/javascript/2015/02/26/the-path-to-parallel-javascript/

smasher164 commented Aug 21, 2015

SharedArrayBuffer is undergoing revision at the moment, for which the current specification can be accessed at this link. http://lars-t-hansen.github.io/ecmascript_sharedmem/shmem.html

It can be mutably shared among workers, without the need for postMessage(). By the way, this is how ASM.js transpiles C++ to Javascript and maintains threading-like functionality. This is referenced in Mozilla's post on the topic. https://blog.mozilla.org/javascript/2015/02/26/the-path-to-parallel-javascript/

@binji

This comment has been minimized.

Show comment
Hide comment
@binji

binji Sep 16, 2015

Contributor

@smasher164 SharedArrayBuffer won't help here. Anything you could do with SAB to parse JSON on a Worker you could do with postMessage. The real issue is mentioned multiple times above, you cannot share objects between isolates without serializing them.

Contributor

binji commented Sep 16, 2015

@smasher164 SharedArrayBuffer won't help here. Anything you could do with SAB to parse JSON on a Worker you could do with postMessage. The real issue is mentioned multiple times above, you cannot share objects between isolates without serializing them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment