Async methods for JSON.parse and JSON.stringify #2031

Closed
asilvas opened this Issue Jun 22, 2015 · 17 comments

Projects

None yet
@asilvas
asilvas commented Jun 22, 2015

Reviving a request made in the old joyent/node repo: nodejs/node-v0.x-archive#7543

Given that parse/stringify of large JSON objects can tie up the current thread, it would be nice to see async versions of these methods. I know that parseAsync and stringifyAsync are a bit alien in node.js, just the same it is functionality that would be best served against a thread pool in a separate thread internally.

Also exposing options for when to use a given level of fallback internally based on the size of the string/object. ex: a 300kb json takes about 12-14ms to parse holding up the main thread. This is an extreme example, but it all adds up.

@trevnorris
Contributor

It is impossible to serialize a V8 object, or convert a string back to a V8 object, off the main thread. This is a limitation of the VM and not something io.js can implement on its own.

@Qard
Member
Qard commented Jun 22, 2015

It could conceivably be done using an abstract representation of V8 object types outside V8 that can be converted to and from real V8 types on the main thread. I can't really comment on what the performance would be like though...it might be even worse than the regular sync calls. It'd also be rather complicated.

@Fishrock123
Member

It would be possible to offload this from the main thread, and much simpler to do, in a worker (working proposal, see: #1159).

Though it'd only be worthwhile if it was large JSON objects.

@ChALkeR
Member
ChALkeR commented Jun 23, 2015

@asilvas How is that supposed to work if you change the object while it's being serialized?

@bnoordhuis
Member

Something (exactly what TBD) might be possible once V8 implements typed objects. I don't expect that to happen anytime soon though.

@trevnorris
Contributor

@Fishrock123 In order to migrate the object from one thread to another it would have to be serialized. There's no way to share native objects across Isolates.

@benjamingr
Member

Yes, as trev said this is impossible - v8 isolates can not share structured memory only buffers.

You can, however, write a streaming JSON parser that gives you the next meaningful result as it is ready (for example, when parsing an array, the next element). JSON was never meant for holding large sets of data - it's readable and simple and nice but it's not built for fast serialization or deserialization.

Also, for what it's worth - we still have a lot of advancement to make with JSON parsing - for example through type hinting or even schemas which would make it much faster.

In either case- there is nothing core can provide here over a user library because of the technical limitations of v8.

@meandmycode

I would say, due to the JSON object being part of EcmaScript spec it would probably be more likely to happen if you proposed async versions of each method, that returned Promise's.

@thelinuxlich

👍

@chrisdickinson
Contributor

From our end, it's not possible to create a JSON.parse that operates in a thread and returns asynchronously — we'd still need to do some marshaling on the objects to get them into V8. @qard's proposal fits in this vein, as does @Fishrock123's — though for different reasons. In the former case, the text is parsed into structs, and then those structs have to be visited on the main thread and turned into objects. In the latter case, there's no "zero-copy" method for transferring non-Transferrable objects between workers (which is to say, most vanilla JS objects.) @bnoordhuis' suggestion of typed objects also falls into this category — we would still have to marshal the typed objects into JS objects on the main thread.

As @trevnorris notes, V8 would have to support creating objects from another thread for this to be feasible. For the time being the recommendation is to use a streaming JSON parser (like JSONStream) if you need to be able to yield back to the event loop while parsing JSON text.

There may be a case where we'd want to include such a streaming parser at some point, in order to remove a blocking point from the existing JSON-IPC communication, but it hasn't been picked up in quite some time. If you'd like to work on that it'd be great to open a PR or an issue for that specific concern.

@asilvas
asilvas commented Jun 24, 2015

Understandable response, but I want to point out that JSON Streaming is not a very good alternative. https://github.com/asilvas/json-stream-bench

@isiahmeadows

One more thing: JSON.stringify is implemented in almost pure JS, except for the common case of JSON.stringify(value), which is implemented in pure C++.

@vkurchatkin
Member

@thelinuxlich I'm pretty sure that what you describe is not better than using setTimeout. You are not blocking main thread now, you are blocking it later. json() returns promise because it reads response body asynchronously, not because it parses json in a background thread.

Here is the code from blink: https://github.com/nwjs/blink/blob/be948afff52a140cdb9339c918e62fc71759904e/Source/modules/fetch/Body.cpp#L322

as you can, see parsing is implemented using standard v8 API

@sindresorhus

ESDiscuss thread about writing a spec for async JSON APIs and proposal.

@mohsen1 mohsen1 referenced this issue in mohsen1/async-json Aug 2, 2015
Open

Thread safety when JSON.stringifyAsync #6

@smasher164

SharedArrayBuffer is undergoing revision at the moment, for which the current specification can be accessed at this link. http://lars-t-hansen.github.io/ecmascript_sharedmem/shmem.html

It can be mutably shared among workers, without the need for postMessage(). By the way, this is how ASM.js transpiles C++ to Javascript and maintains threading-like functionality. This is referenced in Mozilla's post on the topic. https://blog.mozilla.org/javascript/2015/02/26/the-path-to-parallel-javascript/

@binji
binji commented Sep 16, 2015

@smasher164 SharedArrayBuffer won't help here. Anything you could do with SAB to parse JSON on a Worker you could do with postMessage. The real issue is mentioned multiple times above, you cannot share objects between isolates without serializing them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment