Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Async methods for JSON.parse and JSON.stringify #2031

Closed
asilvas opened this issue Jun 22, 2015 · 17 comments
Closed

Async methods for JSON.parse and JSON.stringify #2031

asilvas opened this issue Jun 22, 2015 · 17 comments

Comments

@asilvas
Copy link

@asilvas asilvas commented Jun 22, 2015

Reviving a request made in the old joyent/node repo: nodejs/node-v0.x-archive#7543

Given that parse/stringify of large JSON objects can tie up the current thread, it would be nice to see async versions of these methods. I know that parseAsync and stringifyAsync are a bit alien in node.js, just the same it is functionality that would be best served against a thread pool in a separate thread internally.

Also exposing options for when to use a given level of fallback internally based on the size of the string/object. ex: a 300kb json takes about 12-14ms to parse holding up the main thread. This is an extreme example, but it all adds up.

@trevnorris
Copy link
Contributor

@trevnorris trevnorris commented Jun 22, 2015

It is impossible to serialize a V8 object, or convert a string back to a V8 object, off the main thread. This is a limitation of the VM and not something io.js can implement on its own.

@Qard
Copy link
Member

@Qard Qard commented Jun 22, 2015

It could conceivably be done using an abstract representation of V8 object types outside V8 that can be converted to and from real V8 types on the main thread. I can't really comment on what the performance would be like though...it might be even worse than the regular sync calls. It'd also be rather complicated.

@Fishrock123
Copy link
Member

@Fishrock123 Fishrock123 commented Jun 22, 2015

It would be possible to offload this from the main thread, and much simpler to do, in a worker (working proposal, see: #1159).

Though it'd only be worthwhile if it was large JSON objects.

@ChALkeR
Copy link
Member

@ChALkeR ChALkeR commented Jun 23, 2015

@asilvas How is that supposed to work if you change the object while it's being serialized?

@bnoordhuis
Copy link
Member

@bnoordhuis bnoordhuis commented Jun 23, 2015

Something (exactly what TBD) might be possible once V8 implements typed objects. I don't expect that to happen anytime soon though.

@trevnorris
Copy link
Contributor

@trevnorris trevnorris commented Jun 23, 2015

@Fishrock123 In order to migrate the object from one thread to another it would have to be serialized. There's no way to share native objects across Isolates.

@benjamingr
Copy link
Member

@benjamingr benjamingr commented Jun 23, 2015

Yes, as trev said this is impossible - v8 isolates can not share structured memory only buffers.

You can, however, write a streaming JSON parser that gives you the next meaningful result as it is ready (for example, when parsing an array, the next element). JSON was never meant for holding large sets of data - it's readable and simple and nice but it's not built for fast serialization or deserialization.

Also, for what it's worth - we still have a lot of advancement to make with JSON parsing - for example through type hinting or even schemas which would make it much faster.

In either case- there is nothing core can provide here over a user library because of the technical limitations of v8.

@meandmycode
Copy link

@meandmycode meandmycode commented Jun 23, 2015

I would say, due to the JSON object being part of EcmaScript spec it would probably be more likely to happen if you proposed async versions of each method, that returned Promise's.

@thelinuxlich
Copy link

@thelinuxlich thelinuxlich commented Jun 23, 2015

👍

@chrisdickinson
Copy link
Contributor

@chrisdickinson chrisdickinson commented Jun 24, 2015

From our end, it's not possible to create a JSON.parse that operates in a thread and returns asynchronously — we'd still need to do some marshaling on the objects to get them into V8. @Qard's proposal fits in this vein, as does @Fishrock123's — though for different reasons. In the former case, the text is parsed into structs, and then those structs have to be visited on the main thread and turned into objects. In the latter case, there's no "zero-copy" method for transferring non-Transferrable objects between workers (which is to say, most vanilla JS objects.) @bnoordhuis' suggestion of typed objects also falls into this category — we would still have to marshal the typed objects into JS objects on the main thread.

As @trevnorris notes, V8 would have to support creating objects from another thread for this to be feasible. For the time being the recommendation is to use a streaming JSON parser (like JSONStream) if you need to be able to yield back to the event loop while parsing JSON text.

There may be a case where we'd want to include such a streaming parser at some point, in order to remove a blocking point from the existing JSON-IPC communication, but it hasn't been picked up in quite some time. If you'd like to work on that it'd be great to open a PR or an issue for that specific concern.

@asilvas
Copy link
Author

@asilvas asilvas commented Jun 24, 2015

Understandable response, but I want to point out that JSON Streaming is not a very good alternative. https://github.com/asilvas/json-stream-bench

@isiahmeadows
Copy link

@isiahmeadows isiahmeadows commented Jun 27, 2015

One more thing: JSON.stringify is implemented in almost pure JS, except for the common case of JSON.stringify(value), which is implemented in pure C++.

@vkurchatkin
Copy link
Contributor

@vkurchatkin vkurchatkin commented Aug 1, 2015

@thelinuxlich I'm pretty sure that what you describe is not better than using setTimeout. You are not blocking main thread now, you are blocking it later. json() returns promise because it reads response body asynchronously, not because it parses json in a background thread.

Here is the code from blink: https://github.com/nwjs/blink/blob/be948afff52a140cdb9339c918e62fc71759904e/Source/modules/fetch/Body.cpp#L322

as you can, see parsing is implemented using standard v8 API

@sindresorhus
Copy link

@sindresorhus sindresorhus commented Aug 2, 2015

ESDiscuss thread about writing a spec for async JSON APIs and proposal.

@smasher164
Copy link

@smasher164 smasher164 commented Aug 21, 2015

SharedArrayBuffer is undergoing revision at the moment, for which the current specification can be accessed at this link. http://lars-t-hansen.github.io/ecmascript_sharedmem/shmem.html

It can be mutably shared among workers, without the need for postMessage(). By the way, this is how ASM.js transpiles C++ to Javascript and maintains threading-like functionality. This is referenced in Mozilla's post on the topic. https://blog.mozilla.org/javascript/2015/02/26/the-path-to-parallel-javascript/

@binji
Copy link
Contributor

@binji binji commented Sep 16, 2015

@smasher164 SharedArrayBuffer won't help here. Anything you could do with SAB to parse JSON on a Worker you could do with postMessage. The real issue is mentioned multiple times above, you cannot share objects between isolates without serializing them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
You can’t perform that action at this time.