Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simpler implementation is possible? #34

Open
ORESoftware opened this issue Jul 30, 2017 · 8 comments
Open

Simpler implementation is possible? #34

ORESoftware opened this issue Jul 30, 2017 · 8 comments

Comments

@ORESoftware
Copy link

ORESoftware commented Jul 30, 2017

I am probably missing something or failing on edge cases, but I was able to implement a simple JSON parser like so:

https://github.com/ORESoftware/tap-json-parser/blob/master/index.ts

The reason I started working on my own version, was because JSONStream (which uses jsonparse), was failing when parsing stdout that contained non-JSON delimited data. So if JSON and non-JSON is mixed together it appears to fail.

E.g.:

console.log(JSON.stringify({foo:"bar"});
console.log('yolo rolo cholo');
console.log(JSON.stringify({zim:"zam"});

the above should fail JSONStream (and perhaps jsonparse too?).

So I made an attempt, based off of a super simple try/catch on each line of data, and it works.

Maybe you know why my implementation might fail in certain scenarios/edge cases. I am honestly hoping you can tell me why my implementation might be insufficient, so I can fix it.

thanks!

@ORESoftware ORESoftware changed the title Simpler implementation Simpler implementation is possible? Aug 1, 2017
@ORESoftware
Copy link
Author

bump

@creationix
Copy link
Owner

I'm afraid I don't have time at the moment to help with this. If you have a newline delimited stream and a way to break it into discrete events for each line, then using normal JSON.parse with a try..catch on each line is a great technique. Just make sure that your stream handles your lines getting broken/merged at the TCP or other stream layer and something reassembles the lines before passing to the JSON parser.

@ORESoftware
Copy link
Author

@creationix yeah I haven't extensively tested my transform stream to see if works with edge cases. I was looking for some explanation of why the jsonparse implementation is the way it is. It seems overcomplicated, but maybe it's that way for a good reason.

@creationix
Copy link
Owner

My parser is for streaming large JSON documents. This is a much harder problem than simply a stream of small JSON documents with newlines between them. For most things, newline delimited stream of JSON is probably a better format.

@the1mills
Copy link

I don't intend to demand too much of your time but I fail to see how large JSON documents are any different than small ones if they are both simply newline delimited.

Maybe the large JSON documents you are talking about are broken up by newlines, and then I guess I would understand.

For example it makes some sense to me to send a huge array of JSON objects and send each object on a new line.

@the1mills
Copy link

If I can prove my JSON streaming parser with more tests and maybe do a speed test with yours and ensure it's faster for small JSON documents would you be willing to cite my project in your README?

Maybe I can steal some of the tests in your project.

@creationix
Copy link
Owner

creationix commented Aug 3, 2017

I mean:

[
  {"name": "First"},
  {"name": "Second"}
]

vs

{"name": "First"}
{"name": "Second"}

The first is a single large JSON document, the second is several small documents with newlines.

@the1mills
Copy link

Yeah ok that makes sense

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants