Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parse single long string value #996

Closed
ghost opened this issue Jun 22, 2017 · 1 comment
Closed

Parse single long string value #996

ghost opened this issue Jun 22, 2017 · 1 comment

Comments

@ghost
Copy link

ghost commented Jun 22, 2017

I face a quite strange problem recently:

Our system's previous design uses a strange JSON encoded attachments upload, just like what npm publish does. For example:

{ "foo": "bar", "_attachments": "base 64 encoded attachment" }

The problem is the _attachments field now grows over GBs, then the server has a strong memory pressure.

The rapid json may parse the JSON into this form:

Stream -> SAX Parser -> callback (Key: String)

But I prefer this one for this stupid situation:

Stream -> SAX Parser -> callback (Key: Stream)

That's the point: maybe we can provide a new option, for very large string, like over 1M, rapidjson stop reading more, but return a stream instead.

I know that is a stupid idea to just use JSON to store attachments, and base64 will multiply the size of data by 1.5. But I can't change the protocol except a rewrite.

@ghost ghost changed the title Parse single large field Parse single long string value Jun 22, 2017
@StilesCrisis
Copy link
Contributor

Take a look at the example "simplepullreader.cpp"

You can parse the JSON right up until you get the key "_attachments" and then stop parsing entirely. At that point your JSON stream will be pointing right to the start of the attachment block. If you mess with the stream you won't be able to continue parsing anymore (unless you rewind it manually), but it sounds like you are OK with this.

Of course you will lose things like Unicode conversion or backslash de-escaping.

@pagict pagict closed this as completed Oct 15, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants