Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Streaming multi-byte UTF8 characters not being parsed correctly #13

Open
jlank opened this issue Mar 13, 2013 · 1 comment
Open

Streaming multi-byte UTF8 characters not being parsed correctly #13

jlank opened this issue Mar 13, 2013 · 1 comment

Comments

@jlank
Copy link
Contributor

jlank commented Mar 13, 2013

When streaming data into jsonparse that consists of multi-byte utf8 characters, if a data chunk splits a multi-byte character, jsonparse does not properly reconcile the character between data events. I wrote a quick demo repo to show this behavior and started writing blog post to explain the issue in more detail (not finished). In the meantime check the demo repo out, it has the current implementation and proposed patch working. For more context on this issue see this thread with @mikeal discussing where the "proper" place to reconcile / parse mutli-byte utf8 characters is. I already have a proposed fix written up for jsonparse with test cases, but wanted to open an issue first and get your feedback before I made a PR.

Thanks!

@creationix
Copy link
Owner

I don't see any problem with the patch. Go ahead and send a PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants