Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UTF-8 BOM produces 'parse error: Invalid numeric literal' #45

Closed
jklaiho opened this issue Nov 14, 2012 · 2 comments
Closed

UTF-8 BOM produces 'parse error: Invalid numeric literal' #45

jklaiho opened this issue Nov 14, 2012 · 2 comments
Labels

Comments

@jklaiho
Copy link

jklaiho commented Nov 14, 2012

I attempted to pipe some JSON from cURL to jq but got the aforementioned parse error. After validating the JSON with JSONLint and some initial confusion I noticed that the JSON data had the UTF-8 representation of the BOM character (U+FEFF), the three bytes 0xEF 0xBB 0xBF, at the beginning. If I stripped them out before piping to jq, everything worked.

I haven't tried if the same happens with, say, UTF-16 (little or big endian) encoded data, but either way, jq needs to handle the BOM correctly.

@stedolan
Copy link
Contributor

jq doesn't handle UTF-16 documents. The BOM makes no sense at all in UTF-8 documents, how was that document generated?

On the other hand, the BOM can't be misinterpreted as anything else, so it wouldn't break anything to silently ignore one at the start of the document.

@jklaiho
Copy link
Author

jklaiho commented Nov 16, 2012

From Wikipedia: "The Unicode Standard permits the BOM in UTF-8, but does not require or recommend for or against its use. Byte order has no meaning in UTF-8, so its only use in UTF-8 is to signal at the start that the text stream is encoded in UTF-8."

The JSON feed is generated by a third party and I have no control over it. But since a BOM is valid (if pointless) in UTF-8, jq should handle it, and I suppose the silent ignoring approach is just fine.

@dtolnay dtolnay added the bug label Jul 27, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants