Skip to content

v1.4.0: UTF-16 and BOM handling

Compare
Choose a tag to compare
@neilpa neilpa released this 06 Oct 23:30
· 5 commits to master since this release
b2b9a85

Adds support for UTF-16 (LE and BE) encoded JSON files.

In the absence of a BOM, the encoding is detected by a simple heuristic. If the first byte is NUL then UTF-16BE is assumed, if the second byte is NUL then UTF-16LE, otherwise UTF-8.

If the file starts with a BOM this is normally treated as an error for JSON. However, this behavior can be overridden with the new -b flag that will use the BOM to determine the encoding. This is based on the language of section 8.1 of RFC 8259 (the current JSON spec).

Note that UTF-16 already worked for YAML files if a BOM existed (as required by the spec).

See also #13 for more discussion.