Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Browse files
Browse the repository at this point in the history
Add ops for strict decoding of windows-1251/1252
We have new versions of `encode`, `decode`, and `encoderep` that allow a new config flag. These ops differ from the previous encode/decode ops in that by default they decode using strict rather than permissive methods. They throw on encountering a codepoint that is not mapped in the standard and optionally will decode permissively. These new ops all end with *config. While all other ops are `config` variants of current ops, `decoderep` never existed before, so `decoderepconfig` has been added to allow decoding with replacements By default with windows-1252 and windows-1251 these new ops: * Throw when they encounter a character which does not map to the other encoding. * When used under replacement mode (*repconfig), this causes replacements to be done with codepoints that may fit into the target encoding but are invalid (i.e. 129 in windows-1252). This adds new ops: * `decoderepconfig`: Strict and replaces decoded characters that don't have official mappings with a supplied replacement string. Currently it is limited to substituting the first grapheme of the supplied replacement string (should be useful in most cases). * `encodeconfig`: like `encode` but strict by default and new config flag * `decodeconfig`: like `decode` but strict by default and new config flag * `encoderepconfig`: like `encoderep` but strict by default and new config flag
- Loading branch information
Showing
10 changed files
with
462 additions
and
239 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.