Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hz-gb-2312 encoding and WHATWG compatibility #84

Closed
aneeshusa opened this issue Apr 13, 2015 · 4 comments · Fixed by #85
Closed

hz-gb-2312 encoding and WHATWG compatibility #84

aneeshusa opened this issue Apr 13, 2015 · 4 comments · Fixed by #85

Comments

@aneeshusa
Copy link
Contributor

The WHATWG Encoding Spec lists hz-gb-2312 as mapping to the replacement encoding, which uses the UTF-8 encoder and throws a special replacement encoding error for its decoder. However, it looks like this crate implements the actual HZ encoding. For WHATWG compatibility, this would have to get folded in with the rest of the replacement encodings, but I don't know if that's acceptable considering other people may be using the current implementation.

Would you prefer to maintain strict WHATWG compatibility or keep the current implementation? If the current implementation is kept, this deviation needs to be well documented - it isn't too hard to work around, but is a bit annoying and could catch someone unaware because the rest of the crate is compatible.

@lifthrasiir
Copy link
Owner

Confirmed. This is another piece of change I've missed. It would be enough to fix the encoding_from_whatwg_label and whatwg_name.

Any deviation from the current WHATWG specification is unintentional and to be fixed. Please file an issue or PR dealing with such deviations. (I'm kind of lazy and not always aware of all changes to the specification, but I think at some point I have implemented all encodings in the specification correctly.)

@aneeshusa
Copy link
Contributor Author

Haha, I went through the whole spec and this is the last one I've found with regard to names and labels. Do you just want to ...rip out the entire HZ implementation though? I think it'd be a shame to throw it away/not expose it some other way.

@lifthrasiir
Copy link
Owner

@aneeshusa I want to keep the encoding, simply making it invisible from encoding_from_whatwg_label.

@aneeshusa
Copy link
Contributor Author

OK, that's reasonable. I can put in a PR for that in a few minutes (it should be only a line, I think.)

aneeshusa added a commit to aneeshusa/rust-encoding that referenced this issue Apr 13, 2015
Fixes lifthrasiir#84.

Keep the current HZ implementation, but surface hz-gb-2312 as
replacement in encoding_from_whatwg_label to match the spec.

This maintains WHATWG compatibility and also allows use of the actual HZ
encoding/decoding implementation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants