Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shift-JIS encoding/decoding support #61

Closed
r12a opened this issue Jun 20, 2016 · 13 comments
Closed

Shift-JIS encoding/decoding support #61

r12a opened this issue Jun 20, 2016 · 13 comments
Labels

Comments

@r12a
Copy link
Collaborator

r12a commented Jun 20, 2016

Results for a series of tests for Shift-JIS encoding/decoding can be found at
https://www.w3.org/International/tests/repo/results/encoding-dbl-byte.en#shiftjis

The tests can be run from that page (select the link in the left-most column) or get the tests from the WPT repo. There is a PR at
web-platform-tests/wpt#3200

The tests check whether:

  1. the browser produces the expected byte sequences for all characters in the shift_jis encoding after 0x9F when encoding bytes for a URL produced by a form, using the encoder steps in the specification.
  2. the browser produces percent-escaped character references for a URL produced by a form when encoding miscellaneous characters that are not in the shift_jis encoding. (tests for several ranges)
  3. same two types of test when writing characters to an href value
  4. the browser decodes all characters as expected from a file generated by encoding all pointers in the shift_jis encoding per the shift_jis encoder steps in the specification.
  5. the browser decodes characters that are not recognised from the shift_jis index as replacement characters.

The following summarises the current situation according to my testing, for major desktop browsers. (I will be adding nightly results and perhaps other browsers in time.) The table lists the number of characters that were NOT successfully converted by the test.

screen shot 2016-06-20 at 17 15 46

Notes:

  • Edge fails all href encode tests because characters are not converted to percent-escapes in the href attribute.
  • Firefox fails all href encode tests for characters not in the encoding because it converts characters to percent-escaped Unicode values instead.

Can we please investigate the failures to ascertain whether:

  1. the browser needs to be changed
  2. the spec needs to be changed
  3. the test is at fault

The following tool may be helpful for investigating issues. It converts between byte sequences and characters for all encodings in the Encoding spec. http://r12a.github.io/apps/encodings/

@annevk
Copy link
Member

annevk commented Jun 21, 2016

The single encode failure in Firefox seems like a bug (does not map U+0080 to 0x80).

@jungshik
Copy link

Again, Chromium's form(misc) test failure has the same root cause as #62 and #59.
See https://bugs.chromium.org/p/chromium/issues/detail?id=647568

@vyv03354
Copy link
Collaborator

Firefox Nightly 56 passed all tests.

@jungshik
Copy link

Chrome fixed this issue last September.

@r12a
Copy link
Collaborator Author

r12a commented Jun 15, 2017

Today and yesterday i updated the results at https://www.w3.org/International/tests/repo/results/encoding-dbl-byte.en#shiftjis for Firefox, FNightly, Chrome, and Canary. The latest summary is:

screen shot 2017-06-15 at 08 29 26

@hsivonen
Copy link
Member

Thank you. The Shift_JIS tests LGTM for merging into WPT. /cc @domenic

@domenic
Copy link
Member

domenic commented Jun 15, 2017

Let's close this as web-platform-tests/wpt#6257 is ready to merge.

@domenic domenic closed this as completed Jun 15, 2017
@r12a
Copy link
Collaborator Author

r12a commented Jun 16, 2017

@domenic first, thanks for your help with this. However, these issues were intended to track browser implementations, rather than to track the test development. Since we still have significant progress to make wrt Safari and Edge, i'd like to reopen them.

Btw, perhaps we should also contact the Safari and Edge folks again, pointing to these issues (since they show the results), and ask whether any progress could be made in conforming to the Encoding spec.

@domenic
Copy link
Member

domenic commented Jun 16, 2017

Hmm... I thought the issues on the browser bug trackers would track browser implementations, whereas issues on the spec bug trackers would track spec implementations. What am I missing?

@r12a
Copy link
Collaborator Author

r12a commented Jun 16, 2017

Each of these issues summarises and tracks the cross-browser status of the implementation of an index from the pov of the Encoding spec. I would expect to close them only when we no longer expect further progress from any of the major browsers. In the meantime, the issue at hand is that "the Encoding spec is not interoperably implemented for encoding/decoding X".

Each issue also points to the various browser bugs for a given encoding/decoding, which is handy.

They also allow for discussions related to the indexes that arise from testing and that span more than one browser. (It's also often been a useful place for implementers to raise issues they encounter with the tests themselves during implementation, which can be discussed in a cross-browser way. Having that in one place rather than spread across several issues is also handy.)

@domenic
Copy link
Member

domenic commented Jun 16, 2017

Hmm OK, well I guess I'm happy to reopen, although I'm not sure that's quite the intended use of the spec issue tracker, since I don't see what other work we can do on the spec side here (especially given the convergence of two browsers with the spec so that the spec is quite unlikely to change).

But I'll leave the general question about issue tracker usage for @annevk to decide when he gets back from vacation, and in the meantime err on the side of caution by reopening.

@annevk
Copy link
Member

annevk commented Oct 17, 2018

Now that Firefox passes all these tests and a year has passed, I'm happy to consider this done. A new issue would also be less noisy at this point, were one warranted.

@annevk annevk closed this as completed Oct 17, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

6 participants