Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GB18030] Wrong codepoint at index 7533 #271

Closed
ldelabre opened this issue Aug 8, 2021 · 4 comments
Closed

[GB18030] Wrong codepoint at index 7533 #271

ldelabre opened this issue Aug 8, 2021 · 4 comments

Comments

@ldelabre
Copy link

ldelabre commented Aug 8, 2021

Hi,
When using https://encoding.spec.whatwg.org/index-gb18030.txt, pointer 7533 should decode as U+1E3F which is wrong according to https://www.w3.org/International/tests/repo/encoding/legacy-mb-schinese/gb18030/gb18030_chars.html.
The byte sequence 0xA8 0xBC decodes as U+E7C7, not U+1E3F.

Regards,
Ludovic.

@hsivonen
Copy link
Member

hsivonen commented Aug 9, 2021

The index gb18030 ranges pointer operation special-cases U+E7C7 for the encoder side, so on surface, this looks like an error in generating the test case.

@r12a, is this a test error or an intentional disagreement with the spec?

@annevk
Copy link
Member

annevk commented Aug 9, 2021

The source for this reads

<span data-cp="E7C7" data-bytes="A8 BC">ḿ</span>

and ḿ is U+1E3F so I think there's an error of sorts here in the test.

@vyv03354
Copy link
Collaborator

vyv03354 commented Aug 9, 2021

Maybe the test does not reflect #26?

@annevk
Copy link
Member

annevk commented Aug 10, 2021

Let's duplicate this into #57. Those tests were not maintained and haven't been successfully upstreamed into web-platform-tests. Anyone is welcome to do that work, but they will have to make them match the specification first.

@annevk annevk closed this as completed Aug 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

4 participants