-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Codepage GB18030 does not implement the latest version of the standard GB18030-2022 #91068
Comments
Tagging subscribers to this area: @dotnet/area-system-globalization Issue DetailsThe latest version GB18030-2022 specifies three implementation levels building on each other. When testing my application on .NET 6, 7 and 8 for compliance with implementation levels 1 and 2, I found that there is only one missing bit that is not yet fulfilled by the current implementation in GB18030Encoding.cs. All other extensions mandated by GB18030-2022 implementation levels 1&2 perfectly work out of the box. In detail, GB18030-2022 changes a set of code mappings to no longer point to private use area PUA, but rather to codes standardized by Unicode in the meantime. The changed mappings are nicely described in this blog post, section "No PUA Requirement". The Unicode consortium has a pragmatic proposal to implement the changed mappings only into one direction, for ease of transcoding into the standard. With the missing bit implemented, the .NET codepage would be fully compliant with GB18030-2022 implementation levels 1&2.
|
Tagging subscribers to this area: @dotnet/area-system-text-encoding Issue DetailsThe latest version GB18030-2022 specifies three implementation levels building on each other. When testing my application on .NET 6, 7 and 8 for compliance with implementation levels 1 and 2, I found that there is only one missing bit that is not yet fulfilled by the current implementation in GB18030Encoding.cs. All other extensions mandated by GB18030-2022 implementation levels 1&2 perfectly work out of the box. In detail, GB18030-2022 changes a set of code mappings to no longer point to private use area PUA, but rather to codes standardized by Unicode in the meantime. The changed mappings are nicely described in this blog post, section "No PUA Requirement". The Unicode consortium has a pragmatic proposal to implement the changed mappings only into one direction, for ease of transcoding into the standard. With the missing bit implemented, the .NET codepage would be fully compliant with GB18030-2022 implementation levels 1&2.
|
The latest version GB18030-2022 specifies three implementation levels building on each other. When testing my application on .NET 6, 7 and 8 for compliance with implementation levels 1 and 2, I found that there is only one missing bit that is not yet fulfilled by the current implementation in GB18030Encoding.cs. All other extensions mandated by GB18030-2022 implementation levels 1&2 perfectly work out of the box.
In detail, GB18030-2022 changes a set of code mappings to no longer point to private use area PUA, but rather to codes standardized by Unicode in the meantime. The changed mappings are nicely described in this blog post, section "No PUA Requirement". The Unicode consortium has a pragmatic proposal to implement the changed mappings only into one direction, for ease of transcoding into the standard.
With the missing bit implemented, the .NET codepage would be fully compliant with GB18030-2022 implementation levels 1&2.
The text was updated successfully, but these errors were encountered: