Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should Keywords remove restriction to Latin-1? #1616

Closed
aphillips opened this issue Nov 15, 2022 · 2 comments
Closed

Should Keywords remove restriction to Latin-1? #1616

aphillips opened this issue Nov 15, 2022 · 2 comments
Labels
rejected s:png https://w3c.github.io/png/

Comments

@aphillips
Copy link
Contributor

aphillips commented Nov 15, 2022

Proposed comment

Keywords and text string
https://www.w3.org/TR/2022/WD-png-3-20221025/#11keywords

Keywords shall contain only printable Latin-1 [ISO_8859-1] characters and spaces; that is, only character codes 32-126 and 161-255 decimal are allowed. To reduce the chances for human misreading of a keyword, leading spaces, trailing spaces, and consecutive spaces are not permitted in keywords, nor is the non-breaking space (code 160) since it is visually indistinguishable from an ordinary space.

While the above is probably true/necessary for historical reasons, it seems extremely limiting to restrict the title/author/description/etc. to printable L-1 characters. This statement is really only true for tEXt and zTXt chunks. If iTXt is used instead, the shall normative text shown here wouldn't apply, right?

Failing that, these restrictions seem onerous for non-Latin/non-Latin-1 languages. If the various keywords in question really must be restricted, providing a mechanism to smuggle in UTF-8, such as using the UTF-8 BOM, would perhaps make sense. However, I don't think that's strictly necessary.

I'd also observe that generally hex is preferred to decimal when referring to code point ranges. For example, instead of the non-break space (code 160) it should say U+00A0 NON-BREAKING SPACE and the ranges "32-126" and "161-255" would be 0x20-7E and 0xA1-FF respectively. This is easier for folks familiar with Unicode ;-).

Instructions:

This follows the process at https://w3c.github.io/i18n-activity/guidelines/review-instructions.html

  1. Create the review comment you want to propose by replacing the prompts above these instructions, but LEAVE ALL THE INSTRUCTIONS INTACT

  2. Set a label to identify the spec: this starts with s: followed by the spec's short name. If you are unable to do that, ask a W3C staff contact to help.

  3. Ask the i18n WG to review your comment.

  4. After discussion with the i18n WG, raise an issue in the repository of the WG that owns the spec. Use the text above these instructions as the starting point for that comment, but add any suggestions that arose from the i18n WG. In the other WG's repo, add an 'i18n-needs-resolution' label to the new issue. If you think any of the participants in layout requirements task force groups would be interested in following the discussion, add also the appropriate i18n-*lreq label(s).

  5. Delete the text below that says 'url_for_the_issue_raised', then add in its place the URL for the issue you raised in the other WG's repository. Do NOT remove the initial '§ '. Do NOT use [...](...) notation – you need to delete the placeholder, then paste the URL.

  6. Remove the 'pending' label, and add a 'needs-resolution' tag to this tracker issue.

  7. If you added an *lreq label, add the label 'spec-type-issue', add the corresponding language label, and a label to indicate the relevant typographic feature(s), eg. 'i:line_breaking'. The latter represent categories related to the Language Enablement Index, and all start with i:.

  8. Edit this issue to REMOVE ALL THE INSTRUCTIONS & THE PROPOSED COMMENT, ie. the line below that is '---' and all the text before it to the very start of the issue.


This is a tracker issue. Only discuss things here if they are i18n WG internal meta-discussions about the issue. Contribute to the actual discussion at the following link:

§ url_for_the_issue_raised

@aphillips aphillips added pending Issue not yet sent to WG, or raised by tracker tool & needing labels. s:png https://w3c.github.io/png/ labels Nov 15, 2022
@himorin
Copy link
Contributor

himorin commented Nov 16, 2022

(memo on comment to be spoken up in i18n call - sorry to use here, but I should forget if not...)
I believe Keyword is in Latin-1 even if iTXt, since Translated Keyword is defined as a counter part of Keyword and is allowed to use UTF-8.

@aphillips aphillips added rejected and removed pending Issue not yet sent to WG, or raised by tracker tool & needing labels. labels Nov 22, 2022
@aphillips
Copy link
Contributor Author

We decided not to post this except for the comment about character numbering/naming, for which I created #1621 as a separate issue. The L1 restriction does not make I18N happy, but this is a legacy issue. If PNG were a new spec, we'd plump for either UTF-8 or ASCII-only keywords.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
rejected s:png https://w3c.github.io/png/
Projects
None yet
Development

No branches or pull requests

2 participants