Skip to content

Address isomorphic string and ByteString guidance #158

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: gh-pages
Choose a base branch
from

Conversation

aphillips
Copy link
Contributor

@aphillips aphillips commented Apr 18, 2025

Fixes #151

Adds a new subsection about byte-oriented formats. Adds guidance about ByteString and isomorphic strings. Moves the guidance about not defining legacy encoding to the encoding section.


Preview | Diff

Fixes w3c#151

Adds a new subsection about byte-oriented formats.
Adds guidance about ByteString and isomorphic strings.
Moves the guidance about not defining legancy encoding to the encoding
section.
@aphillips aphillips requested review from xfq and r12a April 18, 2025 18:22
Copy link

netlify bot commented Apr 18, 2025

Deploy Preview for bp-i18n-specdev ready!

Name Link
🔨 Latest commit 7e0573e
🔍 Latest deploy log https://app.netlify.com/projects/bp-i18n-specdev/deploys/682f37e1693fe400084e3798
😎 Deploy Preview https://deploy-preview-158--bp-i18n-specdev.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@aphillips
Copy link
Contributor Author

@annevk Mentioning you for a potential review. I'm working here to make the last bits of consistency with https://www.w3.org/TR/design-principles/#idl-string-types especially your comment here

index.html Outdated
<p>See also <a href="#char_choosing"></a>.</p>
</details>
</div>
<p>{{ByteString}} isn’t a general-purpose string type. The type {{ByteString}} defines strings as sequences of bytes (octets). Interpretation of byte strings thus requires the specification of a [=character encoding form=]. UTF-8 is the preferred encoding for wire and document formats on the Web [[ENCODING]] or the Internet in general [[RFC3629]]. If the field is encoded in UTF-8, there is rarely a reason to interact with it as a byte sequence.</p>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still confusing. A ByteString maps to an isomorphic string. There's no choice of encoding.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good callout. I tried to stick to the text in design principles, WebIDL and INFRA, but have to admit that the text there is fairly mysterious. I've refried this section again. See what you think. Suggestions welcome.

- define 'byte string'
- quote HTTP RFC 9112
- rewrite guidance
@aphillips aphillips requested a review from annevk April 23, 2025 20:39
index.html Outdated

<p>It is preferable, however, to specify these fields as a {{DOMString}} (or, rarely, a {{USVString}}), since the data encoded into these fields must be serialized from and deserialized into in-memory string representations, such as the [[DOM]] or JavaScript strings or your platform's native Unicode string type.</p>
</aside>
<p>{{ByteString}} isn’t a general-purpose string type. Frequently processing of these will be done by performing an [=isomorphic decode=] of the {{ByteString}} into an [=isomorphic string=] or by performing an [=isomorphic encode=] of such a string back into bytes [[INFRA]]. (It is also possible that the specification with work with the bytes directly.)</p>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per Web IDL a ByteString always maps to an isomorphic string. I suppose it's true that unusual processing might take place, but in all cases I can recall we leave it at that. And unusual processing might take place for all string types, but we're not calling that out either so it seems weird to treat this type so circumspect.

index.html Outdated
</aside>
<p>{{ByteString}} isn’t a general-purpose string type. Frequently processing of these will be done by performing an [=isomorphic decode=] of the {{ByteString}} into an [=isomorphic string=] or by performing an [=isomorphic encode=] of such a string back into bytes [[INFRA]]. (It is also possible that the specification with work with the bytes directly.)</p>

<p>{{ByteString}} should not be confused with the more general term [=byte string=].</p>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather no specification would be able to use a term like "byte string". That does seem like it could lead to confusion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Mention IsomorphicString and update ByteString guidance appropriately
2 participants