Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

<meta charset="utf-8"> missing from intro example #8726

Closed
gnpivo opened this issue Jan 13, 2023 · 4 comments
Closed

<meta charset="utf-8"> missing from intro example #8726

gnpivo opened this issue Jan 13, 2023 · 4 comments
Labels
i18n-tracker Group bringing to attention of Internationalization, or tracked by i18n but not needing response.

Comments

@gnpivo
Copy link

gnpivo commented Jan 13, 2023

The example from the introduction does not have a character encoding declaration. Although the character encoding declaration is not necessary in all circumstances, it would be helpful to include it here to better illustrate a strictly conforming HTML document.

<!DOCTYPE html>
<html lang="en">
 <head>
+ <meta charset="utf-8">
  <title>Sample page</title>
 </head>
 <body>
  <h1>Sample page</h1>
  <p>This is a <a href="demo.html">simple</a> sample.</p>
  <!-- this is a comment -->
 </body>
</html>
@gnpivo
Copy link
Author

gnpivo commented Jan 13, 2023

If we were to make this change, we would also need to update the DOM tree representation later in the section. I would be glad to make a pull request if these changes are appropriate.

@domenic
Copy link
Member

domenic commented Jan 13, 2023

A character encoding declaration is not required for conformant HTML. What is required is that if you include one, its contents be utf-8. But you don't have to include one.

@domenic domenic closed this as completed Jan 13, 2023
@gnpivo
Copy link
Author

gnpivo commented Jan 13, 2023

Thank you for your reply. If I understand the spec correctly, the character encoding declaration is required in a certain circumstance:

If an HTML document does not start with a BOM, and its encoding is not explicitly given by Content-Type metadata, and the document is not an iframe srcdoc document, then the encoding must be specified using a meta element with a charset attribute or a meta element with an http-equiv attribute in the Encoding declaration state.

This is quoted from the relevant section of the specification.

Are we assuming that, for the purposes of the examples, there is either a byte-order marker or Content-Type metadata specified for the document?

@r12a r12a added the i18n-tracker Group bringing to attention of Internationalization, or tracked by i18n but not needing response. label Jan 16, 2023
@aphillips
Copy link
Contributor

I was actioned by the I18N WG with looking into this issue and it's friend #8728.

Our tendency as a WG is to agree that it would be nice to have a <meta charaset="utf-8"> in the "basic example" in the introduction, but in my investigation of the current state of character encoding handling in the HTML suite of specs and in re-reading the introduction, I can see how introducing this nicety would be a distraction. I support closing this issue and 8728 and have marked our i18n-activity mirror issues for close.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
i18n-tracker Group bringing to attention of Internationalization, or tracked by i18n but not needing response.
Development

No branches or pull requests

4 participants