Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposed erratum: if a UTF-8 grammar begins with a BOM, the BOM must be ignored #174

Closed
ndw opened this issue Apr 13, 2023 · 3 comments · Fixed by #178
Closed

Proposed erratum: if a UTF-8 grammar begins with a BOM, the BOM must be ignored #174

ndw opened this issue Apr 13, 2023 · 3 comments · Fixed by #178

Comments

@ndw
Copy link
Contributor

ndw commented Apr 13, 2023

E002: Ignore UTF-8 BOM

Implementations are instructed to ignore the BOM if it occurs at the beginning of a grammar encoded in UTF-8.

The second paragraph in the description of the grammar is changed to:

A grammar is an optional prolog, followed by a sequence of one or more rules, surrounded and separated by spacing and comments. Spacing and comments are entirely optional, except that rules must be separated by at least one of either (error S01). If an input grammar encoded in UTF-8 begins with a byte order mark (BOM), the BOM must be ignored

@graydon2014
Copy link

I would like to say this is not sufficient, because the input could have been constructed by concatenating UTF-8 files which each begin with a BOM.

@ndw
Copy link
Contributor Author

ndw commented Apr 13, 2023

That feels like a bug in the program you're using to concatenate them. And critically U+FEFF has another meaning when not at the beginning of the file, it's a ZERO WIDTH NO-BREAK SPACE. I bet the Unicode consortium regrets that decision!

@cmsmcq
Copy link
Contributor

cmsmcq commented May 8, 2023

Should there also be a rule about BOMs in the input string? Since we sometimes speak as if we believe ixml could be used to parse binary data, perhaps any rule about BOMs in the input string should use SHOULD, not MUST. Or am I missing something?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants