Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not allow surrogate code points in ABNF #619

Merged
merged 3 commits into from
May 27, 2019
Merged

Do not allow surrogate code points in ABNF #619

merged 3 commits into from
May 27, 2019

Conversation

sevmeyer
Copy link
Contributor

@sevmeyer sevmeyer commented May 11, 2019

This is more a suggestion than a request.

I noticed that the current ABNF grammar allows UTF-16 surrogate code points (0xD800 - 0xDFFF) in strings and comments. This seems inconsistent, because it is invalid to insert these code points via escape sequences (e.g. \uD800).

This commit excludes the surrogate range. Because this affects five separate rules, I added a reusable rule non-ascii, to reduce redundancy.

@pradyunsg
Copy link
Member

Thanks for finding this and filing this PR (and for your patience!) @sevmeyer.

I'm split on whether we should include a note on this in the specification. We should however add a note to the changelog. Would you be willing to update your PR to do that?

@pradyunsg pradyunsg added this to Nice to Haves in TOML 1.0 May 25, 2019
@pradyunsg pradyunsg moved this from Nice to Haves to Critical Path in TOML 1.0 May 25, 2019
@sevmeyer
Copy link
Contributor Author

I have added an entry to CHANGELOG.md, if that is alright.

I don't think the specification needs any changes.
It already excludes the surrogate range with the statement:

A TOML file must be a valid UTF-8 encoded Unicode document.

According to Unicode 12.0, D92:

Because surrogate code points are not Unicode scalar values, any UTF-8 byte sequence that would otherwise map to code points U+D800..U+DFFF is ill-formed.

@pradyunsg pradyunsg merged commit 1138bfc into toml-lang:master May 27, 2019
TOML 1.0 automation moved this from Critical Path to Done May 27, 2019
@pradyunsg
Copy link
Member

Thanks for filing this @sevmeyer, for your patience and for the super quick response!

I tweaked the wording a bit before merging. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

None yet

2 participants