Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move non-ascii-idents content from unstable book to reference. #999

Merged
merged 1 commit into from
Apr 19, 2021

Conversation

crlf0710
Copy link
Member

@crlf0710 crlf0710 commented Apr 3, 2021

non-ascii-idents feature is now in fcp. Moving its unstable book contents to reference.

@ehuss
Copy link
Contributor

ehuss commented Apr 3, 2021

Thanks for the PR!

Would you be willing to help updating with the details of various restrictions and behaviors? Or at least help identify the list of them? The things I'm aware of:

  • Explicitly specify the Unicode version.
  • Does not allow non-ascii idents:
    • File-loaded modules
    • no_mangle items
    • extern crate
    • paths referencing external crates
  • The RFC mentioned rejecting non-ascii extern mod items, was that removed?
  • How does normalization work? If it is the same as specified in UAX #31, maybe it can just link to that? If not, then it should probably be spelled out clearly. Also, when does the normalization happen (particularly in relation to macros)?

@crlf0710
Copy link
Member Author

crlf0710 commented Apr 6, 2021

Thanks for the comment!

  1. Unicode version. Following Rust's overall Unicode strategy, the strategy here is to update to the newest release of new versions of Unicode on a regular basis. We rely on Unicode's backward compatibility guarantee to make sure existing code doesn't broke. Notability, the confusable detection lints related to this feature doesn't have this guarentee (the confusable set is expected to grow over time), which is why we advice that these lints' level not set to forbid

From the implementation side, unicode-related data for this feature is scattered within in unicode-xid, unicode-normalization and unicode-security. Currently all of them are using Unicode 13.0, and all these crates (and a few others, including libcore) needs an upgrade when new Unicode version is released.

  1. Nothing to add here, but "paths referencing external crates" cannot use non-ascii idents only in the crate name part.

  2. This was not mentioned during the implementation. I think this might be an oversight. I'll create a rustc issue for it. (Non-ASCII identifiers should be restricted within extern blocks. rust#83923)

  3. Normalization is according to the RFC, using NFC. There is not exactly the same with UAX #31, since that recommends normalizing to NFKC instead. The reasons are described within the RFC text.

Time when the normalization happens:

  1. For casual lexing, whenever before a symbol is created, the symbol string is normalized to nfc form first.
  2. For proc-macros, whenever the proc-macro-server crate creates an identifier with Ident::new, the symbol is normalized and re-interned.

Copy link
Member

@Manishearth Manishearth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be good to go when rust-lang/rust#83799 lands. It would be nice to mention that "in some contexts non-ascii identifiers are not allowed, like extern functions"

@Manishearth Manishearth merged commit bf03819 into rust-lang:master Apr 19, 2021
@Manishearth
Copy link
Member

Merging as is since this is strictly an improvement, but we should probably work on adding more about this.

m-ou-se added a commit to m-ou-se/rust that referenced this pull request Apr 28, 2021
Update books

## reference

5 commits in e1abb17cd94cd5a8a374b48e1bc8134a2208ed48..d23f9da8469617e6c81121d9fd123443df70595d
2021-04-07 08:09:48 -0700 to 2021-04-28 11:16:44 -0700
- Document or-patterns (rust-lang/reference#957)
- fixed a typo in traits.md (rust-lang/reference#1009)
- Improve clarity and style consistency of crate type list (rust-lang/reference#1005)
- added macro_rules to weak keywords (rust-lang/reference#1008)
- Move non-ascii-idents content from unstable book to reference. (rust-lang/reference#999)

## book

1 commits in b54090a99ec7c4b46a5203a9c927fdbc311bb1f5..50dd06cb71beb27fdc0eebade5509cdcc1f821ed
2021-03-24 11:21:46 -0500 to 2021-04-23 13:21:54 -0500
- Update link in COPYRIGHT (http to https) (rust-lang/book#2704)

## rust-by-example

3 commits in c80f0b09fc15b9251825343be910c08531938ab2..e0a721f5202e6d9bec0aff99f10e44480c0da9e7
2021-04-08 10:28:17 -0300 to 2021-04-27 09:32:15 -0300
- broken long comments in src/types/cast.md to several shortones (rust-lang/rust-by-example#1430)
- Fix link of formatting traits (rust-lang/rust-by-example#1410)
- chore: Fix the indention of Borrowed definition (rust-lang/rust-by-example#1436)

## rustc-dev-guide

8 commits in a9bd2bbf31e4f92b5d3d8e80b22839d0cc7a2022..e72b43a64925ce053dc7830e21c1a57ba00499bd
2021-04-09 18:12:21 -0400 to 2021-04-27 12:35:37 -0700
- Suggest using `git range-diff` (rust-lang/rustc-dev-guide#1092)
- Remove the possible unnecessary flag
- Replace some Travis-related things completely
- Trigger GHA only on the original repo
- Add sample nix shell
- more RA config suggestions (rust-lang/rustc-dev-guide#1114)
- Add Polymorphisation paper (rust-lang/rustc-dev-guide#1093)
- Mention unpretty=mir-cfg for debugging MIR
jackh726 added a commit to jackh726/rust that referenced this pull request Apr 28, 2021
Update books

## reference

5 commits in e1abb17cd94cd5a8a374b48e1bc8134a2208ed48..d23f9da8469617e6c81121d9fd123443df70595d
2021-04-07 08:09:48 -0700 to 2021-04-28 11:16:44 -0700
- Document or-patterns (rust-lang/reference#957)
- fixed a typo in traits.md (rust-lang/reference#1009)
- Improve clarity and style consistency of crate type list (rust-lang/reference#1005)
- added macro_rules to weak keywords (rust-lang/reference#1008)
- Move non-ascii-idents content from unstable book to reference. (rust-lang/reference#999)

## book

1 commits in b54090a99ec7c4b46a5203a9c927fdbc311bb1f5..50dd06cb71beb27fdc0eebade5509cdcc1f821ed
2021-03-24 11:21:46 -0500 to 2021-04-23 13:21:54 -0500
- Update link in COPYRIGHT (http to https) (rust-lang/book#2704)

## rust-by-example

3 commits in c80f0b09fc15b9251825343be910c08531938ab2..e0a721f5202e6d9bec0aff99f10e44480c0da9e7
2021-04-08 10:28:17 -0300 to 2021-04-27 09:32:15 -0300
- broken long comments in src/types/cast.md to several shortones (rust-lang/rust-by-example#1430)
- Fix link of formatting traits (rust-lang/rust-by-example#1410)
- chore: Fix the indention of Borrowed definition (rust-lang/rust-by-example#1436)

## rustc-dev-guide

8 commits in a9bd2bbf31e4f92b5d3d8e80b22839d0cc7a2022..e72b43a64925ce053dc7830e21c1a57ba00499bd
2021-04-09 18:12:21 -0400 to 2021-04-27 12:35:37 -0700
- Suggest using `git range-diff` (rust-lang/rustc-dev-guide#1092)
- Remove the possible unnecessary flag
- Replace some Travis-related things completely
- Trigger GHA only on the original repo
- Add sample nix shell
- more RA config suggestions (rust-lang/rustc-dev-guide#1114)
- Add Polymorphisation paper (rust-lang/rustc-dev-guide#1093)
- Mention unpretty=mir-cfg for debugging MIR
jackh726 added a commit to jackh726/rust that referenced this pull request Apr 29, 2021
Update books

## reference

5 commits in e1abb17cd94cd5a8a374b48e1bc8134a2208ed48..d23f9da8469617e6c81121d9fd123443df70595d
2021-04-07 08:09:48 -0700 to 2021-04-28 11:16:44 -0700
- Document or-patterns (rust-lang/reference#957)
- fixed a typo in traits.md (rust-lang/reference#1009)
- Improve clarity and style consistency of crate type list (rust-lang/reference#1005)
- added macro_rules to weak keywords (rust-lang/reference#1008)
- Move non-ascii-idents content from unstable book to reference. (rust-lang/reference#999)

## book

1 commits in b54090a99ec7c4b46a5203a9c927fdbc311bb1f5..50dd06cb71beb27fdc0eebade5509cdcc1f821ed
2021-03-24 11:21:46 -0500 to 2021-04-23 13:21:54 -0500
- Update link in COPYRIGHT (http to https) (rust-lang/book#2704)

## rust-by-example

3 commits in c80f0b09fc15b9251825343be910c08531938ab2..e0a721f5202e6d9bec0aff99f10e44480c0da9e7
2021-04-08 10:28:17 -0300 to 2021-04-27 09:32:15 -0300
- broken long comments in src/types/cast.md to several shortones (rust-lang/rust-by-example#1430)
- Fix link of formatting traits (rust-lang/rust-by-example#1410)
- chore: Fix the indention of Borrowed definition (rust-lang/rust-by-example#1436)

## rustc-dev-guide

8 commits in a9bd2bbf31e4f92b5d3d8e80b22839d0cc7a2022..e72b43a64925ce053dc7830e21c1a57ba00499bd
2021-04-09 18:12:21 -0400 to 2021-04-27 12:35:37 -0700
- Suggest using `git range-diff` (rust-lang/rustc-dev-guide#1092)
- Remove the possible unnecessary flag
- Replace some Travis-related things completely
- Trigger GHA only on the original repo
- Add sample nix shell
- more RA config suggestions (rust-lang/rustc-dev-guide#1114)
- Add Polymorphisation paper (rust-lang/rustc-dev-guide#1093)
- Mention unpretty=mir-cfg for debugging MIR
jackh726 added a commit to jackh726/rust that referenced this pull request Apr 29, 2021
Update books

## reference

5 commits in e1abb17cd94cd5a8a374b48e1bc8134a2208ed48..d23f9da8469617e6c81121d9fd123443df70595d
2021-04-07 08:09:48 -0700 to 2021-04-28 11:16:44 -0700
- Document or-patterns (rust-lang/reference#957)
- fixed a typo in traits.md (rust-lang/reference#1009)
- Improve clarity and style consistency of crate type list (rust-lang/reference#1005)
- added macro_rules to weak keywords (rust-lang/reference#1008)
- Move non-ascii-idents content from unstable book to reference. (rust-lang/reference#999)

## book

1 commits in b54090a99ec7c4b46a5203a9c927fdbc311bb1f5..50dd06cb71beb27fdc0eebade5509cdcc1f821ed
2021-03-24 11:21:46 -0500 to 2021-04-23 13:21:54 -0500
- Update link in COPYRIGHT (http to https) (rust-lang/book#2704)

## rust-by-example

3 commits in c80f0b09fc15b9251825343be910c08531938ab2..e0a721f5202e6d9bec0aff99f10e44480c0da9e7
2021-04-08 10:28:17 -0300 to 2021-04-27 09:32:15 -0300
- broken long comments in src/types/cast.md to several shortones (rust-lang/rust-by-example#1430)
- Fix link of formatting traits (rust-lang/rust-by-example#1410)
- chore: Fix the indention of Borrowed definition (rust-lang/rust-by-example#1436)

## rustc-dev-guide

8 commits in a9bd2bbf31e4f92b5d3d8e80b22839d0cc7a2022..e72b43a64925ce053dc7830e21c1a57ba00499bd
2021-04-09 18:12:21 -0400 to 2021-04-27 12:35:37 -0700
- Suggest using `git range-diff` (rust-lang/rustc-dev-guide#1092)
- Remove the possible unnecessary flag
- Replace some Travis-related things completely
- Trigger GHA only on the original repo
- Add sample nix shell
- more RA config suggestions (rust-lang/rustc-dev-guide#1114)
- Add Polymorphisation paper (rust-lang/rustc-dev-guide#1093)
- Mention unpretty=mir-cfg for debugging MIR
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants