Skip to content

Conversation

Kmeakin
Copy link
Contributor

@Kmeakin Kmeakin commented Sep 3, 2025

Split off from #145219

The ASCII subset of Unicode is fixed and will never change, so we don't
need to generate tables for it with every new Unicode version. This
saves a few bytes of static data and speeds up `char::is_control` and
`char::is_grapheme_extended` on ASCII inputs.

Since the table lookup functions exported from the `unicode` module will
give nonsensical errors on ASCII input (and in fact will panic in debug
mode), I had to add some private wrapper methods to `char` which check
for ASCII-ness first.
@rustbot
Copy link
Collaborator

rustbot commented Sep 3, 2025

r? @jhpratt

rustbot has assigned @jhpratt.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Sep 3, 2025
@rustbot
Copy link
Collaborator

rustbot commented Sep 3, 2025

library/core/src/unicode/unicode_data.rs is generated by the src/tools/unicode-table-generator tool.

If you want to modify unicode_data.rs, please modify the tool then regenerate the library source file via ./x run src/tools/unicode-table-generator instead of editing unicode_data.rs manually.

@jhpratt
Copy link
Member

jhpratt commented Sep 4, 2025

The change itself looks fine, but let's check perf just in case. Reviewing the other discussion, it seems that the team would like to see the impact before potentially merging this.

@bors2 try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

rust-bors bot added a commit that referenced this pull request Sep 4, 2025
Don't include ASCII characters in Unicode tables
@rust-bors

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Sep 4, 2025
@rust-bors
Copy link

rust-bors bot commented Sep 4, 2025

☀️ Try build successful (CI)
Build commit: cdb994f (cdb994f1ff4d7f9776a14f9b741a104e6119f9a0, parent: a1208bf765ba783ee4ebdc4c29ab0a0c215806ef)

@rust-timer
Copy link
Collaborator

Queued cdb994f with parent a1208bf, future comparison URL.
There are currently 2 preceding artifacts in the queue.
It will probably take at least ~2.3 hours until the benchmark run finishes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-perf Status: Waiting on a perf run to be completed. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants