Skip to content

Conversation

Kmeakin
Copy link
Contributor

@Kmeakin Kmeakin commented Sep 3, 2025

Split off from #145219

Cased is a derived property - it is the union of the Lowercase property, the Uppercase property, and the Titlecase_Letter general categories. We already have lookup tables for Lowercase and Uppercase, and Titlecase_Letter is very small. So instead of duplicating a lookup table for Cased, just test each of those properties in turn.

This probably will be slower than the old approach, but it is not a public API: it is only used in string::to_lower when deciding when a Greek "sigma" should be mapped to ς or to σ. This is a very rare case, so should not be performance sensitive.

`Cased` is a derived property - it is the union of the `Lowercase`
property, the `Uppercase` property, and the `Titlecase_Letter` general
categories. We already have lookup tables for `Lowercase` and
`Uppercase`, and `Titlecase_Letter` is very small. So instead of
duplicating a lookup table for `Cased`, just test each of those
properties in turn.

This probably will be slower than the old approach, but it is not a
public API: it is only used in `string::to_lower` when deciding when a
Greek "sigma" should be mapped to `ς` or to `σ`. This is a very rare
case, so should not be performance sensitive.
@rustbot
Copy link
Collaborator

rustbot commented Sep 3, 2025

r? @scottmcm

rustbot has assigned @scottmcm.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Sep 3, 2025
@rustbot
Copy link
Collaborator

rustbot commented Sep 3, 2025

library/core/src/unicode/unicode_data.rs is generated by the src/tools/unicode-table-generator tool.

If you want to modify unicode_data.rs, please modify the tool then regenerate the library source file via ./x run src/tools/unicode-table-generator instead of editing unicode_data.rs manually.

@Kmeakin Kmeakin changed the title optimization: Eliminate Cased table Remove Cased Unicode table Sep 3, 2025
@Kobzol
Copy link
Member

Kobzol commented Sep 4, 2025

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

@rust-bors
Copy link

rust-bors bot commented Sep 4, 2025

⌛ Trying commit a765086 with merge abd6680

To cancel the try build, run the command @bors try cancel.

rust-bors bot added a commit that referenced this pull request Sep 4, 2025
@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Sep 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-perf Status: Waiting on a perf run to be completed. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants