Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Char#titlecase for correct mixed-case transformations #13539

Merged

Conversation

HertzDevil
Copy link
Contributor

@HertzDevil HertzDevil commented Jun 1, 2023

Fixes #13533. The name #titlecase was specifically made to be different from #titleize because only the latter deals with word boundaries.

The full list of affected characters are:

Character Titlecase Uppercase
ß U+00DF Ss U+0053 U+0073 SS U+0053 U+0053
DŽ U+01C4 Dž U+01C5 DŽ U+01C4
Dž U+01C5 Dž U+01C5 Dž U+01C5
dž U+01C6 Dž U+01C5 DŽ U+01C4
LJ U+01C7 Lj U+01C8 LJ U+01C7
Lj U+01C8 Lj U+01C8 Lj U+01C8
lj U+01C9 Lj U+01C8 LJ U+01C7
NJ U+01CA Nj U+01CB NJ U+01CA
Nj U+01CB Nj U+01CB Nj U+01CB
nj U+01CC Nj U+01CB NJ U+01CA
DZ U+01F1 Dz U+01F2 DZ U+01F1
Dz U+01F2 Dz U+01F2 Dz U+01F2
dz U+01F3 Dz U+01F2 DZ U+01F1
և U+0587 Եւ U+0535 U+0582 ԵՒ U+0535 U+0552
U+10D0 U+10D0 U+1C90
U+10D1 U+10D1 U+1C91
U+10D2 U+10D2 U+1C92
U+10D3 U+10D3 U+1C93
U+10D4 U+10D4 U+1C94
U+10D5 U+10D5 U+1C95
U+10D6 U+10D6 U+1C96
U+10D7 U+10D7 U+1C97
U+10D8 U+10D8 U+1C98
U+10D9 U+10D9 U+1C99
U+10DA U+10DA U+1C9A
U+10DB U+10DB U+1C9B
U+10DC U+10DC U+1C9C
U+10DD U+10DD U+1C9D
U+10DE U+10DE U+1C9E
U+10DF U+10DF U+1C9F
U+10E0 U+10E0 U+1CA0
U+10E1 U+10E1 U+1CA1
U+10E2 U+10E2 U+1CA2
U+10E3 U+10E3 U+1CA3
U+10E4 U+10E4 U+1CA4
U+10E5 U+10E5 U+1CA5
U+10E6 U+10E6 U+1CA6
U+10E7 U+10E7 U+1CA7
U+10E8 U+10E8 U+1CA8
U+10E9 U+10E9 U+1CA9
U+10EA U+10EA U+1CAA
U+10EB U+10EB U+1CAB
U+10EC U+10EC U+1CAC
U+10ED U+10ED U+1CAD
U+10EE U+10EE U+1CAE
U+10EF U+10EF U+1CAF
U+10F0 U+10F0 U+1CB0
U+10F1 U+10F1 U+1CB1
U+10F2 U+10F2 U+1CB2
U+10F3 U+10F3 U+1CB3
U+10F4 U+10F4 U+1CB4
U+10F5 U+10F5 U+1CB5
U+10F6 U+10F6 U+1CB6
U+10F7 U+10F7 U+1CB7
U+10F8 U+10F8 U+1CB8
U+10F9 U+10F9 U+1CB9
U+10FA U+10FA U+1CBA
U+10FD U+10FD U+1CBD
U+10FE U+10FE U+1CBE
U+10FF U+10FF Ჿ U+1CBF
U+1FB2 Ὰͅ U+1FBA U+0345 ᾺΙ U+1FBA U+0399
U+1FB4 Άͅ U+0386 U+0345 ΆΙ U+0386 U+0399
U+1FB7 ᾼ͂ U+0391 U+0342 U+0345 Α͂Ι U+0391 U+0342 U+0399
U+1FC2 Ὴͅ U+1FCA U+0345 ῊΙ U+1FCA U+0399
U+1FC4 Ήͅ U+0389 U+0345 ΉΙ U+0389 U+0399
U+1FC7 ῌ͂ U+0397 U+0342 U+0345 Η͂Ι U+0397 U+0342 U+0399
U+1FF2 Ὼͅ U+1FFA U+0345 ῺΙ U+1FFA U+0399
U+1FF4 Ώͅ U+038F U+0345 ΏΙ U+038F U+0399
U+1FF7 ῼ͂ U+03A9 U+0342 U+0345 Ω͂Ι U+03A9 U+0342 U+0399
U+FB00 Ff U+0046 U+0066 FF U+0046 U+0046
U+FB01 Fi U+0046 U+0069 FI U+0046 U+0049
U+FB02 Fl U+0046 U+006C FL U+0046 U+004C
U+FB03 Ffi U+0046 U+0066 U+0069 FFI U+0046 U+0046 U+0049
U+FB04 Ffl U+0046 U+0066 U+006C FFL U+0046 U+0046 U+004C
U+FB05 St U+0053 U+0074 ST U+0053 U+0054
U+FB06 St U+0053 U+0074 ST U+0053 U+0054
U+FB13 Մն U+0544 U+0576 ՄՆ U+0544 U+0546
U+FB14 Մե U+0544 U+0565 ՄԵ U+0544 U+0535
U+FB15 Մի U+0544 U+056B ՄԻ U+0544 U+053B
U+FB16 Վն U+054E U+0576 ՎՆ U+054E U+0546
U+FB17 Մխ U+0544 U+056D ՄԽ U+0544 U+053D

Copy link
Member

@beta-ziliani beta-ziliani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks Quinton!

@beta-ziliani beta-ziliani added this to the 1.9.0 milestone Jun 14, 2023
src/char.cr Outdated Show resolved Hide resolved
@straight-shoota straight-shoota removed this from the 1.9.0 milestone Jun 15, 2023
@straight-shoota straight-shoota added this to the 1.9.0 milestone Jun 23, 2023
@straight-shoota straight-shoota merged commit 5b8cee0 into crystal-lang:master Jun 24, 2023
50 checks passed
@HertzDevil HertzDevil deleted the feature/char-titlecase branch June 26, 2023 16:47
Blacksmoke16 pushed a commit to Blacksmoke16/crystal that referenced this pull request Dec 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

String#titleize and #capitalize ignore Unicode Titlecase_Mapping property
4 participants