-
Notifications
You must be signed in to change notification settings - Fork 106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add String.toTitleCase, String.toLocaleTitleCase #294
Comments
Shouldn't this be filed against tc39/ecma262 (as well) ? |
I think it's enough to track it here, even though we would make changes in the main spec too. See also #99 . |
also CLDR (data) and ICU (implementation) implement casing via transforms (transliterators) due to complexity. |
Title case is more complex than just these issues. |
This is a big pain point in JavaScript. Here's a SO question with 461 upvotes where almost all of the answers are, "take the first char and make it upper case": https://stackoverflow.com/q/196972/1407170 We should discuss whether the right answer is title case or whether we should use sentence casing, etc. |
I was pointed to this very issue by @littledan because I just wrote an article on a potential CSS |
Title case is usually used in article titles(e.g. It is related to rules in different languages/locales. Because natural languages are not like programming languages, there may be more complicated rules or uncertain variants and exceptions. I don't know all languages rules. I think if a language has static and definite rules for title case and that are not affected by different contextual semantics and no ambiguities, it could be implemented in JavaScript, or it is not suitable for implementation in JavaScript. In Chinese, there is almost no capitalization, uppercase and lowercase concepts. So title case does not apply to Chinese. In English, I found these references and implementation from https://individed.com/code/to-title-case/ by @gouch .
|
Another use case to consider is tc39/proposal-intl-displaynames#13 Different types of display names have different capitalization rules based on context. For example, you might titlecase month names in some locales but not in others. |
@markusicu What are your thoughts on putting titlecasing more front and center in JavaScript? |
The question is what people mean with "titlecasing". Unicode has a decent spec, and ICU has a solid implementation, for titlecasing at certain boundaries (with adjustment options) and leaving alone or lowercasing the rest of the string. However, different people use it for different things. Some people want just the start of the string titlecased. Some want the start of each sentence. Some want the start of each word. ICU lets you provide different BreakIterator instances/options for these choices. In the US, there is a peculiar style of "titlecasing" book titles and article headlines that titlecases some words but not others. This is language- and style-specific and not built into ICU. You would need to provide the offsets to ICU for where to titlecase and where not. Note that like all case mapping operations, titlecasing is a lossy operation. It's also not always obvious. It is not always actually desirable to titlecase the first character of a word and lowercase the rest. Think of acronyms like NASA, names like McDonald, product names like iPhone. The best we have for that is the "don't lowercase the rest" option. FYI For some characters, titlecasing is different from uppercasing. FYI Yes, CLDR/ICU have "Transliterator" rules for case mappings, but most people don't use them. For example, Greek uppercasing would be more difficult with a Transliterator rule than with the hand-coded implementation in the low-level API I think. |
Just chiming in because opinions were solicited on Twitter...
If there was a JS standard library function called |
@domenic The US isn't the world. There exist different cultural conventions regarding titlecasing, even within English. If the JS standard library function W3C-I18N recently closed an issue related to CSS (it was quite an old issue--we were housekeeping). Basically CSS decided that the I do think a locale-aware titlecasing function would be useful. As @markusicu mentions, ICU has a solid implementation that covers most user's needs for most strings. But the gaps are not isolated in obscure locales or scripts. |
To be clear, I'm not suggesting adding a toTitleCase() that does US English titlecasing. I'm simply saying that if a toTitleCase () is added, and it fails to do US English titlecasing, that would be extremely surprising. As such I was suggesting that a different name be used for a function that does the style of titlecasing that this thread seems to be discussing. |
@domenic Ah, I get it. Still, most functions that claim to "titlecase" in other programming languages are algorithmic and fail to get US English titlecasing correct either. To your point, notice that CSS's transform is called "capitalize". That might be a good choice here too, since unlike |
As the editor, I don't have a specific preference if we should add the features discussed here, but I have some observations:
IMO, @sffc let's add this to the discussions for the next TG2 meeting? |
@everyone: if you want to see titlecasing (regardless of the exact implementation, e.g. capitalize individual words versus string toTitleCase), please 👍 the OP. There are still only 2 votes for this issue. I can't tell whether the discussion in this thread is "if we were to theoretically do this, this is what it should look like" or "I think we should do this, and here's some discussion to get the ball rolling". |
@leobalter |
@aphillips it seems it's not that simple even for capitalization, then. TIL, thanks for the heads up. @sffc in my position I'd be using the feature, rather than implementing it. I'm definitely down to see it being discussed. |
Some background on title case issues for Georgian at this document: https://gist.github.com/srl295/1d9603ecfbcae55a08b04e9cd925d349#problem
The text was updated successfully, but these errors were encountered: