Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Title-case for uppercase strings (textcase_CapitalsUntouched) #9

Open
larsgw opened this issue Oct 31, 2019 · 8 comments
Open

Title-case for uppercase strings (textcase_CapitalsUntouched) #9

larsgw opened this issue Oct 31, 2019 · 8 comments

Comments

@larsgw
Copy link

larsgw commented Oct 31, 2019

The specs say, about title case conversion:

  1. For uppercase strings, the first character of each word remains capitalized. All other letters are lowercased.

Yet in the following test case, two uppercase strings ("UK" and "OC 1") are capitalized according to rule 2 (where non-lowercase words stay the same):

>>===== RESULT =====>>
UK
Review of Book by A.N. Author
Writings on UK People and Places
All for One. For All.
OC 1
<<===== RESULT =====<<

@fbennett
Copy link
Member

fbennett commented May 24, 2020

Yes, the behavior doesn't fit the stated rules here, does it. it's reflecting behaviour of citeproc-js, which I think is coded to avoid touching things that look like acronyms (OECD, WTO, WHO, FAA...), which might well turn up as the sole "word" in a title. (If there are objections, fixtures that reflect this behaviour could be removed from the standard test suite.)

@larsgw
Copy link
Author

larsgw commented May 24, 2020

What things look like acronyms? How would one distinguish between a book titled Dr Who but written in uppercase and a website about Disaster Relief by the World Health Organization? I guess this is an edge case, and I don't have an specific reason for objecting to the fixture, I was just wondering.

@fbennett
Copy link
Member

fbennett commented May 24, 2020

I take that back. It's not guessing at acronym-like things, it's doing what the name of the test suggests: capital letters are not lowercased. That makes the behaviour even further removed from the spec language, but also simpler to explain. It also fits the advice commonly given to users by the CSL team: to set titles in sentence case, capitalizing only things (like acronyms and proper nouns) that should always be in capitals. It's not one for my desk, but this would be a case for amendment of the spec language, I think.

@bdarcus
Copy link
Member

bdarcus commented May 29, 2020

Sounds like this is another issue that would benefit from a PR against the spec?

@larsgw
Copy link
Author

larsgw commented May 29, 2020

I think the amendment would be removing point 1 from the spec entirely, I don't know if that's wanted. But I can make a PR later.

@bdarcus
Copy link
Member

bdarcus commented May 29, 2020

To be clear, I haven't looked at the specifics closely. I'm just trying to triage, across a number of repos, and gathered that from what I could tell of where the conversation ended up.

Thoughts @fbennett?

@adam3smith
Copy link
Member

Removing point 1 (and adjusting the remaining phrasing) seems right to me.
This removes the ability for title-case to fix poor data entry, but it avoids the risk of unintended lowercasing of an all acronym (or otherwise intentionally all caps) title. This seems like the right approach to me -- I'm not aware of any complaints about the current citeproc-js behavior.

@rmzelle
Copy link
Member

rmzelle commented May 30, 2020

Removing point 1 (and adjusting the remaining phrasing) seems right to me.
This removes the ability for title-case to fix poor data entry, but it avoids the risk of unintended lowercasing of an all acronym (or otherwise intentionally all caps) title. This seems like the right approach to me -- I'm not aware of any complaints about the current citeproc-js behavior.

I have a bunch of old items in my Zotero library with all-caps title (mostly patents like https://patentscope.wipo.int/search/en/detail.jsf?docId=WO2011142027), but it looks like Zotero nowadays sentence cases them when saving. (plus Zotero has the option to re-case titles post-save)

I'd be fine with the tenet that CSL processors shouldn't try to clean up messy metadata, which is the main purpose of the described behavior for all-uppercase strings. (although we might want to mention that somewhere)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants