Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend \Make...case to support languages #936

Merged
merged 1 commit into from
Oct 26, 2022
Merged

Extend \Make...case to support languages #936

merged 1 commit into from
Oct 26, 2022

Conversation

josephwright
Copy link
Member

This PR focussed on language support in two ways:

  • An optional arg. for BCP-47 string
  • Auto-babel detection (still to do at the time of writing)

Copy link
Member

@davidcarlisle davidcarlisle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in general looks good to me, a couple of points raised by

\documentclass{article}

\begin{document}

\MakeUppercase{xyz}

\MakeUppercase[]{xyz}

\MakeUppercase[langg=en]{xyz}

\MakeUppercase[lang=fr]{xyz}

\typeout{\MakeUppercase{xyz}}
\typeout{\MakeUppercase[lang=fr]{xyz}}

\end{document}

I wonder if it is still necessary to use a \protected inner command as the whole code is expandable. The typeouts produce

\MakeUppercase    []{xyz}
\MakeUppercase    [lang=fr]{xyz}

rather than XYZ which is more or less compatible behaviour. Previously it was necessary but I'm not sure it would often have been required, and we could probably change this to allow expansion?

Other point shows up in the error message, the keys are in a generic kernel group but should (probably?) be in a specific group for this so they don't get mixed in with other keys as we add more kv interfaces in the format.

@josephwright
Copy link
Member Author

I wonder if it is still necessary to use a \protected inner command as the whole code is expandable. The typeouts produce

\MakeUppercase    []{xyz}
\MakeUppercase    [lang=fr]{xyz}

rather than XYZ which is more or less compatible behaviour. Previously it was necessary but I'm not sure it would often have been required, and we could probably change this to allow expansion?

If we want keys, then working by expansion is a bit more tricky (though doable). If we decide we only want a simple optional argument, that would be easier.

@josephwright
Copy link
Member Author

Other point shows up in the error message, the keys are in a generic kernel group but should (probably?) be in a specific group for this so they don't get mixed in with other keys as we add more kv interfaces in the format.

I was imagining that lang would eventually be a document-wide key that could be modified locally. However, particularly at present, I'm happy to shift to a dedicated tree. But this depends on whether keys are actually a good idea here.

@Skillmon
Copy link
Contributor

Skillmon commented Oct 19, 2022 via email

@josephwright
Copy link
Member Author

Yay, expkv-cs :P More seriously: Do you expect any more keys than the lang key? I don't see how the case-changers would need so many more keys. So I'd go this route: \NewExpandableDocumentCommand \MakeUppercase { O{\l__kernel_lang_key_tl} m } { <code> }

Well there's a babel-aware change to make yet, but something like that, yes. (I plan to add a common internal auxiliary today to cover picking up the current language if the optional argument is absent.)

@josephwright
Copy link
Member Author

If we want keys, then working by expansion is a bit more tricky (though doable). If we decide we only want a simple optional argument, that would be easier.
Yay, expkv-cs :P More seriously: Do you expect any more keys than the lang key?

The feeling was e.g. tagging

@u-fischer
Copy link
Member

More seriously: Do you expect any more keys than the lang key?

The feeling was e.g. tagging

Yes, I think one should keep the option open for keys to add a structure or an alternate text.

@josephwright
Copy link
Member Author

Thoughts on the key names would be welcome: I think lang is relatively informal but likely clear, bcp47 is more formal but of course one might argument for BCP47 or BCP-47 instead (I'm assuming a hyphen in place of a space). Personally, I like lang, but I think @jbezos is likely best placed to comment.

@josephwright
Copy link
Member Author

More seriously: Do you expect any more keys than the lang key?

The feeling was e.g. tagging

Yes, I think one should keep the option open for keys to add a structure or an alternate text.

Indeed, even if we do keys by expansion, any tagging is non-expandable.

@u-fischer
Copy link
Member

Thoughts on the key names would be welcome: I think lang is relatively informal but likely clear, bcp47 is more formal but of course one might argument for BCP47 or BCP-47 instead (I'm assuming a hyphen in place of a space). Personally, I like lang, but I think @jbezos is likely best placed to comment.

I use lang in \DocumentMetadata (and it takes bcp-tags too as value). I don't like bcp-whatever as -- unless I use it everyday -- I would always have to look it up to find out if it is bcp or bpc or 47 or 67 or 74 ... ;-).

@Skillmon
Copy link
Contributor

Fully agree with @u-fischer here, KISS for key names. lang or locale, but not the technically more correct ones that no one can remember.

@jbezos
Copy link
Contributor

jbezos commented Oct 22, 2022

[Very busy these last few days.] It’s ok for me, even lang (which is, after all, what HTML uses), but I think locale is better (I wonder how many average users know what bcp47 means).

This option can be useful in some cases, but from the point of view of localization, the macro, without any changes or options, must upper- or lowercase a string according to the current rules, even if tailored by the user. So, \foreignlanguage{spanish}{\MakeUppercase{á}} prints ‘Á’, but if for some reason I create a new locale named spanish-old with the rule ‘á → A’, then \foreignlanguage{spanish-old}{\MakeUppercase{á}} must print ‘A’.

@josephwright
Copy link
Member Author

[Very busy these last few days.]

I'd noticed :)

It’s ok for me, even lang (which is, after all, what HTML uses), but I think locale is better (I wonder how many average users know what bcp47 means).

I'd seen that HTML uses lang, and I had wondered about locale as that is more precise. Probably those two as equivalent keys, then? (I think we've agreed that BCP47 is technically accurate but not useful to users.)

This option can be useful in some cases, but from the point of view of localization, the macro, without any changes or options, must upper- or lowercase a string according to the current rules, even if tailored by the user. So, \foreignlanguage{spanish}{\MakeUppercase{á}} prints ‘Á’, but if for some reason I create a new locale named spanish-old with the rule ‘á → A’, then \foreignlanguage{spanish-old}{\MakeUppercase{á}} must print ‘A’.

Indeed: that is I hope covered by 639e761, which is part of the PR. That change should mean that \Make...case picks up the babel locale (I haven't done polyglossia yet, but that's in part because I don't really have a good way to obtain locale info by expansion there.)

Copy link
Member

@FrankMittelbach FrankMittelbach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks fine to me with some minor comments

base/ltfinal.dtx Outdated
Comment on lines 1121 to 1118
lang .meta:n = { locale = {#1} } ,
locale .str_set:N = \reserved@a
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder what will be the predominant use, by guess is "lang" not "locale" so perhaps the .meta should be the other way around (not really important though)

base/ltfinal.dtx Outdated
Comment on lines 1114 to 1116
% The odd use of \emph{three} spaces here is needed as \pkg{ltcmd} uses the
% name with one and two spaces to give a `friendly' error message for a runaway
% argument: that means we can't use it here.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm probably dense (or the diff is not showing what I need) but which 3 spaces are we talking about?

Comment on lines +1152 to +1154
\cs_generate_variant:cn { text_ \str_lowercase:n {#1} case:nn } { V }
\cs_new_protected:cpx { Make#1case \c_space_tl \c_space_tl \c_space_tl } [##1] ##2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahh these, so perhaps the comment should really be here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants