New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why is there a requirement that 'cvXX' feature has the same number of variants for all glyphs? #778
Comments
I am very interested in this topic and hope it won't be forgotten. I suppose the purpose of this requirement is to encourage font makers to design rationally, so there should be matching variants on all design axes. An example posted in a discussion of my JuniusX font was the way Charis SIL handles v with hook, where there are matching variants for lowercase, uppercase, modifier, and small cap on cv62:
But I work in the Middle Ages, where the material doesn’t always cooperate. For example, the Medieval Unicode Font Initiative (MUFI) specification has upper- and lowercase variants of “Insular A,” but not of “Uncial A,” “Open-Top A,” etc. A designer could supply the missing variants so as to make the cvNN features correct, but that would be a little foolish, since the reason they don’t appear in the MUFI spec is that they don’t occur in medieval manuscripts, and so they would (probably) never get used. I tied myself in knots trying to come up with a way to conform to the OpenType spec while dealing with the MUFI situation, and ended up pleasing no one—so for now JuniusX doesn’t conform to the OpenType spec. I’d like to know if the font will break in some future software environment. |
The requirement is that for each input glyph in an individual cvxx feature, there should be the same number of output variant glyphs. The rationale is two-fold: a) the features are intended for variants of individual characters, not for sets of characters (for which the ssxx features would be appropriate), so are expected to be used for e.g. uppercase A and its diacritic forms, and not for both upper- and lowercase characters unless they happen to follow the same pattern of variants; and b) having the same number of variants for each input glyph in the feature allows for the feature to be applied across a body of text with the same enumerated variant producing predictable results on all glyphs affected by the feature. If you find yourself dealing with characters that have different numbers of variants, that is an indication that they do not belong in the same Character Variant feature. |
@khaledhosny Can you provide examples of your use of cvxx features with different numbers of variants for the input glyphs in a single cvxx feature? I’m having trouble understanding what sort of things would go into such a feature. It may be that there is a gap between cvxx and ssxx—ignoring the practical issue of the limited number of registered ssxx features—in terms of design. The cvxx features are defined in terms of input—‘What character-related glyphs share one or more variant forms?’—, and the ssxx features are defined in terms of output—‘What set of non-character-related glyphs share a related variant form?’ There isn’t an obvious place to answer ‘What set of non-character-related glyphs share one or more variant forms?’ |
Thanks, John, this is very helpful. I was taking the most permissive possible understanding of this passage:
But it's going to be more complicated in some cases, and I have a concern about how all this is going to look to a user. For example, here are all the variants for Aa in the MUFI spec: But to be fair, what I've got now is also complicated, maybe irrational, and I care a good bit about your second rationale, allowing "for the feature to be applied across a body of text with the same enumerated variant producing predictable results on all glyphs affected by the feature." There doesn't appear to be a solution lacking a downside to the problems presented by (e.g.) MUFI. |
I think you only need two cvxx features: one for uppercase A with three variants and one for lowercase a with five variants. Yes, one of the variants in each case happens to be of similar form, but this doesn’t mean you must put them in the same feature, only that you could, all else being equal. In this case, I would say that the unequivalent variant sets overrides the option to put upper- and lowercase in the same cvxx feature. |
It's a recommendation, not a requirement. As such, no implementation should ever be enforcing this. The feature was recorded as having been registered by Microsoft, but it was originally requested by SIL. @tiroj has captured the gist of the reasoning behind the recommendation: When SIL was creating fonts with broad Latin coverage (Gentium, Doulos, etc.), they had glyph variants for several characters, including cases in which, say, "a" (my hypothetical) plus all of the precomposed accented forms had a systematic set of variants; or "b" and it's barred forms; etc. For sake of discussion, let me refer to these as shared-base sets. And they knew that, for some language's orthography some mix of these variants would be preferred, but they couldn't predict what languages might need what mix. So they wanted a set of features that could be defined on a per-shared-base-set basis, with the feature(s) applied across entire documents. That way, e.g., "a" and any of its precomposed accented forms would get the appropriate variants throughout the document. If anyone thinks the description of the features can be improved, please suggest wording. |
Thanks, @PeterConstable, for the explanation, and for the interesting background re: SIL. I don't know if there's a widely shared understanding of words like "should" in technical specifications. Several of us have misunderstood the sentence quoted by @khaledhosny above as stating a requirement, not a recommendation. Perhaps there would be less misunderstanding if it read, "It is recommended that the number of variants be identical for all glyphs within each 'cvXX' feature." |
@PeterConstable Is it defined anywhere what the behaviour should be if the number of enumerated variants in a cvxx feature is not the same for the input glyphs in the lookup? So, for example, if these lower- and uppercase variants were included in a single cvxx feature what would be the correct output if a block of text containing both |a| and |A| characters were selected and the character variant feature applied with the enumerated variant 4 or 5? Which form of the uppercase |A| should be displayed in that situation, the default form because there is no corresponding enumerated variant? or the enumerated variant 3 because that is the highest, closest enumerated variant for that character? |
In general, I'd say there's a widely shared understanding: "should" is used for recommendations; requirements are typically stated with words like "must" or "shall". The OT spec consistently (AFAIK) uses "must" and "should" in these ways. Saying "it is recommended" for every recommendation would be verbose. |
If it's not stated in the feature description, or logically implied, then it's not defined anywhere. IIRC, the idea SIL was after was that entire documents could be marked up to indicate particular alternates from particular 'cvXX' features. E.g., for 'cv01', use the fourth alternate throughout. Two lines of reasoning follow:
|
In this case no revision is necessary, if a recommendation is your intention. But you seem also to be saying that there is no defined behavior when a font doesn't follow the recommendation, so I wonder if it should be "must" or "shall." In the cases @tiroj is talking about, Harfbuzz displays the default character instead of the last in the sequence, but I don't know about the other engines. |
If a font was created with a different number of alternates within a given 'cvXX' feature, it could be workable if the content author knew the details and marked up individual runs as needed rather than the entire document. Still workable, but not the ideal. |
It's sounding deeply inadvisable. This is the kind of guidance I was hoping for. Thanks! |
I think that is the most sensible approach and a good general principle: if the requested enumerated variant of a glyph is not available in the font, display the default form. This produces predictable results not only in the case under discussion here but also if a user changes to a different font. |
Thanks everyone. I think I have now a clearer understanding of how cvXX features are supposed to be used. I think I was using them more like |
To help clarify, the follow revision is proposed for OT 1.9:
|
Addressed in OpenType 1.9. Closing. |
Quoting the spec:
That requirement seems to be peculiar, what is the rationale behind this and is there any implementation that actually enforce this requirement? I have built several fonts with cvXX features and hadn’t paid attention to this requirement, do I need to revise my fonts?
Document Details
⚠ Do not edit this section. It is required for docs.microsoft.com ➟ GitHub issue linking.
The text was updated successfully, but these errors were encountered: