Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify single character of mi as italic #231

Closed
bert-github opened this issue Apr 9, 2024 · 11 comments
Closed

Clarify single character of mi as italic #231

bert-github opened this issue Apr 9, 2024 · 11 comments
Labels
i18n-needs-resolution Issue the Internationalization Group has raised and looks for a response on.

Comments

@bert-github
Copy link
Contributor

(This is part of the I18n WG review.)

Example
https://www.w3.org/TR/2023/WD-mathml-core-20231127/#mi-example

Note that identifiers containing a single letter are italic by default.

A minor issue is that the example uses the word ‘letter’, suggesting that <mi>a9</mi> would also be italic, because it only contains one letter. But that is maybe not worth fixing, as it is only an example and the normative text below talks about ‘character’.

The transformation to italic is via text-transform: math-auto (section 4.2)
https://www.w3.org/TR/2023/WD-mathml-core-20231127/#new-text-transform-values

On text nodes containing a single character, if the computed value is math-auto then the transformed text is obtained by performing conversion of each character according to the italic table.

This does not mention that white space is collapsed before the characters are counted. E.g., <mi> a </mi> counts as a single character. The MathML3 spec defines that (in 2.1.7 Collapsing Whitespace in Input), but it should probably be recalled here.

Also, the word ‘each’ when we know there is only one could be confusing.

@bert-github bert-github added the i18n-needs-resolution Issue the Internationalization Group has raised and looks for a response on. label Apr 9, 2024
@fred-wang
Copy link
Contributor

A minor issue is that the example uses the word ‘letter’

right, the non-normative text should probably use "character" too

This does not mention that white space is collapsed before the characters are counted

This is on purpose since current version of MathML Core does not perform whitespace collapsing from MathML 3 (there are other issues about that).

Also, the word ‘each’ when we know there is only one could be confusing.

This is probably legacy stuff from the time when other mathvariants were supported e.g. <mi mathvariant="bold">sin</mi> where we had to convert each character one by one.

@SmashManiac
Copy link

I'm not sure if replacing "letter" with "character" is the way to go here. I am not aware of any existing MathML renderer where the infinity symbol or an ellipsis is rendered in italic inside a <mi> by default, nor would I intuitively expect them to based on ISO 80000-2 rules.

@dginev
Copy link

dginev commented Aug 19, 2024

Related: Unicode combining characters can also be used to modify a letter variable name, as with circumflex, while traditionally expecting an italic rendering, matching the unmodified letter.

  • For example U+0302 to create a variable name <mi>x̂</mi>.

I agree MathML Core should be as clear as possible what is and isn't meant to be covered by the italic treatment.

Possibly 4.2 New text-transform value does that best:

On text nodes containing a single character, if the computed value is math-auto then the transformed text is obtained by performing conversion of each character according to the italic table.

Maybe a few extra words on the intended coverage of the italic table would help? Today a read through it reveals that "letter" currently means the more traditional for math "Latin or Greek letter".

@SmashManiac
Copy link

I just realized that MathML 3 currently specifies that all mi elements containing a single character defaults to a mathvariant of italic. So while the original suggestion would align with full MathML 3, it doesn't seem to match what's currently being rendered on the web right now for some reason.

I would speculate that this may be because non-letter characters generally don't have corresponding italic characters in Unicode so existing renderers don't know what to do?

In any case, it would be nice to clarify this, especially when considering that there are many cases where one would NOT want a one-letter mi element in italic, such as mathematical constants, well-known function names and abstract identifiers.

@fred-wang
Copy link
Contributor

So just to repeat, the text mentioning "single letter" is a non-normative example, it's just meant to say that the two <mi>c</mi> are italic by default and that mathvariant="normal" overrides that.

The normative text is the one quoted by Delan: the UA stylesheet has mi { text-transform: math-auto; } by default, mathvariant="normal" is treated as a presentational hint for text-transform: none. And then the exact behavior is defined in the text-transform and "italic mappings" sections.

fred-wang added a commit that referenced this issue Sep 17, 2024
- Provide more explanation in the mi example and add an example of
  a character not mapped to italic.
- Be more explicit about how the "italic mappings" is used by
  text-transform to convert a character.
- Remove confusing "each character" which was a legacy thing for
  other mathvariant values applying on text nodes with an arbitrary
  number of characters.

Current behavior is unchanged and covered by WPT tests.
@fred-wang
Copy link
Contributor

I pushed a commit that does not change the behavior and just try to be more explicit. Hopefully that addresses all the issues reported here.

For the record #149 is the issue about trimming whitespace characters.

@davidcarlisle
Copy link
Collaborator

I just realized that MathML 3 currently specifies that all mi elements containing a single character defaults to a mathvariant of italic. So while the original suggestion would align with full MathML 3, it doesn't seem to match what's currently being rendered on the web right now for some reason.

I would speculate that this may be because non-letter characters generally don't have corresponding italic characters in Unicode so existing renderers don't know what to do?

In any case, it would be nice to clarify this, especially when considering that there are many cases where one would NOT want a one-letter mi element in italic, such as mathematical constants, well-known function names and abstract identifiers.

Mathml3 status is

singe character mi (after space stripping) defaults to mathvariant=italic

By default mathvariant=italic has no effect (does not make text italic) other than the characters listed at

https://www.w3.org/TR/xml-entity-names/italic.html

mathvariant is not a font change it's a codepoint shift to the Unicode math italic block, so only has an effect on characters in that block.

There are some words hinting that systems may use css or font changes to style other characters but this is not guaranteed (and I think in Core should not be automatic although a document may supply its own css of course)

mathml3 section 3.2.1

Renderers should support those combinations of character data and mathvariant values that correspond to Unicode characters, and that they can visually distinguish using available font characters. Renderers may ignore or support those combinations of character data and mathvariant values that do not correspond to an assigned Unicode code point, and authors should recognize that support for mathematical symbols that do not correspond to assigned Unicode code points may vary widely from one renderer to another.

@omentic
Copy link

omentic commented Sep 17, 2024

FYI @fred-wang the latest commit failed to deploy, and https://w3c.github.io/mathml-core/ currently results in a 404.

@davidcarlisle
Copy link
Collaborator

@omentic oh that's odd, there was no error from respec, but one stage of the gh action timed out for some reason. I just forced a rebuild and it's there now, thanks.

@SmashManiac
Copy link

Thank you very much for the clarifications! I had not previously realized that the italics mapping table was normative as all other mapping tables in section C aren't, nor that the italics mappings covered all possible CSS character substitutions. I personally find that @fred-wang's commit eliminated that particular "letter vs character" confusion for me as an external MathML Core user.

Note that I'm not currently in a position to comment on whether additional clarifications should be made or not.

@bkardell
Copy link
Collaborator

Given [fred's push](Clarify single character of mi as italic), I think the issue is resolved and am closing it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
i18n-needs-resolution Issue the Internationalization Group has raised and looks for a response on.
Projects
None yet
Development

No branches or pull requests

7 participants