[css-inline-3] top metrics for non-Western non-CJK writing systems with obvious top edge #5244

fantasai · 2020-06-19T04:08:54Z

[This issue has been annotated in the spec for awhile, but doesn't seem to have a corresponding GH issue, so filing one here.]

Both Thai and Hebrew are writing systems with strong top edges (similar to Latin/CJK). But while OpenType defines multiple top edge metrics (cap-height, x-height, ideographic, and hanging), none of these necessarily coincide with the Hebrew or Thai top metrics, which in a given font will often fall somewhere between the x-height and the cap-height, but not consistently the same place across fonts.

If initial-letter-align and text-edge are to treat all writing systems as equal citizens of the Web, we need metrics for them in OpenType, and we need values for them in CSS that will select those metrics.

Note: See also CSSWG OpenType liaison statement.

The text was updated successfully, but these errors were encountered:

faceless2 · 2020-07-28T16:03:04Z

I don't think we can realistically expect the baseline table to provide this information, now or in the immediate future. Even if baselines for the world's scripts were added in the next OpenType revision, every font would need updating before they could be used. I also don't think we should be itemising every one of these baselines as a list of idents that can be set in initial-letter-align.

Pulling the metrics from the ink bounds of a representative glyph for the script seems like the best option - this is already proposed for Hebrew. I did some testing - first, here's the results of using "cap-height" and the alphabetic baseline for the 16 hebrew fonts at fonts.google.com, plus Noto Sans and Noto Sans Serif. This is what we'll get if we get rid of the "hebrew" keyword and fall back to the default:

Awful. Lots of glyphs have big gaps at the top, which (in our implementation) currently causes the first line to run flush to the margin. Next, the top alignment point is taken from the horizontal center of the ink-outline of U+05BE (Hebrew Maqaf) as suggested in the spec:

Better, but not so great. But using the ink-top in the horizontal center of U+05D4 (Hebrew He) works really well:

So I think in general the idea of pulling alignment points from glyph outlines is a good one. We're already making use of glyph outlines in initial-letter anyway. So long as each script uses the same mechanism (i.e. choosing the point in the horizontal (or vertical) center at the appropriate edge of the glyph outline), then adding new scripts is no more than determining which glyph is representative. After the initial implementation, that should be fairly low cost both for developers, and for anyone wanting to propose a new script.

To that end I'd suggest we consider something like initial-letter-align: auto, which would determine the Unicode script from the first non-common character of the text following the initial letter, exactly as we're doing now for the script inside the initial-letter. We can then simply (and briefly) list the alignment points for each script, e.g.

Latn: over=cap-height, under=alphabetic baseline.
Hebr: over=U+05D4, under=alphabetic baseline.
Deva: over=BASE.hang or U+915 if not defined, under=alphabetic baseline.
Beng: over=BASE.hang or U+995 if not defined, under=alphabetic baseline.
Hans, Hant: over=BASE.icft or U+6C38 if not defined, under=BASE.icfb or U+6C38 if not defined.

If further control over which baseline to select is required (and, the more I think about this, the more I doubt it is) then perhaps something like initial-letter-align: [alphabetic | hanging | ideographic] || [<string> <string>?] - to let you select a baseline pair as we do now, and/or specify a glyph (or under and over glyphs) directly in case the baseline isn't available.

fantasai · 2020-08-13T05:03:30Z

Please, let's not mix up this issue, which is about finding metrics for a given script, with the issue of whether the question of “which script” should be automatically determined. There is enough complexity in just this one issue.

fantasai · 2020-08-13T05:13:35Z

So I think in general the idea of pulling alignment points from glyph outlines is a good one.

Using glyph outlines is an acceptable heuristic for simply-styled fonts, and if UAs want to implement that I would be thrilled. But it is not as good as if the font designer sets the metric themselves. The font designer can account for the effects of flourishes, stroke variations, and other artistic effects correctly. We can only guess that the middle of the glyph is the least likely to be affected by such things, and try to pick a character that has a wide target to measure. So while measuring the glyph is a great tactic for handling fonts and font formats that don't have relevant metrics, that doesn't mean the need for metrics goes away.

As for maintaining a database of ideal glyphs to measure for these things... that should definitely not be the job of the CSS specs. We could make a jointly-maintained registry with i18n for the time being. But ideally I think Unicode and OpenType should be collaborating on this. There should be optional metrics in OpenType to provide this info; there should be defined fallback heuristics based on glyph outlines for when the font is missing those metrics so that implementers have a reference for all the scripts they are unfamiliar with; and the CSSWG should not be the ones maintaining this heuristics registry.

fantasai added i18n-tracker Group bringing to attention of Internationalization, or tracked by i18n but not needing response. i18n-sealreq Southeast Asian language enablement labels Jun 19, 2020

fantasai mentioned this issue Jun 19, 2020

[css-inline-3] Drop 'hebrew' alignment from initial-letter-align #5208

Closed

frivoal added the css-inline-3 Current Work label Jun 19, 2020

w3cbot mentioned this issue Jun 19, 2020

[css-inline-3] top metrics for non-Western non-CJK writing systems with obvious top edge w3c/i18n-activity#931

Open

r12a added the i18n-hlreq Hebrew language enablement label Jun 26, 2020

This was referenced Jul 19, 2020

[css-inline] vertically align to middle of cap height #4707

Open

[css-inline] Leading control at start/end of block #3240

Open

faceless2 mentioned this issue Jul 27, 2020

[css-inline-3] initial-letter sizing for non-western scripts #5366

Open

This comment has been minimized.

Sign in to view

css-meeting-bot mentioned this issue Jul 28, 2020

[css-inline] alignment of initial-letter for South Asian scripts without hanging baseline #864

Open

faceless2 mentioned this issue Aug 5, 2020

[css-inline-3] Add new value "auto" for initial-letter-align #5398

Open

fantasai added a commit that referenced this issue Aug 13, 2020

[css-inline-3] Use Hebrew He instead of maqaf per #5244 (comment)

7bde09f

fantasai mentioned this issue Aug 13, 2020

[css-inline-3] Wrong Unicode code point for Hebrew maqaf (U+05B3) #5337

Closed

faceless2 mentioned this issue Sep 16, 2020

Interoperable font metrics via explicit font metrics overrides #4792

Closed

This was referenced Apr 6, 2021

Allow font-size-adjust to be used in @font-face block w3c/font-text-cg#51

Open

[css-fonts] Any proposals to make CSS font-size-adjust work better for all scripts? #4540

Open

fantasai added the topic: text edge control label Nov 14, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[css-inline-3] top metrics for non-Western non-CJK writing systems with obvious top edge #5244

[css-inline-3] top metrics for non-Western non-CJK writing systems with obvious top edge #5244

fantasai commented Jun 19, 2020 •

edited

Loading

faceless2 commented Jul 28, 2020 •

edited

Loading

This comment has been minimized.

fantasai commented Aug 13, 2020 •

edited

Loading

fantasai commented Aug 13, 2020

[css-inline-3] top metrics for non-Western non-CJK writing systems with obvious top edge #5244

[css-inline-3] top metrics for non-Western non-CJK writing systems with obvious top edge #5244

Comments

fantasai commented Jun 19, 2020 • edited Loading

faceless2 commented Jul 28, 2020 • edited Loading

This comment has been minimized.

fantasai commented Aug 13, 2020 • edited Loading

fantasai commented Aug 13, 2020

fantasai commented Jun 19, 2020 •

edited

Loading

faceless2 commented Jul 28, 2020 •

edited

Loading

fantasai commented Aug 13, 2020 •

edited

Loading