Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why exclude hebrew and arabic proportional reference fonts? #237

Closed
r12a opened this issue May 25, 2017 · 16 comments · Fixed by #245
Closed

Why exclude hebrew and arabic proportional reference fonts? #237

r12a opened this issue May 25, 2017 · 16 comments · Fixed by #245
Assignees
Labels
i18n-needs-resolution Issue the Internationalization Group has raised and looks for a response on.
Milestone

Comments

@r12a
Copy link

r12a commented May 25, 2017

A. Reference Fonts
https://www.w3.org/TR/ttml-imsc1.0.1/#reference-fonts

proportionalSansSerif
All code points specified in B. Recommended Character Sets, excluding the code points defined for Hebrew and Arabic scripts.
Arial or Helvetica or Liberation Sans

Why are codepoints for hebr and arab excluded from the proportional font list? Actually, monospaced fonts are particularly problematic for arabic script text, since it creates an appearance of baseline stretching between narrow glyphs, and can cause difficulty in rendering wide characters elegantly (such as س). So if there was a preference one way or the other, i'd expect it to be biased towards proportional fonts.

@r12a r12a added the i18n-needs-resolution Issue the Internationalization Group has raised and looks for a response on. label May 25, 2017
@nigelmegitt
Copy link
Contributor

nigelmegitt commented May 31, 2017

My understanding here is that the text is not meant to exclude Hebrew and Arabic glyphs per se but to exclude them from the definition of reference metrics. I suspect this is because in those languages (and others I guess) the same code point results in multiple variant glyphs dependent for example on the position within a word, so it is not straightforward to tabulate the metrics per code point.

I would suggest that we add a Note explaining that the exclusion applies only to reference font metrics and is not meant to suggest that the code points themselves should not be supported.

@r12a
Copy link
Author

r12a commented May 31, 2017

Hmm, then this appears to be worse than i thought. Tying that back to

https://w3c.github.io/imsc/imsc1/spec/ttml-ww-profiles.html#reference-fonts-1

The flow of text within a region depends the dimensions and spacing (kerning) between individual glyphs. The following allows, for instance, region extents to be set such that text flows without clipping.

suggests to me that line-breaking is only properly supported for proporitionally-spaced fonts for a few Latin/Greek/Cyrillic languages. That's a much bigger issue, and suggests a very old fashioned, almost archaic, and highly western-biased mindset about how to handle text.

Also, the CJK scripts, which are normally mono-spaced by nature, are not included in the reference fonts either, so presumably line-wrapping isn't supported for them either?

(Note btw the extract above is missing the word 'on' after depends.)

@palemieux
Copy link
Contributor

palemieux commented May 31, 2017

suggests to me that line-breaking is only properly supported for proporitionally-spaced fonts for a few Latin/Greek/Cyrillic languages.

No. Any combination of character and font family can be used by authors, and line breaking is specified for all such combinations -- using the UAX 14 algorithm.

Processors have to support mandatory metrics only for a smaller set of font family and characters.

@nigelmegitt
Copy link
Contributor

@r12a I think that line is not supposed to be an exhaustive list of the things that the flow of text depends on, just an example. It certainly isn't supposed to exclude anything that is needed for line wrapping. Possibly I have not understood your concern about that text though?

The main thing is the normative text:

a processor shall use a font that generates a glyph sequence whose dimension is substantially identical to the glyph sequence that would have been generated by one of the specified reference fonts.

This is independent of line breaking.

Perhaps the wording you quoted could be written in a more precise way?

@r12a
Copy link
Author

r12a commented May 31, 2017

Sorry to be so slow in understanding all this. I think some additional explanations in the spec to address these topics would be useful. There were similar questions coming from other members of the i18n WG.

I guess this is just a profile, but my reading of 7.3 seemed to indicate that text flow (which may not involve line wrapping) was only done by counting characters, and i see no mention of UAX 14. Perhaps there needs to be something that explains the relationship between sections 7.2, 7.3, App A and App B, and the main TTML spec(?)

@nigelmegitt
Copy link
Contributor

my reading of 7.3 seemed to indicate that text flow (which may not involve line wrapping) was only done by counting characters, and i see no mention of UAX 14.

@r12a I'm trying to understand this reading - from the words present, what led you to that?

by the way, §7.4 includes a mandatory requirement to support UAX 14 via the #lineBreak-uax14 feature designator:

#lineBreak-uax14 The processor shall implement the #lineBreak-uax14 feature defined in the TT Feature namespace.

So it is there...

@r12a
Copy link
Author

r12a commented Jun 1, 2017

wrt linebreak-uax14, ah! - not sure why my search of the document didn't find that.

@r12a I'm trying to understand this reading - from the words present, what led you to that?

Perhaps because of the following text in the section 7.3 Reference Fonts

The following allows, for instance, region extents to be set such that text flows without clipping.

I may be leaping to unwarranted conclusions, but it seemed to indicate that if you want to avoid clipping, you need to use reference fonts.

@nigelmegitt
Copy link
Contributor

I may be leaping to unwarranted conclusions, but it seemed to indicate that if you want to avoid clipping, you need to use reference fonts.

Not necessarily - you could use named fonts that correspond to specific font resources you know will be available at presentation time.

The intent of this, in my understanding, is to solve the general problem that when authoring a document the actual font that will be used at presentation time is not known, particularly when a generic font family name is used. This requirement sets a reasonable expectation of the size of the rendered text (sequence of glyphs) so that other fixed size elements such as regions can be given an appropriate dimension to include all that text. This is a particular issue when content elements cannot grow to fit, or when scrollbars cannot usefully be made available, which is the case when presenting text overlaid on video.

It also allows the author to meet the accessibility requirement to position text to avoid important areas of underlying video.

@palemieux
Copy link
Contributor

The following allows, for instance, region extents to be set such that text flows without clipping.

I may be leaping to unwarranted conclusions, but it seemed to indicate that if you want to avoid clipping, you need to use reference fonts.

Ok. This is not the intent of the sentence and there is not conformance terminology in the sentence that would compel implementation behavior.

I am happy to remove the sentence if it causes more confusion than it helps.

@nigelmegitt
Copy link
Contributor

I agree that sentence could usefully be changed but I would not remove it altogether. I would simply qualify it by appending "when using the generic font family names monospaceSerif or proportionalSansSerif".

@palemieux palemieux self-assigned this Jun 8, 2017
@palemieux palemieux added this to the imsc1.0.1 CR milestone Jun 8, 2017
@palemieux
Copy link
Contributor

I plan to generate a PR based on #237 (comment)

@r12a
Copy link
Author

r12a commented Jun 14, 2017

I suspect this is because in those languages (and others I guess) the same code point results in multiple variant glyphs dependent for example on the position within a word, so it is not straightforward to tabulate the metrics per code point.

I don't believe that this applies for Hebrew. Hebrew isn't a cursive script like Arabic.

Also, most other scripts than the simple alphabetic ones like Latin/Cyrillic/Greek or CJK, glyphs vary based on context. So the list of exclusions needs to be much bigger than just Arabic - the spec should, i'd have thought, at least mention that just using monospaced fonts for Arabic doesn't solve the real problem here. To be honest, I'm worried that the current approach will just reinforce the tendency that already exists to tailor processors just for 'easy' scripts and languages, rather than really making developers aware of the need to consider a world wide audience when they create their technology.

@palemieux
Copy link
Contributor

the spec should, i'd have thought, at least mention that just using monospaced fonts for Arabic doesn't solve the real problem here.

Yes, it would be good to note best practices.

@nigelmegitt
Copy link
Contributor

@r12a I had imagined that HEBREW LETTER FINAL KAF and HEBREW LETTER KAF, for example, were the same code point and the selected glyph would be based on position within the word, but it turns out that they are distinct code points. I stand corrected on the Hebrew point.

@r12a
Copy link
Author

r12a commented Jun 16, 2017

Yep, that's a result of legacy technologies and keyboards (and typewriters). Same applies to Greek wrt sigma. It doesn't really apply in the same way to any other scripts i can think of.

(background reading: http://r12a.github.io/scripts/tutorial/part3#word-final and https://r12a.github.io/uniview/?charlist=%CF%83%CF%82)

@nigelmegitt
Copy link
Contributor

Thank you @r12a - those links are extremely useful!

palemieux added a commit that referenced this issue Jun 22, 2017
Clarified use of reference fonts to control appearance
Closes #241 and #237
@palemieux palemieux removed the pr open label Jan 29, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
i18n-needs-resolution Issue the Internationalization Group has raised and looks for a response on.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants