diff --git a/gap-analysis/index.html b/gap-analysis/index.html index 382b296..d5ff0d4 100644 --- a/gap-analysis/index.html +++ b/gap-analysis/index.html @@ -127,10 +127,19 @@
Do the standard fallback fonts used in browsers (eg. serif, sans-serif, cursive, etc.) match expectations? Are special font or OpenType features needed for this script that are not available? See available information or check for currently needed data.
+Do the standard fallback fonts used in browsers match expectations? Are special font features needed for this script that are not available? Do italic fonts lean in the right direction? See available information or check for currently needed data.
+Do italic fonts lean in the right direction? Is synthesised italicisation problematic? See available information or check for currently needed data.
When you double- or triple-click on the text, is the expected range of characters highlighted? When you move through the text with the cursor, or backspace, etc. do you see the expected behaviour? Are there issues when applying punctuation than could be fixed by the application? -See available information or check for currently needed data.
+Does your script need special text transforms that are not supported? Does your script convert letters to uppercase, capitalised and lowercase alternatives according to your typographic needs? Do you need to to convert between half-width and full-width presentation forms? +See available information or check for currently needed data.
Are there any issues when dealing with quotations marks, especially when nested? Should block quotes be indented or handled specially? -See available information or check for currently needed data.
-Does your script need special text transforms that are not supported? Does your script convert letters to uppercase, capitalised and lowercase alternatives according to your typographic needs? Do you need to to convert between half-width and full-width presentation forms? -See available information or check for currently needed data.
+When you double- or triple-click on the text, is the expected range of characters highlighted? When you move through the text with the cursor, or backspace, etc. do you see the expected behaviour? Are there issues when applying punctuation than could be fixed by the application? +See available information or check for currently needed data.
+Are there any issues when dealing with quotations marks, especially when nested? Should block quotes be indented or handled specially? +See available information or check for currently needed data.
If you have comments about this page, send them to ishida@w3.org.
-Content last changed 2017-09-21 15:52 GMT
+Content last changed 2017-09-22 11:58 GMT
Copyright © 2017 W3C® (MIT, Introduction
Many scripts use native punctuation marks in addition to or instead of those used in Latin script text. In other cases, such as Greek, common Latin punctuation marks may mean something different from what they mean in English. It may be important to understand what needs to be supported, how these punctuation marks function, and how they interact with other operations applied to the text. Some scripts require special handling with regard to how font properties are specified and how font resources are loaded dynamically. In some scripts it is common to use different fonts for headings or emphasis, rather than bolding or italicisation. Fallback font families used by browsers (eg. serif, sans-serif, cursive, etc.) may need to be mapped differently to fonts for different scripts. Special OpenType features may need to be supported. Quotation marks vary from language to language, not just from script to script. Also, you should expect variations in behavior when quotation marks are nested. Furthermore, the quotation marks used for vertical Japanese text are not the same as those typically used for the same text when horizontally laid out. In CSS, italic and oblique are described as font styles. Non-Latin script can add requirements for such styling. For example, oblique styles in Arabic or Hebrew scripts text may lean to the left. Proper italic glyphs in Cyrillic text can look very different from normal variants, and so synthesising italics can produce poor results. Chinese, Japanese and Korean fonts almost always lack italic or oblique faces, because their characters are too complicated to support them well. In some scripts, such as Arabic, it may be desirable to allow the content author to control the placement of glyphs such as diacritics, or to control ligation, etc. Languages written with Arabic and Hebrew scripts have particular rules, of course, about when it is appropriate to show or hide diacritics for short vowel sounds. Many complex scripts have rules about how characters combine in syllabic structures, and scripts like Arabic may need controls to indicate where ligatures are wanted or not wanted. In scripts such as Arabic, Mongolian and N'Ko adjacent characters are joined together in normal printed text. It is important to ensure that those connections can be maintained correctly when characters are forced apart, or when transparency is applied to the text, etc. There are also situations where cursive joining behaviour exists when there is no adjacent character, or where joining needs to be disabled between glyphs. Conversion between lower, upper and title case only applies to a few scripts, most scripts are unicameral. Where it does apply, the rules can vary by language. In other cases, a particular script may require a different type of transform. For example, in Japanese it is important to be able to convert between half-width and full-width presentation forms. Some scripts have one or sometimes more sets of their own numeric characters. In some cases, numeric characters represent numbers like 100, or 10,000. Numeric formats can also vary significantly, in terms not only of the separators and negative signs used, but also the groupings used for digits. In some scripts, such as Arabic, it may be desirable to allow the content author to control the placement of glyphs such as diacritics, or to control ligation, etc. Languages written with Arabic and Hebrew scripts have particular rules, of course, about when it is appropriate to show or hide diacritics for short vowel sounds. Many complex scripts have rules about how characters combine in syllabic structures, and scripts like Arabic may need controls to indicate where ligatures are wanted or not wanted. In scripts such as Arabic, Mongolian and N'Ko adjacent characters are joined together in normal printed text. It is important to ensure that those connections can be maintained correctly when characters are forced apart, or when transparency is applied to the text, etc. There are also situations where cursive joining behaviour exists when there is no adjacent character, or where joining needs to be disabled between glyphs. Many scripts use native punctuation marks in addition to or instead of those used in Latin script text. In other cases, such as Greek, common Latin punctuation marks may mean something different from what they mean in English. It may be important to understand what needs to be supported, how these punctuation marks function, and how they interact with other operations applied to the text. Conversion between lower, upper and title case only applies to a few scripts, most scripts are unicameral. Where it does apply, the rules can vary by language. In other cases, a particular script may require a different type of transform. For example, in Japanese it is important to be able to convert between half-width and full-width presentation forms. Quotation marks vary from language to language, not just from script to script. Also, you should expect variations in behavior when quotation marks are nested. Furthermore, the quotation marks used for vertical Japanese text are not the same as those typically used for the same text when horizontally laid out. Many scripts create emphasis or other effects by spacing out the letters or syllables in a word. There are questions about how this should work in Indic and SE Asian scripts, and in Arabic-based scripts which join up adjacent letters. Another aspect of inline-spacing relates to separation of characters or items in text. For example, French uses spaces before certain punctuation marks, and the traditional Mongolian script requires special spacing between word stems and certain suffixes. Ruby is used for phonetic and semantic annotations of East Asian text, including furigana, pinyin and zhuyin fuhao systems. In addition to positioning annotations along the correct side of the base text, there are many fine adjustments of the annotation and base text to support. Bold and italic are not always appropriate for expressing emphasis, and some scripts have their own unique ways of doing it, that are not in the Western tradition at all. Note that italicisation is not only a way to express emphasis: see also the section on font style. See also the section on text decoration. In CSS, italic and oblique are described as font styles. Non-Latin script can add requirements for such styling. For example, oblique styles in Arabic script text tend to lean to the left. Proper italic glyphs in Cyrillic text can look very different from normal variants, and so synthesising italics can produce poor results. Chinese, Japanese and Korean fonts almost always lack italic or oblique faces, because their characters are too complicated to support them well. Does the browser or ereader correctly handle special styling of the initial letter of a line or paragraph, such as for drop caps? Scripts whose characters are typically written right-to-left, like Arabic, Hebrew, Thaana, and so on, become bidirectional when they include numbers or text from other scripts (such as Latin acronyms). Browsers and applications need to support bidirectionality. This means supporting the Unicode Bidirectional Algorithm, but also different visual locations of line start and end, isolation of embedded strings, correct line alignment, and so forth. Some scripts require special handling with regard to how font properties are specified and how font resources are loaded dynamically. A script may call for other inline features than those mentioned above. Two examples in Japanese include warichu and kumimoji. Warichu is a kind of inline annotation where the note text is two approximately equal lines of half sized text, one above the other, but both within the normal line height. Kumimoji is a way of combining several characters into a single character space. Scripts whose characters are typically written right-to-left, like Arabic, Hebrew, Thaana, and so on, become bidirectional when they include numbers or text from other scripts (such as Latin acronyms). Browsers and applications need to support bidirectionality. This means supporting the Unicode Bidirectional Algorithm, but also different visual locations of line start and end, isolation of embedded strings, correct line alignment, and so forth. Does the browser or ereader correctly handle special styling of the initial letter of a line or paragraph, such as for drop caps? Browsers and applications must accurately and comprehensively cover requirements for baseline alignment between mixed scripts. For example, Arabic script descenders go far below those of the Latin script, and Armenian characters need to be aligned with ideographic characters in Chinese appropriately with regard to comparative heights and baselines. European, Far Eastern and South Asian scripts tend to use different baselines, which must be aligned correctly. When content can flow vertically and to the left or right, how do you specify the location of objects, text, etc. relative to the flow? For example, keywords 'left' and 'right' are likely to need to be reversed for pages written in English and page written in Arabic. There are special requirements for vertically oriented text. For example, it's common for content authors to want to mix short horizontal runs of text, such as 2-digit numbers, in a vertical column (tate chu yoko). It's also important to provide appropriate support for text in scripts that are normally only horizontal. The following changes have been made since the document was last published to the TR space:Characters & phrases
- Punctuation
- Fonts
+
-
-
-
+
+
+
+
-
- Quotations
- Font styles
+
-
-
+
-
Numbers
+ Glyphs and diacritics
+
+
+
+
+ Cursive text
+
+
+
+
+ Transforming characters
+
+
+
+
+
+
+ Numbers & digits
Numbers
Identifying boundaries of graphemes, words and larger groupings
@@ -276,96 +337,116 @@ Identifying boundaries of graphemes, words and larger groupings
Glyphs and diacritics
-
-
-
-
- Cursive text
- Punctuation
+
-
-
-
-
+
+
+ Transforming characters
- Quotations
+
+
-
-
-
-
+
+ Inline spacing
+ Inline spacing
Inline spacing
Ruby annotation
Text decoration
Emphasis
+ Emphasis & highlights
Emphasis
Font style
-
-
-
-
-
-
- Initial letter styling
-
+
+
+
Bidirectional text direction
+
-
-
-
Fonts
- Other inline features
+
@@ -837,49 +898,48 @@
-
+
+
-
-
Lists, counters, etc
Bidirectional text direction
-
+
+
+
Initial letter styling
+
-
+
-
Baselines & inline alignment
Other paragraph features
Layout & pages
+
+
+
+
+ Bidirectional layout
+
+
+ Vertical text
Changes Since the Last Published
Version
+