diff --git a/gap-analysis/index.html b/gap-analysis/index.html index 382b296..d5ff0d4 100644 --- a/gap-analysis/index.html +++ b/gap-analysis/index.html @@ -127,10 +127,19 @@

Encoding considerations

+
+
+

Fonts

+

Do the standard fallback fonts used in browsers (eg. serif, sans-serif, cursive, etc.) match expectations? Are special font or OpenType features needed for this script that are not available? See available information or check for currently needed data.

+
+
+ + +

Font styles

-

Do the standard fallback fonts used in browsers match expectations? Are special font features needed for this script that are not available? Do italic fonts lean in the right direction? See available information or check for currently needed data.

+

Do italic fonts lean in the right direction? Is synthesised italicisation problematic? See available information or check for currently needed data.

@@ -156,26 +165,16 @@

Cursive text

- -
+
-

Text boundaries and selection

-

When you double- or triple-click on the text, is the expected range of characters highlighted? When you move through the text with the cursor, or backspace, etc. do you see the expected behaviour? Are there issues when applying punctuation than could be fixed by the application? -See available information or check for currently needed data.

+

Transforming characters

+

Does your script need special text transforms that are not supported? Does your script convert letters to uppercase, capitalised and lowercase alternatives according to your typographic needs? Do you need to to convert between half-width and full-width presentation forms? +See available information or check for currently needed data.

-
-
-

Quotations

-

Are there any issues when dealing with quotations marks, especially when nested? Should block quotes be indented or handled specially? -See available information or check for currently needed data.

-
-
- -

Numbers and digits

@@ -187,11 +186,21 @@

Numbers and digits

-
+
-

Transforming characters

-

Does your script need special text transforms that are not supported? Does your script convert letters to uppercase, capitalised and lowercase alternatives according to your typographic needs? Do you need to to convert between half-width and full-width presentation forms? -See available information or check for currently needed data.

+

Text boundaries and selection

+

When you double- or triple-click on the text, is the expected range of characters highlighted? When you move through the text with the cursor, or backspace, etc. do you see the expected behaviour? Are there issues when applying punctuation than could be fixed by the application? +See available information or check for currently needed data.

+
+
+ + + +
+
+

Quotations

+

Are there any issues when dealing with quotations marks, especially when nested? Should block quotes be indented or handled specially? +See available information or check for currently needed data.

@@ -438,7 +447,7 @@

What else?

If you have comments about this page, send them to ishida@w3.org.

-

Content last changed 2017-09-21 15:52 GMT

+

Content last changed 2017-09-22 11:58 GMT

Characters & phrases

-
-

Punctuation

-

Many scripts use native punctuation marks in addition to or instead of those used in Latin script text. In other cases, such as Greek, common Latin punctuation marks may mean something different from what they mean in English. It may be important to understand what needs to be supported, how these punctuation marks function, and how they interact with other operations applied to the text.

+ + +
+

Fonts

+

Some scripts require special handling with regard to how font properties are specified and how font resources are loaded dynamically. In some scripts it is common to use different fonts for headings or emphasis, rather than bolding or italicisation. Fallback font families used by browsers (eg. serif, sans-serif, cursive, etc.) may need to be mapped differently to fonts for different scripts. Special OpenType features may need to be supported.

-

See also and .

-
-

Quotations

-

Quotation marks vary from language to language, not just from script to script. Also, you should expect variations in behavior when quotation marks are nested. Furthermore, the quotation marks used for vertical Japanese text are not the same as those typically used for the same text when horizontally laid out.

+
+ + + + + + + +
+

Font styles

+

In CSS, italic and oblique are described as font styles. Non-Latin script can add requirements for such styling. For example, oblique styles in Arabic or Hebrew scripts text may lean to the left. Proper italic glyphs in Cyrillic text can look very different from normal variants, and so synthesising italics can produce poor results. Chinese, Japanese and Korean fonts almost always lack italic or oblique faces, because their characters are too complicated to support them well.

-

See also .

+ - - -
-

Numbers

+
+

Glyphs and diacritics

+

In some scripts, such as Arabic, it may be desirable to allow the content author to control the placement of glyphs such as diacritics, or to control ligation, etc. Languages written with Arabic and Hebrew scripts have particular rules, of course, about when it is appropriate to show or hide diacritics for short vowel sounds. Many complex scripts have rules about how characters combine in syllabic structures, and scripts like Arabic may need controls to indicate where ligatures are wanted or not wanted.

+ +
+ + + + + +
+

Cursive text

+

In scripts such as Arabic, Mongolian and N'Ko adjacent characters are joined together in normal printed text. It is important to ensure that those connections can be maintained correctly when characters are forced apart, or when transparency is applied to the text, etc. There are also situations where cursive joining behaviour exists when there is no adjacent character, or where joining needs to be disabled between glyphs.

+ +
+ + +
+

Transforming characters

+

Conversion between lower, upper and title case only applies to a few scripts, most scripts are unicameral. Where it does apply, the rules can vary by language.

+

In other cases, a particular script may require a different type of transform. For example, in Japanese it is important to be able to convert between half-width and full-width presentation forms.

+ +
+ + + + +
+

Numbers & digits

Some scripts have one or sometimes more sets of their own numeric characters. In some cases, numeric characters represent numbers like 100, or 10,000. Numeric formats can also vary significantly, in terms not only of the separators and negative signs used, but also the groupings used for digits.

See also .

- - +

Identifying boundaries of graphemes, words and larger groupings

@@ -276,96 +337,116 @@

Identifying boundaries of graphemes, words and larger groupings

- - - -
-

Glyphs and diacritics

-

In some scripts, such as Arabic, it may be desirable to allow the content author to control the placement of glyphs such as diacritics, or to control ligation, etc. Languages written with Arabic and Hebrew scripts have particular rules, of course, about when it is appropriate to show or hide diacritics for short vowel sounds. Many complex scripts have rules about how characters combine in syllabic structures, and scripts like Arabic may need controls to indicate where ligatures are wanted or not wanted.

- -
- - - - - -
-

Cursive text

-

In scripts such as Arabic, Mongolian and N'Ko adjacent characters are joined together in normal printed text. It is important to ensure that those connections can be maintained correctly when characters are forced apart, or when transparency is applied to the text, etc. There are also situations where cursive joining behaviour exists when there is no adjacent character, or where joining needs to be disabled between glyphs.

+
+

Punctuation

+

Many scripts use native punctuation marks in addition to or instead of those used in Latin script text. In other cases, such as Greek, common Latin punctuation marks may mean something different from what they mean in English. It may be important to understand what needs to be supported, how these punctuation marks function, and how they interact with other operations applied to the text.

-
- - -
-

Transforming characters

-

Conversion between lower, upper and title case only applies to a few scripts, most scripts are unicameral. Where it does apply, the rules can vary by language.

-

In other cases, a particular script may require a different type of transform. For example, in Japanese it is important to be able to convert between half-width and full-width presentation forms.

+

See also and .

+
+ + +
+

Quotations

+

Quotation marks vary from language to language, not just from script to script. Also, you should expect variations in behavior when quotation marks are nested. Furthermore, the quotation marks used for vertical Japanese text are not the same as those typically used for the same text when horizontally laid out.

+

See also .

+ +
-

Inline spacing

+

Inline spacing

Many scripts create emphasis or other effects by spacing out the letters or syllables in a word. There are questions about how this should work in Indic and SE Asian scripts, and in Arabic-based scripts which join up adjacent letters. Another aspect of inline-spacing relates to separation of characters or items in text. For example, French uses spaces before certain punctuation marks, and the traditional Mongolian script requires special spacing between word stems and certain suffixes.

+ + +

Ruby annotation

Ruby is used for phonetic and semantic annotations of East Asian text, including furigana, pinyin and zhuyin fuhao systems. In addition to positioning annotations along the correct side of the base text, there are many fine adjustments of the annotation and base text to support.

@@ -483,7 +567,7 @@

Text decoration

-

Emphasis

+

Emphasis & highlights

Bold and italic are not always appropriate for expressing emphasis, and some scripts have their own unique ways of doing it, that are not in the Western tradition at all. Note that italicisation is not only a way to express emphasis: see also the section on font style. See also the section on text decoration.

- - - -
-

Font style

-

In CSS, italic and oblique are described as font styles. Non-Latin script can add requirements for such styling. For example, oblique styles in Arabic script text tend to lean to the left. Proper italic glyphs in Cyrillic text can look very different from normal variants, and so synthesising italics can produce poor results. Chinese, Japanese and Korean fonts almost always lack italic or oblique faces, because their characters are too complicated to support them well.

- -
-
-

Initial letter styling

-

Does the browser or ereader correctly handle special styling of the initial letter of a line or paragraph, such as for drop caps?

-
-
-

Bidirectional text direction

-

Scripts whose characters are typically written right-to-left, like Arabic, Hebrew, Thaana, and so on, become bidirectional when they include numbers or text from other scripts (such as Latin acronyms). Browsers and applications need to support bidirectionality. This means supporting the Unicode Bidirectional Algorithm, but also different visual locations of line start and end, isolation of embedded strings, correct line alignment, and so forth.