Permalink
Switch branches/tags
Nothing to show
Find file Copy path
Fetching contributors…
Cannot retrieve contributors at this time
419 lines (279 sloc) 25.5 KB

i18n glossary

[Implementers’ doc] [WIP]

This document aims to list terms implementers might not be familiar with. It can be useful when encountering one of those terms in bug reports, issues, or feedbacks.

Arabic

This glossary has been created from the Arabic Script Requirements.

Eastern Arabic numerals (الأَرقَامْ العَرَبِيَّة المَشْرِقِيَّة)
Numerals used in Arabic-Indic and Eastern Arabic-Indic (٠ ١ ٢ ٣ ٤ ٥ ٦ ٧ ٨ ٩). Use the preferred terminology to avoid confusion.
Elongation (التَّطْوِيلْ)
See Tatweel.
European numerals (أَرْقَامْ أُورُوبِيَّة)
Any of the symbols in [0-9] used to represent numbers. Some cultures are used to call them ”Arabic numerals.” Use “European numerals” to avoid confusion.
Harakat (حَرَكَاتْ)
Tashkil marks representing short vowel sounds.
Ihmal (إِهْمَالْ)
See Tashkil.
Ijam (إِعْجَامْ)
Diacritical marks applied to a basic letter shape (or skeleton) to derive a new letter. For example a dot under a “curve” to get the letter Beh. In Unicode each letter plus ijam combination is encoded as a separate, atomic character.
Joining forms
Arabic script is a cursive writing system; every Arabic letter has one, two, or four different joining forms (isolated, initial, medial, final), which allow the letter to join to its neighbors, if applicable. See the Joining section of the Arabic Script Requirements for further details.
Kashida (الْكَشِيدَة)
A method of aligning both edges of all lines to be the same given length by extending the horizontal connection between joined letters. See the Kashida section of the Arabic Script Requirements for further details.
Mabsut (مَبْسُوطْ)
Kind of writing style that tends to rigidity and firmness with pronounced angularity.
Mukawwar (مُكَوَّرْ)
Kind of writing style, generally opposed to mabsut, that is more flexible and rounded.
Shadda (شَدَّة)
A tashkil mark indicating gemination of the base consonant.
Sukun (سُكُونْ)
A tashkil mark indicating the lack of a vowel after the consonant to which it is attached.
Tanwin (تَنْوِينْ)
Tashkil marks indicating postnasalized or long vowels at the end of a word, and indicated by doubling the sign of one of the harakat diacritics.
Tashkil (تَشْكِيلْ)
Marks that are added to letters to indicate vocalisation of text or to correct pronunciation.
Tatweel (التَّطْوِيلْ)
Tatweel is a dual-joining character that can be inserted between two joined letters to widen their connection. See the Tatweel section of the Arabic Script Requirements for further details.

Chinese

This glossary has been created from the Chinese Text Layout Requirements.

Annotation text / Zhùwén (注文)
Interlinear text run indicating pronunciation or definitions.
Base text (基文)
A character to be annotated by ruby, ornament characters, or emphasis dots.
Big5 (大五码)
Big-5 or Big5 is a Chinese character encoding method used in Taiwan, Hong Kong, and Macau for Traditional Chinese characters. A total of 13,060 Traditional Chinese characters are encoded with Big5.
Bilingual annotations (中外文对照)
To prompt a Chinese term with its original or translation in the form of annotation text or base text, an instance of interlinear annotation.
Bopomofo / Zhuyin (注音符号)
The general name of Mandarin Phonetic Symbols and Taiwanese Phonetic Symbols.
End point / Mòduān (末端)
The ending point of a line, meaning the bottom side in vertical writing mode, or the right side in horizontal writing mode.
Fixed inter-character spacing setting / Shūpái (疏排)
A text setting with a uniform inter-character spacing.
Fullwidth (全角/全形)
A square character frame that has a character advance of character size.
Grid alignment / Zònghéng duìqí (纵横对齐)
The process, under the premise of justification, of arranging characters within grids to make sure that they are aligned in both horizontal and vertical axes.
Halfwidth (半形/半角)
A square character frame that has a character advance of 1/2 character size.
Han characters / Hànzì (汉字)
Characters that form the basis of the Chinese language.
Hanging punctuation (行尾点号悬挂)
Typesetting punctuation for line endings outside the margin of alignment, also known as hanging punctuation or exdentation.
Hanyu Pinyin (汉语拼音)
Hanyu Pinyin provides a method of using Latin characters with diacritics to indicate tone. It is often used to teach Standard Chinese and encourage its use as a common language for communication.
Horizontal writing mode / Héngpái (横排)
The process or the result of arranging characters on a line from left to right, of lines on a page from top to bottom, and/or of columns on a page from left to right.
Horizontal-in-vertical setting / Zòngzhōnghéngpái (纵中横排)
To typeset a (small) group of characters horizontally within a vertical line of main text.
Interlinear annotations (行间注)
Annotations between lines used to indicate the pronounciation of a word or an explanation of a word.
Interlinear comments (行间批语)
Comments included between lines, generally free-form with no restrictions on line length. These can exceed the length of a single line.
Hindu–Arabic numerals (阿拉伯数字)
Hindu–Arabic numerals, also called Arabic numerals or European digits are the ten digits: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, based on the Hindu–Arabic numeral system.
Phonetic annotation (标音)
The way to indicate the pronunciation of the Chinese characters, e.g. interlinear annotations.
Proportional type (比例字体)
A proportional typeface contains glyphs of varying widths in order to make the glyphs properly related in size. It is widely used in Latin characters.
Rhotacization of syllable finals (儿化音)
In standard Chinese and certain dialects, some words have rhotacized endings that result in phonetic changes to how the word sounds.
Romanization (罗马拼音)
The conversion system of writing from Chinese pronunciation into Roman/Latin script.
Simplified Chinese (简体中文)
The writing system of Chinese using characters that are relatively simpler in structure and stroke count, mainly refer to Jianhuazi (Simplified Character) published and revised in 1960s in Mainland China.
Solid setting / Mìpái (密排)
To arrange characters with no inter-character space between adjacent character frames.
Starting point / Shǐduān (始端)
The text or character that is near the beginning of the line. Normally, when the text is in vertical writing mode, the starting point is on the top; in horizontal writing mode, the starting point is on the left.
Traditional Chinese (繁体中文)
The writing system of Chinese using characters that are relatively more complex in structure and stroke count. Known as Traditional Chinese because of its long history of use.
Vertical writing mode / Zhípái / Shùpái (直排/竖排)
The process or the result of arranging characters on a line from top to bottom, of lines on a page from right to left, and/or of columns on a page from top to bottom.
Western scripts / Xīwén (西文)
Writing systems derived from Greek and Latin.

Indic

This glossary has been created from the Indic Text Layout Requirements.

Akhand Ligatures
Required consonant ligatures that may appear anywhere in the syllable. They have the highest priority and are formed first.
Anuswara
A modifier denoted by a dot above the letter after which it is to be pronounced.
Augmented Backus-Naur Form (ABNF)
A meta-language based on Backus-Naur Form (BND), but consisting of its own syntax and derivation rule. The linguistic definition of the Indic orthographic syllable has been mapped to ABNF for the purpose of text segmentation, line breaking, drop letter, and letter spacing. See the Indic Text Layout Requirements for further details.
Avagraha
A modifier shaped like the letter “S” used in the Sanskrit text.
Bengali
A script used in the Bengali, Assamese, and Manipuri languages.
Chandrabindu
A modifier denoted by a breve with a dot superposed above the letter after which it is to be pronounced.
Danda (purna viram)
A Devanagari phrase separator used to mark the end of the verse in Sankrit text, shlokas, etc. The properties of the Danda and double Danda should be the same as other punctuation marks, and a line should never start with those characters.
Devanagari
A script used in the Hindi, Sanskrit, Marathi, Konkani, Nepali, Maithili, Sindi, Bodo, Dogri, Santhali, and Kashmiri languages.
Gurmukhi
A script used in the Punjabi language.
Halant (Virama)
The character used after a consonant to strip it of its inherent vowel.
Indic orthographic syllable
The effective orthographic unit of Indic writing systems, consisting of a consonant and a vowel core, and optionally preceded by one or more consonants. The syllable is indivisible, thus it can not be possible to position the cursor within it.
Matra
A character used to represent a vowel sound that is not inherent to the consonant.
Nukta
A character that alters the way a preceding consonant is pronounced.
Visarga
A modifier denoted by two dots place one above the other (it looks like a colon).

Japanese

This glossary has been created from the Japanese Text Layout Requirements.

Base character / Oya moji (親文字)
A character to be annotated by ruby, ornament characters, or emphasis dots.
Base line / Narabi sen (並び線)
A virtual line on which almost all glyphs in Western fonts are designed to be aligned. (JIS Z 8125)
Block heading / Betsugyōmidashi (別行見出し)
A kind of heading styles. The heading is set as an independent line from basic text. (JIS Z 8125)
Bound on the left-hand side / Hidari toji (左綴じ)
Binding of a book to be opened from the left (horizontal writing mode).
Bound on the right-hand side / Migi toji (右綴じ)
Binding of a book to be opened from the right (vertical writing mode).
Bousen (sideline) / Bōsen (傍線)
A line drawn by the left or right side of a character or a run of text in vertical writing mode. (JIS Z 8125)
Character advance / Jihaba (字幅)
Size of a character frame in the inline direction, generally indicated as a ratio of the size of a full-width character, as in full-width, half-width, or quarter em width. Character advance is the width of a given character in horizontal writing-mode, while it is the height in vertical writing-mode.
Character frame / Sotowaku (外枠)
Rectangular area occupied by a character when it is set solid.
Characters not ending line / Gyōmatsu kinsoku moji (行末禁則文字)
Any character for which "line-end prohibition rule" is invoked. (JIS Z 8125)
Characters not starting line / gyōtō kinsoku moji (行頭禁則文字)
Any character for which "line-start prohibition rule" is invoked. (JIS Z 8125)
Chu-boso-kei / Chūbosokei (中細罫)
Middle width line, usually about 0.25mm.
Column spanning heading / Dannuki no midashi (段抜きの見出し)
Headings using multiple columns.
Compound word / Jukugo (熟語)
A combination of two or more kanji characters which makes one word.
Cut-in heading / Madomidashi (窓見出し)
A style of headings. Headings do not occupy the full lines, but share lines area with following main text lines.
Emphasis dots / Kenten (圏点)
Symbols attached alongside a run of base characters to emphasize them. (JIS Z 8125)
European numerals / Arabia sūji (アラビア数字)
Any of the symbols in [0-9] used to represent numbers. (JIS Z 8125)
Even inter-character spacing / Kintō wari (均等割り)
A text setting with uniform inter-character spacing per line so that each line is aligned on the same line-head and line-end. (JIS Z 8125)
Fixed inter-character spacing / Aki gumi (アキ組)
A text setting with a uniform inter-character spacing. (JIS Z 8125)
Fixed-width / Monosupēsu (モノスペース)
A characteristic of a font where the same character advance is assigned for all glyphs. (JIS Z 8125)
Footnote / Kyakuchū (脚注)
A note in a smaller face than that of main text, placed at the bottom of a page. (JIS Z 8125)
Full-width / Zenkaku (全角)
Character frame which character advance is equal to a given character size. A full-width character frame is square in shape by definition.
Futoji (太字)
A kind of font style. Similer to bold in western typograpy.
Furigana (振り仮名)
A method of ruby annotation using kana characters to indicate how to read kanji characters. This term derives from a Japanese verb “furu” (to attach alongside) and “kana”, and has been used synonymously with “ruby.”
Furikanji (振り漢字)
A method of ruby annotation using Kanji characters for ruby instead of kana characters.
Furiwake (振分け)
A method of placing multiple runs of text in a line. (JIS Z 8125)
Gaiji (外字)
Gaiji are small, inline images that represent characters that are not available in a character or font set. Gaiji are typically used for older symbols or characters in Japanese that have fallen out of use. It should be treated as text, especially in reading modes e.g. inverted in night mode.
General-ruby / Sō rubi (総ルビ)
A method of ruby annotation that attaches ruby text for all Kanji characters in the text. (JIS Z 8125)
Group-ruby / Gurūpu rubi (グループルビ)
A method of ruby character distribution such that the length of ruby text matches to that of the base text by giving the same adjusted amount of space between ruby characters.
Gyodori (行取り)
To keep block direction area for headings and so on, along with line units in kihon-hanmen. The width of the gyodori space is calculated with following fomula: line width × number of lines + line gap × (number of lines − 1). Implementers can think of it as something like “vertical rhythm.”
Half-width / Hankaku (半角)
Character frame which has a character advance of a half em.
Han-tobira (半扉)
A simplified version of naka-tobira, the verso side of which text of the new part starts. (JIS Z 8125)
Hanmen (版面)
Actual printed area in a page excluding the margins. It is the page content area.
Horizontal writing mode / Yokogumi (横組)
The process or the result of arranging characters on a line from left to right, of lines on a page from top to bottom, and/or of columns on a page from left to right. (JIS Z 8125)
Ideographic numerals / Kansūji (漢数字)
Ideographic characters representing numbers.
Inline cutting note / Warichu (割注)
A note of two or more lines inserted in the text. It includes brackets which surround the note (JIS Z 8125)
Inseparable characters rule / Bunri kinshi (分離禁止)
A line adjustment rule that prohibits inserting any space between specific combinations of characters. (JIS Z 8125)
Inter-character space / Jikan (字間)
Amount of space between two adjacent character frames on the same line.
Itemization / Kajō gaki (箇条書き)
To list ordered or unordered items one under the other. (JIS Z 8125)
Japanese gothic face / Goshikku tai (ゴシック体)
A Japanese typeface, with strokes almost the same in thickness, and no special ornament on a stroke such as a triangular element commonly seen in the Mincho typeface. Used for text emphasis and/or headings. Similar to “sans serif” in Western typography.
Jidori (字取り)
A method of aligning a run of text to both edges which is specified by a position to start and the length calculated by a specified number of a given size of characters. (JIS Z 8125)
Jouyou Kanji Table / Jōyō kanji hyō (常用漢字表)
The official list of Kanji characters “for general use in society. such as in legal and official documents, newspapers, magazines, broadcasting and the like.” It was established in 1981 as a reference guide for people in composing contemporary Japanese. It listed 1,945 of Kanji characters together with their orthographic shapes, Japanese native reading (Kun), Chinese derived reading (On) and other useful information.
Jukugo-ruby (熟語ルビ)
A method of ruby character distribution determined by two functions, one is to provide reading for each Kanji character, the other is to give a united appearance attached to a word.
Kanbun (漢文)
Chinese classic text (or text in the same style) with various auxiliary symbols so that it can be read as Japanese text.
Katatsuki (肩付き)
A method of attaching ruby at the upper right of each base character. (JIS Z 8125)
Kihon hanmen (基本版面)
The default dimensions of the main area of a typeset page specified by text direction, number of columns, character size, number of characters in a line, number of lines in a column, inter-line spacing and inter-column spacing. (JIS X 4051)
Line adjustment by hanging punctuation / Burasage gumi (ぶら下げ組)
A line breaking rule to avoid commas or full stops at a line head (which is prohibited in Japanese typography) by taking them back to the end of the previous line beyond the specified line length. (JIS Z 8125)
Line breaking rules / Kinsoku shori (禁則処理)
A set of rules to avoid prohibited layout in Japanese typography, such as “line-start prohibition rule,” “line-end prohibition rule,” inseparable or unbreakable character sequences and so on. (JIS Z 8125)
Line feed / Gyō okuri (行送り)
The distance between two adjacent lines measured by their reference points. (JIS Z 8125) Implementers can think of it as the line-height CSS property.
Line gap / Gyōkan (行間)
The smallest amount of space between adjacent lines.
Line-end prohibition rule / Gyōmatsu kinsoku (行末禁則)
A line breaking rule that prohibits specific characters at a line end. (JIS Z 8125)
Line-start prohibition rule / Gyōtō kinsoku (行頭禁則)
A line breaking rule that prohibits specific characters at a line head. (JIS Z 8125)
Mincho typeface / Minchōtai (明朝体)
A major style of Japanese font. Horizontal lines are thin and vertical lines are thick. At the start position and the end position, there are triangular figure representing press of brush. Kana are designed to balance the Kanji design. In Japanese text setting, Mincho typeface is most frequently used for main text, especially for long text. Similar to “serif” in Western typography.
Mono-ruby (モノルビ)
A method of ruby distribution where a run of ruby text is attached to each base character. (JIS Z 8125)
Naka tobira (中扉)
A recto or a page inserted to divide two different parts in a book. It often has a title or other text to describe the new part. (JIS Z 8125)
Nakatsuki (中付き)
A method of ruby character distribution where each ruby character is aligned to the vertical center of the corresponding base character in vertical writing mode, or to the horizontal center of the base character in horizontal writing mode. (JIS Z 8125)
Omotekei (表罫)
Thin width line. Usually about 0.12mm. (JIS Z 8125)
One-third-ruby / Sanbu rubi (三分ルビ)
Ruby characters, narrow enough so that three can fit within the width of a full-width base character.
Para-ruby (パラルビ)
A method of ruby annotation where ruby text is only attached to selected Kanji characters in the text. (JIS Z 8125)
Parallel note / Heiretsuchū (並列注)
Areas of notes are kept when the kihon-hanmen is designed. Related notes are set in these areas, with page unit or spread unit. Parallel-note is the general name for head note (in vertical writing mode), foot note (in vertical writing mode) and side note (in horizontal writing mode).
Proportional / Puropōshonaru (プロポーショナル)
A characteristic of a font where character advance is different per glyph. (JIS Z 8125)
Ruby (ルビ)
Supplementary small characters indicating pronunciation, meaning, etc. for the character or the block of characters they annotate. (JIS Z 8125) (Sometimes these annotations are referred to as “furigana.”)
Run-in heading / Dōgyōmidashi (同行見出し)
A kind of heading style to continue main text just after the heading without line break.
Sidenote / Bōchū (傍注)
A kind of notes, in vertical writing mode with spread unit, and related notes are set from the left end of left page with smaller size font than the main text. A kind of notes, in horizontal writing mode, the realm is kept beforhand in right side or fore-edge side of kihon-hanmen, and related notes are set in the realm with smaller size font than main text.
Small kana / Kogaki no kana (小書きの仮名)
Kana with smaller letter faces to be used mainly for representing contracted sounds or prolonged vowels. (JIS Z 8125)
Solid setting / Beta gumi (ベタ組)
To arrange characters with no inter-character space between adjacent character frames.
Tate-chu-yoko / Tate chū yoko (縦中横)
To typeset a (small) group of characters horizontally within a vertical line of main text, e.g. abbreviations, digits, etc.
Touyou Kanji Table / Tōyō kanji hyō (当用漢字表)
The official list of Kanji characters established in 1946, which was designed to restrict the Kanji characters for general use in society to only those 1850 specified in the list. The list together with other related tables was superseded by the Jouyou Kanji Table.
Tsumegumi (詰め組)
Adjustment of inter-character space by making the distance between the letter face of adjacent characters shorter than that produced by solid setting. (JIS Z 8125)
Unbreakable characters rule / Bunkatsu kinshi (分割禁止)
A line breaking rule that prohibits breaking a line between consecutive dashes or leaders, or between other specific combinations of characters.
Underline / Kasen (下線)
A line drawn under a character or a run of text in horizontal writing mode. (JIS Z 8125)
Urakei (裏罫)
Thick width line. Usually about 0.4mm. (JIS Z 8125)
Vertical writing mode / Tate gumi (縦組)
The process or the result of arranging characters on a line from top to bottom, of lines on a page from right to left, and/or of columns on a page from top to bottom. (JIS Z 8125)
Widow Adjustment / Danraku matsubi shori (段落末尾処理)
A method of line composition to adjust lines in a paragraph so that the last line consists of more than a given number of characters.