diff --git a/index.html b/index.html index ec4442a..4bb2297 100644 --- a/index.html +++ b/index.html @@ -223,18 +223,18 @@
Unicode Bidirectional Algorithm
- [[!BIDI]] details an algorithm for rendering right-to-left text and covers a myriad of
- situations in mixing different kinds of characters. A simpler explanation of the basics of
- the algorithm exists in the W3C article Unicode
+ Unicode Bidirectional Algorithm (or
+ bidi algorithm, for short) [[!BIDI]] details an algorithm for
+ rendering right-to-left text and covers a myriad of situations in mixing different kinds of
+ characters. A simpler explanation of the basics of the algorithm exists in the W3C article
+ Unicode
Bidirectional Algorithm basics. [[UBA-BASICS]] You can refer to these documents for more
information about Unicode’s bidirectional algorithm. A brief overview of the bidirectional (bidi for short)
- algorithm follows, because the direction is an essential part of how Arabic script is
- used. A brief overview of the bidirectional algorithm follows, because the direction is
+ an essential part of how Arabic script is used. The characters of a text are digitally stored and transferred in the same order that they
@@ -303,9 +303,10 @@ Unicode has a bidi category property defined for each character
- that is used to determine the direction of each character. All the Arabic letters are marked
- as right-to-left characters, while Latin characters have the left-to-right category. Unicode has a bidi class (or bidi
+ type) property defined for each character that is used to determine the direction of
+ each character. All the Arabic letters are marked as right-to-left characters, while Latin
+ characters have the left-to-right category. Some characters, mostly punctuations, are neutral. The
@@ -347,19 +348,20 @@ There are different categories of letters based on their joining
- behavior:
- There are different categories of letters based on their joining behavior:Direction
-
Joining Forms
-
@@ -381,14 +383,14 @@ Joining Forms
-
@@ -398,50 +400,49 @@ Joining Categories
-
+
Besides ZWJ, there’s another special Unicode character, - U+0640 ARABIC TATWEEL, which enforces joining behavior (join - causing) on letters next to it. But, in contrast to ZWJ, - TATWEEL has a glyph shape, looking like a hyphen and usually - as wide as the SPACE glyph, which connects to the letters on the main joining line - (a.k.a. base-line). So, using TATWEEL would give a similar - Joining Enforcement behavior, but has a side effect of wider length for the letter, which - is not always desired. That’s why it’s highly recommended to only use ZWJ for joining control.
+Besides ZWJ, there’s another special Unicode character, U+0640 ARABIC TATWEEL, which enforces joining behavior (join causing) on + letters next to it. But, in contrast to ZWJ, TATWEEL has a glyph shape, + looking like a hyphen and usually as wide as the SPACE glyph, which connects to the + letters on the main joining line (a.k.a. base-line). So, using TATWEEL would give + a similar Joining Enforcement behavior, but has a side effect of wider length for the + letter, which is not always desired. That’s why it’s highly recommended to only use + ZWJ for joining control.
Two enforcement methods mentioned above can be combined together to form a - Joining-Disjoining Enforcement method, that enables - Joining Rule 3 for cases when there’s a dual-joining/right-joining letter after a - join-to-left letter, which should not be joined to its - previous letter.
+ Joining-Disjoining Enforcement method, that enables Joining Rule 3 for cases when there’s a + dual-joining/right-joining letter after a join-to-left letter, which + should not be joined to its previous letter.A sequence of letters that join together are called a Joining - Segment. Regardless of language, joining segments have no - direct relationship to syllables.
+A sequence of letters that join together are called a Joining Segment. + Regardless of language, joining segments have no direct relationship to + syllables.
-Two types of joining segments exist: closed and open.
+Two types of joining segments exist: closed and open.
Joining Segments usually have a closed form, meaning that they start in a non-join-to-right form and end in a non-join-to-left - form. Closed joining segments are the result of segments - either start and end with their normal behavior (Joining Rule - 1), or by disjoining enforcement (Joining Rule 2).
+Joining Segments usually have a closed form, meaning that they start in a + non-join-to-right form and end in a non-join-to-left form. Closed + joining segments are the result of segments either start and end with their normal + behavior (Joining Rule 1), or by disjoining enforcement (Joining + Rule 2).
There are two possible types of closed segments:
Under the certain cases, as noted in Joining Rules 3 and 4, - joining segments can start with a join-to-right form, or end with an join-to-left - form, or both.
+Under the certain cases, as noted in Joining Rules 3 + and 4, joining segments can start with a + join-to-right form, or end with a join-to-left form, or both.
There are three possible types of these segments:
Arabic Letters, two Joining Control Characters (ZWNJ and ZWJ), and TATWEEL are the only characters used in the Arabic writing system with - joining behavior.
+Arabic Letters, two Joining Control Characters (ZWNJ and ZWJ), and + TATWEEL are the only characters used in the Arabic writing system with joining + behavior.
Arabic diacritics, other Unicode non-spacing marks, and most - Unicode format control characters are considered transparent in joining behavior.
+ Unicode format control characters are considered transparent in joining behavior.All other Unicode characters in Arabic script (as well as Latin and many other major scripts) are non-joining and do not take any joining forms other than Isolated.
-For the details of how Arabic Cursive Joining algorithm +
For more the details on Arabic Cursive Joining algorithm, please refer to chapter Middle East-I — Modern and Liturgical Scripts of The Unicode Standard. [[!UNICODE]]
@@ -824,12 +822,12 @@In general we group under the generic term Naskh - (copy/inscription) the scripts which are meant for reading at smaller sizes and are - suitable for books and texts to be read, e.g. the Korʼan, and as - Kufic the highly stylized font styles used for ornamentation and - more styled writings. Nevertheless, the rich evolution of the Arabic script led to the - distinctive enumeration of a number of additional named styles.
+In general we group under the generic term Naskh (copy/inscription) the + scripts which are meant for reading at smaller sizes and are suitable for books and texts + to be read, e.g. the Korʼan, and as Kufic the highly stylized font + styles used for ornamentation and more styled writings. Nevertheless, the rich evolution of + the Arabic script led to the distinctive enumeration of a number of additional named + styles.
Similarly, two other generic terms are used to classify styles : Mabsut (wa
@@ -1042,205 +1040,195 @@ Arabic Script and Typography
simplest Naskh style?
Multi-level baselines -
+Letters may join through a finely inclined line
+Letters may join through a finely inclined line
-or two, square-ended lines
+or two, square-ended lines
-Multilevel baselines don't occur in all fonts. The above examples use the Arabic - Typesetting font. Compare those examples to to more typical fonts:
+Multilevel baselines don't occur in all fonts. The above examples use the Arabic + Typesetting font. Compare those examples to to more typical fonts:
-
-
+
Multi-context joining -
+Rendering of letters depends not only on their place in the word (initial, medial, - final) but also on their neighboring letters, i.e. the letter they join with. Each - letter has a different appearance in each combination.
+Rendering of letters depends not only on their place in the word (initial, medial, + final) but also on their neighboring letters, i.e. the letter they join with. Each letter + has a different appearance in each combination.
-Fonts don't always comply with or respect this kind of tuning. To do so, fonts need many glyphs in order to adapt to each - context. In more modern typefaces some of these connections are implemented by - ligatures, but ligatures can't capture or cover all joining behavior.
+Fonts don't always comply with or respect this kind of tuning. To do so, fonts need many glyphs in order to adapt to each + context. In more modern typefaces some of these connections are implemented by ligatures, + but ligatures can't capture or cover all joining behavior.
-In the two left most words, the initial noon differs in that one raises a kind of
- stroke. This property of raising a stroke is common for a number of letters (beh, teh,
- noon, theh) which are taller than their connected letters in order to be distinguished
- in some contexts, such as vs.
, or to resolve ambiguity. See also the section about teeth letters
- below.
In the two left most words, the initial noon differs in that one raises a kind of
+ stroke. This property of raising a stroke is common for a number of letters (beh, teh,
+ noon, theh) which are taller than their connected letters in order to be distinguished in
+ some contexts, such as vs.
, or to resolve ambiguity. See also the section about teeth letters
+ below.
Words as groups of letters -
+A word shape is not (only) a "horizontal" connections of letters, but of groups of - letters (syntagmes).
+A word shape is not (only) a "horizontal" connections of letters, but of groups of + letters (syntagmes).
-Example two words in some nice Naskh font.
+Example two words in some nice Naskh font.
-![]() |
+ |
![]() |
-
To compare with the same words in more usual font:
+To compare with the same words in more usual font:
-![]() |
-
Group combinations cannot be covered by general or usual ligatures.
-Group combinations cannot be covered by general or usual ligatures.
+ -Vertical joining
+Groups of letters may also "join" vertically (top down) instead of right to left. - And not all fonts permit this.
+Groups of letters may also "join" vertically (top down) instead of right to left. And + not all fonts permit this.
-vs. | +vs. | -||
Joining happens almost vertical | +|||
Joining happens almost vertical | -- | ++ | -Joining happens horizontal | -
Once again, some fonts try standard ligatures, but this is not ligature. This is - rather (good) writing practice/style.
+Once again, some fonts try standard ligatures, but this is not ligature. This is + rather (good) writing practice/style.
-One should note that all this characteristics has not only an aesthetic side, but - also play a role in justification. It is at the discretion of (hand writing) authors to - chose the best kind of joining to suit the desired line width. Should then be a general - rule on that. But to achieve such justification would require sophisticated - algorithms.
-One should note that all this characteristics has not only an aesthetic side, but also + play a role in justification. It is at the discretion of (hand writing) authors to chose + the best kind of joining to suit the desired line width. Should then be a general rule on + that. But to achieve such justification would require sophisticated algorithms.
+ -The so called teeth letters. -
+Letters having uniform medial shape, align in a kind of teeth.
+Letters having uniform medial shape, align in a kind of teeth.
-Even in the teeth context letter shape may vary. It's not the same letters (in red) - which raise the stroke in the two figures.
-Even in the teeth context letter shape may vary. It's not the same letters (in red) + which raise the stroke in the two figures.
+ @@ -1571,12 +1559,13 @@Of the four basic justification methods (flush left, flush right, justified, and centered), justified is the most challenging, as it requires changing the widths of the lines - to a pre-defined measure. Measure refers to the width of a column - of text. In a justified paragraph the width of all the lines should be the same as the - paragraph’s measure (except, of course, the last line).
+ to a pre-defined measure. Measure refers to the width of a column of text. In a + justified paragraph the width of all the lines should be the same as the paragraph’s measure + (except, of course, the last line).In Arabic there are six mechanisms for changing the width of a line of text. Each one has @@ -2183,10 +2170,9 @@
An important factor in the application of these mechanisms is their success in creating an - even color. The color of the text refers to the amount of - ink/blackness used to print or show a block of text. Color describes the density of the text - against its background. Poorly justifying paragraphs can create uneven distribution of - color.
+ even color. The color of the text refers to the amount of ink (or blackness) used + to print or show a block of text. Color describes the density of the text against its + background. Poorly justifying paragraphs can create uneven distribution of color.These mechanisms are not exclusive. Quite the contrary, they are commonly used