Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Syntax for applying ruby annotations (CJK texts) (add \rb ...\rb*) #31
Updated: January 2018
In the course of implementing support for ruby editing in one application (Paratext), the specification and markup needs were clarified and refined. Note the following updated proposal. The original proposal has been retained at the end of this description.
For example: If the base text being glossed is a phrase of two Han characters (B), then the ruby gloss text (gg) may contain two elements, one for glossing each of the base text characters making up the phrase.
This syntax allows the decision to present glosses by phrase or by group to be made at the publication stage, rather than pre-determined during translation.
Supporting a null gloss: Allow parts of the gloss to be empty
In order to preserve the whole phrase unit (rather than breaking off just the characters that have glosses), USFM needs some way to specify a null gloss piece. Since the separator character (colon
Examples of omission:
Second and fourth base characters are unglossed:
Second base character is unglossed:
A companion USX 3.0 proposal exists at: ubsicap/usx#24
A result of this proposal is the corresponding proposal to deprecate the existing pronunciation marker
Han characters: Chinese, Japanese, and Korean texts have some characters that they share in common. In Japanese these are called Kanji (literally “Han characters”). There are several thousand of these characters to learn. For new readers or readers new to the Biblical texts it may be very difficult for them to recognize what Chinese or Japanese word corresponds to the Han character(s) they are seeing.
Ruby glosses: In order to help these readers, some Bibles are printed with glosses using small phonetic characters (e.g. Japanese uses the hiragana alphabet) placed above the more symbolic Han characters to tell the reader how to pronounce the character. These phonetic characters are generically called "ruby glosses" or "rubies". In Japanese this technique is called Furigana.
Note: These are character glosses regarding pronunciation, not linguistic glosses per se, though they do effectively indicate the word’s meaning.
Ruby for characters and phrases
A single piece of gloss data handles individual characters well, but not phrases. It requires that the typesetting decision to gloss phrases "by character" or "by group" be made too early, since we must choose between (a) identifying the phrase or (b) identifying word glosses (the pieces of the phrase gloss).
Note: In cases where the annotation text is associated with only a single preceding ideogram, only the