New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Syntax for assigning word-level linking attributes #25

Closed
klassenjm opened this Issue Jul 7, 2016 · 1 comment

Comments

Projects
None yet
2 participants
@klassenjm
Contributor

klassenjm commented Jul 7, 2016

Proposal

  • Specify a set of attributes, following the syntax for word-level descriptive attributes (#24), for assigning linking properties to USFM character level elements.
    • Linking attributes may be added to any character level element.
    • See also USFM 3.0 proposal for "Link text (add \jmp ...\jmp*)" (#30)

A companion USX 3.0 proposal exists at: ubsicap/usx#19

General Syntax

Names given to linking attributes begin with link-, distinguishing them from any other descriptive attributes (#24). As with descriptive attributes, linking attributes are separated from the text content by a vertical bar |. Attributes are listed as pairs of name and corresponding value using the syntax:

link-<attribute> = "value"

Linking attributes are combined with any other descriptive attributes added to the same marker. The order of attributes is not significant, although it would benefit readability to have descriptive and linking attributes grouped together.

Proposed Attribute List

  • link-href - identifies the resource being linked to as a URI.
    • Additional USFM specified URI prefixes are:
      • prj: + standard USFM / USX scripture reference syntax (book, chapter, verse). Example: prj:RSV52 MAT 3:1-4
    • A link reference within the same project text does not require a URI prefix but must follow the standard USFM / USX scripture reference syntax. Example: MAT 3:1-4.
    • The resource may be identified by unique id. Example: #article-Ruth or prj:GNTSB #article-Ruth
  • link-title - plain text describing the remote resource such as might be shown in a tooltip.
  • link-id - a unique identifier for this content location (an anchor).

The set of URI prefixes used within a link-href attribute can be extended beyond the predefined set for USFM 3.0. Any user defined URI prefixes must begin with the prefix x-.

Examples

Link to other project text

The traditional translation of verse 1, as given in 
\jmp RSV|link-href="prj:RSV52 GEN 1:1" link-title="Revised Standard Version"\jmp*, 
may be appropriate.

Link to illustration / media

Storehouses, as used here, refers to large buildings with walls and roof, where grain was 
kept until needed. (See illustration: \jmp Storehouse|link-href="figures/storehouse.png" 
link-title="Ancient storehouse"\jmp*)

Assigning an identifier (anchor). In this example the markup is a milestone, indicating a location but not marking text.

\q1 “Someone is shouting in the desert,
\q2 ‘Prepare a road for the Lord;
\q2 make a straight path for him to travel!’ ”
\esb \cat People\cat*
\ms \jmp |link-id="article-john_the_baptist"\jmp*John the Baptist
\p John is sometimes called the last “Old Testament prophet” because of the warnings he 
brought about God's judgment and because he announced the coming of God's “Chosen 
One” (Messiah).

Glossary entry including a link reference to an external URL

\w gracious|link-href="http://bibles.org/search/grace/eng-GNTD/all"\w*

Reference to named target (in same project)

\p \v 2-6a From Abraham to King David, the following ancestors are listed: Abraham, 
Isaac, Jacob, Judah and his brothers; then Perez and Zerah (their mother was Tamar*), 
Hezron, Ram, Amminadab, Nahshon, Salmon, Boaz (his mother was Rahab*), Obed (his 
mother was \jmp Ruth|link-href="#article-Ruth"\jmp*), Jesse, and King David. 

Nested markup

\ef - \fr 1.2-6a: \fq Ruth: \ft A Moabite (Ruth 1.4). Only outstanding 
women were normally included in Jewish genealogical lists. See article 
on \+jmp Ruth|link-href="#article-Ruth"\+jmp*\ef*
@DavidHaslam

This comment has been minimized.

Show comment
Hide comment
@DavidHaslam

DavidHaslam Jan 4, 2017

Several Indic languages use the vertical bar at the end of a sentence in modern orthography in place of the Devanagari Danda.

The danda (with the same Unicode encoding) has also been used as a full stop in the scripts of several other Indic languages, including Bengali (pronounced as দাঁড়ি / dari), Gurmukhi, Gujarati, Hindi, Oriya, Tamil, Telugu, Kannada, and Malayalam. However, Western punctuation has largely replaced it in contemporary orthography.

I assume this to mean a Western full-stop in the place where you'd formerly find a Danda.

Even so, I have seen a digitized Punjabi Bible in which the vertical bar (and either the double bar or just two bars) has been keyed, where perhaps the printed Bible would have had the Danda or Double Danda with the Bible text in the Gurmukhi script. Here's Genesis 1:1 to illustrate:

\v 1 ਆਦ ਵਿੱਚ ਪਰਮੇਸ਼ੁਰ ਨੇ ਅਕਾਸ਼ ਤੇ ਧਰਤੀ ਨੂੰ ਉਤਪਤ ਕੀਤਾ |

And a verse from the same chapter ending with two bars.

\v 5 ਪਰਮੇਸ਼ੁਰ ਨੇ ਚਾਨਣ ਨੂੰ ਦਿਨ ਆਖਿਆ ਤੇ ਅਨ੍ਹੇਰੇ ਨੂੰ ਰਾਤ ਆਖਿਆ ਸੋ ਸੰਝ ਤੇ ਸਵੇਰ ਪਹਿਲਾ ਦਿਨ ਹੋਇਆ ||

Yet the same digitization also has instances of the Danda used as a similar punctuation mark.
Job 6:19 reads:

\v 19 ਤੇਮਾ ਦੇ ਵਪਾਰੀ ਪਾਣੀ ਦੀ ਤਲਾਸ਼ ਕਰਦੇ ਨੇ ਸ਼ੇਬਾ ਦੇ ਮੁਸਾਫਰ ਆਸ ਨਾਲ ਤੱਕਦੇ ਨੇ।

This might simply reflect inconsistencies in keyboarding among different members of a team.

These observations prompt me to ask the question as to whether the use of a vertical bar as punctuation in some scripts might clash with the use of the vertical bar in word-level linking syntax for USFM ?

I'm guessing that this question may not have occurred to anyone in UBISCAP, so I thought it was worth mentioning here.

DavidHaslam commented Jan 4, 2017

Several Indic languages use the vertical bar at the end of a sentence in modern orthography in place of the Devanagari Danda.

The danda (with the same Unicode encoding) has also been used as a full stop in the scripts of several other Indic languages, including Bengali (pronounced as দাঁড়ি / dari), Gurmukhi, Gujarati, Hindi, Oriya, Tamil, Telugu, Kannada, and Malayalam. However, Western punctuation has largely replaced it in contemporary orthography.

I assume this to mean a Western full-stop in the place where you'd formerly find a Danda.

Even so, I have seen a digitized Punjabi Bible in which the vertical bar (and either the double bar or just two bars) has been keyed, where perhaps the printed Bible would have had the Danda or Double Danda with the Bible text in the Gurmukhi script. Here's Genesis 1:1 to illustrate:

\v 1 ਆਦ ਵਿੱਚ ਪਰਮੇਸ਼ੁਰ ਨੇ ਅਕਾਸ਼ ਤੇ ਧਰਤੀ ਨੂੰ ਉਤਪਤ ਕੀਤਾ |

And a verse from the same chapter ending with two bars.

\v 5 ਪਰਮੇਸ਼ੁਰ ਨੇ ਚਾਨਣ ਨੂੰ ਦਿਨ ਆਖਿਆ ਤੇ ਅਨ੍ਹੇਰੇ ਨੂੰ ਰਾਤ ਆਖਿਆ ਸੋ ਸੰਝ ਤੇ ਸਵੇਰ ਪਹਿਲਾ ਦਿਨ ਹੋਇਆ ||

Yet the same digitization also has instances of the Danda used as a similar punctuation mark.
Job 6:19 reads:

\v 19 ਤੇਮਾ ਦੇ ਵਪਾਰੀ ਪਾਣੀ ਦੀ ਤਲਾਸ਼ ਕਰਦੇ ਨੇ ਸ਼ੇਬਾ ਦੇ ਮੁਸਾਫਰ ਆਸ ਨਾਲ ਤੱਕਦੇ ਨੇ।

This might simply reflect inconsistencies in keyboarding among different members of a team.

These observations prompt me to ask the question as to whether the use of a vertical bar as punctuation in some scripts might clash with the use of the vertical bar in word-level linking syntax for USFM ?

I'm guessing that this question may not have occurred to anyone in UBISCAP, so I thought it was worth mentioning here.

@klassenjm klassenjm modified the milestones: 3.0.rc2, 3.0.0 Oct 27, 2017

@klassenjm klassenjm closed this Apr 13, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment