Add Journal of Semitic Studies schema #73

charlesLoder · 2023-06-14T02:42:32Z

I don't think it's possible to recognize short/long vowels(?) to distinguish e.g. i and ī, so I used ī for hireq-yod and i for plain hireq, which may require some manual fixes from the user (what would be helpful for this is a separate field for hireq+meteg and qibbuts+meteg).

Could you provide examples? I think the ADDITIONAL_FEATURES may be able to account for that.

The style guide also prescribes that qamets before hatef qamets be transliterated as long qamets: בַּֽצָּהֳרָֽיִם should be baṣṣå̄hå̆rå̄yim, not baṣṣåhå̆rå̄yim. I am not sure if this can be specified in the current system.

Interesting, so they say the qamets under the tsade should be a qamets qatan, but they maintain a distinction between qamets qatan and qamets gadol in transliteration. The stlyesheet says:

This transcription of the quality of the vowels corresponds to the Tiberian reading tradition of Biblical Hebrew,
with the exception of the shewa. The distribution of vocalic and silent shewa, however, follows the Tiberian
tradition.

Given that Khan is the editor, I would assume that means there is no distinction between qamets qatan and qamets gadol. Maybe I'll have to pry into this one.

Tsere-he is not recognized correctly, I'm not sure why: וְהִנֵּ֥ה should be wǝhinnē, not wǝhinnɛ.

I'll research that.

Let me know what you think of the two questions above.

Initial JSON

{
  "VOCAL_SHEVA": "ǝ",
  "HATAF_SEGOL": "ɛ̆",
  "HATAF_PATAH": "ă",
  "HATAF_QAMATS": "å̆",
  "HIRIQ": "i",
  "TSERE": "ē",
  "SEGOL": "ɛ",
  "PATAH": "a",
  "QAMATS": "å̄",
  "HOLAM": "ō",
  "QUBUTS": "u",
  "DAGESH": "",
  "DAGESH_CHAZAQ": true,
  "MAQAF": " ",
  "PASEQ": "",
  "SOF_PASUQ": "",
  "QAMATS_QATAN": "å",
  "FURTIVE_PATAH": "a",
  "HIRIQ_YOD": "ī",
  "TSERE_YOD": "ē",
  "SEGOL_YOD": "ɛ",
  "SHUREQ": "ū",
  "HOLAM_VAV": "ō",
  "QAMATS_HE": "å̄",
  "SEGOL_HE": "ɛ",
  "TSERE_HE": "ē",
  "MS_SUFX": "å̄yw",
  "ALEF": "ʾ",
  "BET_DAGESH": "b",
  "BET": "ḇ",
  "GIMEL": "ḡ",
  "GIMEL_DAGESH": "g",
  "DALET": "ḏ",
  "DALET_DAGESH": "d",
  "HE": "h",
  "VAV": "w",
  "ZAYIN": "z",
  "HET": "ḥ",
  "TET": "ṭ",
  "YOD": "y",
  "FINAL_KAF": "ḵ",
  "KAF": "ḵ",
  "KAF_DAGESH": "k",
  "LAMED": "l",
  "FINAL_MEM": "m",
  "MEM": "m",
  "FINAL_NUN": "n",
  "NUN": "n",
  "SAMEKH": "s",
  "AYIN": "ʿ",
  "FINAL_PE": "p̄",
  "PE": "p̄",
  "PE_DAGESH": "p",
  "FINAL_TSADI": "ṣ",
  "TSADI": "ṣ",
  "QOF": "q",
  "RESH": "r",
  "SHIN": "š",
  "SIN": "ś",
  "TAV": "ṯ",
  "TAV_DAGESH": "t",
  "DIVINE_NAME": "yhwh",
  "SYLLABLE_SEPARATOR": "",
  "ADDITIONAL_FEATURES": [],
  "STRESS_MARKER": {
    "location": "",
    "mark": ""
  },
  "longVowels": true,
  "qametsQatan": true,
  "sqnmlvy": true,
  "wawShureq": true,
  "article": true
}

The text was updated successfully, but these errors were encountered:

camilstaps · 2023-06-14T12:46:03Z

Thanks for handling this so quickly!

An example where recognizing hireq-meteg could be helpful would be הִֽתְקַבְּצ֔וּ. With the schema above this is transliterated as hiṯǝ-, while it should be hīṯǝ. The system does correctly recognize that the meteg means that the schwa is vocal, which is very nice. I think there are still other cases where an etymologically long ī or ū is not written with either vowel letter or meteg, and that JSS would want these to be transcribed with macron as well. But there is no way to recognize these cases (except for having a builtin dictionary).

The style sheet specifically gives the example of צָחֳרַיִם, which should be transliterated ṣå̄ḥå̆rayim. I myself have learned to pronounce qamets before hatef qamets as short, but I don't know based on what tradition that is, and I cannot find this rule in Khan 2020. In Blau 2010 I do find the rule, but only for the Sephardic tradition (in §3.5.3.4; but see also the note which says that in genuine Sephardic pronunciation the qamets is unaffected by a following hatef qamets). However, in §3.5.3.7 Blau writes that "The Tiberian vocalization marks only qualitative differences and not quantitative ones (with the exception of the ultra-short vowels ...)", and then I'm confused why the JSS system distinguishes qamets gadol and qamets qatan at all. I'm sorry to not be able to be of more help (but let me know if you need a copy of Blau).

charlesLoder · 2023-06-14T20:30:23Z

I think there are still other cases where an etymologically long ī or ū is not written with either vowel letter or meteg, and that JSS would want these to be transcribed with macron as well. But there is no way to recognize these cases (except for having a builtin dictionary).

Yeah, SBL requires the same, and there is no way to do it w/o a dictionary. I tossed around the idea once, maybe I'll try to incorporate one.

if you need a copy of Blau

If you could that would be wonderful! I'll send you a Twitter DM

charlesLoder · 2023-06-16T14:21:10Z

Acc. to the senior editor:

In the Tiberian reading tradition there is no distinction in the quality of qameṣ gadol and qameṣ qaṭan. However, there is a difference in the quantity, which is indicated in the different diacritics in our transcription system.

In Khan's Tiberian Pronunciation v1, the difference between qamets qatan and qamets gadol (though he never says those terms) is length.

Qatan
See: קָדְשֵׁי [qɔðˈʃeː]
and: כָּל־ [kʰɔl]
with short vowels as indicated by no ː mark.

Gadol
The "typical" qamets usuall has a ː mark.
See: יָמִים [jɔːˈmiːim]

As for the schema, I think an ADDITIONAL_FEATURE may work. Something like this untested code:

{
  ADDITIONAL_FEATURES: [
    {
      FEATURE: "syllable",
      // if the syllable contains a qamets qatan character
      HEBREW: /\u{05C7}/,
      TRANSLITERATION: (syllable) => {
        const next = syllable?.next?.value?.text;
        // if the next syllable includes a hateph qamets, then replace the qamets qatan with a regular qamets 
        if(next && next.includes("\u05B3')) {
          return syllable.text.replace("\u{05C7}", "\u{05B8}")
        }
        return syllable.text
      }
    }
  ]
}

camilstaps · 2023-06-16T18:28:46Z

That's great! I didn't realize an ADDITIONAL_FEATURE could contain code, then something similar should definitely work to recognize hireq/qibbuts + meteg as well.

If you want, I can have a go, but it may take a while, I have a lot on my plate at the moment. Totally understandable if that's also the case for you of course!

charlesLoder · 2023-06-16T18:32:56Z

@camilstaps have at it! I'm totally swamped too :)

There's a folder for schema tests, that would be my only ask. You can just duplicate sblSimple, and add test for these special cases.

charlesLoder added the enhancement New feature or request label Jun 14, 2023

charlesLoder self-assigned this Jun 14, 2023

charlesLoder mentioned this issue Jun 14, 2023

JSS schema charlesLoder/hebrewTransliteration#71

Closed

charlesLoder added this to the v2.5.0 milestone Jun 16, 2023

camilstaps mentioned this issue Jun 21, 2023

JSS schema #74

Merged

charlesLoder closed this as completed in 44476bd Jun 26, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Journal of Semitic Studies schema #73

Add Journal of Semitic Studies schema #73

charlesLoder commented Jun 14, 2023

camilstaps commented Jun 14, 2023

charlesLoder commented Jun 14, 2023

charlesLoder commented Jun 16, 2023

camilstaps commented Jun 16, 2023

charlesLoder commented Jun 16, 2023

Add Journal of Semitic Studies schema #73

Add Journal of Semitic Studies schema #73

Comments

charlesLoder commented Jun 14, 2023

camilstaps commented Jun 14, 2023

charlesLoder commented Jun 14, 2023

charlesLoder commented Jun 16, 2023

camilstaps commented Jun 16, 2023

charlesLoder commented Jun 16, 2023