In [1]:
from datetime import datetime
print(f'Notebook last updated on {datetime.now().__str__()}')

Notebook last updated on 2017-05-11 18:28:08.479105


# Parallel Suffixes and the Continuation of Actants in 4Q246

## Research Question:

Are there places, first in Biblical Aramaic, where a suffix in a clause identifies in person, gender, and number with a subject suffix in an immediately preceding clause, but refers to a different referent?

If so: What kinds of factors allow for distinguishing between the two participants? Are there any formal indicators?

## Motivation:

In [4Q246 (Aramaic)](http://www.deadseascrolls.org.il/explore-the-archive/manuscript/4Q246-1), a controversial figure appears who is called ברה די אל ("son of God") and בר עליון ("son of the Most High"). Much of the material before this point (column 2 line 1) is damaged, which complicates the identification of the figure. But the material which follows the reference leaves an open question as to whether the son of God is presented as a positive or negative character.

The relevant text of the passage in question is:

> ברה די אל יתאמר ובר עליון יקרונה <br>
> כזיקיא חזותא כן מלכותהן תהוה <br>
> שנין ימלכון על ארעא וכלא ידשון <br>
> ...<br>
> עד יקום עם אל

<br>
> The Son of God he will be called, and son of the Most High they will name him. <br>
> Like the sparks/comets of the vision, thus will their kingdom be. <br>
> Years they will reign over the land, and everyone they will trample. <br>
> ... <br>
> Until [the] people of God arise. (4Q246, Col 2, lns. 1-3, 5) <br>

A major issue in interpreting the "son of God" figure is the identification of the 3MP "they" between lines 1-2. Specifically, the text says, "...and son of the Most High **they** will name him" and then progresses to say, "Like the comets of the vision, thus will **their** kingdom be." Since the 3MP suffix on "their kingdom" refers to a tumultuous kingdom that ends after the people of God arise ("until the people of God..."), the "son of God" must be so named by a wicked people, that is, if the 3MP in both clauses refer to the same referent. 

## Inventory of Parameters

Based on the problem described above, here is a set of parameters the code must find:

* Two contiguous clauses
    * If clause is defective or causus pendens, it must be considered together with its last portion.
    * Clause 1 contains a verb with PGN X
    * Clause 2 contains a noun with a pronominal suffix with PGN X
    * No other PGN indicators can occur between PGN reference 1 and PGN reference 2
        * Exception: if the PGN indicator is a noun with a comparative particle such as כ inside of a causus pendens
    
Export results to a spreadsheet for examination. Hits will need to be manually labled as Y or N on the question: is the same referent? Include a separate row in the spreadsheet for this.

Begin first with Biblical Aramaic. If data is sparse, or if it is examined and further data is preferable, expand the search to Biblical Hebrew.

## See the Results of this Query: 

* [Aramaic](results/contiguous_suffixes_BA.csv)
* [Hebrew](results/contiguous_suffixes_BH_Quotation.csv)

In [2]:
# import tools
import collections, csv

# import TF and data
from tf.fabric import Fabric

# initialize TF
TF = Fabric(modules='hebrew/etcbc4c')

# load features
api = TF.load('''
              book chapter verse language
              prs prs_gn prs_nu prs_ps
              gn nu ps
              function pdp typ rela domain
              ''')

api.makeAvailableIn(globals())

This is Text-Fabric 2.3.6
Api reference : https://github.com/ETCBC/text-fabric/wiki/Api
Tutorial      : https://github.com/ETCBC/text-fabric/blob/master/docs/tutorial.ipynb
Data sources  : https://github.com/ETCBC/text-fabric-data
Data docs     : https://etcbc.github.io/text-fabric-data
Shebanq docs  : https://shebanq.ancient-data.org/text
Slack team    : https://shebanq.slack.com/signup
Questions? Ask shebanq@ancient-data.org for an invite to Slack
109 features found and 0 ignored
  0.00s loading features ...
   |     0.01s B book                 from /Users/Cody/github/text-fabric-data/hebrew/etcbc4c
   |     0.01s B chapter              from /Users/Cody/github/text-fabric-data/hebrew/etcbc4c
   |     0.01s B verse                from /Users/Cody/github/text-fabric-data/hebrew/etcbc4c
   |     0.12s B language             from /Users/Cody/github/text-fabric-data/hebrew/etcbc4c
   |     0.13s B prs                  from /Users/Cody/github/text-fabric-data/hebrew/etcbc4c
   |     0.21s

### PGN Matching Functions

First we need some functions to match person, gender, and number indicators between pronominal suffixes -> verbs or nouns -> verbs. The key function is `match_pgn()`, which is made only for third person matching. Afterward, we provide some testing examples to ensure the algorithm works.

In [3]:
def get_pgn(word, pronom=False):
    '''
    Return a person, gender, number (PGN) tuple 
    for a word or pronominal suffix.
    '''
    # return word PGN tuple
    if not pronom:
        return (F.ps.v(word), F.gn.v(word), F.nu.v(word))
    
    # return pronominal suffix PGN tuple
    else:
        return (F.prs_ps.v(word), F.prs_gn.v(word), F.prs_nu.v(word))
    
    
def match_pgn(main_pgn, cmp_pgn):
    '''
    Return True/False for person, gender, and number agreement between:
        * a third person pronominal suffix and verbs
        OR
        * a third person pronominal suffix and nouns
    Requires two tuples formatted as: (person, gender, number)
        for the pronominal PGN and the compared PGN (verb or noun)
    '''
    # label pgn data
    main_ps, main_gn, main_nu = main_pgn
    cmp_ps, cmp_gn, cmp_nu = cmp_pgn
    
    # check the parameters for p3 subject/verb agreement
    if all([main_ps in {'p3','unknown','NA'},
            cmp_ps in {'p3','unknown','NA'},
            main_nu == cmp_nu,
            main_gn in {cmp_gn, 'unknown'} or cmp_gn == 'unknown']):
        
        return True
    
    else:
        return False

Testing the PGN matcher...

In [4]:
pronA, cmpA = ('unknown','m','sg'),('p3','m','sg')
pronB, cmpB = ('p3','f','sg'),('unknown','f','sg')
pronC, cmpC = ('p3','m','sg'),('unknown','f','sg')
pronD, cmpD = ('NA', 'm', 'sg'),('p3', 'm', 'sg')

print('test A:', match_pgn(pronA, cmpA))
print('test B:', match_pgn(pronB, cmpB))
print('test C:', match_pgn(pronC, cmpC))
print('test D:', match_pgn(pronD, cmpD))

test A: True
test B: True
test C: False
test D: True


### Contiguous Verb Suffix // Pronominal Suffix Function

Now we build a big function to find third person verbs in one clause, and the pronominal suffixes that agree with them in person, gender, and number. There can be no intervening PGN markers between these two points of reference that agree with the verb. Thus, there is, in theory, an unbroken chain of PGN indicators up to the suffix. This will allow for testing the problem as it appears in 4Q246. 

In [5]:
def get_contiguous_suffixes(clause_atoms):
    '''
    Return a list of dictionary matches that have:
        * Clause 1 with verbal suffix X
        * Clause 2 with noun + pronominal suffix X
        * No interrupting PGN markers between the suffixes
    Dictionary contains row data for a spreadsheet.
    Requires a list of clauses in canonical order.
    '''
    
    # print info to console with TF.info()
    indent(level=0, reset=True)
    info('Beginning search for contiguous suffixes...')
    indent(level=1, reset=True)
    
    # put results here
    matches = []
    
    # find matches
    for ca in clause_atoms:
        
        # |1| Check for third person predicate verb.
        
        # get third person verb
        verb = [word for word in L.d(ca, otype='word') # get all words in clause atom
                    if F.function.v(L.u(word, otype='phrase')[0]) == 'Pred' # predicate word 
                    and F.pdp.v(word) == 'verb' # word must be verb
                    and F.ps.v(word) == 'p3' # verb must be third person
               ]
        
        # skip if no verb or third person verb
        if not verb:
            continue
            
        # take the last verb (in case there are multiples)
        verb = verb[-1]
        
        # |2| Check subsequent clause atom for noun with pronominal suffix.
        
        # get subsequent clause atom
        subsequent_ca = ca + 1 if ca + 1 in clause_atoms else None
        
        # skip if there is no following clause atom
        if not subsequent_ca:
            continue
            
        # get all words in subsequent ca 
        subs_words = L.d(subsequent_ca, otype='word')
            
        # if the subsequent ca is causus or defective, check the resumptive clause too
        if F.typ.v(subsequent_ca) in {'CPen','Defc'}\
            and F.rela.v(subsequent_ca + 1) == 'Resu':
                # get resumption clause atom
                resume_ca = subsequent_ca + 1
                # extend subsequent words with the words in the resumption clause atom
                subs_words.extend(L.d(resume_ca, otype='word'))
        
        # get all words in subsequent words with 
        # a pronominal suffix that agree with the verb
        suffixed_nouns = [word for word in subs_words
                              if F.pdp.v(word) == 'subs' # word functions as noun
                              and F.prs.v(word) not in {'NA','absent'} # word has pronominal sfx
                              and match_pgn(get_pgn(word, pronom=True), get_pgn(verb)) # PGN agree
                         ]
        
        # skip if there are no nouns with pronominal suffixes
        if not suffixed_nouns:
            continue
            
        # take first suffixed noun
        suff_noun = suffixed_nouns[0]
            
        # |3| Check every word in between the verb in clause 1 
        # and the suffixed noun in clause 2
        # for any PGN marker that agrees with the verb.
        # Skip ca if there is an interrupting marker.
        
        # get any intervening PGN markers
        intervening_words = [word for word in range(verb+1,suff_noun)
                                if F.typ.v(L.u(word, otype='phrase')[0]) != 'PP' # not prep phrs
                                and match_pgn(get_pgn(word),get_pgn(verb)) # agrees with verb
                                or match_pgn(get_pgn(word, pronom=True), get_pgn(verb)) # has sfx
                            ]
        
        # skip if there is an intervening word
        if intervening_words:
            continue
        
        # |4| Any clause atom up to this point is a match!
        
        # save and append match data
        
        # format reference
        book, chapter, verse1, = T.sectionFromNode(ca)
        verse2 = T.sectionFromNode(subsequent_ca)[2]
        reference = f'{book} {chapter}:{verse1}' if verse1 == verse2\
                        else f'{book} {chapter}:{verse1}-{verse2}'
        
        # format rest of data
        match_data = {'reference':reference,
                      'text':T.text(L.d(ca, otype='word') + L.d(subsequent_ca, otype='word')),
                      'same?': '', # empty column for manually sorting
                      'notes':'', # empty column for notes on sorting
                      'verb':T.text([verb]),
                      'pronominal': T.text([suff_noun]),
                      'verb_PGN': get_pgn(verb),
                      'suffx_PGN': get_pgn(suff_noun, pronom=True),
                      'vb_Cl_type': F.typ.v(ca), # clause type of verb
                      'pn_Cl_type': F.typ.v(subsequent_ca), # cl type of pronom
                      'verb_ca': ca,
                      'pron_ca': subsequent_ca
                     }
        
        # append the match data
        matches.append(match_data)
        
        # print updates for larger datasets
        if len(matches) % 50 == 0:
            info(f'{len(matches)} matches found...')
        
    # give final report
    indent(level=0)
    info(f'DONE with {len(matches)} results.')
    
    # return the matches
    return matches

### Apply the Function to Aramaic Texts and Export the Results

In [6]:
# get all clauses in Aramaic

aramaic_clauses = [ca for ca in F.otype.s('clause_atom')
                      if F.language.v(L.d(ca, otype='word')[0]) == 'arc'
                  ]

print(f'{len(aramaic_clauses)} Aramaic clause atoms ready for processing...')

1378 Aramaic clause atoms ready for processing...


In [7]:
# apply the function to the clauses

aramaic_results = get_contiguous_suffixes(aramaic_clauses)

  0.00s Beginning search for contiguous suffixes...
  0.17s DONE with 8 results.


In [8]:
# export the results

# fieldnames for columns
fieldnames = ('reference','text','same?','notes','verb','pronominal',
              'verb_PGN','suffx_PGN', 'vb_Cl_type','pn_Cl_type','verb_ca',
              'pron_ca')

aramaic_file = 'results/contiguous_suffixes_BA.csv'

# export commented out to prevent overwrite of manually edited files
# with open(export, 'w') as outfile:
#    writer = csv.DictWriter(outfile, fieldnames=fieldnames)
    
#    writer.writeheader()
#    writer.writerows(aramaic_results)

### For Hebrew

There are too many Hebrew results without an additional filter. So we filter by Hebrew clauses that only appear within the `domain` of "Q" (quotation). 

In [9]:
hebrew_clauses = [ca for ca in F.otype.s('clause_atom')
                      if F.language.v(L.d(ca, otype='word')[0]) == 'hbo'
                      and F.domain.v(L.u(ca, otype='clause')[0]) == 'Q'
                  ]

print(f'{len(hebrew_clauses)} Hebrew clause atoms ready for processing...')

53238 Hebrew clause atoms ready for processing...


In [10]:
# apply the function to the clauses

hebrew_results = get_contiguous_suffixes(hebrew_clauses)


  0.00s Beginning search for contiguous suffixes...
   |     0.72s 50 matches found...
   |     2.67s 100 matches found...
   |     4.48s 150 matches found...
   |     5.64s 200 matches found...
   |     6.41s 250 matches found...
   |     8.00s 300 matches found...
   |     8.82s 350 matches found...
   |       10s 400 matches found...
    10s DONE with 405 results.


In [11]:
hebrew_file = 'results/contiguous_suffixes_BH_Quotation.csv'

# export commented out to prevent overwrite of manually edited files
#with open(Heb_export, 'w') as outfile:
#    writer = csv.DictWriter(outfile, fieldnames=fieldnames)
    
#    writer.writeheader()
#    writer.writerows(hebrew_results)
    
# print('Done!')

## See Results

* [Aramaic](results/contiguous_suffixes_BA.csv)
* [Hebrew](results/contiguous_suffixes_BH_Quotation.csv)

# Analysis of Manually Categorized Results

Now that the results above have been exported and manually tagged as to whether the verb and pronominal suffix refers to the same referent, we can now re-import the data, now with the manual tags (y for "yes" it is the same referent or n for "no" it is not the same referent).

The goal of the analysis is to answer the question: When the referent is different between a verb and pronominal suffix with the same PGN data, and without an interrupting noun of the same PGN, what kind of factors are present? In other words: what kinds of factors lead to the disconnect of the pronominal suffix from the adjacent, agreeing verb?

Finally, what do these results bear out for our understanding of the same problem in 4Q246?

Since there were only about 8 Aramaic results, we focus on the Hebrew first.

After the results were manually labeled, they were saved within the same directory by the same names. We will access their data by simply re-opening them.

In [12]:
# (re)load the manually labeled data
with open(hebrew_file, 'r') as hfile:
    reader = csv.DictReader(hfile)
    
    # put results in list
    hebrew_results = [r for r in reader]
    
print(f'{len(hebrew_results)} Hebrew results loaded...')

405 Hebrew results loaded...


## Simple Data Exploration

What kind of results did the search find?

In [13]:
# A counting object to count the kinds of results
referent_match_counts = collections.Counter()

# add the manual categorizations to the count
for result in hebrew_results:
    referent_match_counts[result['same?']] += 1
    
# print the results
for category, count in referent_match_counts.items():
    print(category, count)

y 342
n 57
? 6


Thus: in 342 cases, the referent of the two suffixes is identical. In 57 cases (higher than I expected!), the referent is different. In 6 cases I chose to forego a formal classification because of the initial difficulties in disambiguating the referents. Now let's dive deeper into the 'no' results.

There are tags on each 'no' result that gives a clue as to the reason the referents are different. Let's peak into those tags and their counts.

In [14]:
# counter object to count the reasons for "no"
no_notes = collections.Counter()

# count the no tags
for result in hebrew_results:
    
    # count if result is tagged as no
    if result['same?'] == 'n':
        
        # up the count
        no_notes[result['notes']] += 1
        
    
# print the results from most to least
for category, count in sorted(no_notes.items(), key=lambda k: k[-1], reverse=True):
    print(category)
    print('\t', count)
    print()

additional_noun
	 36

new_referent
	 12

placeholder_referent
	 5

marked_discourse
	 2

fronted_element
	 2



The meaning of the tags are as follows:

* **additional_noun** - there is an additional noun subject, often it is a fronted, preverbal subject (hence it does not interrupt between the verb and the matched pronominal suffix). Occasionally the two subjects are a clause or few away. The interpretation of the referent then depends on a choice between two explicitly stated subjects.
* **new_referent** - there is an intervening noun subject between the verb and the pronominal suffix that was allowed in the search results since I made an exception for prepositional phrases.
* **placeholder_referent** - the verb is in the passive and has no actual referent, yet it agrees in PGN with pronominal suffix, which actually has a real referent (since such an exception occurs in 4Q246). 
* **marked_discourse** - the verb and the pronominal suffix are separated by the beginning of marked discourse...the verb is thus a verb of speaking. The two referents are different.
* **fronted_element** - there are two examples in Numbers in which a fronted element sets off a new section and separates the verb from the pronominal suffix. 

It is the results with the tag `placeholder_referent` which are of the greatest interest to the 4Q246 problem. In the 4Q246 text, there are parallel statements, one in the passive, and one in the active 3MP, before the clause with the pronominal suffix:

> ברה די אל יתאמר <br>
> ובר עליון יקרונה <br>
> "The son of God **he will be called**,<br>
> and son of the Most High **they** will name him."

Tucker Ferda, a proponent of reading the "son of God" title as a positive one, explains the argument:

> One initial reason for suspicion is the third-person masculine plural verb in ii 1 (יקרונה), which questions the notion that the previous three Ithpeel/Ithpaal forms are reflexive. It instead suggests the opposite: the figure “is named” by some unspecified subject/s throughout (discussed more below). - "Naming the Messiah: A Contribution to the 4Q246 'Son of God' Debate," *Dead Sea Discoveries* 21 (2014): 165 

Thus, we look more closely at these results. Their respective clause atom numbers are stored under the `verb_ca` and `pron_ca` keys.

In [15]:
# count matching results here
result_num = 0

# print out the text and verses of the 'no' results with the placeholder tag
for result in hebrew_results: # enumerate to ID results
    
    # print matching results
    if result['same?'] == 'n' and result['notes'] == 'placeholder_referent':
        
        # up the results count
        result_num += 1
        
        # print the results
        print(result_num)
        print(result['reference'])
        print(result['text'])
        print()

1
1_Samuel 6:3
וְנֹודַ֣ע לָכֶ֔ם לָ֛מָּה לֹא־תָס֥וּר יָדֹ֖ו מִכֶּֽם׃ 

2
1_Samuel 23:23
וְהָיָה֙ אִם־יֶשְׁנֹ֣ו בָאָ֔רֶץ 

3
Isaiah 66:23
וְהָיָ֗ה מִֽדֵּי־חֹ֨דֶשׁ֙ בְּחָדְשֹׁ֔ו וּמִדֵּ֥י שַׁבָּ֖ת בְּשַׁבַּתֹּ֑ו יָבֹ֧וא כָל־בָּשָׂ֛ר 

4
Joel 2:2-3
וְאַֽחֲרָיו֙ לֹ֣א יֹוסֵ֔ף עַד־שְׁנֵ֖י דֹּ֥ור וָדֹֽור׃ לְפָנָיו֙ אָ֣כְלָה אֵ֔שׁ 

5
Ecclesiastes 8:16
אֲשֶׁ֥ר נַעֲשָׂ֖ה עַל־הָאָ֑רֶץ כִּ֣י גַ֤ם בַּיֹּום֙ וּבַלַּ֔יְלָה שֵׁנָ֕ה בְּעֵינָ֖יו אֵינֶ֥נּוּ רֹאֶֽה׃ 



To do: describe how these results are similar or different from the 4Q246 example.