In [1]:
from datetime import datetime
print(f'Notebook last updated on {datetime.now().__str__()}')

Notebook last updated on 2017-05-05 18:09:15.572905


# Parallel Suffixes and the Continuation of Actants in 4Q246

## Research Question:

Are there places, first in Biblical Aramaic, where a suffix in a clause identifies in person, gender, and number with a subject suffix in an immediately preceding clause, but refers to a different referent?

If so: What kinds of factors allow for distinguishing between the two participants? Are there any formal indicators?

## Motivation:

In [4Q246 (Aramaic)](http://www.deadseascrolls.org.il/explore-the-archive/manuscript/4Q246-1), a controversial figure appears who is called ברה די אל ("son of God") and בר עליון ("son of the Most High"). Much of the material before this point (column 2 line 1) is damaged, which complicates the identification of the figure. But the material which follows the reference leaves an open question as to whether the son of God is presented as a positive or negative character.

The relevant text of the passage in question is:

> ברה די אל יתאמר ובר עליון יקרונה <br>
> כזיקיא חזותא כן מלכותהן תהוה <br>
> שנין ימלכון על ארעא וכלא ידשון <br>
> ...<br>
> עד יקום עם אל

<br>
> The Son of God he will be called, and son of the Most High they will name him. <br>
> Like the sparks/comets of the vision, thus will their kingdom be. <br>
> Years they will reign over the land, and everyone they will trample. <br>
> ... <br>
> Until [the] people of God arise. (4Q246, Col 2, lns. 1-3, 5) <br>

A major issue in interpreting the "son of God" figure is the identification of the 3MP "they" between lines 1-2. Specifically, the text says, "...and son of the Most High **they** will name him" and then progresses to say, "Like the comets of the vision, thus will **their** kingdom be." Since the 3MP suffix on "their kingdom" refers to a tumultuous kingdom that ends after the people of God arise ("until the people of God..."), the "son of God" must be so named by a wicked people, that is, if the 3MP in both clauses refer to the same referent. 

## Inventory of Parameters

Based on the problem described above, here is a set of parameters the code must find:

* Two contiguous clauses
    * If clause is defective or causus pendens, it must be considered together with its last portion.
    * Clause 1 contains a verb with PGN X
    * Clause 2 contains a noun with a pronominal suffix with PGN X
    * No other PGN indicators can occur between PGN reference 1 and PGN reference 2
        * Exception: if the PGN indicator is a noun with a comparative particle such as כ inside of a causus pendens
    
Export results to a spreadsheet for examination. Hits will need to be manually labled as Y or N on the question: is the same referent? Include a separate row in the spreadsheet for this.

Begin first with Biblical Aramaic. If data is sparse, or if it is examined and further data is preferable, expand the search to Biblical Hebrew.

In [2]:
# import tools
import collections, csv

# import TF and data
from tf.fabric import Fabric

# initialize TF
TF = Fabric(modules='hebrew/etcbc4c')

# load features
api = TF.load('''
              book chapter verse language
              prs prs_gn prs_nu prs_ps
              gn nu ps
              function pdp typ rela domain
              ''')

api.makeAvailableIn(globals())

This is Text-Fabric 2.3.6
Api reference : https://github.com/ETCBC/text-fabric/wiki/Api
Tutorial      : https://github.com/ETCBC/text-fabric/blob/master/docs/tutorial.ipynb
Data sources  : https://github.com/ETCBC/text-fabric-data
Data docs     : https://etcbc.github.io/text-fabric-data
Shebanq docs  : https://shebanq.ancient-data.org/text
Slack team    : https://shebanq.slack.com/signup
Questions? Ask shebanq@ancient-data.org for an invite to Slack
109 features found and 0 ignored
  0.00s loading features ...
   |     0.01s B book                 from /Users/Cody/github/text-fabric-data/hebrew/etcbc4c
   |     0.00s B chapter              from /Users/Cody/github/text-fabric-data/hebrew/etcbc4c
   |     0.00s B verse                from /Users/Cody/github/text-fabric-data/hebrew/etcbc4c
   |     0.12s B language             from /Users/Cody/github/text-fabric-data/hebrew/etcbc4c
   |     0.11s B prs                  from /Users/Cody/github/text-fabric-data/hebrew/etcbc4c
   |     0.11s

### PGN Matching Functions

First we need some functions to match person, gender, and number indicators between pronominal suffixes -> verbs or nouns -> verbs. The key function is `match_pgn()`, which is made only for third person matching. Afterward, we provide some testing examples to ensure the algorithm works.

In [3]:
def get_pgn(word, pronominal=False):
    '''
    Return a person, gender, number (PGN) tuple 
    for a word or pronominal suffix.
    '''
    # return word PGN tuple
    if not pronominal:
        return (F.ps.v(word), F.gn.v(word), F.nu.v(word))
    
    # return pronominal suffix PGN tuple
    else:
        return (F.prs_ps.v(word), F.prs_gn.v(word), F.prs_nu.v(word))
    
    
def match_pgn(main_pgn, cmp_pgn):
    '''
    Return True/False for person, gender, and number agreement between:
        * a third person pronominal suffix and verbs
        OR
        * a third person pronominal suffix and nouns
    Requires two tuples formatted as: (person, gender, number)
        for the pronominal PGN and the compared PGN (verb or noun)
    '''
    # label pgn data
    main_ps, main_gn, main_nu = main_pgn
    cmp_ps, cmp_gn, cmp_nu = cmp_pgn
    
    # check the parameters for p3 subject/verb agreement
    if all([main_ps in {'p3','unknown','NA'},
            cmp_ps in {'p3','unknown','NA'},
            main_nu == cmp_nu,
            main_gn in {cmp_gn, 'unknown'} or cmp_gn == 'unknown']):
        
        return True
    
    else:
        return False

Testing the PGN matcher...

In [4]:
pronA, cmpA = ('unknown','m','sg'),('p3','m','sg')
pronB, cmpB = ('p3','f','sg'),('unknown','f','sg')
pronC, cmpC = ('p3','m','sg'),('unknown','f','sg')
pronD, cmpD = ('NA', 'm', 'sg'),('p3', 'm', 'sg')

print('test A:', match_pgn(pronA, cmpA))
print('test B:', match_pgn(pronB, cmpB))
print('test C:', match_pgn(pronC, cmpC))
print('test D:', match_pgn(pronD, cmpD))

test A: True
test B: True
test C: False
test D: True


### Contiguous Verb Suffix // Pronominal Suffix Function

Now we build a big function to find third person verbs in one clause, and the pronominal suffixes that agree with them in person, gender, and number. There can be no intervening PGN markers between these two points of reference that agree with the verb. Thus, there is, in theory, an unbroken chain of PGN indicators up to the suffix. This will allow for testing the problem as it appears in 4Q246. 

In [5]:
def get_contiguous_suffixes(clause_atoms):
    '''
    Return a list of dictionary matches that have:
        * Clause 1 with verbal suffix X
        * Clause 2 with noun + pronominal suffix X
        * No interrupting PGN markers between the suffixes
    Dictionary contains row data for a spreadsheet.
    Requires a list of clauses in canonical order.
    '''
    
    # put results here
    matches = []
    
    # find matches
    for ca in clause_atoms:
        
        # |1| Check for third person predicate verb.
        
        # get third person verb
        verb = [word for word in L.d(ca, otype='word') # get all words in clause atom
                    if F.function.v(L.u(word, otype='phrase')[0]) == 'Pred' # get word in predicate phrase
                    and F.pdp.v(word) == 'verb' # word must be verb
                    and F.ps.v(word) == 'p3' # verb must be third person
               ]
        # skip if no verb or third person verb
        if not verb:
            continue
            
        # take the last verb (in case there are multiples)
        verb = verb[-1]
        
        # |2| Check subsequent clause atom for noun with pronominal suffix.
        
        # get subsequent clause atom
        subsequent_ca = ca + 1 if ca + 1 in clause_atoms else None
        
        # skip if there is no following clause atom
        if not subsequent_ca:
            continue
            
        # get all words in subsequent ca 
        subs_words = L.d(subsequent_ca, otype='word')
            
        # if the subsequent ca is causus or defective, check the resumptive clause too
        if F.typ.v(subsequent_ca) in {'CPen','Defc'}\
            and F.rela.v(subsequent_ca + 1) == 'Resu':
                # get resumption clause atom
                resume_ca = subsequent_ca + 1
                # extend subsequent words with the words in the resumption clause atom
                subs_words.extend(L.d(resume_ca, otype='word'))
        
        # get all words in subsequent words with 
        # a pronominal suffix that agree with the verb
        suffixed_nouns = [word for word in subs_words
                              if F.pdp.v(word) == 'subs' # word functions as noun
                              and F.prs.v(word) not in {'NA','absent'} # word has pronominal suffix
                              and match_pgn(get_pgn(word, pronominal=True), get_pgn(verb)) # word agrees with verb
                         ]
        
        # skip if there are no nouns with pronominal suffixes
        if not suffixed_nouns:
            continue
            
        # take first suffixed noun
        suff_noun = suffixed_nouns[0]
            
        # |3| Check every word in between the verb in clause 1 
        # and the suffixed noun in clause 2
        # for any PGN marker that agrees with the verb.
        # Skip ca if there is an interrupting marker.
        
        # get any intervening PGN markers
        intervening_words = [word for word in range(verb+1,suff_noun)
                                if match_pgn(get_pgn(word),get_pgn(verb)) # word agrees in PGN with verb
                                or match_pgn(get_pgn(word, pronominal=True), get_pgn(verb)) # or word has sfx
                            ]
        
        # skip if there is an intervening word
        if intervening_words:
            continue
        
        # |4| Any clause atom up to this point is a match!
        
        # save and append match data
        match_data = {'reference':f'{T.sectionFromNode(ca)}-{T.sectionFromNode(subsequent_ca)}',
                      'text':T.text(L.d(ca, otype='word') + L.d(subsequent_ca, otype='word')),
                      'verb':T.text([verb]),
                      'pronominal': T.text([suff_noun]),
                      'verb_PGN': get_pgn(verb),
                      'suffx_PGN': get_pgn(suff_noun, pronominal=True),
                      'same?': '', # empty column for manually sorting
                      'verb_ca': ca,
                      'pron_ca': subsequent_ca
                     }
        
        # append the match data
        matches.append(match_data)
        
    # return the matches
    return matches

### Apply the Function to Aramaic Texts and Export the Results

In [6]:
# get all clauses in Aramaic

aramaic_clauses = [ca for ca in F.otype.s('clause_atom')
                      if F.language.v(L.d(ca, otype='word')[0]) == 'arc'
                  ]

print(f'{len(aramaic_clauses)} Aramaic clause atoms ready for processing...')

1378 Aramaic clause atoms ready for processing...


In [7]:
# apply the function to the clauses

aramaic_results = get_contiguous_suffixes(aramaic_clauses)

print(f'Done with {len(aramaic_results)} results!')

Done with 7 results!


In [8]:
# export the results

# fieldnames for columns
fieldnames = ('reference','text','verb','pronominal',
              'verb_PGN','suffx_PGN','same?','verb_ca',
              'pron_ca')

export = 'results/contiguous_suffixes_BA.csv'

with open(export, 'w') as outfile:
    writer = csv.DictWriter(outfile, fieldnames=fieldnames)
    
    writer.writeheader()
    writer.writerows(aramaic_results)

### For Hebrew

There are too many Hebrew results without an additional filter. So we filter by Hebrew clauses that only appear within the `domain` of "Q" (quotation). 

In [9]:
hebrew_clauses = [ca for ca in F.otype.s('clause_atom')
                      if F.language.v(L.d(ca, otype='word')[0]) == 'hbo'
                      and F.domain.v(L.u(ca, otype='clause')[0]) == 'Q'
                  ]

print(f'{len(hebrew_clauses)} Hebrew clause atoms ready for processing...')

53238 Hebrew clause atoms ready for processing...


In [10]:
# apply the function to the clauses

hebrew_results = get_contiguous_suffixes(hebrew_clauses)

print(f'Done with {len(hebrew_results)} results!')

Done with 318 results!


In [11]:
Heb_export = 'results/contiguous_suffixes_BH_Quotation.csv'

with open(Heb_export, 'w') as outfile:
    writer = csv.DictWriter(outfile, fieldnames=fieldnames)
    
    writer.writeheader()
    writer.writerows(hebrew_results)
    
print('Done!')

Done!
