# Valency of `שכב`

In [13]:
from datetime import datetime
last_modified = datetime.now()
print('Notebook last modified on {}'.format(last_modified.__str__()))

Notebook last modified on 2016-12-29 21:47:47.701212


## Methodology
The goal is to inventory and categorise the various satellites surrounding the verb שכב ("to lie") in biblical Hebrew in order to determine which elements give rise to which semantic meanings of שכב. Valency tracks the interaction between semantics and syntax.<br>
<br>
In Dyk et al. they suggest that few "watertight" methods exist to separate obligatory, complement functions from non-obligatory, adjunctive functions. (see Dyk, Glanz, Oosting, "Analysing Valence Patterns," 4-5). They apply a "distributional method" as follows:

* "Collect all occurrences of a verb with the complete patterns of elements occurring in the data."
* "Sort these by pattern."
* "Analyse the differences between the various patterns, observing what relation the separate sentence constituents have to the verb." *(Dyk et al., 6)*

Which elements to use? Dyk et al. use:
* "predicate (Pred), subject (Subj), object (Objc), complement (Cmpl), adjunct (Adju)." (7)

The valence corrections notebook ([here](https://shebanq.ancient-data.org/shebanq/static/docs/tools/valence/corr_enrich.html)) contains good information on procedure. Certain functiosn are considered "core," i.e., many of the functions above.<br><br>
Let's begin by applying the simplest measures first. We'll keep track of how many examples of the predicate we've accounted for as we work from simpler to more complex examples. 
<br>
**Here's the objective:**
1. Inventory phrase functions for all **relevant** phrase functions; organise first by these groups.
    * Procedural question: should the order of elements matter?
    * **why this step?** This part establishes the valency type of the verb. Is it transitive or intransitive? Monovalent, divalent, or trivalent? Are there examples of valence expansion or valence reduction? . 
2. Further subdivide 

In [1]:
from tf.fabric import Fabric

TF = Fabric(modules='Hebrew/etcbc4c')
print()
api = TF.load("""otype
                 book chapter verse
                 function pdp vs
                 lex g_cons g_cons_utf8
                """)

This is Text-Fabric 2.0.0
Api reference : https://github.com/ETCBC/text-fabric/wiki/Api
Tutorial      : https://github.com/ETCBC/text-fabric/blob/master/docs/tutorial.ipynb
Data sources  : https://github.com/ETCBC/text-fabric-data
Data docs     : https://etcbc.github.io/text-fabric-data/features/hebrew/etcbc4c/0_overview.html
Shebanq docs  : https://shebanq.ancient-data.org/text
Slack team    : https://shebanq.slack.com/signup
Questions? Ask shebanq@ancient-data.org for an invite to Slack
106 features found and 0 ignored

  0.00s loading features ...
   |     0.06s B otype                from /Users/Cody/github/text-fabric-data/Hebrew/etcbc4c
   |     0.01s B book                 from /Users/Cody/github/text-fabric-data/Hebrew/etcbc4c
   |     0.01s B chapter              from /Users/Cody/github/text-fabric-data/Hebrew/etcbc4c
   |     0.01s B verse                from /Users/Cody/github/text-fabric-data/Hebrew/etcbc4c
   |     0.24s B g_cons               from /Users/Cody/github/text-

In [2]:
api.makeAvailableIn(globals())

import collections as col

In [18]:
# collect all clauses that contain the target verb with a function of predicate

target = 'CKB['
stem = 'qal'

def find_satellites(target, stem, no_repeats = True):
    '''
    takes in a lexeme string, 
    finds all instances of the string as a predicate
    returns satelites in a dict with clause as key and satellites as a set or list (repeats optional)
    '''
    pred_function = {'Pred','PreO','PreS'}
    satellites = col.defaultdict(set if no_repeats else list)
    for word in F.otype.s('word'):
        lex = F.lex.v(word)
        if lex != target:
            continue
        phrase_node = L.u(word, otype='phrase')[0]
        phrase_func = F.function.v(phrase_node)
        if phrase_func not in pred_function or F.vs.v(word) != stem:
            continue
        clause_node = L.u(phrase_node, otype = 'clause_atom')[0]
        phrase_functs = list(F.function.v(phraseF) for phraseF in L.d(clause_node, otype = 'phrase'))
        if not no_repeats:
            phrase_functs = set(phrase_functs)
        satellites[clause_node] = phrase_functs
    return satellites

ckb_sats = find_satellites(target, stem)

len(ckb_sats)

169

In [19]:
ckb_inventory = col.Counter()

for clause, sats in ckb_sats.items():
    ckb_inventory[(tuple(sats))] += 1
    
len(ckb_inventory)

67

In [20]:
for pattern, stat in sorted(ckb_inventory.items(), key = lambda k: -k[1]):
    print(stat, pattern)

33 ('Conj', 'Pred', 'Subj', 'Cmpl')
26 ('Conj', 'Pred', 'Cmpl')
11 ('Conj', 'Pred')
11 ('Pred', 'Cmpl')
7 ('Pred',)
4 ('PreS',)
4 ('Rela', 'Pred', 'Cmpl')
4 ('Rela', 'Pred', 'Objc')
2 ('Conj', 'Pred', 'Subj', 'Time')
2 ('Cmpl', 'Pred')
2 ('Adju', 'Pred')
2 ('Rela', 'Pred', 'Cmpl', 'Subj')
2 ('Modi', 'Pred')
2 ('Pred', 'Subj', 'Cmpl')
2 ('Conj', 'Pred', 'Objc')
2 ('Pred', 'Adju')
2 ('Conj', 'PreS')
2 ('Subj', 'Pred')
1 ('Rela', 'Pred', 'Cmpl', 'Adju')
1 ('Subj', 'Cmpl', 'Pred')
1 ('Conj', 'Subj', 'Nega', 'Pred')
1 ('Conj', 'Pred', 'Cmpl', 'Adju')
1 ('Rela', 'Pred')
1 ('Conj', 'Pred', 'Adju')
1 ('Conj', 'Pred', 'Subj', 'Objc', 'Objc')
1 ('Time', 'Nega', 'Pred', 'Subj')
1 ('Conj', 'Pred', 'Objc', 'Adju')
1 ('Rela', 'Pred', 'Objc', 'Adju')
1 ('Conj', 'Objc', 'Pred', 'Time')
1 ('Rela', 'Pred', 'Cmpl', 'Time')
1 ('Conj', 'Adju', 'Cmpl', 'Pred')
1 ('Pred', 'Time')
1 ('Rela', 'Pred', 'Loca')
1 ('Subj', 'Pred', 'Cmpl')
1 ('Modi', 'Cmpl', 'Pred')
1 ('Modi', 'Pred', 'Cmpl', 'Time', 'Adju')
1 ('Co