# Dependency Data

This notebook will investigate using spaCy dependency data to analyse and generate patent text.

## Getting the Data

Let's look at some patent descriptions.

We have a patent corpus with 100 example specifications. We can start with that.

In [1]:
# imports
from patentdata.models.patentcorpus import PatentCorpus
import os, pickle

In [2]:
#Let's import spaCy
import spacy

nlp = spacy.load('en') 

In [3]:
filename = "g06_100_grants.patcorp.zip"
pc = PatentCorpus.load(filename)

In [4]:
doc1 = pc.documents[0]

Maybe as a hack we can replace "FIG." with "Figure" and "FIGS." with "Figures".  

Also we need to get rid of the reference numerals. This is where our entity recognition would come in handy - as we could replace the entity with a labelled or unlabelled string.

In [5]:
"FIG. 11 shows an disassembly of the stylus configuration of FIG. 8".replace("FIG.", "Figure")

'Figure 11 shows an disassembly of the stylus configuration of Figure 8'

In [6]:
spacy_doc = nlp(doc1.description.text.replace("FIG.", "Figure").replace("FIGS.", "Figures"))

In [7]:
roots = [sent.root for sent in spacy_doc.sents]
print(roots)

[handle, comprises, has, has, has, is, relates, known, connected, is, is, allow, is, is, attached, result, be, provide, is, attached, handle, provided, comprises, has, has, has, is, is, shows, is, recognised, include, have, recognised, recognised, recognised, has, has, shaped, have, referred, rotated, is, material, recognised, recognised, recognised, recognised, attached, has, have, have, have, has, facilitated, Connection, shaped, coupled, Control, configured, configured, recognised, take, recognised, positioned, recognised, is, is, recognised, recognised, Coupling, configured, recognised, Further, configured, configured, inserted, located, causes, recognised, extend, has, coupled, configured, is, restricts, recognised, varied, facilitate, be, facilitates, shows, occurs, recognised, be, located, located, Accordingly, handle, comprises, has, has, recognised, have, comprising, have, recognised, recognised, Stylus, is, have, be, used, configured, configured, ease, has, configured, has, b

In [8]:
spacy_doc

A handle for a portable terminal having a user interface configured for interaction with a stylus. The handle comprises a handle body having the grip portion shaped for grasping by a hand of the user of the portable terminal, a proximal end of the grip portion for coupling to a housing of the portable computer, and a distal end of the grip portion. The handle has a receptacle connected to the handle body and located in the interior of the handle body, such that the receptacle is adapted for releasably retaining the stylus in the interior of the handle body. The handle body 21 has an aperture for facilitating access of the stylus into the receptacle. The receptacle has at least one arm having a first shaped feature (e.g. protrusion and/or notch/groove) adapted for engaging a second shaped feature (e.g. corresponding notch/groove and/or protrusion) of the stylus for providing the releasable retaining of the stylus when resident in the receptacle. The arm is biased towards a first positio

In [9]:
compressed = []
for root_token in roots:
    subj = obj = ""
    for child in root_token.children:
        if 'subj' in child.dep_:
            subj = child.text
        if 'obj' in child.dep_:
            obj = child.text
    compressed.append(" ".join([subj, root_token.text, obj]))

In [10]:
compressed

[' handle ',
 'handle comprises body',
 'handle has receptacle',
 'body has aperture',
 'receptacle has arm',
 'arm is ',
 'invention relates ',
 'It known ',
 'molded connected ',
 ' is ',
 'removal is ',
 'handles allow ',
 'attaching is ',
 'disadvantage is ',
 'stylus attached ',
 'techniques result ',
 'triggers be ',
 'handles provide configuration',
 'It is ',
 'stylus attached ',
 'result handle ',
 'Contrary provided handle',
 'handle comprises body',
 'handle has receptacle',
 'body has aperture',
 'receptacle has arm',
 'arm is ',
 'aspect is ',
 'Figure shows assembly',
 'shown is ',
 'It recognised ',
 '18 include ',
 'terminal have handle',
 'It recognised ',
 ' recognised ',
 ' recognised ',
 'computer has number',
 'terminal has source',
 'handle shaped ',
 'portion have ',
 'portion referred ',
 'component rotated ',
 'cavity is ',
 ' material ',
 ' recognised ',
 'It recognised ',
 'It recognised ',
 'It recognised ',
 'end attached ',
 'handle has assembly',
 'assemb

In [11]:
[" ".join([w.tag_ for w in sent]) for sent in spacy_doc.sents]

['DT NN IN DT JJ NN VBG DT NN NN VBN IN NN IN DT NN .',
 'DT NN VBZ DT NN NN VBG DT NN NN VBN IN VBG IN DT NN IN DT NN IN DT JJ NN , DT JJ NN IN DT NN NN IN NN IN DT NN IN DT JJ NN , CC DT JJ NN IN DT NN NN .',
 'DT NN VBZ DT NN VBN IN DT NN NN CC VBN IN DT NN IN DT NN NN , JJ IN DT NN VBZ VBN IN RB VBG DT NN IN DT NN IN DT NN NN .',
 'DT NN NN CD VBZ DT NN IN VBG NN IN DT NN IN DT NN .',
 'DT NN VBZ RB JJS CD NN VBG DT JJ VBN NN -LRB- NN NN CC NN SYM NN -RRB- VBD IN VBG DT RB VBN NN -LRB- JJ VBG NN SYM NN CC NN -RRB- IN DT NN IN VBG DT JJ NN IN DT NN WRB NN IN DT NN .',
 'DT NN VBZ JJ IN DT JJ NN IN VBG DT NN IN DT JJ VBN NN -LRB- NN NN CC NN SYM NN -RRB- IN DT NN VBN NN -LRB- JJ VBG NN SYM NN CC NN -RRB- . SP',
 'DT NN VBZ IN DT NN NN NN IN DT NN IN DT JJ NN . SP',
 'PRP VBZ RB VBN TO VB DT NN IN VBG NN JJ IN DT JJ NN IN NNS NNS .',
 'RB , JJ VBN NNS VBP RB CC VBN IN DT JJ NN IN DT NN IN DT NN NN , CC VBP VBN IN DT NN NNS CC VBG NNS .',
 'IN DT NN IN DT JJ NN , NN IN DT NN IN DT NN N

In [12]:
[" ".join([w.pos_ for w in sent]) for sent in spacy_doc.sents]

['DET NOUN ADP DET ADJ NOUN VERB DET NOUN NOUN VERB ADP NOUN ADP DET NOUN PUNCT',
 'DET NOUN VERB DET NOUN NOUN VERB DET NOUN NOUN VERB ADP VERB ADP DET NOUN ADP DET NOUN ADP DET ADJ NOUN PUNCT DET ADJ NOUN ADP DET NOUN NOUN ADP NOUN ADP DET NOUN ADP DET ADJ NOUN PUNCT CCONJ DET ADJ NOUN ADP DET NOUN NOUN PUNCT',
 'DET NOUN VERB DET NOUN VERB ADP DET NOUN NOUN CCONJ VERB ADP DET NOUN ADP DET NOUN NOUN PUNCT ADJ ADP DET NOUN VERB VERB ADP ADV VERB DET NOUN ADP DET NOUN ADP DET NOUN NOUN PUNCT',
 'DET NOUN NOUN NUM VERB DET NOUN ADP VERB NOUN ADP DET NOUN ADP DET NOUN PUNCT',
 'DET NOUN VERB ADV ADJ NUM NOUN VERB DET ADJ VERB NOUN PUNCT NOUN NOUN CCONJ NOUN SYM NOUN PUNCT VERB ADP VERB DET ADV VERB NOUN PUNCT ADJ VERB NOUN SYM NOUN CCONJ NOUN PUNCT ADP DET NOUN ADP VERB DET ADJ NOUN ADP DET NOUN ADV NOUN ADP DET NOUN PUNCT',
 'DET NOUN VERB ADJ ADP DET ADJ NOUN ADP VERB DET NOUN ADP DET ADJ VERB NOUN PUNCT NOUN NOUN CCONJ NOUN SYM NOUN PUNCT ADP DET NOUN VERB NOUN PUNCT ADJ VERB NOUN SYM

In [13]:
# Look at probablistic rules for NN > E
from collections import Counter

NN_rules = Counter([word.text for word in spacy_doc if word.tag_ == "NN"])
print(NN_rules.most_common())

[('handle', 123), ('body', 107), ('portion', 103), ('surface', 80), ('stylus', 76), ('actuator', 72), ('user', 54), ('Figure', 53), ('end', 50), ('position', 48), ('receptacle', 45), ('example', 40), ('enclosure', 37), ('grip', 34), ('assembly', 28), ('terminal', 28), ('slot', 28), ('housing', 28), ('protrusion', 27), ('arm', 26), ('computer', 25), ('latch', 23), ('connection', 23), ('control', 23), ('contact', 22), ('engagement', 21), ('region', 21), ('device', 20), ('side', 19), ('member', 18), ('scanning', 18), ('e.g.', 18), ('hand', 17), ('view', 17), ('feature', 16), ('switch', 15), ('configuration', 14), ('notch', 14), ('trigger', 13), ('wall', 13), ('groove', 13), ('covering', 13), ('material', 13), ('finger', 13), ('interface', 12), ('coupling', 12), ('depression', 12), ('operation', 11), ('mechanism', 11), ('face', 11), ('respect', 10), ('embodiment', 10), ('point', 10), ('overmold', 10), ('interaction', 10), ('location', 9), ('part', 9), ('b', 9), ('interior', 8), ('c', 8), (

In [14]:
DT_rules = Counter([word.text.lower() for word in spacy_doc if word.tag_ == "DT"])
print(DT_rules.most_common())

[('the', 1210), ('a', 205), ('an', 57), ('another', 19), ('either', 11), ('any', 7), ('these', 4), ('each', 3), ('that', 3), ('this', 2), ('some', 1), ('both', 1)]


In [15]:
JJ_rules = Counter([word.text.lower() for word in spacy_doc if word.tag_ == "JJ"])
print(JJ_rules.most_common())

[('first', 48), ('proximal', 47), ('portable', 38), ('resilient', 33), ('second', 32), ('adjacent', 29), ('such', 28), ('other', 20), ('interior', 16), ('distal', 16), ('releasable', 15), ('electrical', 14), ('corresponding', 10), ('operable', 9), ('elongated', 9), ('different', 8), ('inclined', 8), ('mechanical', 8), ('alternative', 7), ('e.g.', 7), ('external', 7), ('current', 6), ('optional', 6), ('above', 6), ('subsequent', 5), ('biased', 5), ('same', 4), ('overmold', 4), ('unactuated', 4), ('foreign', 3), ('rigid', 3), ('handle', 3), ('electronic', 3), ('protruding', 3), ('present', 3), ('additional', 3), ('front', 3), ('wireless', 3), ('integral', 3), ('further', 3), ('molded', 3), ('onboard', 3), ('secondary', 3), ('depressed', 3), ('flush', 3), ('non', 2), ('awkward', 2), ('dotted', 2), ('similar', 2), ('physical', 2), ('comfortable', 2), ('actuated', 2), ('bottom', 2), ('accidental', 2), ('unactivated', 2), ('problematic', 2), ('engaged', 2), ('thermoplastic', 2), ('respective

## The point is that our "grammars" are independent of the words.

We can start by looking at n-grams. Start with POS as tags are limited.  

Let's have a look at the set of POS labels over our 100 documents.

In [16]:
POS_counter = Counter([
    word.pos_ 
    for doc in pc.documents 
    for word in nlp(doc.description.text.replace("FIG.", "Figure").replace("FIGS.", "Figures"))
])

In [17]:
POS_counter.most_common()

[('NOUN', 283372),
 ('VERB', 142978),
 ('DET', 129360),
 ('ADP', 120407),
 ('PUNCT', 109050),
 ('ADJ', 73079),
 ('NUM', 47758),
 ('CCONJ', 27478),
 ('ADV', 27452),
 ('PROPN', 25368),
 ('SPACE', 16415),
 ('PART', 11712),
 ('PRON', 3445),
 ('SYM', 2976),
 ('X', 1814),
 ('INTJ', 507)]

In [18]:
POS_set = set(POS_counter.elements())
print(POS_set)

{'VERB', 'CCONJ', 'SPACE', 'PRON', 'PROPN', 'SYM', 'ADJ', 'ADV', 'X', 'DET', 'PUNCT', 'ADP', 'NOUN', 'INTJ', 'PART', 'NUM'}


In [19]:
spacy_docs = [nlp(doc.description.text.replace("FIG.", "Figure").replace("FIGS.", "Figures")) for doc in pc.documents]

Look at comma clause structures within sentences. Would these maybe come out of n-gram patterns?

In [20]:
spacy_doc[-2]

132

In [21]:
bigrams = []
trigrams = []

for doc in spacy_docs:
    for i, token in enumerate(doc[:-2]):
        bigrams.append("{0} {1}".format(token.pos_, doc[i+1].pos_))
        trigrams.append("{0} {1} {2}".format(token.pos_, doc[i+1].pos_, doc[i+2].pos_))
    bigrams.append("{0} {1}".format(doc[-2].pos_, doc[-1].pos_))
    
bigram_count = Counter(bigrams)
trigram_count = Counter(trigrams)

In [22]:
print("20 most common bigrams: {0}".format(bigram_count.most_common(50)))
print("20 most common trigrams: {0}".format(trigram_count.most_common(50)))

20 most common bigrams: [('DET NOUN', 83866), ('NOUN NOUN', 73864), ('ADP DET', 66556), ('NOUN PUNCT', 53281), ('NOUN ADP', 52587), ('ADJ NOUN', 47744), ('NOUN VERB', 39931), ('VERB VERB', 34326), ('VERB ADP', 31899), ('DET ADJ', 30461), ('NOUN NUM', 29969), ('VERB DET', 25421), ('ADP NOUN', 24667), ('PUNCT DET', 21446), ('VERB NOUN', 17625), ('NUM PUNCT', 16979), ('PUNCT NOUN', 14285), ('PUNCT ADP', 13368), ('PUNCT SPACE', 11912), ('NUM VERB', 11412), ('NOUN CCONJ', 10622), ('PART VERB', 9461), ('ADV VERB', 9101), ('ADP ADJ', 8900), ('VERB ADV', 8530), ('PROPN PUNCT', 8046), ('PUNCT VERB', 8005), ('ADP VERB', 7839), ('PUNCT CCONJ', 7494), ('VERB ADJ', 7474), ('VERB PUNCT', 7080), ('PUNCT ADV', 7051), ('PUNCT PUNCT', 6992), ('NOUN ADJ', 6796), ('PUNCT ADJ', 6573), ('ADV PUNCT', 6435), ('CCONJ DET', 6363), ('ADJ VERB', 6140), ('ADJ ADP', 5934), ('DET VERB', 5902), ('CCONJ NOUN', 5841), ('NUM ADP', 5020), ('NOUN PROPN', 5016), ('CCONJ VERB', 4994), ('DET PROPN', 4864), ('NUM NOUN', 4827)

In [23]:
fourgrams = []
fivegrams = []

for doc in spacy_docs:
    for i, token in enumerate(doc[:-4]):
        fourgrams.append("{0} {1} {2} {3}".format(token.pos_, doc[i+1].pos_, doc[i+2].pos_, doc[i+3].pos_))
        fivegrams.append("{0} {1} {2} {3} {4}".format(token.pos_, doc[i+1].pos_, doc[i+2].pos_, doc[i+3].pos_, doc[i+4].pos_))
    fourgrams.append("{0} {1} {2} {3}".format(doc[-4].pos_, doc[-3].pos_, doc[-2].pos_, doc[-1].pos_))
    
fourgram_count = Counter(fourgrams)
fivegram_count = Counter(fivegrams)
print("20 most common 4-grams: {0}".format(fourgram_count.most_common(50)))
print("20 most common 5-grams: {0}".format(fivegram_count.most_common(50)))

20 most common 4-grams: [('NOUN ADP DET NOUN', 18744), ('ADP DET NOUN NOUN', 14908), ('ADP DET ADJ NOUN', 14393), ('VERB ADP DET NOUN', 12028), ('DET NOUN ADP DET', 9982), ('ADP DET NOUN ADP', 8578), ('VERB VERB ADP DET', 7957), ('NOUN ADP DET ADJ', 7855), ('ADP DET NOUN PUNCT', 7525), ('DET ADJ NOUN NOUN', 7468), ('NOUN PUNCT DET NOUN', 7345), ('NOUN NOUN ADP DET', 6515), ('VERB DET NOUN ADP', 6036), ('NOUN VERB ADP DET', 5858), ('DET NOUN NOUN NOUN', 5840), ('PUNCT DET NOUN NOUN', 5703), ('DET NOUN NOUN VERB', 5609), ('VERB DET NOUN NOUN', 5423), ('ADP DET NOUN VERB', 5258), ('DET NOUN NOUN NUM', 5234), ('DET ADJ NOUN PUNCT', 5172), ('ADJ NOUN ADP DET', 5156), ('VERB ADP DET ADJ', 4942), ('VERB DET ADJ NOUN', 4912), ('DET NOUN ADP NOUN', 4900), ('DET ADJ NOUN ADP', 4820), ('DET NOUN NOUN PUNCT', 4758), ('NOUN NOUN VERB VERB', 4516), ('DET NOUN NOUN ADP', 4511), ('NOUN NOUN NUM PUNCT', 4469), ('NOUN NOUN NUM VERB', 4162), ('NOUN VERB VERB ADP', 4094), ('NOUN VERB VERB VERB', 4044), ('

These don't seem to help for building grammar rules.

They don't capture the compositional structure.  

See here for some discussion of why POS group: http://www.mit.edu/~6.863/spring2009/jmnew/11.pdf. Noun phrases are consitituents before and after a verb. This syncs with our ROOT verb from our dependency parse. 

Again are our groups different levels of the dependency pass children?

In [24]:
test_sent = nlp("""
It is also recognised that the body of the receptacle can be configured to include only one resilient arm that is biased with respect to the wall of the body, thus facilitating the retention of the stylus between the one arm and the wall (and/or other rigid secondary structures of the body)
""")

In [25]:
sent = list(test_sent.sents)[0]

In [26]:
sent.root

recognised

In [27]:
list(sent.root.children)

[It, is, also, configured]

In [28]:
list(sent.root.lefts)

[It, is, also]

In [29]:
list(list(sent.root.lefts)[0].lefts)

[]

In [30]:
list(sent.root.rights)

[configured]

In [31]:
list(list(sent.root.rights)[0].children)

[that, body, can, be, include]

In [32]:
list(list(sent.root.children)[-1].lefts)

[that, body, can, be]

In [33]:
list(list(sent.root.children)[0].subtree)

[, It]

In [34]:
list(list(list(sent.root.children)[-1].lefts)[1].children)

[the, of]

In [43]:
def print_l_pos(node):
    """ Print the POS of children if node has children."""
    if list(node.lefts):
        print([word.pos_ for word in node.lefts] + [node.pos_])
        for child in node.lefts:
            print_l_pos(child)

def print_r_pos(node):
    """ Print the POS of children if node has children."""
    if list(node.rights):
        print([node.pos_] + [word.pos_ for word in node.rights])
        for child in node.rights:
            print_r_pos(child)

In [44]:
print_l_pos(sent.root)

['PRON', 'VERB', 'ADV', 'VERB']
['SPACE', 'PRON']


In [45]:
print_r_pos(sent.root)

['VERB', 'VERB']
['VERB', 'VERB']
['VERB', 'NOUN']
['NOUN', 'VERB']
['VERB', 'ADP', 'VERB']
['ADP', 'NOUN']
['NOUN', 'ADP']
['ADP', 'NOUN']
['NOUN', 'ADP', 'PUNCT']
['ADP', 'NOUN']
['VERB', 'NOUN']
['NOUN', 'ADP']
['ADP', 'NOUN']
['NOUN', 'ADP']
['ADP', 'NOUN']
['NOUN', 'CCONJ', 'NOUN', 'PUNCT']
['NOUN', 'PUNCT', 'CCONJ', 'NOUN', 'ADP']
['ADP', 'NOUN']
['PUNCT', 'SPACE']


### Subtrees

In [48]:
def print_subtree(node):
    """ Print the subtree of a node."""
    print(list(node.subtree))
    if node.children:
        for child in node.children:
            print_subtree(child)

In [49]:
print_subtree(sent.root)

[
, It, is, also, recognised, that, the, body, of, the, receptacle, can, be, configured, to, include, only, one, resilient, arm, that, is, biased, with, respect, to, the, wall, of, the, body, ,, thus, facilitating, the, retention, of, the, stylus, between, the, one, arm, and, the, wall, (, and/or, other, rigid, secondary, structures, of, the, body, ), 
]
[
, It]
[
]
[is]
[also]
[that, the, body, of, the, receptacle, can, be, configured, to, include, only, one, resilient, arm, that, is, biased, with, respect, to, the, wall, of, the, body, ,, thus, facilitating, the, retention, of, the, stylus, between, the, one, arm, and, the, wall, (, and/or, other, rigid, secondary, structures, of, the, body, ), 
]
[that]
[the, body, of, the, receptacle]
[the]
[of, the, receptacle]
[the, receptacle]
[the]
[can]
[be]
[to, include, only, one, resilient, arm, that, is, biased, with, respect, to, the, wall, of, the, body, ,, thus, facilitating, the, retention, of, the, stylus, between, the, one, arm, and,

These subtrees do contain useful segments of the sentence.

For example:
* ```[that, the, body, of, the, receptacle, can, be, configured, to, include, only, one, resilient, arm, that, is, biased, with, respect, to, the, wall, of, the, body, ,, thus, facilitating, the, retention, of, the, stylus, between, the, one, arm, and, the, wall, (, and/or, other, rigid, secondary, structures, of, the, body, ), 
]```
* ``` [that, is, biased, with, respect, to, the, wall, of, the, body, ,, thus, facilitating, the, retention, of, the, stylus, between, the, one, arm, and, the, wall, (, and/or, other, rigid, secondary, structures, of, the, body, ), 
] ```
* ```
thus, facilitating, the, retention, of, the, stylus, between, the, one, arm, and, the, wall, (, and/or, other, rigid, secondary, structures, of, the, body,
```

In [63]:
def print_long_subtree(node):
    """ Print the subtree of a node."""
    subtree = list(node.subtree)
    if len(subtree) > 1:
        print(node.text, node.pos_, node.dep_, subtree)
    if node.children:
        print(len(list(node.children)))
        print([c.text for c in node.children])
        for child in node.children:
            print_long_subtree(child)

In [64]:
print_long_subtree(sent.root)

recognised VERB ROOT [
, It, is, also, recognised, that, the, body, of, the, receptacle, can, be, configured, to, include, only, one, resilient, arm, that, is, biased, with, respect, to, the, wall, of, the, body, ,, thus, facilitating, the, retention, of, the, stylus, between, the, one, arm, and, the, wall, (, and/or, other, rigid, secondary, structures, of, the, body, ), 
]
4
['It', 'is', 'also', 'configured']
It PRON nsubjpass [
, It]
1
['\n']
0
[]
0
[]
0
[]
configured VERB ccomp [that, the, body, of, the, receptacle, can, be, configured, to, include, only, one, resilient, arm, that, is, biased, with, respect, to, the, wall, of, the, body, ,, thus, facilitating, the, retention, of, the, stylus, between, the, one, arm, and, the, wall, (, and/or, other, rigid, secondary, structures, of, the, body, ), 
]
5
['that', 'body', 'can', 'be', 'include']
0
[]
body NOUN nsubjpass [the, body, of, the, receptacle]
2
['the', 'of']
0
[]
of ADP prep [of, the, receptacle]
1
['receptacle']
receptacle N

Do we need to look at the depth of the child links? We take the subtree above terminal nodes - but phrases such as "of the body" may be several layers.

Can we use graph processing libraries? The dependency parse is a DAG.

### Noun Chunks

In [56]:
for nc in sent.noun_chunks:
    print(["{0}_{1}".format(word.text, word.pos_) for word in nc])

['\n_SPACE', 'It_PRON']
['the_DET', 'body_NOUN']
['the_DET', 'receptacle_NOUN']
['only_ADV', 'one_NUM', 'resilient_ADJ', 'arm_NOUN']
['respect_NOUN']
['the_DET', 'wall_NOUN']
['the_DET', 'body_NOUN']
['the_DET', 'retention_NOUN']
['the_DET', 'stylus_NOUN']
['the_DET', 'one_NUM', 'arm_NOUN']
['the_DET', 'wall_NOUN']
['other_ADJ', 'rigid_ADJ', 'secondary_ADJ', 'structures_NOUN']
['the_DET', 'body_NOUN']


These don't seem as relevant - mainly extracting the DET-NOUN relationship

https://web.stanford.edu/~jurafsky/slp3/14.pdf

The motivation for all of the relations in the Universal Dependency scheme is
beyond the scope of this chapter, but the core set of frequently used relations can be
broken into two sets: clausal relations that describe syntactic roles with respect to a
predicate (often a verb), and modifier relations that categorize the ways that words
that can modify their heads
   

See this paper which discusses conversion between the parsing methods: https://homes.cs.washington.edu/~nasmith/papers/kong+rush+smith.naacl15.pdf and https://github.com/ikekonglp/PAD/tree/master/python.

spaCy uses the terms head and children to describe the words connected by a single arc in the dependency tree. The term dep is used for the arc label, which describes the type of syntactic relation that connects the child to the head. 

Can we look at patterns in the left and right branches of the tree?

In [None]:
for doc in spacy_docs:
    for sent in doc.sents:
        lefts, pos, rights = print_pos(sent.root)
        for child in lefts