# Composing Time Constructions

The current method for isolating phrase heads ([here](https://nbviewer.jupyter.org/github/ETCBC/heads/blob/master/phrase_heads.ipynb)) requires strenuous and ineloquent processing of BHSA subphrase relations. The subphrases are not always consistently encoded and suffer from numerous exceptional cases. The result is that the method is rather convoluted and ineloquent.

This notebook will explore the possibility of disconnecting semantic head analysis from the ETCBC subphrase encoding. 

A "semantic" head is the primary content word of a phrase, following Croft's "Primary Information Bearing Unit":

> **The noun and the verb are the PRIMARY INFORMATION_BEARING UNITS (PIBUs) of the phrase and clause respectively. In common parlance, they are the content words. PIBUs have major informational content that functional elements such as articles and [auxiliaries] do not have. (Croft, *Radical Construction Grammar*, 2001, 258; see also Shead, *Radical Frame Semantics and Biblical Hebrew*, 104)**

> **A (semantic) head is the profile equivalent that is the primary information-bearing unit, that is, the most contentful item that most closely profiles the same kind of thing that the whole constituent profiles. (ibid., 259)**

Croft also provides an additional criterion to "profile equivalence":

> **If the criterion of profile equivalence produces two candidates for headhood, the less schematic meaning is the PIBU; that is, the PIBU is the one with the narrower extension, in the formal semantic sense of that term (ibid., 259)**

## Inquiry

Can we isolate semantic phrase heads in BHSA using only the phrase_atom and phrase limits? This question indeed means that we  take the phrase_atom/phrase boundaries for granted. Empirically, the validity of BHSA phrase boundaries needs to be tested. But for now, the exercise of isolating semantic phrase heads could be seen as the first step towards reproducible phrase boundaries.

## Construction-Specific Heads and Roles

A semantic head is the central idea of the phrase and is construction-dependent. In Biblical Hebrew, it could be said that the majority of semantic heads are those words which do not stand in a "genitive" or appositional relationship to another word. But this is not always the case! For instance, in the case of the quantifier ◊õ◊ú, the head of the quantified phrase is most usually in the genitive position ("all of"). And there are other cases as well. Thus, one of the efforts in this project is to define headship on a construction by construction basis. A head is modeled as a semantic role in the noun-phrase. It is the central idea, which is somehow modified or specified by the words and phrases which surround it.

In [1]:
import sys
import collections
import pickle
import random
import re
import itertools
import copy
import uuid
import numpy as np
import networkx as nx
import matplotlib.pyplot as plt
from Levenshtein import distance as lev_dist
from IPython.display import display, HTML
from datetime import datetime
from pprint import pprint
from tf.app import use
from tf.fabric import Fabric
from tools.locations import data_locations

# load semantic vectors
with open('semvector.pickle', 'rb') as infile: 
    semdist = pickle.load(infile)

# load custom BHSA data + heads
TF = Fabric(locations=data_locations.values())
load_features = ['g_cons_utf8', 'trailer_utf8', 'label', 'lex',
                 'role', 'rela', 'typ', 'function', 'language',
                 'pdp', 'gloss', 'vs', 'vt', 'nhead', 'head', 
                 'mother', 'nu', 'prs', 'sem_set', 'ls', 'st',
                 'kind', 'top_assoc', 'number', 'obj_prep',
                 'embed', 'freq_lex', 'sp']
api = TF.load(' '.join(load_features))
F, E, T, L = api.F, api.E, api.T, api.L # shortform TF methods

A = use('bhsa', api=api, silent=True)
A.displaySetup(condenseType='phrase', withNodes=True, extraFeatures='st')

This is Text-Fabric 7.8.12
Api reference : https://annotation.github.io/text-fabric/Api/Fabric/

123 features found and 6 ignored
  0.00s loading features ...
   |     0.00s No structure info in otext, the structure part of the T-API cannot be used
  7.42s All features loaded/computed - for details use loadLog()


# Machinery

We could use some machinery to do the hard work of looking in and around a node. In the older approach we used TF search templates. But these are not very efficient at scale, and they are always bound by the limits of the query language. I take another approach here: a set of classes that specify locations and directions within a specified context.

In [2]:
from tools.langtools import Positions, PositionsTF, Walker, Dummy

## `Positions(TF)`

The `Positions` class enables concise access to adjacent nodes within a given context. This allows us to write algorithms with query-like efficiency with all of the power of Python. 

This class is instantiated on a word node and can provide contextual look-up data for a given word. For example, given a phrase containing the following word nodes:

> (189681, 189682, **189683**, 189684, 189685, 189686) <br>

representing the following phrase (space separated for clarity):

> ◊ë ◊©◊Å◊†◊™ **◊©◊Å◊ú◊©◊Å◊ô◊ù** ◊ï ◊©◊Å◊û◊†◊î ◊©◊Å◊†◊î

Given that the bolded node, `189683` is our `source` word, we instantiate the class, feeding in the node, the "phrase_atom" string (which is the context we want to search within), and an instance of Text-Fabric (`tf`):

In [3]:
      #    source node    context  TF instance  
      #         |            |       |
P = PositionsTF(189683, 'phrase_atom', A).get

If we want to obtain the word adjacent one space forward, we simply ask `P` for `1`, which gives us the next word in the phrase.

In [4]:
P(1)

189684

If we try to ask for 4 words forward, we go beyond the bounds of the phrase. But `P` handles this by returning nothing:

In [5]:
P(4)

To look back one word, we simply give a negative value:

In [6]:
P(-1)

189682

Finally, `P` can be used to quickly call features on these words. For instance, in order to get the lexeme of the word two words in front of `189683`:

In [7]:
P(2,'lex')

'CMNH/'

And if we want to get a number of features, we can just add other features to the arguments. The result is a feature set:

In [8]:
P(2, 'lex', 'nu')

{'CMNH/', 'sg'}

`P` can also handle features on the source node itself by giving a positionality of `0`:

In [9]:
P(0, 'lex')

'CLC/'

### `Positions` also exists in a non-TF version

When the non-tf version of `Positions` is provided any iterable, it can perform the same functions.

In [10]:
test_ps = ['The', 'good', 'dog', 'jumped.']

P = Positions('good', test_ps).get

In [11]:
P(1)

'dog'

Positions can perform a function on the result with an option `do`. In the example below, the word two words ahead is found and an upper-case function is called on the string.

In [12]:
P(2, do=lambda w: w.upper())

'JUMPED.'

The non-tf version of `Positions` makes it possible to do positionality searches with any ordered list of Python objects that represent linguistic units.

## `Walker`

`Walker` performs a similar function to `Positions`, except it is ambiguous to exact positions, walking either `ahead` or `back` from the source to a target node in the context. A function must be supplied that returns `True` on the target node.

We instantiate the `Walker` using the same source and context as above.

In [13]:
source = 189683
# get words inside source's phrase_atom
positions = L.d(
    L.u(189683,'phrase_atom')[0], 'word'
)

Wk = Walker(source, positions)

`Walker` is demonstrated below with the same word. A simple `lambda` function is used to test for the lexeme. In the example below, we find the first word ahead of `189683` that is a cardinal number:

In [14]:
Wk.ahead(lambda w: F.ls.v(w) == 'card')

189685

An alternative demonstrates the `None` returned on the lack of a valid match.

In [15]:
Wk.ahead(lambda w: F.ls.v(w) == 'BOOGABOOGA')

Another example wherein we walk backwards to the preposition:

In [16]:
Wk.back(lambda w: F.sp.v(w) == 'prep')

189681

We can also specify that the walk should be interrupted under certain conditions with a `stop` function. In this case we walk forward to the next cardinal number, but the walk is interrupted when the `stop` function detects a conjunction.

In [17]:
Wk.ahead(lambda w: F.ls.v(w) == 'card',
         stop=lambda w: F.sp.v(w) == 'conj')

We can also specify the opposite with a `go` function argument, which defines the nodes that allowed to intervene between `source` and `target`. Below we specify that *only* a conjunction should intervene.

In [18]:
Wk.ahead(lambda w: F.ls.v(w) == 'card',
         go=lambda w: F.sp.v(w) == 'conj')

189685

The `go` and `stop` functions can be as permissive or strict as desired.

Finally, we can tell `Walker` that the output of the validation function should be returned instead of the node itself with the optional argument `output=True`:

In [19]:
val_funct = lambda w: F.ls.v(w) if F.ls.v(w)=='card' else None

Wk.ahead(val_funct, output=True)

'card'

This ability is useful for certain tests.

Like `Positions`, `Walker` can be used in non-TF contexts:

In [20]:
test_ps = ['The', 'bad', 'cat', 'swatted.']

Wk_notf = Walker('bad', test_ps)

In [21]:
Wk_notf.ahead(lambda w: w.startswith('sw'))

'swatted.'

### Returning All Results along Path

`Walker` can also return all results along the path by toggling `every=True`

In [22]:
Wk_notf.ahead(lambda w: type(w)==str, every=True)

['cat', 'swatted.']

## `Dummy`

When writing conditions and logic, we want an object that passively receives `NoneType`s or zero `int`s without throwing errors. Such an object should also return `None` to reflect its `False` value. `Dummy`, provides such functionality. `Dummy` can receive all of the arguments, kwargs, and function calls as a `Positions` or `Walker` object. But it returns absolutely nothing. Ouch.

In [23]:
D = Dummy(None, 'phrase_atom', A)

The function call below returns `None`:

In [24]:
D.get(1)

As does this:

In [25]:
D.get(1, 'lex')

And even this:

In [26]:
D.ahead(1)

`D` is essentially a souless void that consumes whatever you throw at it and gives nothing in return.

For safe-calls on a `Position` or `Walker` object, assign nodes to it via a function with a `Dummy` given on null nodes:

In [27]:
def getPos(node, context, tf):
    """A function to get Positions safely."""
    if node:
        return PositionsTF(node, context, tf)
    else:
        return Dummy() # <- give dummy on empty node

So:

In [28]:
P = getPos(None, 'phrase_atom', A)
P.get(1)

Or:

In [29]:
P = getPos(1, 'phrase_atom', A)
P.get(1)

2

# Need for Semantic Data

The accurate processing of word connections depends on fuller semantic data than BHSA provides. Future semantic data could be stored in a similar way to word sets (`wsets`). 

For example, in the two phrases

> (Exod 25:39) ◊õ◊õ◊® ◊ñ◊î◊ë ◊ò◊î◊ï◊® <br>
> (2 Sam 24:24) ◊ë◊õ◊°◊£ ◊©◊ß◊ú◊ô◊ù ◊ó◊û◊©◊Å◊ô◊ù

we see that ◊ñ◊î◊ë and ◊õ◊°◊£, despite being in two different positions with two different words indicates a kind of "composed of" semantic concept: "round gold" (i.e. round composed of gold) and "silver shekels" (shekels composed of silver). To process these kinds of links, we need a list of nouns that often function as "material." But this is only the beginning. Many other words will have specific semantic values that motivate their syntactic behavior. Such a scope lies outside the bounds of this author's current project on Hebrew time phrases.

## A Compromise: Time Phrases

Since constructing these semantic classes is vastly time consuming, I want to start with a smaller set of cases. I will instead focus on parsing connections within time phrases for now. This is because I am analyzing time phrases in my current ongoing PhD project. 

In [30]:
def disjoint(ph):
    """Isolate phrases with gaps."""
    ph = L.d(ph,'word')
    for w in ph:
        if ph[-1] == w:
            break
        elif (ph[ph.index(w)+1] - w) > 1:
            return True

In [31]:
alltimes = [
    ph for ph in F.otype.s('timephrase') 
]
    
timephrases = [ph for ph in alltimes if not disjoint(ph)]

print(f'{len(timephrases)} phrases ready')

3864 phrases ready


## Search & Display Functions

The functions below allow for fast searching and displaying of queries using a `Construction` object, described in the next section.

In [32]:
# NB: For the future. Here is a template to plot 
# a network graph using networkx.

# graph = GIVE GRAPH HERE

# plt.figure(figsize=(10,5))
# pos = nx.drawing.spectral_layout(graph)
# nx.draw_networkx(graph, pos)

# edge_labels = {
#     (n1,n2):graph[n1][n2]['role']
#         for n1,n2 in graph.edges
# }
    
# nx.draw_networkx_edge_labels(graph, pos, font_size=10, edge_labels=edge_labels)
# plt.show()

In [33]:
def pretty(obj, condense='phrase', **kwargs):
    """Show a linguistic object that is not native to TF app."""
    index = kwargs.get('index')
    kwargs = {k:v for k,v in kwargs.items() if k not in {'index'}}
    show = L.d(obj, condense) if index is None else (L.d(obj, condense)[index],)
    print(show, not index, index)
    A.prettyTuple(show, seq=kwargs.get('seq', obj), **kwargs)

def prettyconds(cx):
    '''
    Iterate through an explain dict for a rela
    and print out all of checked conditions.
    '''
    cx_tree = [
        n for n in nx.bfs_tree(cx.graph, cx)
            if type(n) == Construction
    ]
    
    for node in cx_tree:
        print(f'-- {node} --')
        for case in node.cases:
            print(f'pattern: {case.get("pattern", case["name"])}')
            for cond, value in case['conds'].items():
                print('{:<30} {:>30}'.format(cond, str(value)))
            print()
        
def showcx(cx, **kwargs):
    """Display a construction object with TF.
    
    Calls TF.show() with HTML highlights for 
    words/stretch of words that serve a role
    within the construction. 
    """
    
    # get slots for display
    refslots = cx.slots if cx.slots else cx.element.slots
    showcontext = tuple(set(L.u(s, 'phrase')[0] for s in refslots))
    timephrase = L.u(list(refslots)[0], 'timephrase')[0]        

    if not cx:
        print('NO MATCHES')
        print('-'*20)
        A.prettyTuple(showcontext, extraFeatures='sp st', withNodes=True, seq=f'{timephrase} -> {cx}')
        if kwargs.get('conds'):
            prettyconds(cx)
        return None

    colors = itertools.cycle([
        '#96ceb4', '#ffeead', '#ffcc5c', '#ff6f69',
        '#bccad6', '#8d9db6', '#667292', '#f1e3dd',
    ])
    highlights = {}
    role2color = {}
    
    for node in cx.graph.adj[cx]:
        role = cx.graph[cx][node]['role']
        slots = cx.getslots(node)
        color = next(colors)
        role2color[role] = color
        for slot in slots:
            highlights[slot] = color
    
    A.prettyTuple(
        showcontext, 
        extraFeatures=kwargs.get('extraFeatures', 'sp st lex'), 
        withNodes=True, 
        seq=f'{timephrase} -> {cx}', 
        highlights=highlights
    )
    # reveal color meanings
    for role,color in role2color.items():
        colmean = '<div style="background: {}; text-align: center">{}</div>'.format(color, role)
        display(HTML(colmean))
    
    pprint(cx.unfoldroles(), indent=4)
    print()
    if kwargs.get('conds'):
        prettyconds(cx)
    display(HTML('<hr>'))
        
def test_search(
    elements, cxtest, 
    pattern='', 
    show=None, 
    end=None, 
    shuffle=True,
    updatei=1000,
    select=None,
    **kwargs
):
    '''
    Searches phrases with the specified relation 
    and prints out their descriptive explanation.
    '''
    
    start = datetime.now()
    print('beginning search')
    
    # random shuffle to get good diversity of examples
    if shuffle:
        random.shuffle(elements)
    matches = []
    
    # iterate and find matches on words
    for i,el in enumerate(elements):

        # update every 5000 iterations
        if i%updatei == 0:
            print(f'\t{len(matches)} found ({i}/{len(elements)})')
        
        # run test for construction
        test = cxtest(el)
        
        # save results
        if test:
            if pattern:
                if test.pattern == pattern:
                    matches.append(test)
            else:
                matches.append(test)
            
        # stop at end
        if end and len(matches) == end:
            break
        
    # display
    print('done at', datetime.now() - start)
    print(len(matches), 'matches found...')
    if show:
        print(f'showing top {show}')
    
    # option for filtering results
    if select:
        matches = [m for m in matches if select(m)]
        print(f'\tresults filtered to {len(matches)}')
    
    for match in matches[:show]:
        showcx(match, **kwargs)

## Construction Classes

* `Construction` - an object that represents a linguistic construction; the class records roles and the words that occupy them, as well as has methods for accessing and retrieving data on embedded roles/other constructions
* `CXBuilder` - matches conditions to build `Construction` objects; populates them with requisite data

### Construction2

In [34]:
class Construction(object):
    """A linguistic construction and its attributes.
    
    This is version 2, which utilizes NetworkX graphs
    instead of standard dictionaries.
    """
    
    def __init__(self, **specs):
        """Make a new construction.
        
        **specs:
            name: A name for the construction (CX).
            kind: A kind for the CX.
            pattern: Name of pattern that matched to license
                this CX.
            conds: A dictionary of conditions that all eval to
                True to license this CX. Keys are strings that
                describe what was tested; values are booleans.
            cases: A tuple containing all of the possible conds
                dicts that were tested and their results, including 
                non-matches. Useful for debugging.
                
        Key Attributes:
            slots: An ordered tuple of TF slot integers which
                describe what span of words in the corpus this
                CX represents.
            graph: A NetworkX graph object that contains the
                internal structure of this cx. Top node of
                the graph is this object; edges have values of
                "role" that give semantic role of each node.
            parent: a parent CX if this one is contained in 
                another's graph.
        """
        
        # map optional attributes
        for k,v in specs.items():
            setattr(self, k, v)
            
        # map obligatory attributes
        self.element = specs.get('element', str(uuid.uuid4()))
        self.match = specs.get('match', {})
        self.name = specs.get('name', '')
        self.kind = specs.get('kind', '')
        self.pattern = specs.get('pattern', specs.get('name', ''))
        self.conds = specs.get('conds', {})
        self.cases = specs.get('cases', tuple())
        
        # map roles and slots
        self.graph = nx.DiGraph()
        self.populate_graph(specs.get('roles', {}))
        self.slots = tuple()
        self.updateslots() # populates self.slots
    
    def __bool__(self):
        """Determine truth value of CX."""
        if self.match:
            return True
        else:
            return False
        
    def __repr__(self):
        """Display CX name with slots."""
        if self:
            return f'CX {self.name} {self.slots}'
        else:
            return '{CX EMPTY}'
        
    def _cx_att(self, attr, item):
        """Get an attribute on a cx or return int"""
        if type(item) == Construction:
            return item.__dict__[attr]
        elif type(item) == int:
            return item
            
    def _rolestuple(self):
        return tuple(
            (n1, n2, self.graph[n1][n2]['role'])
                 for n1, n2 in nx.bfs_edges(self.graph, self)
        )
            
    def __eq__(self, other):
        """Determine slot/role-based equality between CXs."""
        if (
            self.__class__ == other.__class__
            and self.name == other.name
            and str(self._rolestuple) == str(other._rolestuple)
        ):
            return True
        else:
            return False
        
    def __hash__(self):
        return hash(
            (self.name, self.element)
        )
    
    def __int__(self):
        """Provide integers for first slot in cx.
        
        Most relevant for word-level CXs and for
        using TF methods on those objects.
        """
        return next(iter(sorted(self.slots)), 0)
        
    def __contains__(self, cx):
        """Determine whether certain CX is contained in this one."""
        return cx in self.subgraph()
        
    def __deepcopy__(self, memo):
        """Return a copied version of this CX"""
        roles = {
            self.graph[self][node]['role']:node 
                for node in self.graph.succ[self]
        }
        attribs = {
            k:v for k,v in self.__dict__.items()
                if k != 'graph'
        }
        attribs['roles'] = roles
        return Construction(**attribs)
        
    def getslots(self, item):
        """Get TF integer slots as tuple."""
        slots = self._cx_att('slots', item)
        if type(slots) == tuple:
            return slots
        else:
            return (slots,)
            
    def populate_graph(self, rolesdict):
        """Populate the graph with the CX's structure"""
        
        # populate graph with roles
        self.graph.add_node(self)
        for role, child in rolesdict.items():
            
            # create unique copy of child 
            # esp. relevant for CX objects
            # that are shared between other CXs
            child = copy.deepcopy(child)
            
            # add child to graph
            self.graph.add_edge(self, child, role=role)
            
            # import child's graph structure
            if type(child) == Construction:
                self.graph.update(child.graph)
                child.graph = self.graph # assign graph to child
    
    def subgraph(self):
        """Return graph governed by this CX"""        
        # return subgraph
        return self.graph.subgraph(nx.bfs_tree(self.graph, self))
    
    def updategraph(self, oldnode, newnode):
        """Update the internal structure of CX graph.
        
        Change left to right.
        """
        
        # get predecessor for reassignment
        pred = next(iter(self.graph.pred[oldnode]))
        
        # get replacement role 
        role = self.graph[pred][oldnode]['role']

        # remove old node
        self.graph.remove_node(oldnode)

        # make unique copy of newnode
        newnode = copy.deepcopy(newnode)

        # add new node
        self.graph.add_edge(pred, newnode, role=role)

        # add new nodes's constituents & roles to graph
        if type(newnode) == Construction:
            self.graph.update(newnode.graph)
            newnode.graph = self.graph # assign graph to child
            
        # remap slots to reflect new nodes
        self.updateslots()
        
        # remap slots for constituent cxs
        for node in self.graph:
            if type(node) == Construction:
                node.updateslots()
        
    def updateslots(self):
        """Update the slots list."""
        self.slots = tuple(sorted(set(
            slot for node in nx.bfs_tree(self.graph, self)
                for slot in self.getslots(node)
        )))
        
    def getrole(self, role, default=None):
        """Retrieves the adjacent node of a specific role.
        
        If node is not present, return default.
        """
        for node in self.graph.succ[self]:
            if self.graph[self][node]['role'] == role:
                return node
        return default
    
    def getsuccroles(self, role, start=None):
        """Retrieve successive roles.
        
        Recursively calls down the graph looking
        for successive roles.
        E.g. 
        >    head -> head -> head
        but not
        >    head -> adjv -> head
        """
        start = start or self
        for adj_node in self.graph.adj[start]:
            if self.graph[start][adj_node]['role'] == role:
                yield adj_node
                yield from self.getsuccroles(role, start=adj_node)
                
    def unfoldroles(self, cx=None):
        """Return all contained construction roles as a dict.

        Recursively calls down into graph nodes to populate
        a recursive dict along with labels.
        """
        cx = cx if cx is not None else self
        roledict = {}
        roledict['__cx__'] = cx.name
        for child in self.graph.succ[cx]:
            role = self.graph[cx][child]['role']
            if type(child) == Construction:
                roledict[role] = self.unfoldroles(child)
            elif type(child) == int:
                roledict[role] = child
        return roledict

### CXbuilder2

In [35]:
class Debugger(object):
    """Display debugging messages if toggled"""
    def __init__(self, boolean):
        self.report = boolean
        self.indent = 0
    def say(self,msg, end='\n', **kwargs):
        self.indent = kwargs.get('indent', self.indent)
        if self.report:
            indent = self.indent * '\t'
            fmtmsg = f'{indent}{msg}{end}'
            sys.stderr.write(fmtmsg)

class CXbuilder(object):
    """Identifies and builds constructions using Text-Fabric nodes."""
    
    def __init__(self):
        """Initialize CXbuilder, giving methods for CX detection."""
        
        # cache matched constructions for backreferences
        self.cache = collections.defaultdict(
            lambda: collections.defaultdict()
        )
        
        # NB: objects below should be overwritten 
        # and configured for the particular cxs needed
        self.cxs = tuple()
        self.yieldsto = {} 
        
        # for drip-bucket categories
        self.dripbucket = tuple()
    
    def cxcache(self, element, name, method):
        """Get cx from cache or run."""
        try:
            return self.cache[element][name]
        except KeyError:
            return method(element)
    
    def test(self, *cases):
        """Populate Construction obj based on a cases's all Truth value.
        
        The last-matching case will be used to populate
        a Construction object. This allows more complex
        cases to take precedence over simpler ones.
        
        Args:
            cases: an arbitrary number of dictionaries,
                each of which contains a string key that
                describes the test and a test that evals 
                to a Boolean.
        
        Returns:
            a populated or blank Construction object
        """
        
        # find cases where all cnds == True
        test = [
            case for case in cases
                if all(case['conds'].values())
                    and all(case['roles'].values())
        ]
        
        # return last test
        if test:
            cx = Construction(
                match=test[-1],
                cases=cases,
                **test[-1]
            )
            self.cache[cx.element][cx.name] = cx
            return cx
        else:
            return Construction(cases=cases, **cases[0])
        
    def findall(self, element):
        """Runs analysis for all constructions with an element.
        
        Returns as dict with test:result as key:value.
        """
        results = []
        
        # add cxs from this builder
        for funct in self.cxs:
            cx = funct(element)
            if cx:
                results.append(cx)
        
        return results
                        
    def sortbyslot(self, cxlist):
        """Sort constructions by order of contained slots."""
        sort = sorted(
            ((sorted(cx.slots), cx) for cx in cxlist),
            key=lambda k: k[0]
        )
        return [cx[-1] for cx in sort]
    
    def clusterCXs(self, cxlist):
        """Cluster constructions which overlap in their slots/roles.

        Overlapping constructions form a graph wherein the constructions 
        are nodes and the overlaps are edges. This algorithm retrieves all 
        interconnected constructions. It does so with a recursive check 
        for overlapping slot sets. Merging the slot sets produces new 
        overlaps. The algorithm passes over all constructions until no 
        further overlaps are detected.

        Args:
            cxlist: list of Construction objects

        Returns:
            list of lists, where each embedded list 
            is a cluster of overlapping constructions.
        """

        clusters = []
        cxlist = [i for i in cxlist] # operate on copy

        # iterate until no more intersections found
        thiscluster = [cxlist.pop(0)]
        theseslots = set(s for s in thiscluster[0].slots)

        # loop continues as it snowballs and picks up slots
        # loop stops when a complete loop produces no other matches
        while cxlist:

            matched = False # whether loop was successful

            for cx in cxlist:
                if theseslots & set(cx.slots):
                    thiscluster.append(cx)
                    theseslots |= set(cx.slots)
                    matched = True

            # cxlist shrinks; when empty, it stops loop
            cxlist = [
                cx for cx in cxlist 
                    if cx not in thiscluster
            ]

            # assemble loop
            if not matched:
                clusters.append(thiscluster)
                thiscluster = [cxlist.pop(0)]
                theseslots = set(s for s in thiscluster[0].slots)
        
        # add last cluster
        clusters.append(thiscluster)

        return clusters

    def test_yield(self, cx1, cx2):
        """Determine whether to submit cx1 to cx2."""
        
        # get name or class yields
        cx1yields = self.yieldsto.get(
            cx1.name,
            self.yieldsto.get(cx1.kind, set())
        )
        # test yields
        if type(cx1yields) == set:
            return bool({cx2.name, cx2.kind} & cx1yields)
        elif type(cx1yields) == bool:
            return cx1yields
        
    def interslots(self, cx1, cx2):
        """Get the intersecting slots of two CXs
        
        Return as sorted tuple.
        """
        return tuple(sorted(
            set(cx1.slots) & set(cx2.slots)
        ))
    
    def slots2node(self, cx, slots):
        """Get a CX node from a tuple of slots."""
        for node in nx.bfs_tree(cx.graph, cx):
            if cx.getslots(node) == slots:
                return node
    
    def intersect_node(self, cx1, cx2):
        """Get node from cx1 with slots common with cx2."""
        intersect = self.interslots(cx1, cx2)
        return self.slots2node(cx1, intersect)

    def weaveCX(self, cxlist, debug=False):
        """Weave together constructions on their intersections.

        Overlapping constructions form a graph wherein constructions 
        are nodes and the overlaps are edges. The graph indicates
        that the constructions function together as one single unit.
        weaveCX combines all constructions into a single one. Moving
        from right-to-left (Hebrew), the function consumes and subsumes
        subsequent constructions to previous ones. The result is a 
        single unit with embedding based on the order of consumption.
        Roles in previous constructions are thus expanded into the 
        constructions of their subsequent constituents.
        
        For instance, take the following phrase in English:
        
            >    "to the dog"
            
        Say a CXbuilder object contains basic noun patterns and can
        recognize the following contained constructions:
        
            >    cx Preposition: ('prep', to), ('obj', the),
            >    cx Definite: ('art', the), ('noun', dog)
        
        When the words of the constructions are compared, an overlap
        can be seen:
        
            >    cx Preposition:    to  the
            >    cx Definite:           the  dog
        
        The overlap in this case is "the". The overlap suggests that
        the slot filled by "the" in the Preposition construction 
        should be expanded. This can be done by remapping the role
        filled by "the" alone to the subsequent Definite construction.
        This results in embedding:
        
            >    cx Preposition: ('prep', to), 
                                 ('obj', cx Definite: ('art', the), 
                                                      ('noun', dog))
        
        weaveCX accomplishes this by calling the updaterole method native
        to Construction objects. The end result is a list of merged 
        constructions that contain embedding.
        
        Args: 
            cxlist: a list of constructions pre-sorted for word order;
                the list shrinks throughout recursive iteration until
                the job is finished
            cx: a construction object to begin/continue analysis on
            debug: an option to display debugging messages for when 
                things go wrong ü§™
                
        Prerequisites:
            self.yieldsto: A dictionary in CXbuilder that tells weaveCX
                to subsume one construction into another regardless of
                word order. Key is name of submissive construction, value
                is a set of dominating constructions. Important for, e.g., 
                cases of quantification where a head-noun might be preceded 
                by a chain of quantifiers but should still be at the top of 
                the structure since it is more semantically prominent.
                
        Returns:
            a list of composed constructions
        """
        
        db = Debugger(debug)
        
        db.say(f'\nReceived cxlist {cxlist}', indent=0)

        # compile all cxs to here
        root = copy.deepcopy(cxlist.pop(0))
        
        db.say(f'Beginning analysis with {root}')
        
        # begin matching and remapping
        while cxlist:
            
            # get next cx
            ncx = copy.deepcopy(cxlist.pop(0))
            
            # find root node with slots intersecting next cx
            db.say(f'comparing {root} with {ncx}', indent=1)
            node = self.intersect_node(root, ncx)
            db.say(f'intersect is at {node}')
            
            # remove cxs covered by larger version
            if root in ncx:
                db.say(f'root {root} in ncx {ncx}...replacing root with ncx')
                root = ncx
            
            # update yielded nodes
            elif self.test_yield(node, ncx):
                
                db.say(f'{node} being yielded to {ncx}')
                   
                # get top-most yielding node
                path = nx.shortest_path(root.graph, root, node)
                while path and self.test_yield(path[-1], ncx):
                    node = path.pop(-1)
                
                db.say(f'top-yielding node is {node}', indent=2)
                   
                # update ncx graph
                db.say(f'comparing {ncx} with {node}')
                ncxnode = self.intersect_node(ncx, node)
                db.say(f'intersect is at {ncxnode}')
                ncx.updategraph(ncxnode, node)
                db.say(f'ncx updated to {ncx}')
                
                # update root graph or remap root to ncx
                if root != node:
                    rnode = self.intersect_node(root, ncx)
                    db.say(f'replacing node {rnode} in root {root} with {ncx}')
                    root.updategraph(rnode, ncx)
                    
                else:
                    # switch root and ncx
                    db.say(f'switching {root} with {ncx}')
                    root = ncx
                 
            # update all non-yielding nodes
            else:
                db.say(f'\tupdating {node} in root with {ncx}')
                root.updategraph(node, ncx)
            
        return root
            
    def analyzestretch(self, stretch, debug=False):
        """Analyze an entire stretch of a linguistic unit.
        
        Applies construction tests for every constituent 
        and merges all overlapping constructions into a 
        single construction.
        
        Args:
            stretch: an iterable containing elements that
                are tested by construction tests to build
                Construction objects. e.g. stretch might be 
                a list of TF word nodes.
            debug: option to display debuggin messages
        
        Returns:
            list of merged constructions
        """
                   
        db = Debugger(debug)
        
        # match elements to constructions based on tests
        rawcxs = []
        covered = set()
        for element in stretch:
            matches = self.findall(element)
            if matches:
                rawcxs.extend(matches)
                covered |= set(
                    el for cx in matches 
                        for el in cx.graph
                )
        
        # apply drip-bucket categories
        for element in set(stretch) - covered:
            for funct in self.dripbucket:
                dripcx = funct(element)
                if dripcx:
                    rawcxs.append(dripcx)
        
        db.say(f'rawcxs found: {rawcxs}...')
        
        # return empty results
        if not rawcxs:
            db.say(f'!no cx pattern matches! returning []')
            return []
            
        # cluster and sort matched constructions
        clsort = [
            self.sortbyslot(cxlist)
                for cxlist in self.clusterCXs(rawcxs)    
        ]
    
        db.say(f'cxs clustered into: {clsort}...')
    
        db.say(f'Beginning weaveCX method...')
        # merge overlapping constructions
        cxs = [
            self.weaveCX(cluster, debug=debug)
                for cluster in clsort
        ]
        
        return self.sortbyslot(cxs)

### CXbuilder with Text-Fabric Methods

In [36]:
class CXbuilderTF(CXbuilder):
    """Build Constructions with TF integration."""
    
    def __init__(self, tf, **kwargs):
        
        # set up TF data for tests
        self.tf = tf
        self.F, self.T, self.L = tf.api.F, tf.api.T, tf.api.L
        self.context = kwargs.get('context', 'timephrase')
        
        # set up CXbuilder
        CXbuilder.__init__(self)

    def getP(self, node):
        """Get Positions object for a TF node.
        
        Return Dummy object if not node.
        """
        if not node:
            return Dummy
        return PositionsTF(node, self.context, self.tf).get
    
    def getWk(self, node):
        """Get Walker object for a TF word node.
        
        Return Dummy object if not node.
        """
        if not node:
            return Dummy()
        
        # format tf things to send
        thisotype = self.F.otype.v(node)
        context = self.L.u(node, self.context)[0]
        positions = self.L.d(context, thisotype)        
        return Walker(node, positions)

## Word Constructions

The `wordConstructions` builder class recognizes word semantic classes and types based on provided criteria.

In [37]:
class wordConstructions(CXbuilderTF):
    """Build word constructions."""
    
    def __init__(self, tf, **kwargs):
        
        """Initialize with Constructions attribs/methods."""
        CXbuilderTF.__init__(self, tf, **kwargs)
        
        # Order matters! More specific meanings last
        self.cxs = (
            self.pos,
            self.prep,
            self.qual_quant,
            self.card,
            self.ordn,
            self.name,
            self.cont_ptcp,
        )
        
        self.kind = 'word_cx'
    
    def cxdict(self, slotlist):
        """Map all TF word slots to a construction.
        
        Method returns a dictionary of slot:cx
        mappings.
        """
        
        slot2cx = {}
        for w in slotlist:
            for cx in self.findall(w):
                slot2cx[w] = cx
    
        return slot2cx
    
    def pos(self, w):
        """A drip-bucket part of speech CX.
        
        The standard ETCBC feature is pdp,
        which is "phrase-dependent part of
        speech." I.e. it is a contextually
        sensive pos label.
        """
        
        F = self.F
        
        # map
        pdplabel = {
            'subs': 'cont',
            'adjv': 'cont',
            'advb': 'cont',
        }
        pdp = F.pdp.v(w)
        
        return self.test(
            {
                'element': w,
                'name': f'{pdplabel.get(pdp, pdp)}',
                'kind': self.kind,
                'roles': {'head': w},
                'conds': {
                    f'bool(F.pdp.v({w}))':
                        bool(F.pdp.v(w)),
                }
            }
        )
    
    def prep(self, w):
        """A preposition word."""
        
        P = self.getP(w)
        F = self.F
        name = 'prep'
        roles = {'head': w}
        return self.test(
            {
                'element': w,
                'name': name,
                'kind': self.kind,
                'pattern': 'ETCBC pdp',
                'roles': roles,
                'conds': {
                    'F.pdp.v(w) == prep':
                        F.pdp.v(w) == 'prep',
                }
            },
            {
                'element': w,
                'name': name,
                'kind': self.kind,
                'pattern': 'ETCBC ppre words',
                'roles': roles,
                'conds': {
                    'F.ls.v(w) == ppre':
                        F.ls.v(w) == 'ppre',
                    'F.lex.v(w) != DRK/':
                        F.lex.v(w) != 'DRK/',
                }
            },
            {
                'element': w,
                'name': name,
                'kind': self.kind,
                'pattern': 'R>C/',
                'roles': roles,
                'conds': {
                    'F.lex.v(w) == R>C/':
                        F.lex.v(w) == 'R>C/',
                    'F.st.v(w) == c':
                        F.st.v(w) == 'c',
                    'P(-1,pdp) == prep':
                        P(-1,'pdp') == 'prep',
                    'phrase is adverbial':
                        F.function.v(
                            L.u(w,'phrase')[0]
                        ) in {
                            'Time', 'Adju', 
                            'Cmpl', 'Loca',
                        },
                }
            },
            {
                'element': w,
                'name': name,
                'kind': self.kind,
                'pattern': 'construct lexs',
                'roles': roles,
                'conds': {
                    'F.lex.v(w) in lexset':
                        F.lex.v(w) in {
                            'PNH/','TWK/', 
                            'QY/', 'QYH=/', 
                            'QYT/', '<WD/'
                        },
                    'F.prs.v(w) == absent':
                        F.prs.v(w) == 'absent',
                    'F.st.v(w) == c':
                        F.st.v(w) == 'c'
                }
            },
            {
                'element': w,
                'name': name,
                'kind': self.kind,
                'pattern': 'L+BD',
                'roles': roles,
                'conds': {
                    'F.lex.v(w) == BD/':
                        F.lex.v(w) == 'BD/',
                    'P(-1,lex) == L':
                        P(-1,'lex') == 'L',
                }
            },
            {
                'element': w,
                'name': name,
                'kind': self.kind,
                'pattern': '>XRJT/',
                'roles': roles,
                'conds': {
                    'F.lex.v(w) == >XRJT/':
                        F.lex.v(w) == '>XRJT/',
                    'F.st.v(w) == c':
                        F.st.v(w) == 'c',
                    'P(1,lex) or P(2,lex) not >JWB|RC</':
                        not {
                            P(1,'lex'), P(2,'lex')
                        } & {
                            '>JWB/', 'RC</'
                        }
                }
            },
            {
                'element': w,
                'name': name,
                'kind': self.kind,
                'pattern': '<YM/ time',
                'roles': roles,
                'conds': {
                    'F.lex.v(w) == <YM/':
                        F.lex.v(w) == '<YM/',
                    'F.st.v(w) == c':
                        F.st.v(w) == 'c',
                    'F.function.v(phrase) == Time':
                        F.function.v(
                            L.u(w,'phrase')[0]
                        ) == 'Time',
                }
            }
        )
    
    def name(self, w):
        """A name word (i.e. proper noun)."""
        return self.test(
            {
                'element': w,
                'name': 'name',
                'kind': self.kind,
                'roles': {'head': w},
                'conds': {
                    'F.pdp.v(w) == nmpr':
                        self.F.pdp.v(w) == 'nmpr'
                }
            }
        )
    
    def cont_ptcp(self, w):
        """A content word participle.
        
        A participle which can potentially
        function like a "noun" i.e. a content word.
        """
        
        F = self.F
        
        return self.test(
            {
                'element': w,
                'name': 'cont',
                'kind': self.kind,
                'pattern': 'participle',
                'roles': {'head': w},
                'conds': {
                    'F.sp.v(w) == verb':
                        F.sp.v(w) == 'verb',
                    'F.vt.v(w) in {ptcp, ptca}':
                        F.vt.v(w) in {'ptcp', 'ptca'},
                }
            },
        )    
    
    def card(self, w):
        """A cardinal number."""
        
        F = self.F
        P = self.getP(w)
        name = 'card'
        roles = {'head': w}
        
        return self.test(
            {
                'element': w,
                'name': name,
                'kind': self.kind,
                'roles': roles,
                'conds': {
                    'F.ls.v(w) == card':
                        F.ls.v(w) == 'card',
                }
            },
        )
    
    def ordn(self, w):
        """An ordinal word."""
        
        F = self.F
        P = self.getP(w)
        roles = {'head': w}
        
        return self.test(
            {
                'element': w,
                'name': 'ordn',
                'kind': self.kind,
                'pattern': 'ETCBC ls',
                'roles': roles,
                'conds': {
                    'F.ls.v(w) == ordn':
                        F.ls.v(w) == 'ordn',
                }
            },
        )
    
    def qual_quant(self, w):
        """A qualitative quantifier word."""
        
        F = self.F
        P = self.getP(w)
        name = 'qquant'
        roles = {'head': w}
        
        return self.test(
            {
                'element': w,
                'name': name,
                'kind': self.kind,
                'pattern': 'qualitative',
                'roles': roles,
                'conds': {
                    f'{F.lex.v(w)} in lexset':
                        F.lex.v(w) in {
                            'KL/', 'M<V/', 'JTR/',
                            'XYJ/', 'C>R=/', 'MSPR/', 
                            'RB/', 'RB=/',
                        },
                }
            },
            {
                'element': w,
                'name': name,
                'kind': self.kind,
                'pattern': 'portion',
                'roles': roles,
                'conds': {
                    f'{F.lex.v(w)} in lexset':
                        F.lex.v(w) in {
                            'M<FR/', '<FRWN/',
                            'XMJCJT/',
                        },
                }
            },
        )

## Subphrase Constructions

The `SPConstructions` class prepares subphrase constructions.

In [49]:
class SPConstructions(CXbuilderTF):
    """Class for building time phrase constructions."""
    
    def __init__(self, wordcxs, tf, **kwargs):
        
        """Initialize with Constructions attribs/methods."""
        CXbuilderTF.__init__(self, tf, **kwargs)
        
        self.words = wordcxs
        
        # map cx searches for full analyses
        self.cxs = (
            self.defi,
            self.card_chain,
            self.demon,
            self.adjv,
            self.advb,
            self.attrib,
            self.geni,
            self.numb,
            self.prep,
        )
        
        self.dripbucket = (
            self.wordphrase,
        )
        
        self.kind = 'subphrase'
        
        # submit these cxs to cx in set 
        self.yieldsto = {
            'card_chain': {'numb_ph'},
            'word_cx': {self.kind},
        }
        
    def word(self, w):
        """Safely get word CX"""
        return self.words.get(w, Construction())
        
    def wordphrase(self, w):
        """A phrase construction for one word.
        
        Returns first matching word cx for a word.
        """
        return self.word(w)
        
    def getindex(self, indexable, index, default=None):
        """Safely get an index on an item"""
        try:
            return indexable[index]
        except IndexError:
            return default
        
    def defi(self, w):
        """Matches a definite construction."""
        
        P = self.getP(w)
        
        return self.test( 
            {
                'element': w,
                'name': 'defi_ph',
                'kind': self.kind,
                'roles': {'art': self.word(w), 'head': self.word(P(1))},
                'conds': {

                    f'F.sp.v({w}) == art':
                        self.F.sp.v(w) == 'art',

                    'bool(P(1))':
                        bool(P(1))
                }
            }
        )
    
    def prep(self, w):
        """Matches a preposition with a modified element."""
                
        P = self.getP(w)
        Wk =  self.getWk(w)
        F = self.F
        
        return self.test(
            {
                'element': w,
                'name': 'prep_ph',
                'kind': self.kind,
                'roles': {'prep':self.word(w), 'head':self.word(P(1))},
                'conds': {

                    f'({w}).name == prep':
                        self.word(w).name == 'prep',

                    f'F.prs.v({w}) == absent':
                        self.F.prs.v(w) == 'absent',
                    
                    'bool(P(1))':
                        bool(P(1)),
                }
            },
            {
                'element': w,
                'name': 'prep_ph',
                'pattern': 'suffix',
                'kind': self.kind,
                'roles': {'prep': self.word(w), 'head': self.word(w)},
                'conds': {
                    
                    f'({w}).name == prep':
                        self.word(w).name == 'prep',
                    
                    'F.prs.v(w) not in {absent, NA}':
                        F.prs.v(w) not in {'absent', 'NA'},
                }
                
            },
            {
                'element': w,
                'name': 'prep_ph',
                'pattern': 'prep...on',
                'kind': self.kind,
                'roles': {'prep': self.word(w), 'head': self.word(w)},
                'conds': {
                    f'{F.lex.v(w)} in lexset':
                        F.lex.v(w) in {'M<L/', 'HL>H'},
                    f'Wk.back(({w}).name == prep)':
                        bool(Wk.back(lambda n: self.word(n).name=='prep'))
                }
                
            }
        )
        
    def geni(self, w):
        """Queries for "genitive" relations on a word."""
        
        P = self.getP(w)
        word = self.word
        
        return self.test(
            {
                'element': w,
                'name': 'geni_ph',
                'kind': self.kind,
                'roles': {'geni': self.word(w), 'head': self.word(P(-1))},
                'conds': {

                    'P(-1, st) == c': 
                        P(-1,'st') == 'c',

                    'P(-1).name not in {qquant,card}':
                        word(P(-1)).name not in {'qquant','card'},
                    
                    'P(-1).name != prep':
                        word(P(-1)).name != 'prep',
                }
            }
        )

    def advb(self, w):
        """Match and adverb and its mod."""
        
        P = self.getP(w)
        word = self.word
        
        return self.test(
           {
                'element': w,
                'name': 'advb_ph',
                'kind': self.kind,
                'roles': {'advb': word(w), 'head': word(P(1))},
                'conds': {
                    f'F.sp.v({w}) == advb':
                        self.F.sp.v(w) == 'advb',
                    'P(-1,sp) != art':
                        P(-1,'sp') != 'art',
                    'bool(P(1))':
                        bool(P(1)),
                    'P(1,sp) != conj': # ensure not a nominal use
                        P(1,'sp') != 'conj',
                    'P(-1).name != prep': # ensure not nominal
                        word(P(-1)).name != 'prep',
                    f'F.lex.v({F.lex.v(w)}) not in noadvb_set':
                        F.lex.v(w) not in {'JWMM'},
                }
            }
        )
    
    def adjv(self, w):
        """Matches a word serving as an adjective."""
        
        P = self.getP(w)
        F = self.F
        word = self.word
        name = 'adjv_ph'
        
        # check for recursive adjective matches 
        a2match = self.adjv(P(-1)) if P(-1) else Construction()
        a2match_head = int(a2match.getrole('head', 0))
        
        common = {
            
            'w.name not in {qquant,card}':
                word(w).name not in {'qquant','card'},
            
            'P(-1).name == cont':
                word(P(-1)).name == 'cont',
                        
            'P(-1, st) & {NA, a}': 
                P(-1,'st') in {'NA', 'a'},   
            
            'P(-1).name != quant':
                word(P(-1)).name != 'quant',
            
            'P(-1).name != prep':
                word(P(-1)).name != 'prep',
        }
                
        tests = (
            
            {
                'element': w,
                'name': name,
                'kind': self.kind,
                'pattern': 'adjv (1x)',
                'roles': {'adjv':word(w), 'head': word(P(-1))},
                'conds': dict(common, **{
                    'F.sp.v(w) in {adjv, verb}':
                        F.sp.v(w) in {'adjv', 'verb'},
                })
            },
            {
                'element': w,
                'name': name,
                'kind': self.kind,
                'pattern': 'adjv (2x)',
                'roles': {'adjv': word(w), 'head': word(a2match_head)},
                'conds': dict(common, **{
                    
                    'F.sp.v(w) in {adjv, verb}':
                        F.sp.v(w) in {'adjv', 'verb'},
                    
                     'self.adjv(P(-1)) and target != P(0)':
                        bool(a2match) and a2match_head != P(0)
                })
            }
        )

        return self.test(*tests)
     
    def attrib(self, w):
        """Identify elements in a attrib construction.
        
        In Hebrew this construction typically consists of four slots:
            > ◊î + A + ◊î + B
        Attrib identifies each of these elements and labels them.
        A is assumed to be the head, or modified, element and B
        is assumed to be an adjectival element.
        """
                
        # CX consists of two constituent cxs
        # start walk from head of first match
        P = self.getP(w)
        defi1 = self.defi(w)
        d1head = int(defi1.getrole('head', 0))
        Wk = self.getWk(d1head)

        # walk to next valid defi match
        # and allow adjectives to intervene:
        defi2 = Wk.ahead(
            lambda n: self.defi(n),
            go=lambda n: self.F.sp.v(n)=='adjv',
            output=True
        ) if Wk else Construction()
        defi2 = defi2 or Construction()

        # check for single_defi (only two cases)
        defi_p1 = self.defi(P(1))
        
        return self.test(
            {
                'element': w,
                'name': 'attrib_ph',
                'pattern': 'double_defi',
                'kind': self.kind,
                'roles': {'head': defi1, 'attrib': defi2},
                'conds': {
                    'bool(defi1)':
                        bool(defi1),
                    'bool(defi2)':
                        bool(defi2), 
                }
            },
            {
                'element': w,
                'name': 'attrib_ph',
                'pattern': 'single_defi',
                'kind': self.kind,
                'roles': {'head': self.word(w), 'attrib': defi_p1},
                'conds': {
                    'name(w) == cont':
                        self.word(w).name == 'cont',
                    'F.st.v(w) == a':
                        self.F.st.v(w) == 'a',
                    'P(-1,lex) != H':
                        P(-1,'lex') != 'H',
                    'bool(defi_p1)':
                        bool(defi_p1),
                }
            }
        )
        
    def numb(self, w):
        """Defines numerical relations with an non-quant word.
        
        Often but not always indicates quantification as other
        semantic relations are possible.
        """

        P = self.getP(w)
        Wk = self.getWk(w)
        word = self.word
        is_nom = (
            lambda n: word(n).name == 'cont'
        )
        
        # for the quant ahead check
        # should stop at a preposition or another quantifier
        stop_ahead = (
            lambda n: (word(n).name == 'prep'
                or word(n).name in {'card', 'qquant'} and word(n).name != word(w).name)
        )
        
        behind_nom = Wk.back(is_nom, stop=lambda n: not is_nom(n)) 
        
        return self.test(
        
            {
                'element': w,
                'name': 'numb_ph',
                'kind': self.kind,
                'pattern': 'numbered forward',
                'roles': {'numb': word(w), 'head': word(P(1))},
                'conds': {
                    
                    'w.name in {qquant,card}':
                     word(w).name in {'qquant', 'card'},
                    
                    'bool(P(1))':
                        bool(P(1)),
                    
                    'P(1,sp) != conj':
                        P(1,'sp') != 'conj',
                    
                    'P(1).name not in {qquant,card,prep}':
                        word(P(1)).name not in {'qquant','card','prep'},
        
                    'P(-1,sp) != art':
                        P(-1,'sp') != 'art',
                },
            },  
            {
                'element': w,
                'name': 'numb_ph',
                'kind': self.kind,
                'pattern': 'numbered backward',
                'roles': {'numb': word(w), 'head': word(behind_nom)},
                'conds': {
                    
                    'w.name in {qquant,card}':
                        word(w).name in {'qquant','card'},
                    
                    'not Wk.ahead(is_nominal)':
                        not Wk.ahead(is_nom, stop=stop_ahead),
                    
                    'bool(Wk.back(is_nominal))':
                        bool(behind_nom),
                    
                    'F.st.v(behind_nom) in {a, NA}':
                        self.F.st.v(behind_nom) in {'a', 'NA'},
                }
            }
        )
        
    def card_chain(self, w):
        """Defines cardinal number chain constructions"""
        
        P = self.getP(w)
        F = self.F
        word = self.word
        
        return self.test(
            {
                'element': w,
                'name': 'card_chain',
                'kind': self.kind,
                'pattern': 'adjacent',
                'roles': {'card':word(w), 'head':word(P(-1))},
                'conds': {
                    
                    'F.ls.v(w) == card':
                        F.ls.v(w) == 'card',
                    'P(-1,ls) == card':
                        P(-1,'ls') == 'card',                    
                }
            },
            {
                'element': w,
                'name': 'card_chain',
                'kind': self.kind,
                'pattern': 'conjunctive',
                'roles': {'card': word(w), 'head': word(P(-2)), 'conj': word(P(-1))},
                'conds': {
                    'F.ls.v(w) == card':
                        F.ls.v(w) == 'card',
                    'P(-1,lex) == W':
                        P(-1,'lex') == 'W',
                    'P(-2,ls) == card':
                        P(-2,'ls') == 'card',   
                }
            }
        )
    
    def demon(self, w):
        """Defines an adjacent demonstrative construction."""
        
        P = self.getP(w)
        word = self.word
        F = self.F
        name = 'demon_ph'
        
        return self.test(
            {
                'element': w,
                'name': name,
                'kind': self.kind,
                'pattern': 'adjacent forward',
                'roles': {'demon': word(w), 'head': word(P(1))},
                'conds': {
                    'prde in {F.pdp.v(w), F.sp.v(w)}':
                        'prde' in {F.pdp.v(w), F.sp.v(w)},
                    
                    'P(-1,sp) != art': # ensure not part of attrib pattern
                        P(-1,'sp') != 'art',
                    
                    'P(-1).name != prep':
                        word(P(-1)).name != 'prep',
                    
                    'bool(P(1))':
                        bool(P(1)),
                    
                    'P(1).name == cont':
                        word(P(1)).name == 'cont',
                }
            },
            {
                'element': w,
                'name': name,
                'kind': self.kind,
                'pattern': 'adjacent back',
                'roles': {'demon':word(w), 'head':word(P(-1))},
                'conds': {
                    'prde in {F.pdp.v(w), F.sp.v(w)}':
                        'prde' in {F.pdp.v(w), F.sp.v(w)},
                    
                    'P(-1).name not in {prep,qquant,card}':
                        word(P(-1)).name not in {'prep','qquant','card'},
                    
                    'P(-1,sp) == subs':
                        P(-1,'sp') == 'subs',
                }
            }
        )
    def apposition(self, w):
        """Looks for non-definite appositional constructions"""
        P = self.getP(w)
        F = self.F
        wd = self.word
        
        return self.test(
            {
                'element': w,
                'name': 'appo',
                'kind': self.kind,
                'roles': {'head': wd(P(-1)), 'appo': wd(w)},
                'conds': {
                    
                    'name(w) == cont':
                        wd(w).name == 'cont',
                    
                    'not adjv(w)':
                        not self.adjv(w),
                    'not advb(w)':
                        not self.advb(w),
                    
                    'name(P-1) == cont':
                        wd(P(-1)).name == 'cont',
                    
                    'st(P-1) == a':
                        F.st.v(P(-1)) == 'a',
                    
                }
            }
        )

### Load Constructions

In [39]:
words = wordConstructions(A) # word CX builder

# analyze all matches; return as dict
start = datetime.now()
print(f'Beginning word construction analysis...')
wordcxs = words.cxdict(
    s for tp in timephrases
        for s in L.d(tp,'word')
)
print(f'\t{datetime.now() - start} COMPLETE \t[ {len(wordcxs)} ] words loaded')

Beginning word construction analysis...
	0:00:06.291684 COMPLETE 	[ 12887 ] words loaded


In [50]:
# time phrase CX builder
spc = SPConstructions(wordcxs, A)

### TO FIX:

In [41]:
#pretty(1447386)

NB: L> is marked as the object of the preposition

<hr>

### Small Tests

In [42]:
pretty(1448320)

(739399,) True None


In [43]:
# test_small = spc.attrib(153682)
# showcx(test_small, conds=True)

### Stretch Tests

In [44]:
# test = spc.analyzestretch(L.d(1448269, 'word'), debug=True)

# for res in test:
#     showcx(res, conds=True)

In [55]:
semdist['JWM/']['NCP/']

0.48298183171350717

In [56]:
semdist['JWM/']['<RB/']

0.32622363715870806

### Pattern Searches

In [53]:
words = [w for ph in timephrases for w in L.d(ph, 'word')]

test_search(words, spc.apposition, pattern='', show=100, shuffle=False)

beginning search
	0 found (0/12887)
	5 found (1000/12887)
	11 found (2000/12887)
	12 found (3000/12887)
	18 found (4000/12887)
	29 found (5000/12887)
	32 found (6000/12887)
	38 found (7000/12887)
	42 found (8000/12887)
	42 found (9000/12887)
	42 found (10000/12887)
	49 found (11000/12887)
	57 found (12000/12887)
done at 0:00:29.073419
58 matches found...
showing top 100


{   '__cx__': 'appo',
    'appo': {'__cx__': 'cont', 'head': 3108},
    'head': {'__cx__': 'cont', 'head': 3107}}



{   '__cx__': 'appo',
    'appo': {'__cx__': 'cont', 'head': 16129},
    'head': {'__cx__': 'cont', 'head': 16128}}



{   '__cx__': 'appo',
    'appo': {'__cx__': 'cont', 'head': 21639},
    'head': {'__cx__': 'cont', 'head': 21638}}



{   '__cx__': 'appo',
    'appo': {'__cx__': 'cont', 'head': 22357},
    'head': {'__cx__': 'cont', 'head': 22356}}



{   '__cx__': 'appo',
    'appo': {'__cx__': 'cont', 'head': 30964},
    'head': {'__cx__': 'cont', 'head': 30963}}



{   '__cx__': 'appo',
    'appo': {'__cx__': 'cont', 'head': 33386},
    'head': {'__cx__': 'cont', 'head': 33385}}



{   '__cx__': 'appo',
    'appo': {'__cx__': 'cont', 'head': 35963},
    'head': {'__cx__': 'cont', 'head': 35962}}



{   '__cx__': 'appo',
    'appo': {'__cx__': 'cont', 'head': 37487},
    'head': {'__cx__': 'cont', 'head': 37486}}



{   '__cx__': 'appo',
    'appo': {'__cx__': 'cont', 'head': 38478},
    'head': {'__cx__': 'cont', 'head': 38477}}



{   '__cx__': 'appo',
    'appo': {'__cx__': 'cont', 'head': 40444},
    'head': {'__cx__': 'cont', 'head': 40443}}



{   '__cx__': 'appo',
    'appo': {'__cx__': 'cont', 'head': 40574},
    'head': {'__cx__': 'cont', 'head': 40573}}



{   '__cx__': 'appo',
    'appo': {'__cx__': 'cont', 'head': 96159},
    'head': {'__cx__': 'cont', 'head': 96158}}



{   '__cx__': 'appo',
    'appo': {'__cx__': 'cont', 'head': 101795},
    'head': {'__cx__': 'cont', 'head': 101794}}



{   '__cx__': 'appo',
    'appo': {'__cx__': 'cont', 'head': 103950},
    'head': {'__cx__': 'cont', 'head': 103949}}



{   '__cx__': 'appo',
    'appo': {'__cx__': 'cont', 'head': 104027},
    'head': {'__cx__': 'cont', 'head': 104026}}



{   '__cx__': 'appo',
    'appo': {'__cx__': 'cont', 'head': 114041},
    'head': {'__cx__': 'cont', 'head': 114040}}



{   '__cx__': 'appo',
    'appo': {'__cx__': 'cont', 'head': 114808},
    'head': {'__cx__': 'cont', 'head': 114807}}



{   '__cx__': 'appo',
    'appo': {'__cx__': 'cont', 'head': 124246},
    'head': {'__cx__': 'cont', 'head': 124245}}



{   '__cx__': 'appo',
    'appo': {'__cx__': 'cont', 'head': 135366},
    'head': {'__cx__': 'cont', 'head': 135365}}



{   '__cx__': 'appo',
    'appo': {'__cx__': 'cont', 'head': 141373},
    'head': {'__cx__': 'cont', 'head': 141372}}



{   '__cx__': 'appo',
    'appo': {'__cx__': 'cont', 'head': 141592},
    'head': {'__cx__': 'cont', 'head': 141591}}



{   '__cx__': 'appo',
    'appo': {'__cx__': 'cont', 'head': 142458},
    'head': {'__cx__': 'cont', 'head': 142457}}



{   '__cx__': 'appo',
    'appo': {'__cx__': 'cont', 'head': 143463},
    'head': {'__cx__': 'cont', 'head': 143462}}



{   '__cx__': 'appo',
    'appo': {'__cx__': 'cont', 'head': 145883},
    'head': {'__cx__': 'cont', 'head': 145882}}



{   '__cx__': 'appo',
    'appo': {'__cx__': 'cont', 'head': 146483},
    'head': {'__cx__': 'cont', 'head': 146482}}



{   '__cx__': 'appo',
    'appo': {'__cx__': 'cont', 'head': 148785},
    'head': {'__cx__': 'cont', 'head': 148784}}



{   '__cx__': 'appo',
    'appo': {'__cx__': 'cont', 'head': 152997},
    'head': {'__cx__': 'cont', 'head': 152996}}



{   '__cx__': 'appo',
    'appo': {'__cx__': 'cont', 'head': 153682},
    'head': {'__cx__': 'cont', 'head': 153681}}



{   '__cx__': 'appo',
    'appo': {'__cx__': 'cont', 'head': 154481},
    'head': {'__cx__': 'cont', 'head': 154480}}



{   '__cx__': 'appo',
    'appo': {'__cx__': 'cont', 'head': 167641},
    'head': {'__cx__': 'cont', 'head': 167640}}



{   '__cx__': 'appo',
    'appo': {'__cx__': 'cont', 'head': 168716},
    'head': {'__cx__': 'cont', 'head': 168715}}



{   '__cx__': 'appo',
    'appo': {'__cx__': 'cont', 'head': 173717},
    'head': {'__cx__': 'cont', 'head': 173716}}



{   '__cx__': 'appo',
    'appo': {'__cx__': 'cont', 'head': 191435},
    'head': {'__cx__': 'cont', 'head': 191434}}



{   '__cx__': 'appo',
    'appo': {'__cx__': 'cont', 'head': 192030},
    'head': {'__cx__': 'cont', 'head': 192029}}



{   '__cx__': 'appo',
    'appo': {'__cx__': 'cont', 'head': 198930},
    'head': {'__cx__': 'cont', 'head': 198929}}



{   '__cx__': 'appo',
    'appo': {'__cx__': 'cont', 'head': 199468},
    'head': {'__cx__': 'cont', 'head': 199467}}



{   '__cx__': 'appo',
    'appo': {'__cx__': 'cont', 'head': 201270},
    'head': {'__cx__': 'cont', 'head': 201269}}



{   '__cx__': 'appo',
    'appo': {'__cx__': 'cont', 'head': 203186},
    'head': {'__cx__': 'cont', 'head': 203185}}



{   '__cx__': 'appo',
    'appo': {'__cx__': 'cont', 'head': 231928},
    'head': {'__cx__': 'cont', 'head': 231927}}



{   '__cx__': 'appo',
    'appo': {'__cx__': 'cont', 'head': 242368},
    'head': {'__cx__': 'cont', 'head': 242367}}



{   '__cx__': 'appo',
    'appo': {'__cx__': 'cont', 'head': 249113},
    'head': {'__cx__': 'cont', 'head': 249112}}



{   '__cx__': 'appo',
    'appo': {'__cx__': 'cont', 'head': 249329},
    'head': {'__cx__': 'cont', 'head': 249328}}



{   '__cx__': 'appo',
    'appo': {'__cx__': 'cont', 'head': 320456},
    'head': {'__cx__': 'cont', 'head': 320455}}



{   '__cx__': 'appo',
    'appo': {'__cx__': 'cont', 'head': 321446},
    'head': {'__cx__': 'cont', 'head': 321445}}



{   '__cx__': 'appo',
    'appo': {'__cx__': 'cont', 'head': 348663},
    'head': {'__cx__': 'cont', 'head': 348662}}



{   '__cx__': 'appo',
    'appo': {'__cx__': 'cont', 'head': 349114},
    'head': {'__cx__': 'cont', 'head': 349113}}



{   '__cx__': 'appo',
    'appo': {'__cx__': 'cont', 'head': 349155},
    'head': {'__cx__': 'cont', 'head': 349154}}



{   '__cx__': 'appo',
    'appo': {'__cx__': 'cont', 'head': 355633},
    'head': {'__cx__': 'cont', 'head': 355632}}



{   '__cx__': 'appo',
    'appo': {'__cx__': 'cont', 'head': 356489},
    'head': {'__cx__': 'cont', 'head': 356488}}



{   '__cx__': 'appo',
    'appo': {'__cx__': 'cont', 'head': 367503},
    'head': {'__cx__': 'cont', 'head': 367502}}



{   '__cx__': 'appo',
    'appo': {'__cx__': 'cont', 'head': 375509},
    'head': {'__cx__': 'cont', 'head': 375508}}



{   '__cx__': 'appo',
    'appo': {'__cx__': 'cont', 'head': 376481},
    'head': {'__cx__': 'cont', 'head': 376480}}



{   '__cx__': 'appo',
    'appo': {'__cx__': 'cont', 'head': 377200},
    'head': {'__cx__': 'cont', 'head': 377199}}



{   '__cx__': 'appo',
    'appo': {'__cx__': 'cont', 'head': 378046},
    'head': {'__cx__': 'cont', 'head': 378045}}



{   '__cx__': 'appo',
    'appo': {'__cx__': 'cont', 'head': 383558},
    'head': {'__cx__': 'cont', 'head': 383557}}



{   '__cx__': 'appo',
    'appo': {'__cx__': 'cont', 'head': 385749},
    'head': {'__cx__': 'cont', 'head': 385748}}



{   '__cx__': 'appo',
    'appo': {'__cx__': 'cont', 'head': 389048},
    'head': {'__cx__': 'cont', 'head': 389047}}



{   '__cx__': 'appo',
    'appo': {'__cx__': 'cont', 'head': 410977},
    'head': {'__cx__': 'cont', 'head': 410976}}



### Testing on Random Phrases

In [46]:
shuff = [k for k in timephrases
            if len(L.d(k,'word')) > 4]
random.shuffle(shuff)

In [47]:
# for phrase in shuff[:25]:
    
#     print('analyzing', phrase)
#     elements = L.d(phrase,'word')
    
#     try:
#         cxs = tpc.analyzestretch(elements)
#         if cxs:
#             for cx in cxs:
#                 showcx(cx, refslots=elements)
#         else:
#             showcx(Construction(), refslots=elements)
    
#     except:
#         sys.stderr.write(f'\nFAIL...running with debug...\n')
#         pretty(phrase)
#         tpc.analyzestretch(elements, debug=True)
#         raise Exception('...debug complete...')

### Testing on All Timephrases

In [48]:
phrase2cxs = collections.defaultdict(list)
nocxs = []

# time it
start = datetime.now()

print(f'{datetime.now()-start} beginning analysis...')

for i, phrase in enumerate(timephrases):
     
    # analyze all known relas
    elements = L.d(phrase,'word')
    
    # analyze with debug exceptions
    try:
        cxs = spc.analyzestretch(elements)
    except:
        sys.stderr.write(f'\nFAIL...running with debug...\n')
        pretty(phrase)
        spc.analyzestretch(elements, debug=True)
        raise Exception('...debug complete...')

    # save those phrases that have no matching constructions
    if not cxs:
        nocxs.append(phrase)
    else:
        phrase2cxs[phrase] = cxs
        
    # report status
    if i % 500 == 0 and i:
        print(f'\t{datetime.now()-start}\tdone with iter {i}/{len(timephrases)}')
        
print(f'{datetime.now()-start}\tCOMPLETE')
print('-'*20)
print(f'{len(phrase2cxs)} phrases matched with Constructions...')
print(f'{len(nocxs)} phrases not yet matched with Constructions...')

0:00:00.000037 beginning analysis...
	0:00:10.120056	done with iter 500/3864
	0:00:20.784187	done with iter 1000/3864
	0:00:30.299543	done with iter 1500/3864
	0:00:39.503434	done with iter 2000/3864
	0:00:51.143791	done with iter 2500/3864
	0:01:03.166122	done with iter 3000/3864
	0:01:14.305286	done with iter 3500/3864
0:01:24.362490	COMPLETE
--------------------
3864 phrases matched with Constructions...
0 phrases not yet matched with Constructions...


## Closing Gaps

### Identify Gaps

Find timephrases that contain un-covered words besides waw conjunctions.

In [49]:
gapped = []
tested = []

for ph, cxs in phrase2cxs.items():
    
    tested.append(ph)
    
    ph_slots = set(
        s for s in L.d(ph,'word')
    )
    cx_slots = set(
        s for cx in cxs
            for s in cx.slots
    )
    
    if ph_slots.difference(cx_slots):
        gapped.append(cxs)
        
print(f'{len(gapped)} gapped phrases logged...')

0 gapped phrases logged...


In [50]:
for gp in gapped[:25]:
    for cx in gp:
        showcx(cx)

## Connecting Constructions

Developing a CXbuilder to connect all constructions in a complete phrase.


### Ambiguity with Coordinate CXs

Considerable ambiguity is present in several coordinate constructions:

**`A B and C`**<br>
Given A, B, C == nominal words. Is their relationship `A // B // C` or `A+B // C`. In other words: **what is the relationship of two adjacent nominal words given a list?** Is B a descriptor of A or is it an independent element? 

**`A of B and C`**<br>
Is it, `(A of B) // (C)` or `(A of (B // C)`

Or even:

**`A of B C and D`**<br>
This pattern combines elements from both ambiguous cases.

### Method

To address these ambiguities we will apply a battery of disambiguation attempts. At the core of these attempts is a [Semantic Vector Space](https://en.wikipedia.org/wiki/Vector_space_model), which is able to quantify the semantic distance between two words based on their contextual uses throughout the Hebrew Bible.

The working hypothesis of this method is
> Words in coordination with each other will be more semantically similar (i.e. the least distance in the vector space) than other candidates in the phrase.

Semantic similarity in a vector space is not the only method used, however. Another aspect of semantic closeness is phrase structure. For instance, the identity of phrase types is taken into consideration above semantic similarity. 

In [51]:
class CXbuilderPH(CXbuilder):
    """Build complete phrase constructions."""
    
    def __init__(self, phrase2cxs, semdists, tf):
        CXbuilder.__init__(self)
        
        # set up tf methods
        self.tf = tf
        self.F, self.T, self.L = tf.api.F, tf.api.T, tf.api.L
        
        # map cx to phrase node for context retrieval
        self.cx2phrase = {
            cx:ph 
                for ph in phrase2cxs
                    for cx in phrase2cxs[ph]
        }
        
        self.phrase2cxs = phrase2cxs
        self.semdists = semdists
        
        self.cxs = (        
            self.plus_prep,
            self.adjacent
        )
        self.dripbucket = (
            self.cxph,
        )
        
        self.kind = 'phrase'
        
    def cxph(self, cx):
        """Dripbucket function that returns cx as is."""
        return cx
        
    def get_context(self, cx):
        """Get context for a given cx."""
        phrase = self.cx2phrase.get(cx, None)
        if phrase:
            return self.phrase2cxs[phrase]
        else:
            return tuple()
        
    def getP(self, cx):
        """Index positions on phrase context"""
        positions = self.get_context(cx)
        if positions:
            return Positions(
                cx, positions, default=Construction()
            ).get
        else:
            return Dummy

    def getWk(self, cx):
        """Index walks on phrase context"""
        positions = self.get_context(cx)
        if positions:
            return Walker(cx, positions)
        else:
            return Dummy()
    
    def getindex(
        self, indexable, index, 
        default=Construction()
    ):
        """Safe index on iterables w/out IndexErrors."""
        try:
            return indexable[index]
        except:
            return default
    
    def getname(self, cx):
        """Get a cx name"""
        return cx.name
    
    def getkind(self, cx):
        """Get a cx kind."""
        return cx.kind
    
    def getsuccrole(self, cx, role, index=-1):
        """Get a cx role from a list of successive roles.
        
        e.g.
        [big_head, medium_head, small_head][-1] == small_head
        """
        cands = list(cx.getsuccroles(role))
        try:
            return cands[index]
        except IndexError:
            return Construction()
    
    def string_plus(self, cx, plus=1):
        """Stringifies a CX + N-slots for Levenshtein tests."""
        
        # get all slots in the context for plussing
        allslots = sorted(set(
            s for scx in self.get_context(cx)
                for s in scx.slots
        ))
        
        # get plus slots
        P = (Positions(self.getindex(cx.slots, -1), allslots).get
                 if cx.slots and allslots else Dummy)
        plusses = []
        for i in range(plus, plus+1):
            plusses.append(P(i,-1)) # -1 for null slots (== empty string in T.text)
        plusses = [p for p in plusses if type(p) == int]
        
        # format the text string for Levenshtein testing
        ptxt = T.text(
            cx.slots + tuple(plusses),
            fmt='text-orig-plain'
        ) if cx.slots else ''
        
        return ptxt

    
    def coord(self, cx):
        """A coordinate construction.
        
        In order to match a coordinate cx, we need to determine
        which item in the previous phrase this cx belongs with. 
        This is done using a semantic vector space, which can
        quantify the approximate semantic distance between the
        heads of this cx and a candidate cx.
        
        Criteria utilized in validating a coordinate cx between
        an origin cx and a candidate cx are the following:
            TODO: fill in
        """
        
        F, T = self.F, self.T
        P = self.getP(cx)
        semdist = self.semdists
        Wk = self.getWk(cx)
                         
        # get all top-level cxs behind this one that match in name
        cx_behinds = Wk.back(
            lambda c: c.name == cx.name,
            every=True,
            stop=lambda c: (
                c.name == 'conj' and (c != P(-1))
            )
        )
        
        # if top level phrases produce no results,
        # use subphrases instead
        if not cx_behinds:
            topcontext = self.get_context(cx)
            
            # gather all valid subphrase candidates
            subcontext = []
            for topcx in topcontext:
                for subcx in topcx.subgraph():
                    if type(subcx) == int: # skip TF slots
                        continue
                    if (
                        subcx in topcontext or subcx.name != 'conj'
                        and subcx not in cx
                    ):
                        subcontext.append(subcx)        
            
            # walk the new candidates
            Wk2 = Walker(cx, subcontext)
            cx_behinds = Wk2.back(
                lambda c: c.name != 'conj', 
                default=[P(-2)],
                every=True,
                stop=lambda c: (
                    c.name == 'conj' and (c != P(-1))
                )
            )
        
        # map each back-cx to its last slot to make sure
        # every candidate is the last item in its phrase
        # check is made in next series of lines
        cx2last = {
            cxb:self.getindex(sorted(cxb.slots), -1, 0)
                for cxb in cx_behinds
        }
        
        # find coordinate candidate subphrases that stand
        # at the end of the phrase
        cx_subphrases = []
        
        for cx_back in cx_behinds:
            for cxsp in cx_back.subgraph():
                if type(cxsp) == int:
                    continue
                elif (
                    cx2last[cx_back] in cxsp.slots
                    and cxsp.getrole('head')
                ):
                    cx_subphrases.append(cxsp)
        
        # get subphrase heads for semantic tests
        cx2heads = [
            (cxsp, self.getsuccrole(cxsp,'head'))
                for cxsp in cx_behinds
        ]

        # get head of this cx
        head1 = self.getsuccrole(cx,'head')     
        head1lex = F.lex.v(head1)
        
        # sort on a set of priorities
        # the default sort behavior is used (least to greatest)
        # thus when a bigger value should be more important, 
        # a negative is added to the number
        stringp = self.string_plus
        
        # arrange candidates by priority
        cxpriority = []
        for cxsp, headsp in cx2heads:
            name_eq = 0 if cxsp.name == cx.name else 1
            semantic_dist = semdist.get(
                head1lex,{}
            ).get(F.lex.v(headsp), np.inf)
            size = -len(cxsp.slots)
            levenshtein = lev_dist(stringp(cx), stringp(cxsp))
            slot_dist = -next(iter(cxsp.slots), 0)
            heads = (head1, headsp) # for reporting purposes only
            
            cxpriority.append((
                name_eq,
                semantic_dist,
                size,
                levenshtein,
                slot_dist,
                heads,
                cxsp
            ))
            
        # make the sorting
        cxpriority = sorted(cxpriority, key=lambda k: k[:-1])
        
        # select the first priority candidate
        cand = next(iter(cxpriority), (0,0,Construction()))
        
        # add data for conds report / debugging
        data = collections.defaultdict(str)
        for namescore,sdist,leng,ldist,lslot,heads,cxp in cxpriority:
            # name equality
            data['namescore'] += f'\n\t{cxp} namescore: {namescore}'
            # semantic distance
            data['semdists'] += (
                f'\n\t{round(sdist, 2)}, {F.lex.v(heads[0])} ~ {F.lex.v(heads[1])}, {cxp}'
            )
            # size of cx
            data['size'] += f'\n\t{cxp} length: {abs(leng)}'
            
            # Levenstein distance
            data['ldist'] += f'\n\t{cxp} dist: {ldist}'
            
            # dist of last slot
            data['lslot'] += f'\n\t{cxp} last slot: {abs(lslot)}'
    
        
        return self.test(
            {
                'element': cx,
                'name': 'coord',
                'kind': self.kind,
                'roles': {'part2':cx, 'conj': P(-1), 'part1': cand[-1]},
                'conds': {
                    'P(-1).name == conj':
                        P(-1).name == 'conj',
                    'bool(cand)':
                        bool(cand[-1]),
                    f'name matches {data["namescore"]}\n':
                        bool(cxpriority),
                    f'is shortest sem. distance of {data["semdists"]}\n':
                        bool(cxpriority),
                    f'is longest length of: {data["size"]}\n':
                        bool(cxpriority),
                    f'is shortest Levenshtein distance: {data["ldist"]}\n':
                        bool(cxpriority),
                    f'is closest last slot of: {data["lslot"]}\n':
                        bool(cxpriority)
                }
            }
        )
    
    def appo_name(self, cx):
        """Apposition of name"""
        
        P = self.getP(cx)
        geti = self.getindex
        
        # get head and first slot of construction
        cxhead = self.getsuccrole(cx, 'head') # a tf integer
        headcx = next(iter(cx.graph.pred[cxhead])) # a CX
        first_slot = cx.slots[0] # for tests
        
        # get very last embedded cx in P(-1)
        back = P(-1)
        name = geti(back.slots, -1)
        try:
            namecx = next(iter(back.graph.pred[name])) # a CX
        except KeyError:
            namecx = Construction()
        
        return self.test(
        
            {
                'element': cx,
                'name': 'appo_name',
                'kind': self.kind,
                'roles': {'name': cx, 'head':namecx},
                'conds': {
                    
                    'cx(head).name == cont':
                        headcx.name == 'cont',
                    
                    'cx.name not in {prep_ph}':
                        cx.name not in {'prep_ph'},
                    
                    'bool(P(-1))':
                        bool(P(-1)),
                    
                    'backcx.name == name':
                        namecx.name == 'name',
                    
                    f'F.nu.v({cxhead}) == F.nu.v({name})':
                        F.nu.v(cxhead) == F.nu.v(name),
                    
                    'cxhead == first_slot or first_slot==art':
                        (
                            cxhead == first_slot
                            or self.F.sp.v(first_slot) == 'art'
                        ),
                    
                    # NB:
                    # rule below reveals the need to be able to say
                    # what head_slot should be; i.e., the lexeme should
                    # be semantically consistent with the ID of the proper name
                    # if person, head_slot should ~ person, etc.
                    # but for now I'll use a work-around solution
                    'F.lex.v(head_slot) not in timeword set':
                        F.lex.v(cxhead) not in {'CNH/'}
                }
            }
        )
    
    def adjacent(self, cx):
        """Find adjacent CXs"""
        
        P = self.getP(cx)
        
        return self.test(
            {
                'element': cx,
                'name': 'adjacent',
                'kind': self.kind,
                'roles': {'phrase1':cx, 'phrase2':P(1)},
                'conds': {
                    'cx.name != conj':
                        cx.name != 'conj',
                    'P(1).name != prep':
                        P(1).name != 'prep',
                    'bool(P(1))':
                        bool(P(1)),
                    f'name({P(1).name}) not in (conj, prep_ph)':
                        P(1).name not in {'conj','prep_ph'},
                    'not appo_name(P(1))':
                        not (self.appo_name(P(1)) if P(1) else False),
                    'not appo_name(cx)':
                        not self.appo_name(cx),
                }
            }
        
        )
    
    def plus_prep(self, cx):
        """Find phrase+prep CXs"""
        
        P = self.getP(cx)
                
        return self.test(
            {
                'element': cx,
                'name': '+prep',
                'kind': self.kind,
                'roles': {'+prep': cx, 'head': P(-1)},
                'conds': {
                    'cx.name == prep_ph':
                        cx.name == 'prep_ph',
                    'bool(P(-1))':
                        bool(P(-1)),
                    'P(-1,name) != conj':
                        P(-1).name != 'conj',
                }
            }
        )
    
cxp = CXbuilderPH(phrase2cxs, semdist, A)

## Tests

In [55]:
# A.show(A.search('''

# timephrase
#     word pdp=subs ls#card|prpe lex#KL/|JWM/ st=a

#     <: word lex=JWM/
# ''')[:10])

In [136]:
# the following phrases contain cases that still
# need to be fixed for the coordinate cx; some should
# actually be done in the previous cx builder at subphrase level

to_fix = [
    1450039, # coord, add adjacent advb cx with JWM
    1450075, # coord, add adjacent advb cx with >Z
    1450647, # coord, consider prioritizing Levenshtein over size
    
]

### Test Small

In [162]:
# testph = phrase2cxs[1450540]
# testph

In [161]:
# test = cxp.appo_name(testph[-1])

# showcx(test, conds=True)

### Pattern Matches

In [139]:
def filt_gaps(cx):
    """Isolate cxs with gaps"""
    timephrase = L.u(next(iter(cx.slots)),'phrase')[0]
    if set(L.d(timephrase,'word')) - cx.slots:
        return True
    else:
        return False
    
def filt(cx):
    """Find specific lexeme"""
    timephrase = L.u(next(iter(cx.slots)),'phrase')[0]
    phrasewords = L.d(timephrase, 'word')
    if (
        {'JWM/', 'LJLH/'}.issubset(set(F.lex.v(w) for w in phrasewords))
        and len(phrasewords) == 3
    ):
        return True
    else:
        return False

In [52]:
elements = [
    cx for ph in list(phrase2cxs.values())
        for cx in ph
]

test_search(
    elements, 
    cxp.adjacent, 
    pattern='', 
    shuffle=False,
    #select=lambda c: filt(c),
    extraFeatures='lex st',
    show=150
)

beginning search
	0 found (0/4871)
	17 found (1000/4871)
	35 found (2000/4871)
	61 found (3000/4871)
	78 found (4000/4871)
done at 0:00:08.131292
106 matches found...
showing top 150


{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {'__cx__': 'cont', 'head': 3107},
                   'prep': {'__cx__': 'prep', 'head': 3106}},
    'phrase2': {   '__cx__': 'numb_ph',
                   'head': {'__cx__': 'cont', 'head': 3108},
                   'numb': {'__cx__': 'card', 'head': 3109}}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {   '__cx__': 'defi_ph',
                               'art': {'__cx__': 'art', 'head': 4518},
                               'head': {'__cx__': 'cont', 'head': 4519}},
                   'prep': {'__cx__': 'prep', 'head': 4517}},
    'phrase2': {   '__cx__': 'numb_ph',
                   'head': {'__cx__': 'cont', 'head': 4522},
                   'numb': {   '__cx__': 'card_chain',
                               'card': {'__cx__': 'card', 'head': 4521},
                               'head': {'__cx__': 'card', 'head': 4520}}}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {'__cx__': 'cont', 'head': 16128},
                   'prep': {'__cx__': 'prep', 'head': 16127}},
    'phrase2': {'__cx__': 'cont', 'head': 16129}}



{   '__cx__': 'adjacent',
    'phrase1': {'__cx__': 'cont', 'head': 21638},
    'phrase2': {'__cx__': 'cont', 'head': 21639}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {   '__cx__': 'attrib_ph',
                               'attrib': {   '__cx__': 'defi_ph',
                                             'art': {   '__cx__': 'art',
                                                        'head': 22288},
                                             'head': {   '__cx__': 'ordn',
                                                         'head': 22289}},
                               'head': {   '__cx__': 'defi_ph',
                                           'art': {   '__cx__': 'art',
                                                      'head': 22286},
                                           'head': {   '__cx__': 'cont',
                                                       'head': 22287}}},
                   'prep': {'__cx__': 'prep', 'head': 22285}},
    'phrase2': {'__cx__': 'cont', 'head': 22290}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {   '__cx__': 'prep_ph',
                               'head': {'__cx__': 'cont', 'head': 22356},
                               'prep': {'__cx__': 'prep', 'head': 22355}},
                   'prep': {'__cx__': 'prep', 'head': 22354}},
    'phrase2': {'__cx__': 'cont', 'head': 22357}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'advb_ph',
                   'advb': {'__cx__': 'cont', 'head': 30375},
                   'head': {   '__cx__': 'prep_ph',
                               'head': {'__cx__': 'cont', 'head': 30377},
                               'prep': {'__cx__': 'prep', 'head': 30376}}},
    'phrase2': {   '__cx__': 'advb_ph',
                   'advb': {'__cx__': 'cont', 'head': 30378},
                   'head': {   '__cx__': 'prep_ph',
                               'head': {'__cx__': 'cont', 'head': 30380},
                               'prep': {'__cx__': 'prep', 'head': 30379}}}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'advb_ph',
                   'advb': {'__cx__': 'cont', 'head': 30378},
                   'head': {   '__cx__': 'prep_ph',
                               'head': {'__cx__': 'cont', 'head': 30380},
                               'prep': {'__cx__': 'prep', 'head': 30379}}},
    'phrase2': {   '__cx__': 'advb_ph',
                   'advb': {'__cx__': 'cont', 'head': 30381},
                   'head': {   '__cx__': 'prep_ph',
                               'head': {'__cx__': 'cont', 'head': 30383},
                               'prep': {'__cx__': 'prep', 'head': 30382}}}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'advb_ph',
                   'advb': {'__cx__': 'cont', 'head': 31086},
                   'head': {'__cx__': 'cont', 'head': 31087}},
    'phrase2': {   '__cx__': 'advb_ph',
                   'advb': {'__cx__': 'cont', 'head': 31088},
                   'head': {   '__cx__': 'defi_ph',
                               'art': {'__cx__': 'art', 'head': 31089},
                               'head': {'__cx__': 'cont', 'head': 31090}}}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {   '__cx__': 'defi_ph',
                               'art': {'__cx__': 'art', 'head': 33384},
                               'head': {'__cx__': 'cont', 'head': 33385}},
                   'prep': {'__cx__': 'prep', 'head': 33383}},
    'phrase2': {'__cx__': 'cont', 'head': 33386}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {'__cx__': 'cont', 'head': 35962},
                   'prep': {'__cx__': 'prep', 'head': 35961}},
    'phrase2': {'__cx__': 'cont', 'head': 35963}}



{   '__cx__': 'adjacent',
    'phrase1': {'__cx__': 'cont', 'head': 37486},
    'phrase2': {'__cx__': 'cont', 'head': 37487}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {'__cx__': 'cont', 'head': 38477},
                   'prep': {'__cx__': 'prep', 'head': 38476}},
    'phrase2': {'__cx__': 'cont', 'head': 38478}}



{   '__cx__': 'adjacent',
    'phrase1': {'__cx__': 'cont', 'head': 56848},
    'phrase2': {   '__cx__': 'numb_ph',
                   'head': {'__cx__': 'cont', 'head': 56850},
                   'numb': {'__cx__': 'card', 'head': 56849}}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {'__cx__': 'nega', 'head': 61997},
                   'prep': {'__cx__': 'prep', 'head': 61996}},
    'phrase2': {   '__cx__': 'geni_ph',
                   'geni': {'__cx__': 'cont', 'head': 61999},
                   'head': {'__cx__': 'cont', 'head': 61998}}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {   '__cx__': 'prep_ph',
                               'head': {'__cx__': 'name', 'head': 78152},
                               'prep': {'__cx__': 'prep', 'head': 78151}},
                   'prep': {'__cx__': 'prep', 'head': 78150}},
    'phrase2': {'__cx__': 'name', 'head': 78153}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {   '__cx__': 'defi_ph',
                               'art': {'__cx__': 'art', 'head': 87743},
                               'head': {'__cx__': 'cont', 'head': 87744}},
                   'prep': {'__cx__': 'prep', 'head': 87742}},
    'phrase2': {   '__cx__': 'numb_ph',
                   'head': {'__cx__': 'cont', 'head': 87746},
                   'numb': {'__cx__': 'card', 'head': 87745}}}



{   '__cx__': 'adjacent',
    'phrase1': {'__cx__': 'cont', 'head': 101794},
    'phrase2': {'__cx__': 'cont', 'head': 101795}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {   '__cx__': 'attrib_ph',
                               'attrib': {   '__cx__': 'defi_ph',
                                             'art': {   '__cx__': 'art',
                                                        'head': 107424},
                                             'head': {   '__cx__': 'ordn',
                                                         'head': 107425}},
                               'head': {   '__cx__': 'defi_ph',
                                           'art': {   '__cx__': 'art',
                                                      'head': 107422},
                                           'head': {   '__cx__': 'cont',
                                                       'head': 107423}}},
                   'prep': {'__cx__': 'prep', 'head': 107421}},
    'phrase2': {   '__cx__': 'geni_ph',
                   'geni': {   '__cx__': 'defi_ph',
         

{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {'__cx__': 'cont', 'head': 135365},
                   'prep': {'__cx__': 'prep', 'head': 135364}},
    'phrase2': {'__cx__': 'cont', 'head': 135366}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {   '__cx__': 'geni_ph',
                               'geni': {'__cx__': 'cont', 'head': 137198},
                               'head': {'__cx__': 'cont', 'head': 137197}},
                   'prep': {'__cx__': 'prep', 'head': 137196}},
    'phrase2': {   '__cx__': 'numb_ph',
                   'head': {'__cx__': 'cont', 'head': 137200},
                   'numb': {'__cx__': 'card', 'head': 137199}}}



{   '__cx__': 'adjacent',
    'phrase1': {'__cx__': 'cont', 'head': 139129},
    'phrase2': {   '__cx__': 'numb_ph',
                   'head': {'__cx__': 'cont', 'head': 139131},
                   'numb': {'__cx__': 'card', 'head': 139130}}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {'__cx__': 'cont', 'head': 141372},
                   'prep': {'__cx__': 'prep', 'head': 141371}},
    'phrase2': {'__cx__': 'cont', 'head': 141373}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {'__cx__': 'cont', 'head': 141591},
                   'prep': {'__cx__': 'prep', 'head': 141590}},
    'phrase2': {'__cx__': 'cont', 'head': 141592}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {'__cx__': 'cont', 'head': 142457},
                   'prep': {'__cx__': 'prep', 'head': 142456}},
    'phrase2': {'__cx__': 'cont', 'head': 142458}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {   '__cx__': 'defi_ph',
                               'art': {'__cx__': 'art', 'head': 145881},
                               'head': {'__cx__': 'cont', 'head': 145882}},
                   'prep': {'__cx__': 'prep', 'head': 145880}},
    'phrase2': {'__cx__': 'cont', 'head': 145883}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'defi_ph',
                   'art': {'__cx__': 'art', 'head': 145988},
                   'head': {'__cx__': 'cont', 'head': 145989}},
    'phrase2': {   '__cx__': 'numb_ph',
                   'head': {   '__cx__': 'defi_ph',
                               'art': {'__cx__': 'art', 'head': 145991},
                               'head': {'__cx__': 'cont', 'head': 145992}},
                   'numb': {'__cx__': 'card', 'head': 145990}}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {   '__cx__': 'defi_ph',
                               'art': {'__cx__': 'art', 'head': 153680},
                               'head': {'__cx__': 'cont', 'head': 153681}},
                   'prep': {'__cx__': 'prep', 'head': 153679}},
    'phrase2': {   '__cx__': 'attrib_ph',
                   'attrib': {   '__cx__': 'defi_ph',
                                 'art': {'__cx__': 'art', 'head': 153683},
                                 'head': {'__cx__': 'ordn', 'head': 153684}},
                   'head': {'__cx__': 'cont', 'head': 153682}}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'advb_ph',
                   'advb': {'__cx__': 'cont', 'head': 154000},
                   'head': {'__cx__': 'cont', 'head': 154001}},
    'phrase2': {   '__cx__': 'advb_ph',
                   'advb': {'__cx__': 'cont', 'head': 154002},
                   'head': {   '__cx__': 'defi_ph',
                               'art': {'__cx__': 'art', 'head': 154003},
                               'head': {'__cx__': 'cont', 'head': 154004}}}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'advb_ph',
                   'advb': {'__cx__': 'cont', 'head': 156845},
                   'head': {'__cx__': 'cont', 'head': 156846}},
    'phrase2': {   '__cx__': 'advb_ph',
                   'advb': {'__cx__': 'cont', 'head': 156847},
                   'head': {'__cx__': 'cont', 'head': 156848}}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'advb_ph',
                   'advb': {'__cx__': 'cont', 'head': 156847},
                   'head': {'__cx__': 'cont', 'head': 156848}},
    'phrase2': {   '__cx__': 'numb_ph',
                   'head': {'__cx__': 'cont', 'head': 156850},
                   'numb': {'__cx__': 'qquant', 'head': 156849}}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'advb_ph',
                   'advb': {'__cx__': 'cont', 'head': 162031},
                   'head': {'__cx__': 'cont', 'head': 162032}},
    'phrase2': {   '__cx__': 'advb_ph',
                   'advb': {'__cx__': 'cont', 'head': 162033},
                   'head': {'__cx__': 'cont', 'head': 162034}}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'advb_ph',
                   'advb': {'__cx__': 'cont', 'head': 162945},
                   'head': {'__cx__': 'cont', 'head': 162946}},
    'phrase2': {   '__cx__': 'advb_ph',
                   'advb': {'__cx__': 'cont', 'head': 162947},
                   'head': {'__cx__': 'cont', 'head': 162948}}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {'__cx__': 'cont', 'head': 167640},
                   'prep': {'__cx__': 'prep', 'head': 167639}},
    'phrase2': {'__cx__': 'cont', 'head': 167641}}



{   '__cx__': 'adjacent',
    'phrase1': {'__cx__': 'cont', 'head': 168715},
    'phrase2': {'__cx__': 'cont', 'head': 168716}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {   '__cx__': 'geni_ph',
                               'geni': {'__cx__': 'name', 'head': 173714},
                               'head': {'__cx__': 'cont', 'head': 173713}},
                   'prep': {'__cx__': 'prep', 'head': 173712}},
    'phrase2': {   '__cx__': 'numb_ph',
                   'head': {'__cx__': 'cont', 'head': 173716},
                   'numb': {'__cx__': 'card', 'head': 173715}}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'numb_ph',
                   'head': {'__cx__': 'cont', 'head': 173716},
                   'numb': {'__cx__': 'card', 'head': 173715}},
    'phrase2': {'__cx__': 'cont', 'head': 173717}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'numb_ph',
                   'head': {'__cx__': 'cont', 'head': 183625},
                   'numb': {'__cx__': 'card', 'head': 183624}},
    'phrase2': {   '__cx__': 'numb_ph',
                   'head': {'__cx__': 'cont', 'head': 183628},
                   'numb': {   '__cx__': 'card_chain',
                               'card': {'__cx__': 'card', 'head': 183627},
                               'head': {'__cx__': 'card', 'head': 183626}}}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {   '__cx__': 'defi_ph',
                               'art': {'__cx__': 'art', 'head': 188233},
                               'head': {'__cx__': 'cont', 'head': 188234}},
                   'prep': {'__cx__': 'prep', 'head': 188232}},
    'phrase2': {'__cx__': 'name', 'head': 188235}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {   '__cx__': 'defi_ph',
                               'art': {'__cx__': 'art', 'head': 188368},
                               'head': {'__cx__': 'cont', 'head': 188369}},
                   'prep': {'__cx__': 'prep', 'head': 188367}},
    'phrase2': {'__cx__': 'name', 'head': 188370}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {   '__cx__': 'defi_ph',
                               'art': {'__cx__': 'art', 'head': 191433},
                               'head': {'__cx__': 'cont', 'head': 191434}},
                   'prep': {'__cx__': 'prep', 'head': 191432}},
    'phrase2': {'__cx__': 'cont', 'head': 191435}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {   '__cx__': 'defi_ph',
                               'art': {'__cx__': 'art', 'head': 192028},
                               'head': {'__cx__': 'cont', 'head': 192029}},
                   'prep': {'__cx__': 'prep', 'head': 192027}},
    'phrase2': {'__cx__': 'cont', 'head': 192030}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {   '__cx__': 'defi_ph',
                               'art': {'__cx__': 'art', 'head': 198928},
                               'head': {'__cx__': 'cont', 'head': 198929}},
                   'prep': {'__cx__': 'prep', 'head': 198927}},
    'phrase2': {'__cx__': 'cont', 'head': 198930}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {   '__cx__': 'defi_ph',
                               'art': {'__cx__': 'art', 'head': 199466},
                               'head': {'__cx__': 'cont', 'head': 199467}},
                   'prep': {'__cx__': 'prep', 'head': 199465}},
    'phrase2': {'__cx__': 'cont', 'head': 199468}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {   '__cx__': 'defi_ph',
                               'art': {'__cx__': 'art', 'head': 201268},
                               'head': {'__cx__': 'cont', 'head': 201269}},
                   'prep': {'__cx__': 'prep', 'head': 201267}},
    'phrase2': {'__cx__': 'cont', 'head': 201270}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {   '__cx__': 'defi_ph',
                               'art': {'__cx__': 'art', 'head': 202678},
                               'head': {'__cx__': 'cont', 'head': 202679}},
                   'prep': {'__cx__': 'prep', 'head': 202677}},
    'phrase2': {'__cx__': 'name', 'head': 202680}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {   '__cx__': 'defi_ph',
                               'art': {'__cx__': 'art', 'head': 206687},
                               'head': {'__cx__': 'cont', 'head': 206688}},
                   'prep': {'__cx__': 'prep', 'head': 206686}},
    'phrase2': {'__cx__': 'name', 'head': 206689}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {   '__cx__': 'defi_ph',
                               'art': {'__cx__': 'art', 'head': 206780},
                               'head': {'__cx__': 'cont', 'head': 206781}},
                   'prep': {'__cx__': 'prep', 'head': 206779}},
    'phrase2': {'__cx__': 'name', 'head': 206782}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {   '__cx__': 'defi_ph',
                               'art': {'__cx__': 'art', 'head': 209277},
                               'head': {'__cx__': 'cont', 'head': 209278}},
                   'prep': {'__cx__': 'prep', 'head': 209276}},
    'phrase2': {'__cx__': 'name', 'head': 209279}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {   '__cx__': 'defi_ph',
                               'art': {'__cx__': 'art', 'head': 210502},
                               'head': {'__cx__': 'cont', 'head': 210503}},
                   'prep': {'__cx__': 'prep', 'head': 210501}},
    'phrase2': {'__cx__': 'name', 'head': 210504}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {   '__cx__': 'defi_ph',
                               'art': {'__cx__': 'art', 'head': 211368},
                               'head': {'__cx__': 'cont', 'head': 211369}},
                   'prep': {'__cx__': 'prep', 'head': 211367}},
    'phrase2': {'__cx__': 'name', 'head': 211370}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {   '__cx__': 'geni_ph',
                               'geni': {'__cx__': 'name', 'head': 212082},
                               'head': {'__cx__': 'cont', 'head': 212081}},
                   'prep': {'__cx__': 'prep', 'head': 212080}},
    'phrase2': {'__cx__': 'name', 'head': 212083}}



{   '__cx__': 'adjacent',
    'phrase1': {'__cx__': 'name', 'head': 212083},
    'phrase2': {'__cx__': 'name', 'head': 212084}}



{   '__cx__': 'adjacent',
    'phrase1': {'__cx__': 'name', 'head': 212084},
    'phrase2': {'__cx__': 'name', 'head': 212085}}



{   '__cx__': 'adjacent',
    'phrase1': {'__cx__': 'name', 'head': 212085},
    'phrase2': {   '__cx__': 'geni_ph',
                   'geni': {'__cx__': 'name', 'head': 212087},
                   'head': {'__cx__': 'cont', 'head': 212086}}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {   '__cx__': 'geni_ph',
                               'geni': {   '__cx__': 'geni_ph',
                                           'geni': {   '__cx__': 'defi_ph',
                                                       'art': {   '__cx__': 'art',
                                                                  'head': 217247},
                                                       'head': {   '__cx__': 'cont',
                                                                   'head': 217248}},
                                           'head': {   '__cx__': 'cont',
                                                       'head': 217246}},
                               'head': {'__cx__': 'cont', 'head': 217245}},
                   'prep': {'__cx__': 'prep', 'head': 217244}},
    'phrase2': {'__cx__': 'name', 'head': 217249}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {   '__cx__': 'defi_ph',
                               'art': {'__cx__': 'art', 'head': 223744},
                               'head': {'__cx__': 'cont', 'head': 223745}},
                   'prep': {'__cx__': 'prep', 'head': 223743}},
    'phrase2': {'__cx__': 'name', 'head': 223746}}



{   '__cx__': 'adjacent',
    'phrase1': {'__cx__': 'cont', 'head': 231927},
    'phrase2': {'__cx__': 'cont', 'head': 231928}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {'__cx__': 'cont', 'head': 242367},
                   'prep': {'__cx__': 'prep', 'head': 242366}},
    'phrase2': {'__cx__': 'cont', 'head': 242368}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {   '__cx__': 'prep_ph',
                               'head': {'__cx__': 'cont', 'head': 249112},
                               'prep': {'__cx__': 'prep', 'head': 249111}},
                   'prep': {'__cx__': 'prep', 'head': 249110}},
    'phrase2': {'__cx__': 'cont', 'head': 249113}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {   '__cx__': 'prep_ph',
                               'head': {'__cx__': 'cont', 'head': 249328},
                               'prep': {'__cx__': 'prep', 'head': 249327}},
                   'prep': {'__cx__': 'prep', 'head': 249326}},
    'phrase2': {'__cx__': 'cont', 'head': 249329}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {   '__cx__': 'defi_ph',
                               'art': {'__cx__': 'art', 'head': 264016},
                               'head': {'__cx__': 'cont', 'head': 264017}},
                   'prep': {'__cx__': 'prep', 'head': 264015}},
    'phrase2': {'__cx__': 'name', 'head': 264018}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {   '__cx__': 'geni_ph',
                               'geni': {'__cx__': 'cont', 'head': 264733},
                               'head': {'__cx__': 'cont', 'head': 264732}},
                   'prep': {'__cx__': 'prep', 'head': 264731}},
    'phrase2': {   '__cx__': 'numb_ph',
                   'head': {   '__cx__': 'geni_ph',
                               'geni': {'__cx__': 'cont', 'head': 264736},
                               'head': {'__cx__': 'cont', 'head': 264735}},
                   'numb': {'__cx__': 'qquant', 'head': 264734}}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {   '__cx__': 'attrib_ph',
                               'attrib': {   '__cx__': 'defi_ph',
                                             'art': {   '__cx__': 'art',
                                                        'head': 284195},
                                             'head': {   '__cx__': 'prde',
                                                         'head': 284196}},
                               'head': {   '__cx__': 'defi_ph',
                                           'art': {   '__cx__': 'art',
                                                      'head': 284193},
                                           'head': {   '__cx__': 'cont',
                                                       'head': 284194}}},
                   'prep': {'__cx__': 'prep', 'head': 284192}},
    'phrase2': {'__cx__': 'cont', 'head': 284197}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {   '__cx__': 'defi_ph',
                               'art': {'__cx__': 'art', 'head': 289003},
                               'head': {'__cx__': 'cont', 'head': 289004}},
                   'prep': {'__cx__': 'prep', 'head': 289002}},
    'phrase2': {   '__cx__': 'numb_ph',
                   'head': {   '__cx__': 'defi_ph',
                               'art': {'__cx__': 'art', 'head': 289006},
                               'head': {'__cx__': 'cont', 'head': 289007}},
                   'numb': {'__cx__': 'card', 'head': 289005}}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {   '__cx__': 'geni_ph',
                               'geni': {'__cx__': 'name', 'head': 290929},
                               'head': {'__cx__': 'cont', 'head': 290928}},
                   'prep': {'__cx__': 'prep', 'head': 290927}},
    'phrase2': {'__cx__': 'name', 'head': 290930}}



{   '__cx__': 'adjacent',
    'phrase1': {'__cx__': 'name', 'head': 290930},
    'phrase2': {'__cx__': 'name', 'head': 290931}}



{   '__cx__': 'adjacent',
    'phrase1': {'__cx__': 'name', 'head': 290931},
    'phrase2': {'__cx__': 'name', 'head': 290932}}



{   '__cx__': 'adjacent',
    'phrase1': {'__cx__': 'name', 'head': 290932},
    'phrase2': {   '__cx__': 'geni_ph',
                   'geni': {'__cx__': 'name', 'head': 290934},
                   'head': {'__cx__': 'cont', 'head': 290933}}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {   '__cx__': 'geni_ph',
                               'geni': {'__cx__': 'name', 'head': 299550},
                               'head': {'__cx__': 'cont', 'head': 299549}},
                   'prep': {'__cx__': 'prep', 'head': 299548}},
    'phrase2': {'__cx__': 'name', 'head': 299551}}



{   '__cx__': 'adjacent',
    'phrase1': {'__cx__': 'name', 'head': 299551},
    'phrase2': {'__cx__': 'name', 'head': 299552}}



{   '__cx__': 'adjacent',
    'phrase1': {'__cx__': 'name', 'head': 299552},
    'phrase2': {   '__cx__': 'geni_ph',
                   'geni': {'__cx__': 'name', 'head': 299554},
                   'head': {'__cx__': 'cont', 'head': 299553}}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {'__cx__': 'prin', 'head': 306762},
                   'prep': {'__cx__': 'prep', 'head': 306761}},
    'phrase2': {'__cx__': 'cont', 'head': 306763}}



{   '__cx__': 'adjacent',
    'phrase1': {'__cx__': 'cont', 'head': 320455},
    'phrase2': {'__cx__': 'cont', 'head': 320456}}



{   '__cx__': 'adjacent',
    'phrase1': {'__cx__': 'cont', 'head': 321445},
    'phrase2': {'__cx__': 'cont', 'head': 321446}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {'__cx__': 'nega', 'head': 340049},
                   'prep': {'__cx__': 'prep', 'head': 340048}},
    'phrase2': {'__cx__': 'cont', 'head': 340050}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {'__cx__': 'prde', 'head': 346911},
                   'prep': {'__cx__': 'prep', 'head': 346910}},
    'phrase2': {   '__cx__': 'numb_ph',
                   'head': {'__cx__': 'cont', 'head': 346915},
                   'numb': {   '__cx__': 'card_chain',
                               'card': {'__cx__': 'card', 'head': 346914},
                               'conj': {'__cx__': 'conj', 'head': 346913},
                               'head': {'__cx__': 'card', 'head': 346912}}}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {'__cx__': 'cont', 'head': 348662},
                   'prep': {'__cx__': 'prep', 'head': 348661}},
    'phrase2': {'__cx__': 'cont', 'head': 348663}}



{   '__cx__': 'adjacent',
    'phrase1': {'__cx__': 'cont', 'head': 349113},
    'phrase2': {'__cx__': 'cont', 'head': 349114}}



{   '__cx__': 'adjacent',
    'phrase1': {'__cx__': 'cont', 'head': 349154},
    'phrase2': {'__cx__': 'cont', 'head': 349155}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {'__cx__': 'cont', 'head': 355632},
                   'prep': {'__cx__': 'prep', 'head': 355631}},
    'phrase2': {'__cx__': 'cont', 'head': 355633}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {'__cx__': 'cont', 'head': 356366},
                   'prep': {'__cx__': 'prep', 'head': 356365}},
    'phrase2': {   '__cx__': 'defi_ph',
                   'art': {'__cx__': 'art', 'head': 356367},
                   'head': {'__cx__': 'cont', 'head': 356368}}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {   '__cx__': 'defi_ph',
                               'art': {'__cx__': 'art', 'head': 361455},
                               'head': {'__cx__': 'cont', 'head': 361456}},
                   'prep': {'__cx__': 'prep', 'head': 361454}},
    'phrase2': {   '__cx__': 'numb_ph',
                   'head': {   '__cx__': 'geni_ph',
                               'geni': {   '__cx__': 'geni_ph',
                                           'geni': {   '__cx__': 'cont',
                                                       'head': 361460},
                                           'head': {   '__cx__': 'cont',
                                                       'head': 361459}},
                               'head': {'__cx__': 'cont', 'head': 361458}},
                   'numb': {'__cx__': 'qquant', 'head': 361457}}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {'__cx__': 'nega', 'head': 361720},
                   'prep': {'__cx__': 'prep', 'head': 361719}},
    'phrase2': {'__cx__': 'cont', 'head': 361721}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'numb_ph',
                   'head': {'__cx__': 'cont', 'head': 365527},
                   'numb': {'__cx__': 'qquant', 'head': 365528}},
    'phrase2': {   '__cx__': 'numb_ph',
                   'head': {'__cx__': 'cont', 'head': 365532},
                   'numb': {   '__cx__': 'card_chain',
                               'card': {'__cx__': 'card', 'head': 365531},
                               'conj': {'__cx__': 'conj', 'head': 365530},
                               'head': {'__cx__': 'card', 'head': 365529}}}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {   '__cx__': 'defi_ph',
                               'art': {'__cx__': 'art', 'head': 366818},
                               'head': {'__cx__': 'cont', 'head': 366819}},
                   'prep': {'__cx__': 'prep', 'head': 366817}},
    'phrase2': {'__cx__': 'name', 'head': 366820}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'numb_ph',
                   'head': {'__cx__': 'cont', 'head': 367502},
                   'numb': {'__cx__': 'card', 'head': 367501}},
    'phrase2': {'__cx__': 'cont', 'head': 367503}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {'__cx__': 'cont', 'head': 375508},
                   'prep': {'__cx__': 'prep', 'head': 375507}},
    'phrase2': {   '__cx__': 'numb_ph',
                   'head': {'__cx__': 'cont', 'head': 375509},
                   'numb': {   '__cx__': 'card_chain',
                               'card': {   '__cx__': 'card_chain',
                                           'card': {   '__cx__': 'card',
                                                       'head': 375513},
                                           'head': {   '__cx__': 'card',
                                                       'head': 375512}},
                               'conj': {'__cx__': 'conj', 'head': 375511},
                               'head': {'__cx__': 'card', 'head': 375510}}}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'numb_ph',
                   'head': {'__cx__': 'cont', 'head': 376480},
                   'numb': {'__cx__': 'card', 'head': 376479}},
    'phrase2': {'__cx__': 'cont', 'head': 376481}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {   '__cx__': 'prep_ph',
                               'head': {   '__cx__': 'defi_ph',
                                           'art': {   '__cx__': 'art',
                                                      'head': 377198},
                                           'head': {   '__cx__': 'cont',
                                                       'head': 377199}},
                               'prep': {'__cx__': 'prep', 'head': 377197}},
                   'prep': {'__cx__': 'prep', 'head': 377196}},
    'phrase2': {'__cx__': 'cont', 'head': 377200}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {'__cx__': 'cont', 'head': 378045},
                   'prep': {'__cx__': 'prep', 'head': 378044}},
    'phrase2': {'__cx__': 'cont', 'head': 378046}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {   '__cx__': 'geni_ph',
                               'geni': {'__cx__': 'name', 'head': 383423},
                               'head': {'__cx__': 'cont', 'head': 383422}},
                   'prep': {'__cx__': 'prep', 'head': 383421}},
    'phrase2': {   '__cx__': 'geni_ph',
                   'geni': {'__cx__': 'card', 'head': 383425},
                   'head': {'__cx__': 'cont', 'head': 383424}}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'defi_ph',
                   'art': {'__cx__': 'art', 'head': 383556},
                   'head': {'__cx__': 'cont', 'head': 383557}},
    'phrase2': {'__cx__': 'cont', 'head': 383558}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {   '__cx__': 'geni_ph',
                               'geni': {'__cx__': 'name', 'head': 383716},
                               'head': {'__cx__': 'cont', 'head': 383715}},
                   'prep': {'__cx__': 'prep', 'head': 383714}},
    'phrase2': {   '__cx__': 'geni_ph',
                   'geni': {'__cx__': 'card', 'head': 383718},
                   'head': {'__cx__': 'cont', 'head': 383717}}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {   '__cx__': 'adjv_ph',
                               'adjv': {'__cx__': 'cont', 'head': 389047},
                               'head': {'__cx__': 'cont', 'head': 389046}},
                   'prep': {'__cx__': 'prep', 'head': 389045}},
    'phrase2': {'__cx__': 'cont', 'head': 389048}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {   '__cx__': 'geni_ph',
                               'geni': {'__cx__': 'name', 'head': 389941},
                               'head': {'__cx__': 'cont', 'head': 389940}},
                   'prep': {'__cx__': 'prep', 'head': 389939}},
    'phrase2': {'__cx__': 'name', 'head': 389942}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {'__cx__': 'name', 'head': 392144},
                   'prep': {'__cx__': 'prep', 'head': 392143}},
    'phrase2': {'__cx__': 'name', 'head': 392145}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'advb_ph',
                   'advb': {'__cx__': 'cont', 'head': 397009},
                   'head': {'__cx__': 'cont', 'head': 397010}},
    'phrase2': {   '__cx__': 'advb_ph',
                   'advb': {'__cx__': 'cont', 'head': 397011},
                   'head': {'__cx__': 'cont', 'head': 397012}}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {   '__cx__': 'attrib_ph',
                               'attrib': {   '__cx__': 'defi_ph',
                                             'art': {   '__cx__': 'art',
                                                        'head': 399743},
                                             'head': {   '__cx__': 'prde',
                                                         'head': 399744}},
                               'head': {   '__cx__': 'defi_ph',
                                           'art': {   '__cx__': 'art',
                                                      'head': 399741},
                                           'head': {   '__cx__': 'cont',
                                                       'head': 399742}}},
                   'prep': {'__cx__': 'prep', 'head': 399740}},
    'phrase2': {'__cx__': 'cont', 'head': 399745}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {   '__cx__': 'geni_ph',
                               'geni': {'__cx__': 'name', 'head': 407329},
                               'head': {'__cx__': 'cont', 'head': 407328}},
                   'prep': {'__cx__': 'prep', 'head': 407327}},
    'phrase2': {'__cx__': 'cont', 'head': 407330}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {   '__cx__': 'attrib_ph',
                               'attrib': {   '__cx__': 'defi_ph',
                                             'art': {   '__cx__': 'art',
                                                        'head': 410224},
                                             'head': {   '__cx__': 'prde',
                                                         'head': 410225}},
                               'head': {   '__cx__': 'defi_ph',
                                           'art': {   '__cx__': 'art',
                                                      'head': 410222},
                                           'head': {   '__cx__': 'cont',
                                                       'head': 410223}}},
                   'prep': {'__cx__': 'prep', 'head': 410221}},
    'phrase2': {   '__cx__': 'numb_ph',
                   'head': {'__cx__': 'cont', 'head': 410227}

{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {   '__cx__': 'defi_ph',
                               'art': {'__cx__': 'art', 'head': 410883},
                               'head': {'__cx__': 'cont', 'head': 410884}},
                   'prep': {'__cx__': 'prep', 'head': 410882}},
    'phrase2': {   '__cx__': 'numb_ph',
                   'head': {'__cx__': 'cont', 'head': 410886},
                   'numb': {'__cx__': 'card', 'head': 410885}}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {   '__cx__': 'defi_ph',
                               'art': {'__cx__': 'art', 'head': 410975},
                               'head': {'__cx__': 'cont', 'head': 410976}},
                   'prep': {'__cx__': 'prep', 'head': 410974}},
    'phrase2': {   '__cx__': 'geni_ph',
                   'geni': {   '__cx__': 'geni_ph',
                               'geni': {'__cx__': 'name', 'head': 410979},
                               'head': {'__cx__': 'cont', 'head': 410978}},
                   'head': {'__cx__': 'cont', 'head': 410977}}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {   '__cx__': 'defi_ph',
                               'art': {'__cx__': 'art', 'head': 412640},
                               'head': {'__cx__': 'cont', 'head': 412641}},
                   'prep': {'__cx__': 'prep', 'head': 412639}},
    'phrase2': {'__cx__': 'name', 'head': 412642}}



{   '__cx__': 'adjacent',
    'phrase1': {   '__cx__': 'prep_ph',
                   'head': {   '__cx__': 'defi_ph',
                               'art': {'__cx__': 'art', 'head': 412982},
                               'head': {'__cx__': 'cont', 'head': 412983}},
                   'prep': {'__cx__': 'prep', 'head': 412981}},
    'phrase2': {'__cx__': 'name', 'head': 412984}}



{   '__cx__': 'adjacent',
    'phrase1': {'__cx__': 'cont', 'head': 418580},
    'phrase2': {   '__cx__': 'numb_ph',
                   'head': {   '__cx__': 'geni_ph',
                               'geni': {'__cx__': 'name', 'head': 418583},
                               'head': {'__cx__': 'cont', 'head': 418582}},
                   'numb': {'__cx__': 'qquant', 'head': 418581}}}



## Stretch Tests

Testing across a whole phrase.

In [53]:
# test = cxp.analyzestretch(phrase2cxs[1449168], debug=True)
# for res in test:
#     showcx(res, conds=False)