# Edge Cases by Tense

The emerging opposition from the data is that of yiqtol vs. wayyiqtol (see esp. [`2_markers_with_tenses.ipynb`](2_markers_with_tenses.ipynb) and the two dominant tenses). In a number of tense-dominant markers, though, whether they be yiqtol or wayyiqtol dominant, there are edge cases where the opposing tense is used. Is it possible to isolate the circumstances which lead to these exceptions to the rules?

The examples examined here are:

* לעולם - yiqtol-dominant (46 cases) with 2 wayyiqtol edge cases
* עד היום הזה - wayyiqtol-dominant (35) with 3 yiqtol edge cases

Also of interest is the role of the qatal, which has a lot of variation amongst both yiqtol and wayyiqtol dominant time-markers. There are two qatal-dominant time markers of interest:

* היום הזה - qatal-dominant (14), yiqtol-secondary (7)
* אחריו - qatal-dominant (15), wayyiqtol-secondary (4)

The strategy of this notebook is simple. Gather data on the edge cases and manually examine it through a process of elimination to discover any features that may help explain the edge case. Why, for instance, in a time marker usually used alongside a yiqtol verb would a wayyiqtol be used instead?

Factors to consider:

* Other words/particles before the clause in question (semantic signals), interpreted within the discourse structure.
* Other verbs/tenses before the clause in question, interpreted within the discourse structure.
* How much is tense dominance a product of uneven distribution within the HB? E.g. are the qatal dominant time markers truly qatal-dominant, or is this statistic due to a motif/repeated phrase in a single work?

In [1]:
# import modules and Text-Fabric

import pickle, collections, random
import pandas as pd
from pprint import pprint
from tf.fabric import Fabric
from IPython.display import display, HTML

TF = Fabric(modules='hebrew/etcbc4c', silent=True)
api = TF.load('''book chapter verse
                 pdp vt domain lex tab
                 typ function
              ''')

api.makeAvailableIn(globals())

  0.00s loading features ...
   |     0.01s B book                 from /Users/Cody/github/text-fabric-data/hebrew/etcbc4c
   |     0.00s B chapter              from /Users/Cody/github/text-fabric-data/hebrew/etcbc4c
   |     0.01s B verse                from /Users/Cody/github/text-fabric-data/hebrew/etcbc4c
   |     0.11s B pdp                  from /Users/Cody/github/text-fabric-data/hebrew/etcbc4c
   |     0.11s B vt                   from /Users/Cody/github/text-fabric-data/hebrew/etcbc4c
   |     0.02s B domain               from /Users/Cody/github/text-fabric-data/hebrew/etcbc4c
   |     0.12s B lex                  from /Users/Cody/github/text-fabric-data/hebrew/etcbc4c
   |     0.02s B tab                  from /Users/Cody/github/text-fabric-data/hebrew/etcbc4c
   |     0.20s B typ                  from /Users/Cody/github/text-fabric-data/hebrew/etcbc4c
   |     0.07s B function             from /Users/Cody/github/text-fabric-data/hebrew/etcbc4c
   |     0.00s Feature overview

In [2]:
# import custom function for weqetal detection
from functions.verbs import is_weqt

In [3]:
# import time markers data
tm_data_file = 'data/time_markers.pickle'

# load data
with open(tm_data_file, 'rb') as infile:
    tm_data = pickle.load(infile)

print('data available: ', ', '.join(tm_data.keys()))

data available:  markers, top_markers, stats_rows, preposition_cl_lists


In [4]:
# assign the data
markers = tm_data['markers']
top_markers = tm_data['top_markers']
stats_rows = tm_data['stats_rows']

In [5]:
# show key data availabe for time markers
markers['L <WLM'].keys()

dict_keys(['count', 'clauses', 'tense_cl_lists', 'tense_counts', 'tense_percents', 'example_phrase'])

# Basic Examination

Print out the plain-text examples and make comparisons. Some examples will come from the dominant-tense selections, more extensive examples are provided for the edge cases.

## לעולם, yiqtol dominant


In [6]:
for_forever = 'L <WLM'

# get yiqtol examples
for_forever_yiqtols = markers[for_forever]['tense_cl_lists']['impf']

# five random yiqtol samples
rand_forever_yiqtols = [for_forever_yiqtols[random.randrange(0, len(for_forever_yiqtols))]
                           for i in range(1,6)]

# get wayyiqtol examples
for_forever_wayyiqtols = markers[for_forever]['tense_cl_lists']['wayq']

len(for_forever_wayyiqtols)

2

In [7]:
html_span = '<span style="font-size: 14pt; font-family: Times New Roman">{content}</span>'

# show header
display(HTML(html_span.format(content='לעולם with Yiqtol')))
print()

# print examples of yiqtols with marker
for clause in for_forever_yiqtols:
    
    book, chapter, verse = T.sectionFromNode(clause)
    reference = f'{book} {chapter}.{verse}'.replace('_',' ')
    text = T.text(L.d(clause, otype='word'))
    
    # display text in readable format
    display(HTML(html_span.format(content=reference)))
    display(HTML(html_span.format(content=text)))
    print()














































































































































## Look at all Time Marker Phrases with Yiqtol + ל

In [11]:
# show header
display(HTML(html_span.format(content='ל')))
print()

# print examples of yiqtols with marker
for i, clause in enumerate(tm_data['preposition_cl_lists']['L']['wayq']):
    
    book, chapter, verse = T.sectionFromNode(clause)
    reference = f'{book} {chapter}.{verse}'.replace('_',' ')
    text = T.text(L.d(clause, otype='word'))
    
    # display text in readable format
    display(HTML(html_span.format(content=i+1)))
    display(HTML(html_span.format(content=reference)))
    display(HTML(html_span.format(content=text)))
    print()



























































































### Print Edge Cases with Contextual Data in the Chapter

The edge cases will be marked in red. The blue will contain any additional time markers to aid interpreting the data. Other features are provided.

In [12]:
# print examples of wayyiqtol with for_forever marker
# also print contextual information/data

def display_edge_cases(clause_atom_list):

    html_div =\
    '''
    <div style="font-size: 15pt; 
                font-family: Times New Roman; 
                direction: rtl; 
                color:{color};
                width: 80%">

                {content} 
    </div>
    '''


    # show header
    display(HTML(html_span.format(content='לעולם with Wayyiqtol')))
    print()

    for clause in clause_atom_list:

        book, chapter, verse = T.sectionFromNode(clause)

        # get every clause atom in the chapter
        chapter_node = L.u(clause, otype='chapter')[0]
        chapter_clause_atoms = L.d(chapter_node, otype='clause_atom')

        # show header
        display(HTML(html_span.format(content=f'{book} {chapter}')))
        print()

        for ca in chapter_clause_atoms:

            # look for time markers
            time_markers = [phrase for phrase in L.d(ca, otype='phrase')
                               if F.function.v(phrase) == 'Time'
                           ]

            text = T.text(L.d(ca, otype='word'))
            indent = '...' * F.tab.v(ca)
            typ = F.typ.v(ca)
            text_indented = str(F.tab.v(ca)) + '&nbsp;&nbsp;' + typ + '&nbsp;&nbsp;&nbsp;&nbsp;' + indent + text

            # format color
            cur_clause = L.u(ca, otype='clause')[0]
            color = 'red' if cur_clause == clause\
                    else 'blue' if time_markers\
                    else ''

            display(HTML(html_div.format(color=color, content=text_indented)))


In [13]:
display_edge_cases(for_forever_wayyiqtols)








