<img align="right" src="images/tf-small.png"/>

# ETCBC versions

In this notebook we try to map the nodes between the versions `3`, `4`, `4b` and 2016 of the BHSA dataset.

If we succeed, then text-fabric notebooks that are based on an older version of the data,
can also be used unmodified on newer versions of the data.

In general, node mappings between versions can not be perfect. We try and see how far we get.

Let us start with *slot* mappings.
We map the slots of a version to the slots of the next version.
Mappings go from old to new, and they are between successive versions.

We have data in text-fabric format for the ETCBC Hebrew Bible Database, versions `3`, `4`, `4b`, and `2016`.

Stephen Ku has prepared a Strong number mapping for version `4`, based on 
[OpenScriptures Bible Lexicon](https://github.com/openscriptures/HebrewLexicon).

This provides us with a nice use case:
can we apply the Strong number mapping for version `4` to versions `3`, `4b` and `2016`
as well?
See notebook
[evolutionStrong](https://github.com/ETCBC/bhsa/blob/master/programs/evolutionStrong.ipynb)
for how we add Strong numbers to the BHSA dataset.

Below we will get a pretty good view on the differences between the versions.
We use the
[BHSA transcription](https://shebanq.ancient-data.org/shebanq/static/docs/BHSA-transcription.pdf)
to write down the diffs.

In [1]:
import os,collections
from utils import caption
from tf.fabric import Fabric

We specify our versions and the subtle differences between them as far as they are relevant.

In [2]:
baseDir = '~/github/etcbc/bhsa/tf'

versions = '''
    3 
    4 
    4b 
    2016
'''.strip().split()

versionInfo = {
    '': dict(
            OCC='g_word',
            LEX='lex',
        ),
    '3': dict(
            OCC='text_plain',
            LEX='lexeme',
        ),
} 

Load all versions in one go!

In [3]:
TF = {}
api = {}
for v in versions:
    for (param, value) in versionInfo.get(v, versionInfo['']).items():
        globals()[param] = value
    caption(4, 'Version -> {} <- loading ...'.format(v))
    TF[v] = Fabric(locations='{}/{}'.format(baseDir, v), modules=[''])
    api[v] = TF[v].load('{} {}'.format(OCC, LEX))

..............................................................................................
.       0.00s Version -> 3 <- loading ...                                                    .
..............................................................................................
This is Text-Fabric 3.0.6
Api reference : https://github.com/Dans-labs/text-fabric/wiki/Api
Tutorial      : https://github.com/Dans-labs/text-fabric/blob/master/docs/tutorial.ipynb
Example data  : https://github.com/Dans-labs/text-fabric-data

118 features found and 0 ignored
  0.00s loading features ...
   |     0.11s B lexeme               from /Users/dirk/github/etcbc/bhsa/tf/3
   |     0.18s B text_plain           from /Users/dirk/github/etcbc/bhsa/tf/3
   |     0.00s Feature overview: 115 for nodes; 2 for edges; 1 configs; 7 computed
  4.09s All features loaded/computed - for details use loadLog()
..............................................................................................
.       4.

We want to switch easily between the APIs for the versions.

In [4]:
def activate(v):
    for (param, value) in versionInfo.get(v, versionInfo['']).items():
        globals()[param] = value
    api[v].makeAvailableIn(globals())
    caption(4, 'Active version is now -> {} <-'.format(v))

Inspect the amount of slots in all versions.

In [5]:
nSlots = {}
for v in versions:
    activate(v)
    nSlots[v] = F.otype.maxSlot
    caption(0, '\t {} slots'.format(nSlots[v]))

..............................................................................................
.         17s Active version is now -> 3 <-                                                  .
..............................................................................................
|         17s 	 426499 slots
..............................................................................................
.         17s Active version is now -> 4 <-                                                  .
..............................................................................................
|         17s 	 426555 slots
..............................................................................................
.         17s Active version is now -> 4b <-                                                 .
..............................................................................................
|         17s 	 426568 slots
..........................................................

# Method

When we compare two versions, we inspect the lexemes found at corresponding positions in the versions.
We start at the beginning, and when the lexemes do not match, we have a closer look.

However, in order not to be disturbed by minor discrepancies in the lexemes, we mask the lexemes: we
apply a few transformations to it, such as removing alefs and waws, and finally even turning them into
ordered sets of letters, thereby loosing the order and multiplicity of letter.
We also strip the disambiguation marks.

We maintain a current mapping between the slots of the two versions, and we update it if we encounter
disturbances. 
Initially, this map is the identity map.

What we encounter as remaining differences boils down to the following:

* a lexeme is split into two lexemes with the same total material, typically involving `H`, `MN`, or `B`
* the lexeme is part of a special case, listed in the `cases` table (which has been found by repeatedly
  chasing for the first remaining difference.
* the both lexemes differ, but that's it, no map updates have to be done.
  
The first two types of cases can be solved by splitting a lexeme into `k` parts or combining `k` lexemes into one.
After that the mapping has to be shifted to the right or to the left from a certain point onwards.

The loop then is as follows:

* find the first slot with a lexeme in the first version that is different from the lexeme at the mapped slot
  in the second version
* analyse what is the case:
  * if the disturbance is recognized on the basis of existing patterns and cases, update the map and
    consider this case solved
  * if the disturbance is not recognized, the case is unsolved, and we break out of the loop.
    More analysis is needed, and the outcome of that has to be coded as an extra pattern or case.
* if the status is solved, go back to the first step

We end up with a mapping from the slots of the first version to those of the other version that links
slots with approximately equal lexemes together.

# Lexeme masking
We start by defining our masking function, and compile lists of all lexemes and masked lexemes for all versions.

In [7]:
masks = [
    (lambda lex: lex.rstrip('[/='),                         'strip disambiguation'),
    (lambda lex: lex[0:-2] if lex.endswith('JM') else lex,  'remove JM'),
    (lambda lex: lex[0:-2] if lex.endswith('WT') else lex,  'remove WT'),
    (lambda lex: lex.replace('J', ''),                      'remove J'),
    (lambda lex: lex.replace('>', ''),                      'remove Alef'),
    (lambda lex: lex.replace('W', ''),                      'remove W'),
    (lambda lex: lex.replace('Z', 'N'),                     'identify Z and N'),
    (lambda lex: lex.rstrip('HT'),                          'strip HT'),
    (lambda lex: (''.join(sorted(set(set(lex)))))+'_'*lex.count('_'), 'ignore order and multiplicity'),
]

def mask(lex, trans=None):
    if trans != None:
        return masks[trans][0](lex)
    for (fun, desc) in masks:
        lex = fun(lex)
    return lex

Carry out the lexeme masking for all versions.

In [8]:
lexemes = {}

caption(4, 'Masking lexemes')
for v in versions:
    activate(v)
    lexemes[v] = collections.OrderedDict()
    for n in F.otype.s('word'):
        lex = Fs(LEX).v(n)
        lexemes[v][n] = (lex, mask(lex, trans=0), mask(lex))
caption(0, 'Done')

..............................................................................................
.         33s Masking lexemes                                                                .
..............................................................................................
..............................................................................................
.         33s Active version is now -> 3 <-                                                  .
..............................................................................................
..............................................................................................
.         36s Active version is now -> 4 <-                                                  .
..............................................................................................
..............................................................................................
.         39s Active version is now -> 4b <-      

# Cases and mappings
In `cases` we store special cases that we stumbled upon.
Every time we encountered a disturbance which did not follow a recognized pattern,
we turned it into a case.
The number is the slot number in the first version where the case will be applied.
Cases will only be applied at these exact slot number and nowhere else.

In [9]:
cases = {}
mappings = {}

# Algorithm

Here is the code that directly implements the method.
Every pair of distinct versions can be mapped.
We store the mappings in a dictionary, keyed by tuples like `(4, 4b)`, 
for the mapping from version `4` to `4b`, for instance.

The loop is in `doDiffs` below.

In [10]:
def inspect(v1, v2, start, end):
    mapKey = (v1, v2)
    mp = mappings[mapKey]
    for n in range(start, end):
        print('{:>6}: {:<8} {:<8}'.format(
            n, 
            api[v1].Fs(LEX).v(n),
            api[v2].Fs(LEX).v(mp[n]),

        ))

def firstDiff(v1, v2, start):
    mapKey = (v1, v2)
    mp = mappings[mapKey]

    fDiff = None
    for (n, (lx1, sxl, mx1)) in lexemes[v1].items():
        if n < start: continue
        if mx1 != lexemes[v2][mp[n]][2]:
            fDiff = n
            break
    return fDiff

def printDiff(v1, v2, n):
    mapKey = (v1, v2)
    mp = mappings[mapKey]

    (lx1, sx1, mx1) = lexemes[v1][n]
    (lx2, sx2, mx2) = lexemes[v2][mp[n]]
    if n < api[v1].F.otype.maxSlot:
        (lx1n, sx1n, mx1n) = lexemes[v1][n+1]
    else:
        (lx1n, sx1n, mx1n) = ('max', 'max', 'max')
    if mp[n] < api[v2].F.otype.maxSlot:
        (lx2n, sx2n, mx2n) = lexemes[v2][mp[n+1]]
    else:
        (lx2n, sx2n, mx2n) = ('max', 'max', 'max')
    if n > 1:
        (lx1p, sx1p, mx1p) = lexemes[v1][n-1]
    else:
        (lx1p, sx1p, mx1p) = ('min', 'min', 'min')
    if mp[n] > 1:
        (lx2p, sx2p, mx2p) = lexemes[v2][mp[n-1]]
    else:
        (lx2p, sx2p, mx2p) = ('min', 'min', 'min')

    #print('''{} {}:{} ==> slot {} ==> {}
    #{:<2}: {:<6} ~ |{:<6}| ~ {:<6}   {:<6} ~ |{:<6}| ~ {:<6}   {:<6} ~ |{:<6}| {:<6}
    #{:<2}: {:<6} ~ |{:<6}| ~ {:<6}   {:<6} ~ |{:<6}| ~ {:<6}   {:<6} ~ |{:<6}| {:<6}'''.format(
    #    *api[v1].T.sectionFromNode(n),
    #    n, mp[n],
    #    v1, lx1p, lx1, lx1n, sx1p, sx1, sx1n, mx1p, mx1, mx1n,
    #    v2, lx2p, lx2, lx2n, sx2p, sx2, sx2n, mx2p, mx2, mx2n,
    #)) 
    print('''{} {}:{} ==> slot {} ==> {}
    {:>4}: ┣{:<6}┫ ▷{:>10}◁▶{:>10}◀▷{:<8}◁
    {:>4}: ┣{:<6}┫ ▷{:>10}◁▶{:>10}◀▷{:<8}◁'''.format(
        *api[v1].T.sectionFromNode(n),
        n, mp[n],
        v1, mx1, lx1p, lx1, lx1n, 
        v2, mx2, lx2p, lx2, lx2n,
    )) 

# doDiffs

This function contains the loop to walk through all differences.

In [11]:
MAX_ITER = 250

def doDiffs(v1, v2):
    mapKey = (v1, v2)
    mappings[mapKey] = dict(((n, n) for n in api[v1].F.otype.s('word')))
    mp = mappings[mapKey]
    theseCases = cases.get(mapKey, {})
    it = 0
    start = 1
    while True:
        n = firstDiff(v1, v2, start)

        if n == None:
            print('No more differences.\nFound {} points of disturbance'.format(it))
            break

        if it > MAX_ITER: 
            print('There might be more disturbances: increase MAX_ITER')
            break
            
        it += 1

        printDiff(v1, v2, n)

        (lx1, sx1, mx1) = lexemes[v1][n]
        (lx2, sx2, mx2) = lexemes[v2][mp[n]]
        (lx1n, sx1n, mx1n) = lexemes[v1][n+1]
        (lx2n, sx2n, mx2n) = lexemes[v2][mp[n+1]]

        solved = None
        skip = 0
        if n in theseCases:
            (action, param) = theseCases[n]
            if action == 'collapse':
                solved = '{} {} slots'.format(action, param)
                skip = param
                for m in range(api[v1].F.otype.maxSlot, n + param -1, -1):
                    mp[m] = mp[m-param+1]
                for m in range(n+1, n+param):
                    mp[m] = mp[n]
            elif action == 'split':
                solved = '{} into {} slots'.format(action, param)
                for m in range(n+1, api[v1].F.otype.maxSlot+1):
                    mp[m] = mp[m] + param -1
            elif action == 'ok':
                solved = 'incidental variation in lexeme'
#        elif lx1.replace('C', 'X') == lx2:
#            solved = 'letter C replaced by X'
        elif lx1 in theseCases:
            (action, param) = theseCases[lx1]
            if action == 'ok':            
                if lx2 == param:
                    solved = 'systematic variation in lexeme' 
            elif action == 'split':
                solved = 'systematic {} on _ into {} slots'.format(action, param)
                for m in range(n+1, api[v1].F.otype.maxSlot+1):
                    mp[m] = mp[m] + param -1
        elif '_' in lx1:
            action = 'split'
            param = lx1.count('_') + 1
            solved = '{} on _ into {} slots'.format(action, param)
            for m in range(n+1, api[v1].F.otype.maxSlot+1):
                mp[m] = mp[m] + param -1
        elif lx1 == lx2 + lx2n:
            if lx2 == 'H':
                solved = 'split article off'
                for m in range(n+1, api[v1].F.otype.maxSlot+1):
                    mp[m] = mp[m] + 1
        elif set(mx1) == set(mx2) | set(mx2n):
            if lx2 == 'B' or lx2 == 'MN':
                solved = 'split preposition off'
                for m in range(n+1, api[v1].F.otype.maxSlot+1):
                    mp[m] = mp[m] + 1
        print('Action: {}\n'.format(solved if solved else 'BLOCKED'))

        if not solved: break
        
        start = n + 1 + skip

    if not solved:
        print('Blocking difference in {} iterations'.format(it))

The mappings itself are needed elsewhere in Text-Fabric, let us write them to file.
We write them into the dataset corresponding to the target version.
So the map `3-4` ends up in version `4`.

In [25]:
def writeMaps():
    for ((v1, v2), mp) in mappings.items():
        fName = 'omap@{}-{}'.format(v1, v2)
        caption(4, 'Write slot mapping {}'.format(fName))

        edgeFeatures = {
            fName: dict(((n, (mp[n],)) for n in range(1, api[v1].F.otype.maxSlot + 1)))
        }
        metaData = {
            fName: {
                'about': 'Mapping from the slots of BHSA version {} to version {}'.format(v1, v2),
                'encoder': 'Dirk Roorda by a semi-automatic method',
                'see': 'https://github.com/ETCBC/bhsa/blob/master/programs/evolutionVersions.ipynb',
                'valueType': 'str',
            }
        }
        TF[v2].save(
            nodeFeatures={},
            edgeFeatures=edgeFeatures,
            metaData=metaData,
        )

# Running

Here we run the mapping between `3` and `4`.

## 3 => 4

Here are the special cases for this conversion.

In [13]:
cases[('3', '4')] = {}

In [14]:
cases.update({
    ('3', '4'): {
        'CXH[' : ('ok', 'XWH['),
        'MQYT/': ('split', 2),
        28730  : ('ok', None),
        121812 : ('ok', None),
        174515 : ('ok', None),
        201089 : ('ok', None),
        218383 : ('split', 3),
        221436 : ('ok', None),
        247730 : ('ok', None),
        272884 : ('collapse', 2),
        353611 : ('ok', None),
#        370037 : ('split', 2),
#        370138 : ('split', 2),
#        370329 : ('split', 2),
    },
})    

In [15]:
doDiffs('3', '4')

Genesis 18:2 ==> slot 7840 ==> 7840
       3: ┣CX    ┫ ▷         W◁▶      CXH[◀▷>RY/    ◁
       4: ┣X     ┫ ▷         W◁▶      XWH[◀▷>RY/    ◁
Action: systematic variation in lexeme

Genesis 19:1 ==> slot 8447 ==> 8447
       3: ┣CX    ┫ ▷         W◁▶      CXH[◀▷>P/     ◁
       4: ┣X     ┫ ▷         W◁▶      XWH[◀▷>P/     ◁
Action: systematic variation in lexeme

Genesis 21:14 ==> slot 9856 ==> 9856
       3: ┣<BCR__┫ ▷     MDBR/◁▶  B>R_CB</◀▷W       ◁
       4: ┣BR    ┫ ▷     MDBR/◁▶      B>R/◀▷CB<==/  ◁
Action: split on _ into 2 slots

Genesis 21:31 ==> slot 10174 ==> 10175
       3: ┣<BCR__┫ ▷       HW>◁▶  B>R_CB</◀▷KJ      ◁
       4: ┣BR    ┫ ▷       HW>◁▶      B>R/◀▷CB<==/  ◁
Action: split on _ into 2 slots

Genesis 21:32 ==> slot 10183 ==> 10185
       3: ┣<BCR__┫ ▷         B◁▶  B>R_CB</◀▷W       ◁
       4: ┣BR    ┫ ▷         B◁▶      B>R/◀▷CB<==/  ◁
Action: split on _ into 2 slots

Genesis 21:33 ==> slot 10200 ==> 10203
       3: ┣<BCR__┫ ▷         B◁▶  B>R_CB</◀▷W       ◁
 

Action: systematic variation in lexeme

Leviticus 26:1 ==> slot 68138 ==> 68159
       3: ┣CX    ┫ ▷         L◁▶      CXH[◀▷<L      ◁
       4: ┣X     ┫ ▷         L◁▶      XWH[◀▷<L      ◁
Action: systematic variation in lexeme

Numbers 22:31 ==> slot 84445 ==> 84466
       3: ┣CX    ┫ ▷         W◁▶      CXH[◀▷L       ◁
       4: ┣X     ┫ ▷         W◁▶      XWH[◀▷L       ◁
Action: systematic variation in lexeme

Numbers 25:2 ==> slot 85620 ==> 85641
       3: ┣CX    ┫ ▷         W◁▶      CXH[◀▷L       ◁
       4: ┣X     ┫ ▷         W◁▶      XWH[◀▷L       ◁
Action: systematic variation in lexeme

Deuteronomy 4:19 ==> slot 95563 ==> 95584
       3: ┣CX    ┫ ▷         W◁▶      CXH[◀▷L       ◁
       4: ┣X     ┫ ▷         W◁▶      XWH[◀▷L       ◁
Action: systematic variation in lexeme

Deuteronomy 5:9 ==> slot 96432 ==> 96453
       3: ┣CX    ┫ ▷        L>◁▶      CXH[◀▷L       ◁
       4: ┣X     ┫ ▷        L>◁▶      XWH[◀▷L       ◁
Action: systematic variation in lexeme

Deuteronomy 8:19 ==>

Action: split on _ into 2 slots

2_Samuel 18:21 ==> slot 171572 ==> 171603
       3: ┣CX    ┫ ▷         W◁▶      CXH[◀▷KCJ/    ◁
       4: ┣X     ┫ ▷         W◁▶      XWH[◀▷KCJ/    ◁
Action: systematic variation in lexeme

2_Samuel 18:28 ==> slot 171749 ==> 171780
       3: ┣CX    ┫ ▷         W◁▶      CXH[◀▷L       ◁
       4: ┣X     ┫ ▷         W◁▶      XWH[◀▷L       ◁
Action: systematic variation in lexeme

2_Samuel 22:27 ==> slot 174515 ==> 174546
       3: ┣BR    ┫ ▷      BRR[◁▶      BRR[◀▷W       ◁
       4: ┣BRT   ┫ ▷      BRR[◁▶      TBR[◀▷W       ◁
Action: incidental variation in lexeme

2_Samuel 24:2 ==> slot 175416 ==> 175447
       3: ┣<BCR__┫ ▷        <D◁▶  B>R_CB</◀▷W       ◁
       4: ┣BR    ┫ ▷        <D◁▶      B>R/◀▷CB<==/  ◁
Action: split on _ into 2 slots

2_Samuel 24:6 ==> slot 175525 ==> 175557
       3: ┣<DN__ ┫ ▷      BW>[◁▶   DN_J<N/◀▷W       ◁
       4: ┣DN    ┫ ▷      BW>[◁▶       DN/◀▷J<N/    ◁
Action: split on _ into 2 slots

2_Samuel 24:7 ==> slot 175547 ==>

Jeremiah 7:2 ==> slot 238176 ==> 238217
       3: ┣CX    ┫ ▷         L◁▶      CXH[◀▷L       ◁
       4: ┣X     ┫ ▷         L◁▶      XWH[◀▷L       ◁
Action: systematic variation in lexeme

Jeremiah 8:2 ==> slot 238952 ==> 238993
       3: ┣CX    ┫ ▷       >CR◁▶      CXH[◀▷L       ◁
       4: ┣X     ┫ ▷       >CR◁▶      XWH[◀▷L       ◁
Action: systematic variation in lexeme

Jeremiah 13:10 ==> slot 241282 ==> 241323
       3: ┣CX    ┫ ▷         L◁▶      CXH[◀▷L       ◁
       4: ┣X     ┫ ▷         L◁▶      XWH[◀▷L       ◁
Action: systematic variation in lexeme

Jeremiah 16:11 ==> slot 242826 ==> 242867
       3: ┣CX    ┫ ▷         W◁▶      CXH[◀▷L       ◁
       4: ┣X     ┫ ▷         W◁▶      XWH[◀▷L       ◁
Action: systematic variation in lexeme

Jeremiah 22:9 ==> slot 245440 ==> 245481
       3: ┣CX    ┫ ▷         W◁▶      CXH[◀▷L       ◁
       4: ┣X     ┫ ▷         W◁▶      XWH[◀▷L       ◁
Action: systematic variation in lexeme

Jeremiah 25:6 ==> slot 247079 ==> 247120
       3: ┣CX 

       4: ┣MN    ┫ ▷         L◁▶        MN◀▷QYT/    ◁
Action: systematic split on _ into 2 slots

Nehemiah 7:69 ==> slot 386987 ==> 387034
       3: ┣MQY   ┫ ▷         W◁▶     MQYT/◀▷R>C/    ◁
       4: ┣MN    ┫ ▷         W◁▶        MN◀▷QYT/    ◁
Action: systematic split on _ into 2 slots

Nehemiah 8:6 ==> slot 387279 ==> 387327
       3: ┣CX    ┫ ▷         W◁▶      CXH[◀▷L       ◁
       4: ┣X     ┫ ▷         W◁▶      XWH[◀▷L       ◁
Action: systematic variation in lexeme

Nehemiah 9:3 ==> slot 387715 ==> 387763
       3: ┣CX    ┫ ▷         W◁▶      CXH[◀▷L       ◁
       4: ┣X     ┫ ▷         W◁▶      XWH[◀▷L       ◁
Action: systematic variation in lexeme

Nehemiah 9:6 ==> slot 387815 ==> 387863
       3: ┣CX    ┫ ▷         L◁▶      CXH[◀▷>TH     ◁
       4: ┣X     ┫ ▷         L◁▶      XWH[◀▷>TH     ◁
Action: systematic variation in lexeme

Nehemiah 11:27 ==> slot 389626 ==> 389674
       3: ┣<BCR__┫ ▷         B◁▶  B>R_CB</◀▷W       ◁
       4: ┣BR    ┫ ▷         B◁▶      B>R/◀▷CB<==

# Running

Here we run the mapping between `4` and `4b`.
The points of disturbance will be written into the output cell.

## 4 => 4b

Here are the special cases for this conversion.

In [16]:
cases.update({
    ('4', '4b'): {
        214730: ('collapse', 4),
        260028: ('split', 2),
        289948: ('ok', None),
        307578: ('split', 2),
        323067: ('ok', None),
        389774: ('ok', None),
        407543: ('split', 2),
        408429: ('split', 2),
    },
})

In [17]:
doDiffs('4', '4b')

Genesis 24:65 ==> slot 12369 ==> 12369
       4: ┣HLN   ┫ ▷      >JC/◁▶      HLZH◀▷H       ◁
      4b: ┣      ┫ ▷      >JC/◁▶         H◀▷LZH     ◁
Action: split article off

Genesis 37:19 ==> slot 20514 ==> 20515
       4: ┣HLN   ┫ ▷     XLWM/◁▶      HLZH◀▷BW>[    ◁
      4b: ┣      ┫ ▷     XLWM/◁▶         H◀▷LZH     ◁
Action: split article off

Judges 6:20 ==> slot 130846 ==> 130848
       4: ┣HLN   ┫ ▷      SL</◁▶       HLZ◀▷W       ◁
      4b: ┣      ┫ ▷      SL</◁▶         H◀▷LZ      ◁
Action: split article off

1_Samuel 14:1 ==> slot 148319 ==> 148322
       4: ┣HLN   ┫ ▷      <BR/◁▶       HLZ◀▷W       ◁
      4b: ┣      ┫ ▷      <BR/◁▶         H◀▷LZ      ◁
Action: split article off

1_Samuel 17:26 ==> slot 151331 ==> 151335
       4: ┣HLN   ┫ ▷    PLCTJ/◁▶       HLZ◀▷W       ◁
      4b: ┣      ┫ ▷    PLCTJ/◁▶         H◀▷LZ      ◁
Action: split article off

1_Samuel 20:19 ==> slot 153816 ==> 153821
       4: ┣HLN   ┫ ▷      >BN/◁▶     H>ZL/◀▷W       ◁
      4b: ┣      ┫ ▷      >BN

Just have a look at the first point of disturbance:

In [18]:
(v1, v2) = ('4', '4b')
(n, m) = [x for x in mappings[(v1, v2)].items() if x[0] != x[1]][0]
print('{} {}:{} node {}: {} versus {} becomes {}'.format(
    *api[v1].T.sectionFromNode(n),
    n,
    api[v1].F.lex.v(n),
    api[v2].F.lex.v(n),
    api[v2].F.lex.v(m),
))

Genesis 24:65 node 12370: H versus LZH becomes H


## 4b => 2016

We need other cases.

In [19]:
cases.update({
    ('4b', '2016'): {
         28423: ('split', 3),
         28455: ('split', 3),
         91193: ('split', 2),
         91197: ('split', 2),
        122218: ('split', 2),
        122247: ('split', 2),
        123160: ('split', 2),
        184086: ('split', 2),
        394186: ('collapse', 2),
        395150: ('ok', None),
        395190: ('ok', None),
        401036: ('split', 3),
        404503: ('ok', None),
        419138: ('split', 3),
    },    
})

In [20]:
doDiffs('4b', '2016')

Genesis 50:10 ==> slot 28423 ==> 28423
      4b: ┣DGNRV__┫ ▷        <D◁▶  GRN_>VD/◀▷>CR     ◁
    2016: ┣GNR   ┫ ▷        <D◁▶      GRN/◀▷H       ◁
Action: split into 3 slots

Genesis 50:11 ==> slot 28455 ==> 28457
      4b: ┣DGNRV__┫ ▷         B◁▶  GRN_>VD/◀▷W       ◁
    2016: ┣GNR   ┫ ▷         B◁▶      GRN/◀▷H       ◁
Action: split into 3 slots

Numbers 33:45 ==> slot 91193 ==> 91197
      4b: ┣BDGN__┫ ▷         B◁▶ DJBWN_GD/◀▷W       ◁
    2016: ┣BDN   ┫ ▷         B◁▶     DJBN/◀▷GD==/   ◁
Action: split into 2 slots

Numbers 33:46 ==> slot 91197 ==> 91202
      4b: ┣BDGN__┫ ▷        MN◁▶ DJBWN_GD/◀▷W       ◁
    2016: ┣BDN   ┫ ▷        MN◁▶     DJBN/◀▷GD==/   ◁
Action: split into 2 slots

Joshua 16:3 ==> slot 122218 ==> 122224
      4b: ┣BNRTX___┫ ▷     GBWL/◁▶BJT_XRWN_TXTWN/◀▷W       ◁
    2016: ┣BNRTX__┫ ▷     GBWL/◁▶BJT_XWRWN/◀▷TXTWN/  ◁
Action: split into 2 slots

Joshua 16:5 ==> slot 122247 ==> 122254
      4b: ┣<BLNRTX___┫ ▷        <D◁▶BJT_XRWN_<LJWN/◀▷W       ◁
    2016: ┣BN

The bit below is very handy if you need a closer look to what is the case in some range of slots.

In [21]:
inspect('4b', '2016', 419135, 419145)

419135: <M/      <M/     
419136: W        W       
419137: HLK[     HLK[    
419138: GJ>_MLX/ GJ>/    
419139: W        W       
419140: NKH[     NKH[    
419141: >T       >T      
419142: BN/      BN/     
419143: F<JR====/ F<JR====/
419144: <FRH=/   <FRH=/  


Just have a look at the first point of disturbance:

In [22]:
(v1, v2) = ('4b', '2016')
(n, m) = [x for x in mappings[(v1, v2)].items() if x[0] != x[1]][0]
print('{} {}:{} node {}: {} versus {} becomes {}'.format(
    *api[v1].T.sectionFromNode(n),
    n,
    api[v1].F.lex.v(n),
    api[v2].F.lex.v(n),
    api[v2].F.lex.v(m),
))

Genesis 50:10 node 28424: >CR versus H becomes >CR


In [26]:
caption(4, 'Constructed mappings:')
for (v1, v2) in sorted(mappings.keys()):
    caption(0, '\t {:>4} ==> {:<4}'.format(v1, v2))

writeMaps()

..............................................................................................
.      7m 19s Constructed mappings:                                                          .
..............................................................................................
|      7m 19s 	    3 ==> 4   
|      7m 19s 	    4 ==> 4b  
|      7m 19s 	   4b ==> 2016
..............................................................................................
.      7m 19s Write slot mapping omap@3-4                                                    .
..............................................................................................
  0.00s Exporting 0 node and 1 edge and 0 config features to /Users/dirk/github/etcbc/bhsa/tf/4:
   |     1.26s T omap@3-4             to /Users/dirk/github/etcbc/bhsa/tf/4
  1.26s Exported 0 node features and 1 edge features and 0 config features to /Users/dirk/github/etcbc/bhsa/tf/4
......................................................