<img align="right" src="tf-small.png"/>

# ETCBC versions

In this notebook we try to map the nodes between the versions 4, 4b and 4c of the ETCBC dataset.

If we succeed, then text-fabric notebooks that are based on an older version of the data, can also be used unmodified on newer versions of the data.

In general, node mappings between versions can not be perfect. We try and see how far we get.

Let us start with *slot* mappings.
We map the slots of a version to the slots of the next version.
Mappings go from old to new, and they are between successive versions.

We have data in text-fabric format for the ETCBC Hebrew Bible Database, versions 4, 4b, and 4c.

Stephen Ku has prepared a Strong number mapping for version 4, based on 
[]().

This provides us with a nice use case: can we apply the Strong number mapping for version 4 to versions 4b and 4c
as well?

In [1]:
import os,collections
from tf.fabric import Fabric

In [3]:
locations = ['~/github/text-fabric-data', '~/github/text-fabric-data-legacy']
versions = ['4', '4b', '4c']
TF = {}
api = {}
for v in versions:
    TF[v] = Fabric(locations=locations, modules='hebrew/etcbc{}'.format(v))
    api[v] = TF[v].load('''
        g_word lex
    ''')
A4 = api['4']
A4b = api['4b']
A4c = api['4c']

This is Text-Fabric 2.2.1
Api reference : https://github.com/ETCBC/text-fabric/wiki/Api
Tutorial      : https://github.com/ETCBC/text-fabric/blob/master/docs/tutorial.ipynb
Data sources  : https://github.com/ETCBC/text-fabric-data
Data docs     : https://etcbc.github.io/text-fabric-data
Shebanq docs  : https://shebanq.ancient-data.org/text
Slack team    : https://shebanq.slack.com/signup
Questions? Ask shebanq@ancient-data.org for an invite to Slack
110 features found and 0 ignored
  0.00s loading features ...
   |     0.25s B g_word               from /Users/dirk/github/text-fabric-data-legacy/hebrew/etcbc4
   |     0.16s B lex                  from /Users/dirk/github/text-fabric-data-legacy/hebrew/etcbc4
   |     0.00s Feature overview: 105 nodes; 4 edges; 1 configs; 7 computeds
  5.77s All features loaded/computed - for details use loadLog()
This is Text-Fabric 2.2.1
Api reference : https://github.com/ETCBC/text-fabric/wiki/Api
Tutorial      : https://github.com/ETCBC/text-fabric/bl

In [7]:
nSlots = {}
for v in versions:
    nSlots[v] = api[v].F.otype.maxSlot
nSlots

{'4': 426555, '4b': 426568, '4c': 426581}

In [8]:
lexemes = {}
for v in versions:
    lexemes[v] = [api[v].F.lex.v(n) for n in api[v].F.otype.s('word')]

In [10]:
for v in versions:
    print('{:<2}: {}'.format(
        v,
        ' '.join(lexemes[v][0:15])
    ))

4 : B R>CJT/ BR>[ >LHJM/ >T H CMJM/ W >T H >RY/ W H >RY/ HJH[
4b: B R>CJT/ BR>[ >LHJM/ >T H CMJM/ W >T H >RY/ W H >RY/ HJH[
4c: B R>CJT/ BR>[ >LHJM/ >T H CMJM/ W >T H >RY/ W H >RY/ HJH[
