<img align="right" src="tf-small.png"/>

# Using mappings between versions

The notebook 
[etcbc_versions](https://github.com/ETCBC/text-fabric/blob/master/Versions/etcbc-versions.ipynb)
has created slot mappings between the versions 4, 4b and 4c of the ETCBC dataset.

The mappings are available in as features `omap@`*sourcev-targetv*`.tf`, where
*sourcev* and *targetv* are the version numbers of the source and of the target respectively.

These are files in 
[.tf](https://github.com/ETCBC/text-fabric/wiki/File-formats)
format. 

That means that you can use them as *features* in your data set, like

```
nodesIn4b = Es('omap@4b-4c').t(nodeIn4c)
```

So, if you are using Text-Fabric, you do not have to do anything, the mappings are available.
But if you want to use the mappings outside Text-Fabric, you might want them in `.csv` format.

We give a minimal example how you can export a mapping feature as a csv file.

Remember that mapping features are edge features: they map a node to a set of other nodes.
The mapping cannot always be perfect, so it might happen that one source node is mapped to multiple target nodes, and vice versa.

The file we produce has tab-delimited fields.
The first field are nodes in the source version, and the second field are nodesets in the target version,
given as comma-separated values.

#### Note
> If you are going to use the mapping 4 to 4b, you need the etcbc4b data, which is also on
github, in
[text-fabric-data-legacy](https://github.com/ETCBC/text-fabric-data-legacy).

In [1]:
import os, collections
from tf.fabric import Fabric

# 4b => 4c
Do the next cell if you want to map from 4b to 4c.
Skip it if you want to map from 4 to 4b.

In [9]:
(vSource, vTarget) = ('4b', '4c')
location = '~/github/text-fabric-data'

# 4 => 4b
Do the next cell if you want to map from 4 to 4b.
Skip it if you want to map from 4b to 4c.

In [6]:
(vSource, vTarget) = ('4', '4b')
location='~/github/text-fabric-data-legacy'

Continue with the common cells.

In [10]:
ETCBC = 'hebrew/etcbc'+vTarget
TF = Fabric(locations=location, modules=ETCBC)
mapFeature = 'omap@{}-{}'.format(vSource, vTarget)
api = TF.load(mapFeature)
api.makeAvailableIn(globals())

This is Text-Fabric 2.3.0
Api reference : https://github.com/ETCBC/text-fabric/wiki/Api
Tutorial      : https://github.com/ETCBC/text-fabric/blob/master/docs/tutorial.ipynb
Data sources  : https://github.com/ETCBC/text-fabric-data
Data docs     : https://etcbc.github.io/text-fabric-data
Shebanq docs  : https://shebanq.ancient-data.org/text
Slack team    : https://shebanq.slack.com/signup
Questions? Ask shebanq@ancient-data.org for an invite to Slack
108 features found and 0 ignored
  0.00s loading features ...
   |     0.67s B omap@4b-4c           from /Users/dirk/github/text-fabric-data/hebrew/etcbc4c
   |     0.00s Feature overview: 102 nodes; 5 edges; 1 configs; 7 computeds
  6.46s All features loaded/computed - for details use loadLog()


We are going to walk through the nodes of the target version, 
and get the corresponding sets of nodes in the source version.
Note that the mapping is an edge from source nodes to target nodes,
so we use the mapping edge in the opposite direction, using `.t()` rather than `.f()`
(see the
[edge API](https://github.com/ETCBC/text-fabric/wiki/Api#edge-features).)

We store the mapping in a dict, then output it as a csv file.

In [11]:
outputDir = os.path.expanduser('~/Downloads')
mapCsv = '{}/{}.csv'.format(outputDir, mapFeature)

mapping = collections.defaultdict(set)

indent(reset=True)
info('fetching map info...')
for nTarget in N():
    nSources = Es(mapFeature).t(nTarget)
    if nSources != None:
        for nSource in nSources:
            mapping[nSource].add(nTarget)
info('{} {}-nodes in map'.format(len(mapping), vSource))

info('writing mapping for {} as csv ...'.format(mapFeature))
with open(mapCsv, 'w') as fh:
    for (nSource, nTargets) in sorted(mapping.items()):
            fh.write('{}\t{}\n'.format(nSource, ','.join(str(nTarget) for nTarget in nTargets)))
info('Done')

  0.00s fetching map info...
  4.61s 426568 4b-nodes in map
  4.61s writing mapping for omap@4b-4c as csv ...
  5.94s Done
