<img align="right" src="tf-small.png"/>

# SHEBANQ from Text-Fabric

This notebook assembles the data from the text-fabric repositories 
of the ETCBC that is needed
to compile the webstie data for
[SHEBANQ](https://shebanq.ancient-data.org).

## Pipeline

A run of the pipeline produces a shebanq data *version*.
It should be run whenever there are new or updated data sources present that affect the output data.
Since all input data is delivered in a Github repo, we have excellent machinery to 
work with versioning.

The pipe line works by pointing text-fabric to the data contained in the github repositories.

All this is specified in the configuration below.

### Core data

The core data is what resides in 
the Github repo [bhsa](https://github.com/ETCBC/bhsa) in directory `tf`.

This data will be converted by `tfFromMQL` in the `programs` directory.

The result of this action will be an updated TF resource in its 
`tf/core` directory.

### Additional data

The pipeline will try to load any text-fabric features found in the `tf` subdirectories
of the designated additional repos.
It will descend one level deeper, according to the *version* that is to be gathered.

### Resulting data
The resulting data will be delivered in the `shebanq` subdirectory of the core repo `bhsa`, 
and then under a *version* subdirectory.

The resulting data consists of three parts:

* One big mql file, containing the core data plus **all** additions: `bhsa-xx.mql`.
  It will be bzipped. This smallish file (<30MB) is your *portable ETCBC*.
  You can easily transfer it to your own machine (preferably by just `git pull origin master`),
  and then you can run MQL queries on the full, actual, ETCBC data.
* **not yet implemented** 
  A subdirectory `mysql` with database tables, containing everything SHEBANQ needs to construct its pages.
* **not yet implemented**
  A subdirectory `annotations`, containing bulk-uploadable annotation sets, that SHEBANQ can show in notes view,
  between the clause atoms of the text.

In [1]:
import os,sys,collections
from pipeline import webPipeline
from tf.fabric import Fabric

# Config

In [2]:
if 'SCRIPT' not in locals(): 
    SCRIPT = False

In [3]:
pipeline = dict(
    repoOrder = '''
        bhsa
        phono
        valence
        parallels
    ''',
    coreModule='core',
)

In [4]:
good = webPipeline(pipeline, version='c', force=False)


##############################################################################################
#                                                                                            #
#       0.00s Aggregate MLQ for version c                                                    #
#                                                                                            #
##############################################################################################

|       0.00s 	Already up to date
|       0.00s 	bzipping /Users/dirk/github/etcbc/bhsa/_temp/c/bhsa_xx.mql
|       0.00s 	and delivering as /Users/dirk/github/etcbc/bhsa/shebanq/c/bhsa_xx.mql.bz2
|       0.00s 	NOTE: Using existing bzipped file which is newer than unzipped one
