![pipeline](pictures/pictures.003.png)

# SHEBANQ from Text-Fabric

This notebook assembles data from relevant Github repositories of the ETCBC.
It selects the data that is needed for the website
[SHEBANQ](https://shebanq.ancient-data.org).


## Pipeline
This is **pipe 2** of the pipeline from ETCBC data to the website SHEBANQ.

A run of this pipe produces shebanq data according to a chosen *version*.
It should be run whenever there are new or updated data sources present that affect the output data.
Since all input data is delivered in Github repositories, we have excellent machinery to 
work with versioning.

Which directories the pipe should access for which version is specified in the configuration below.

### Core data
The core data is what resides in 
the Github repo [bhsa](https://github.com/ETCBC/bhsa) in directory `tf`.

This data will be converted by notebook `tfFromMQL` in its `programs` directory.

The result of this action will be an updated TF resource in its 
`tf/core` directory, under the chosen *version*.

### Additional data

The pipe will try to load any text-fabric data features found in the `tf` subdirectories
of the designated additional repos.
It will descend one level deeper, according to the chosen *version*.

### Resulting data
The resulting data will be delivered in the `shebanq` subdirectory of the core repo `bhsa`, 
and then under the chosen *version* subdirectory.

The resulting data consists of three parts:

* One big mql file, containing the core data plus **all** additions: `bhsa-xx.mql`.
  It will be bzipped.
* **not yet implemented** 
  A subdirectory `mysql` with database tables, containing everything SHEBANQ needs to construct its pages.
* **not yet implemented**
  A subdirectory `annotations`, containing bulk-uploadable annotation sets, that SHEBANQ can show in notes view,
  between the clause atoms of the text.

In [1]:
import os,sys,collections
from pipeline import webPipeline
from tf.fabric import Fabric

# Config

In [2]:
if 'SCRIPT' not in locals(): 
    SCRIPT = False

In [3]:
pipeline = dict(
    repoOrder = '''
        bhsa
        phono
        valence
        parallels
    ''',
    coreModule='core',
)

In [4]:
good = webPipeline(pipeline, version='c', force=False)


##############################################################################################
#                                                                                            #
#       0.00s Aggregate MLQ for version c                                                    #
#                                                                                            #
##############################################################################################

|       0.00s 	Already up to date
|       0.00s 	bzipping /Users/dirk/github/etcbc/bhsa/_temp/c/bhsa_xx.mql
|       0.00s 	and delivering as /Users/dirk/github/etcbc/bhsa/shebanq/c/bhsa_xx.mql.bz2
|       0.00s 	NOTE: Using existing bzipped file which is newer than unzipped one
