In [1]:
%load_ext autoreload
%autoreload 2

# Make a new data release of the current version of the NENA data

The NENA data should reside locally on your system.

By default, we assume it is under `~/github` and then under *org/repo/folder/version*
where

* *org* = `CambridgeSemiticsLab`
* *repo* = `nena_tf`
* *folder* = `tf`
* *version* = `alpha`

You pass *org*, *repo*, *folder*, *version* as parameters.
You can replace the `~/github` by passing a `source` parameter.

The data will be zipped into a file that will be attached to a new release on GitHub.
This zipfile is created by default in `~/Downloads`, but you can override this by passing a `dest` parameter.

We assume the tf data resides under `~/local/data` and we use `~/local/zips` as landing directory for the zip file. 

Note the parameter `3` after `VERSION` in the call of `releaseData()` below.
This indicates which part of the version number should be increased by one.
The parts behind it will be deleted.

Examples:

old version | method | new version
--- | --- | ---
v2.4.6 | 3 | v2.4.7
v2.4.6 | 2 | v2.5
v2.4.6 | 1 | v3
v2.4 | 3 | v2.4.1
v2 | 3 | v2.0.1
v2 | 2 | v2.1

In [2]:
from tf.advanced.repo import releaseData

ORG = "CambridgeSemiticsLab"
REPO = "nena_tf"
FOLDER = "tf"
VERSION = "alpha"
DATA_IN = "~/local/data"
DATA_ZIP = "~/local/zips"

releaseData(ORG, REPO, FOLDER, VERSION, 3, source=DATA_IN, dest=DATA_ZIP)

Create release data for CambridgeSemiticsLab/nena_tf/tf
Found 1 versions
zip files end up in ~/local/zips/CambridgeSemiticsLab-release/nena_tf
zipping CambridgeSemiticsLab/nena_tf alpha with  35 features ==> tf-alpha.zip
rate limit is 5000 requests per hour, with 5000 left for this hour
	connecting to online GitHub repo CambridgeSemiticsLab/nena_tf ... connected
Latest release = v0.1.3
New release = v0.1.4
tf-alpha.zip attached to release v0.1.4


True

# Make all search clients for the NENA dataset

Suppose a new release has been made of the NENA corpus data.
Now we want to regenerate its client search apps.

We assume the config data for the apps is in a local directory on the system.

CONFIG_DIR should have the same structure as
[layeredsearch](https://github.com/annotation/app-nena/tree/master/layeredsearch).
You can tweak it, but it should play nice with the client generation process.

We generate the client search data in a local directory on the system.

In [12]:
import os
from tf.client.make.build import makeSearchClients

APP_DIR = os.path.expanduser("~/local/app-nena")
OUTPUT_DIR = os.path.expanduser("~/local/lsOut")
DATASET = "nena"

In [13]:
makeSearchClients(DATASET, OUTPUT_DIR, APP_DIR, dataDir="~/local/tfNew")

This is Text-Fabric 8.5.10
Api reference : https://annotation.github.io/text-fabric/tf/cheatsheet.html

26 features found and 0 ignored
  0.00s loading features ...
   |     0.00s Dataset without structure sections in otext:no structure functions in the T-API
  1.94s All features loaded/computed - for details use loadLog()
  0.00s loading features ...
  0.34s All additional features loaded - for details use loadLog()




o-o-o-o-o-o-o phono o-o-o-o-o-o-o-o


Node type declared as result focus:

	sentence

Layers declared as visible in the result ('visible'):

	word/text, word/fuzzy

  0.46s Config written to file ~/local/lsOut/phono/corpus/config.js
  0.47s Make links ...
  0.47s links for types text, line
text                :    127 links
line                :   2587 links
  0.62s done
  0.62s Recording ...
  0.00s preparing ... 
  0.00s start recording
127 Women Do Things Best                                                            
    11s done
    11s Dumping ...
    11s wrap recorders for delivery
    11s 	word
    11s 		text
    11s 		full
    11s 		fuzzy
    12s 		lite
    12s 		cls
    12s 		lang
    12s 		speaker
    12s 	line
    12s 		number
    12s 	text
    12s 		title
    12s 		tid
    12s 		dialect
    12s 		place
    12s wrap accumulators for delivery
    12s 	word
    12s 		voice
    12s 		place
    12s 		manner
    12s 	line
    12s 	text
  0.00s Dumping data to compact json fil

# Ship all apps for all corpora

From now on we work in a setting where we ship client apps to GitHub pages of the `app-`*dataset* repo.

In [2]:
import collections
import os

from tf.client.make.build import Make

In [3]:
APPS = (
    ("bhsa", "structure"),
    ("missieven", "text"),
    ("nena", "phono"),
    ("nena", "fuzzy"),
)

In [4]:
APPS_BY_DATASET = (
    ("nena", ("fuzzy", "phono")),
    ("bhsa", ("structure",)),
    ("missieven", ("text",)),
)

In [5]:
for (dataset, apps) in APPS_BY_DATASET:
    nApps = len(apps)
    for app in apps:
        Mk = Make(dataset, app, debugState="off")
        Mk.ship(publish=nApps == 1)
    if nApps > 1:
        Mk = Make(dataset, None, debugState="off")
        Mk.publish()

Version went from `v074@2021-05-20T16:04:29` to `v075@2021-06-14T11:31:30`


Node type declared as result focus:

	sentence

Layers declared as visible in the result ('visible'):

	word/fuzzy, line/number, text/title, text/dialect

  0.35s Config written to file ~/github/annotation/app-nena/site/fuzzy/corpus/config.js
  0.35s Make links ...
  0.35s links for types text, line
text                :    126 links
line                :   2544 links
  0.47s done
  0.47s Recording ...
  0.00s preparing ... 
  0.00s start recording
126 Women Do Things Best                                                            
  1.49s done
  1.49s Dumping ...
  1.49s wrap recorders for delivery
  1.49s 	word
  1.49s 		fuzzy
  1.64s 	line
  1.65s 		number
  1.65s 	text
  1.65s 		title
  1.65s 		dialect
  1.65s wrap accumulators for delivery
  1.65s 	word
  1.65s 	line
  1.65s 	text
  0.00s Dumping data to compact json files
  0.01s Data texts-word-fuzzy stored in ~/github/annotation/app-nena/site/fuzzy/corpus/texts-word-fuzzy.js
  0.01s Data texts-line-number stored in ~/github/ann

Node type declared as result focus:

	sentence

Layers declared as visible in the result ('visible'):

	word/text, word/fuzzy

  0.43s Config written to file ~/github/annotation/app-nena/site/phono/corpus/config.js
  0.43s Make links ...
  0.44s links for types text, line
text                :    126 links
line                :   2544 links
  0.57s done
  0.57s Recording ...
  0.00s preparing ... 
  0.00s start recording
126 Women Do Things Best                                                            
  9.37s done
  9.37s Dumping ...
  9.37s wrap recorders for delivery
  9.37s 	word
  9.37s 		text
  9.53s 		full
  9.72s 		fuzzy
  9.89s 		lite
    10s 		cls
    10s 		pos
    10s 		lang
    10s 		speaker
    11s 	line
    11s 		number
    11s 	text
    11s 		title
    11s 		tid
    11s 		dialect
    11s 		place
    11s wrap accumulators for delivery
    11s 	word
    11s 		voice
    11s 		place
    11s 		manner
    11s 	line
    11s 	text
  0.00s Dumping data to compact json files
  0

Node type declared as result focus:

	sentence

Layers declared as visible in the result ('visible'):

	word/lex, word/phono, word/pdp, word/png, word/vs, word/vt, phrase/function, clause/rela, verse/number, chapter/number, book/book

  7.18s Config written to file ~/github/annotation/app-bhsa/site/structure/corpus/config.js
  7.18s Make links ...
  7.18s links for types book, chapter, verse
book                :     39 links
chapter             :    929 links
verse               :  23213 links
  8.39s done
  8.39s Recording ...
  0.00s preparing ... 
  0.00s start recording
 39 2_Chronicles                                                                    
    45s done
    45s Dumping ...
    45s wrap recorders for delivery
    45s 	word
    45s 		lex
    45s 		phono
    46s 		gloss
    47s 		pdp
    47s 	phrase
    47s 		ptype
    48s 	clause
    48s 		ttype
    48s 		ctype
    48s 	sentence
    48s 		number
    48s 	verse
    48s 		number
    48s 	chapter
    48s 		number
    48s 	

  0.80s Combining features ...
Node type declared as result focus:

	line

Layers declared as visible in the result ('visible'):

	word/text, word/kind, letter/author, letter/date, letter/place

  6.87s Config written to file ~/github/annotation/app-missieven/site/text/corpus/config.js
  6.87s Make links ...
  6.87s links for types volume, page
volume              :     13 links
page                :  10149 links
  7.19s done
  7.19s Recording ...
  0.00s preparing ... 
  0.00s start recording
  1 1
  2 2                                                                         
  3 3                                                                         
  4 4                                                                         
  5 5                                                                         
  6 6                                                                         
  7 7                                                                         
  8 8                

# Individual apps in debug mode

In [20]:
Mk = Make(*APPS[1], debugState="on")

# Load data

The Text-Fabric dataset is loaded.

In [21]:
Mk.loadTf()

In [22]:
A = Mk.A
api = A.api
Fs = api.Fs
F = api.F
L = api.L
T = api.T

In [23]:
T.sectionTypes

['volume', 'page', 'line']

# Configure

If you changed critical files in the layered search app (`mkdata.py` or `config.yaml`),
run this command to update the configuration inside the maker.

In [24]:
# do this if you have changed mkdata.py or config.yaml

Mk.config()

True

# Settings

Generate the settings for the client app, but do not dump them yet to file.
Also the legends are generated here, which might use the loaded data.

In [25]:
Mk.makeClientSettings()

  0.81s Combining features ...
Node type declared as result focus:

	line

Layers declared as visible in the result ('visible'):

	word/text, word/kind, letter/author, letter/date, letter/place



# Links

Generate links from section nodes to online locations of those sections.

This is done by simply calling the Text-Fabric API.

In [26]:
Mk.makeLinks()

  6.56s links for types volume, page
volume              :     13 links
page                :  10149 links
  6.85s done


# Record

Here we call the app-dependent function `record()`, 
which records the corpus data in `recorders` and `accumulators`.

Some layers can use the position data of other layers, and these layers are stored in accumulators.

Layers with their own position data are stored in recorders, they remember the node positions within
the stored material. This is a Text-Fabric device, see [Recorder](https://annotation.github.io/text-fabric/tf/convert/recorder.html).

In [27]:
Mk.config()
Mk.record()

  0.00s preparing ... 
  0.00s start recording
  1 1
  2 2                                                                         
  3 3                                                                         
  4 4                                                                         
  5 5                                                                         
  6 6                                                                         
  7 7                                                                         
  8 8                                                                         
  9 9                                                                         
 10 10                                                                        
 11 11                                                                       
 12 12                                                                       
 13 13                                                                       
  

# Dump data

The corpus texts and positions are derived from the recorders and accumulators, and written to file.

In [28]:
Mk.config()
Mk.dumpCorpus()

 1m 08s wrap recorders for delivery
 1m 08s 	word
 1m 08s 		text
 1m 15s 		kind
 1m 17s 	line
 1m 17s 		location
 1m 18s 	letter
 1m 18s 		page
 1m 18s 		author
 1m 18s 		date
 1m 18s 		place
 1m 18s 	volume
 1m 18s 		number
 1m 18s wrap accumulators for delivery
 1m 18s 	word
 1m 18s 	line
 1m 18s 	letter
 1m 18s 	volume
  0.00s Dumping data to compact json files
  0.32s Data texts-word-text stored in ~/github/annotation/app-missieven/site/text/corpus/texts-word-text.js
  0.36s Data texts-word-kind stored in ~/github/annotation/app-missieven/site/text/corpus/texts-word-kind.js
  0.38s Data texts-line-location stored in ~/github/annotation/app-missieven/site/text/corpus/texts-line-location.js
  0.38s Data texts-letter-page stored in ~/github/annotation/app-missieven/site/text/corpus/texts-letter-page.js
  0.38s Data texts-letter-author stored in ~/github/annotation/app-missieven/site/text/corpus/texts-letter-author.js
  0.38s Data texts-letter-date stored in ~/github/annotation/app-mis

True

# Dump config

The client settings, generated in an earlier step, are dumped to file, next to the corpus data.

In [29]:
Mk.dumpConfig()

  5.92s Config written to file ~/github/annotation/app-missieven/site/text/corpus/config.js


# Make

The client app is composed as an HTML page with CSS styles and a Javascript program,
and it is moved to the `site` directory of the repo.

We also set the debug flag reflecting how we initialized the maker: debug is on.

In [30]:
Mk.makeClient()
Mk.adjustDebug()

Copied static files
html file written to /Users/dirk/github/annotation/app-missieven/site/index.html
html file written to /Users/dirk/github/annotation/app-missieven/site/text/index.html
html file (for use with file://) written to /Users/dirk/github/annotation/app-missieven/site/text/index-local.html
Adjusting debug in /Users/dirk/github/annotation/app-missieven/site/text/js/defs.js
Debug set to true
