<img align="right" src="tf-small.png"/>

# Programming theologians

[Text-Fabric](https://github.com/ETCBC/text-fabric): Ancient texts as fabrics of source and annotations.

[data model](https://github.com/ETCBC/text-fabric/wiki/Data-model): Text objects, relationships, features.

Got it? Get it! [home page](https://github.com/ETCBC/text-fabric/wiki)

Join the computing

1. go to [https://shebanq.jove.surfsara.nl](https://shebanq.jove.surfsara.nl) and log in (see paper ticket)
1. select assignment `prog_theo`, fetch `leipzig` and click it
1. click `Programming theologians.ipynb` and off-you-go

![shot](jove.png)

# Before the beginning

In [None]:
import collections, pandas
import matplotlib.pyplot as plt
from IPython.display import display
%matplotlib inline
pandas.set_option('display.notebook_repr_html', True)

In [None]:
from tf.fabric import Fabric

ETCBC = 'hebrew/etcbc4c'
PHONO = 'hebrew/phono'

TF_H = Fabric( modules=[ETCBC, PHONO], silent=False )

In [None]:
TF_G = Fabric(modules='greek/sblgnt')

In [None]:
apiH = TF_H.load('sp')

In [None]:
apiG = TF_G.load('psp')

In [None]:
def doGreek():
    global T
    global L
    global F
    global Fs
    T = apiG.T
    L = apiG.L
    F = apiG.F
    Fs = apiG.Fs

def doHebrew():
    global T
    global L
    global F
    global Fs
    T = apiH.T
    L = apiH.L
    F = apiH.F
    Fs = apiH.Fs
    
def doingHebrew():
    return F is apiH.F
def doingGreek():
    return F is apiG.F

# In the beginning

The first verse

In [None]:
doHebrew()

In [None]:
T.text(range(1,12))

In [None]:
T.text(range(1,12), fmt='text-phono-full')

In [None]:
T.formats

In [None]:
T.text(range(1,12), fmt='lex-orig-plain')

In [None]:
doGreek()

In [None]:
firstVerse = T.nodeFromSection(('Matthew', 1, 1))
F.otype.v(firstVerse)

In [None]:
words = L.d(firstVerse, otype='word')
words

In [None]:
T.text(words)

In [None]:
T.formats

In [None]:
T.text(words, fmt='text-orig-plain')

In [None]:
T.text(words, fmt='lex-orig-full')

# Man and woman
God created the genders, we count them.

Which genders have we?

In [None]:
doHebrew()

TF_H.load('gn', add=True)

{F.gn.v(w) for w in F.otype.s('word')}

In [None]:
def getGenders():
    featureName = 'gn' if doingHebrew() else 'Gender'
    return {Fs(featureName).v(w) for w in F.otype.s('word')}
getGenders()

In [None]:
doGreek()
TF_G.load('Gender', add=True)
print(getGenders())
doHebrew()

In [None]:
def countGenders():
    featureName = 'gn' if doingHebrew() else 'Gender'
    stats = collections.Counter()
    for w in F.otype.s('word'):
        stats[Fs(featureName).v(w)] += 1
    print(stats)
countGenders()

## ... in graphic detail ...

In [None]:
def genderBias(book):
    bookNode = T.nodeFromSection((book,))
    chapterNodes = L.d(bookNode, otype='chapter')
    x = [T.sectionFromNode(c)[1] for c in chapterNodes]
    masc = dict((c, 0) for c in x)
    fem = dict((c, 0) for c in x)
    neut = dict((c, 0) for c in x)
    absent = dict((c, 0) for c in x)
    total = dict((c, 0) for c in x)

    genderFeature = 'gn' if doingHebrew() else 'Gender'

    for chapterNode in chapterNodes:
        chapter = T.sectionFromNode(chapterNode)[1]
        words = L.d(chapterNode, otype='word')
        for w in words:
            total[chapter] += 1
            gender = Fs(genderFeature).v(w)
            if gender in {'m', 'Masculine'}: masc[chapter] += 1
            if gender in {'f', 'Feminine'}: fem[chapter] += 1
            if gender in {'Neuter'}: neut[chapter] += 1
            if gender in {'NA', 'unknown', None}: absent[chapter] += 1
    m = [100 * masc[c] / total[c] for c in x]
    f = [100 * fem[c] / total[c] for c in x]
    n = [100 * neut[c] / total[c] for c in x]
    a = [100 * absent[c] / total[c] for c in x]

    fig = plt.figure()
    plt.plot(x, m, 'b-', x, f, 'r-', x, n, 'g-', x, a, '0.5')
    plt.axis([x[0], x[-1], 0, 70])
    plt.xticks(x, x, rotation='vertical')
    plt.margins(0.2)
    plt.subplots_adjust(bottom=0.15);
    plt.title('gender in {} {}-{}'.format(book, x[0], x[-1]))
    

In [None]:
genderBias('Leviticus')

## Inspect some peaks and dips

In [None]:
TF_H.load('gloss', add=True)

In [None]:
def atAGlance(book, chapter):
    words = L.d(T.nodeFromSection((book, chapter)), otype='word')
    freqs = collections.Counter()
    for w in words:
        if doingHebrew():
            lexeme = L.u(w, otype='lex')[0]
            freqs[F.gloss.v(lexeme)] += 1
        else:
            freqs[F.UnicodeLemma.v(w)] += 1
    for (gloss, freq) in sorted(freqs.items(), key=lambda x: (-x[1], x[0])):
        print('{:>3} {}'.format(freq, gloss))

In [None]:
        
def inDepth(book, chapter):
    chapterNode = T.nodeFromSection((book, chapter))
    verseNodes = L.d(chapterNode, otype='verse')
    for verseNode in verseNodes:
        words = L.d(verseNode, otype='word')
        print('{}: {}'.format(T.sectionFromNode(verseNode)[2], T.text(words)))  

In [None]:
genderBias('Leviticus')

In [None]:
atAGlance('Leviticus', 18)

In [None]:
genderBias('Leviticus')

In [None]:
atAGlance('Leviticus', 26)

In [None]:
inDepth('Leviticus', 26)

## Man, woman and thing

In [None]:
doGreek()

TF_G.load('UnicodeLemma', add=True)

The Greek genders

In [None]:
getGenders()

In [None]:
genderBias('Matthew')

In [None]:
atAGlance('Matthew', 24)

In [None]:
inDepth('Matthew', 24)

In [None]:
genderBias('John')

In [None]:
atAGlance('John', 16)


# Six days of work

Semantic plurals in the letter of Jude.

Let's get all nominal phrases.

In [None]:
doGreek()

TF_G.load('Cat', add=True)

bookNode = T.nodeFromSection(('Jude',))
phraseNodes = L.d(bookNode, otype='phrase')
NPs = [p for p in phraseNodes if F.Cat.v(p) == 'np']

print('{} NPs in Jude'.format(len(NPs)))

Export this data as CSV

so that expert can add a feature: *semantically plural*.

In [None]:
enrichFile = 'np.csv'
enrichedFile = 'np-enriched.csv'

with open(enrichFile, 'w') as f:
    fieldNames = ['passage', 'node', 'phrase', 'semantic plural', 'sentence']
    f.write('{}\n'.format('\t'.join(fieldNames)))
    for np in NPs:
        sn = L.u(np, otype='sentence')[0]
        sentence = L.d(sn, otype='word')
        phrase = L.d(np, otype='word')
        fields = [
            '{} {}:{}'.format(*T.sectionFromNode(np)),
            str(np),
            T.text(phrase),
            '',
            T.text(sentence),
        ]
        f.write('{}\n'.format('\t'.join(fields)))

dataFrame = pandas.read_csv(enrichFile, sep='\t')

dataFrame.head(100)

In [None]:
semNumber = dict()

with open(enrichedFile) as f:
    for (i, line) in enumerate(f):
        if i == 0: continue                    # header row

        fields = line.rstrip('\n').split(';')
        value = fields[3]
        if value == '': continue               # no data entered

        node = int(fields[1])
        semNumber[node] = value
        
for p in sorted(semNumber):
    print('{} => {}'.format(p, semNumber[p]))

Save the new feature as a text-fabric file.

In [None]:
metaData = dict(
    semNumber=dict(
        valueType='str',
        source='Semantic plurality training set',
        author='J.S. Bach, Leipzig',
    ),
)
TF_G = Fabric(locations='.', modules='semantic')
TF_G.save(
    nodeFeatures=dict(semNumber=semNumber),
    metaData=metaData,
)

In [None]:
!cat semantic/semNumber.tf

## Use the new feature

In [None]:
LOCATIONS = [
    '~/Downloads/text-fabric-data',
    '~/text-fabric-data',
    '~/github/text-fabric-data',
    '/mnt/shared/text-fabric-data',
]

TF_G = Fabric(
    locations=LOCATIONS+['.'], 
    modules=['greek/sblgnt', 'semantic'],
)

In [None]:
apiG = TF_G.load('Number semNumber')
doGreek()

In [None]:
for np in NPs:
    semNumber = F.semNumber.v(np)
    if not semNumber: continue
    words = L.d(np, otype='word')
    print('NP {}: semantically "{}", words marked as {}'.format(
        np,
        semNumber,
        ' '.join(F.Number.v(w) for w in words if F.Number.v(w)),
    ))

# Sabbath
Have a look at the (un)finished work and see whether it is good.

## Martijn Naaijer

Won a grassroots price for setting up a theology course based on SHEBANQ, Jupyter, and R.
See [Python course here](https://shebanq.jove.surfsara.nl/user/dirkr/notebooks/shared/martijn/Python_Course/Introduction_to_text_fabric.ipynb).

![poster](PosterGrassroots_Naaijer.jpg)

## Christiaan Erwich

Tries to track who is who in the Psalms, and is deeply into graph visualization.
![doxo](doxology.pdf)

## Cody Kingham

Helped to convert the SBL Greek New Testament to Text-Fabric format.
Tries to
[explain to the world](http://www.codykingham.com/etcbc/datacreation)
how the ETCBC encoded the Hebrew Bible after 40 years of struggling with computers.

![schema](ps4.p_description.png)

## Dirk Roorda

Tries to recombine everything.

[Phonetic transcription of Hebrew](https://rawgit.com/ETCBC/text-fabric/master/phono/phonoTf.html)

![phono](phono_tests.png)

[Parallel passages](https://shebanq.ancient-data.org/shebanq/static/docs/tools/parallel/parallels.html)

See it in action on SHEBANQ:
[etcbc4b Genesis 10:1](https://shebanq.ancient-data.org/hebrew/text?qactive=hlcustom&qsel_one=grey&qpub=x&qget=x&wactive=hlcustom&wsel_one=gray&wpub=x&wget=x&nactive=hlcustom&nsel_one=black&npub=x&nget=v&chapter=10&lang=en&book=Genesis&qw=q&tr=hb&tp=txt_tb1&iid=Mnxjcm9zc3JlZg__&verse=1&version=4b&mr=m&page=1&wd4_statfl=v&ph_arela=v&wd4_statrl=v&sn_an=v&cl=v&wd1_lang=x&wd1_subpos=x&wd2_person=v&sp_rela=v&wd1_pdp=x&sn_n=v&wd3_uvf=x&ph_fun=v&wd1_nmtp=v&gl=v&sp_n=v&pt=v&ph_an=v&ph_typ=x&cl_typ=v&tt=v&wd4_statro=x&wd3_vbs=x&wd1=v&tl=x&wd3=x&wd4=v&wd2_gender=v&ph=v&wd3_vbe=v&wd1_pos=v&ph_det=v&ph_rela=x&wd4_statfo=x&tl_tlv=x&wd2_stem=v&wd2_state=v&ht=v&ph_n=v&tl_tlc=x&cl_tab=v&wd3_nme=x&hl=v&cl_par=v&cl_an=v&cl_n=v&wd3_prs=v&wd3_pfm=x&sp=v&cl_code=v&ht_hk=v&wd2=v&hl_hlc=x&cl_rela=v&wd2_gnumber=v&wd2_tense=v&cl_txt=v&wd1_n=x&sn=v&ht_ht=v&hl_hlv=v&pref=alt)

![parallel](parallel.png)

# Thanks

Dirk Roorda

[Linguistic Annotation and Philology Workshop](http://www.dh.uni-leipzig.de/wo/laphw/)

Leipzig, July 6-7, 2017

![logo](DANS-logo.png)

[Data Archiving and Networked Services (DANS)](https://dans.knaw.nl/en/front-page?set_language=en)