<img align="left" src="images/peshitta2.png"/>

<img align="right" src="images/tf-small.png"/>
<img align="right" src="images/etcbc.png"/>

Here are a few examples of processing the CALAP text by means of Text-Fabric.

About CALAP and how we got it into Text-Fabric, see [tfFromMql.ipynb](tfFromMql.ipynb).

In [1]:
import os,sys,collections
from tf.fabric import Fabric

In [2]:
PROJECT = 'calap'
VERSION = '2014'

repoBase = os.path.expanduser('~/github/etcbc')
thisRepo = '{}/{}'.format(repoBase, PROJECT)

thisTemp = '{}/_temp/{}'.format(thisRepo, VERSION)

CALAP = '{}/tf/{}'.format(thisRepo, VERSION)

In [3]:
TF = Fabric(locations=CALAP, modules=[''])
api = TF.load('')

allFeatures = TF.explore(silent=False, show=True)
loadableFeatures = allFeatures['nodes'] + allFeatures['edges']
api = TF.load(loadableFeatures)
api.makeAvailableIn(globals())

This is Text-Fabric 3.1.5
Api reference : https://github.com/Dans-labs/text-fabric/wiki/Api
Tutorial      : https://github.com/Dans-labs/text-fabric/blob/master/docs/tutorial.ipynb
Example data  : https://github.com/Dans-labs/text-fabric-data

39 features found and 0 ignored
  0.00s loading features ...
   |     0.00s Feature overview: 36 for nodes; 1 for edges; 2 configs; 7 computed
  0.32s All features loaded/computed - for details use loadLog()
   |     0.00s Feature overview: 36 for nodes; 1 for edges; 2 configs; 7 computed
  0.00s loading features ...
   |     0.03s B analyzed_form        from /Users/dirk/github/etcbc/calap/tf/2014
   |     0.02s B determination        from /Users/dirk/github/etcbc/calap/tf/2014
   |     0.01s B emf                  from /Users/dirk/github/etcbc/calap/tf/2014
   |     0.01s B frv                  from /Users/dirk/github/etcbc/calap/tf/2014
   |     0.02s B gender               from /Users/dirk/github/etcbc/calap/tf/2014
   |     0.03s B is_apposit

# First verse in all formats

In order to show the syriac text, you need to install a font that has glyphs for the syriac unicode characters (0700 - 074F).
For example: Estrangelo Edessa from [Meltho](http://www.bethmardutho.org/index.php/resources/fonts.html).

In [4]:
F.vix.freqList()

(('-', 53409), ('J', 493), ('W', 8), ('G', 5), ('WR', 5))

In [6]:
for fmt in T.formats:
    print('{}'.format(fmt))
    print('\t{}'.format(T.text(range(1,12), fmt=fmt)))

text-orig-full
	ܘ ܡܠܟܐ ܕܘܝܕ ܣܐܒ ܘ ܥܠ ܒ ܫܢܝܐ ܘ ܡܟܣܝܢ ܗܘܘ 
text-trans-full
	W MLK> DWJD S>B W <L B CNJ> W MKSJN HWW 


# Generate full text locally

We generate the full text with book, chapter and verse divisions, transliterated and in unicode.

In [7]:
for fmt in T.formats:
    with open(f'{thisTemp}/{fmt}.txt', 'w') as fh:
        for b in F.otype.s('book'):
            book = T.sectionFromNode(b)[0]
            fh.write(f'\n\nBOOK {book}\n')
            for c in L.d(b, otype='chapter'):
                chapter = T.sectionFromNode(c)[1]
                fh.write(f'\n{book} {chapter}\n\n')
                for v in L.d(c, otype='verse'):
                    verse = T.sectionFromNode(v)[2]
                    text = T.text(L.d(v, otype='word'), fmt=fmt)
                    fh.write(f'{verse} {text}\n')

In [8]:
!head -n 10 {thisTemp}/text-orig-full.txt



BOOK I_Kings

I_Kings 1

1 ܘ ܡܠܟܐ ܕܘܝܕ ܣܐܒ ܘ ܥܠ ܒ ܫܢܝܐ ܘ ܡܟܣܝܢ ܗܘܘ ܠܗ ܒ ܠܒܘܫܐ ܘ ܠܐ ܫܚܢ 
2 ܘ ܐܡܪܘ ܠܗ ܥܒܕܘܗܝ ܗܐ ܥܒܕܝܟ ܩܕܡܝܟ ܢܒܥܘܢ ܠ ܡܪܢ ܡܠܟܐ ܥܠܝܡܬܐ ܒܬܘܠܬܐ ܘ ܬܩܘܡ ܩܕܡ ܡܠܟܐ ܘ ܬܗܘܐ ܠܗ ܡܫܡܫܢܝܬܐ ܘ ܬܫܟܒ ܒ ܥܘܒܟ ܘ ܢܫܚܢ ܠ ܡܪܢ ܡܠܟܐ 
3 ܘ ܒܥܘ ܥܠܝܡܬܐ ܕ ܫܦܝܪܐ ܒ ܟܠܗ ܬܚܘܡܐ ܕ ܐܝܣܪܝܠ ܘ ܐܫܟܚܘ ܠ ܐܒܝܫܓ ܫܝܠܘܡܝܬܐ ܘ ܐܝܬܝܘܗ ܠ ܡܠܟܐ 
4 ܘ ܥܠܝܡܬܐ ܫܦܝܪܐ ܗܘܬ ܒ ܚܙܘܗ ܛܒ ܘ ܗܘܬ ܠ ܡܠܟܐ ܡܫܡܫܢܝܬܐ ܘ ܡܫܡܫܐ ܠܗ ܘ ܡܠܟܐ ܠܐ ܝܕܥܗ 


In [9]:
!head -n 10 {thisTemp}/text-trans-full.txt



BOOK I_Kings

I_Kings 1

1 W MLK> DWJD S>B W <L B CNJ> W MKSJN HWW LH B LBWC> W L> CXN 
2 W >MRW LH <BDWHJ H> <BDJK QDMJK NB<WN L MRN MLK> <LJMT> BTWLT> W TQWM QDM MLK> W THW> LH MCMCNJT> W TCKB B <WBK W NCXN L MRN MLK> 
3 W B<W <LJMT> D CPJR> B KLH TXWM> D >JSRJL W >CKXW L >BJCG CJLWMJT> W >JTJWH L MLK> 
4 W <LJMT> CPJR> HWT B XZWH VB W HWT L MLK> MCMCNJT> W MCMC> LH W MLK> L> JD<H 


# Distribution of part of speech

In [4]:
F.psp.freqList()

(('noun', 18087),
 ('preposition', 13991),
 ('verb', 9744),
 ('conjunction', 7270),
 ('pronoun', 1553),
 ('adjective', 1463),
 ('negative', 1123),
 ('adverb', 511),
 ('interjection', 136),
 ('interrogative', 42))

# Distribution of phrase type

In [5]:
F.phrase_type.freqList()

(('VP', 9259),
 ('CP', 8441),
 ('PP', 7418),
 ('NP', 5452),
 ('NegP', 1113),
 ('PrNP', 1012),
 ('PPrP', 951),
 ('AdvP', 439),
 ('AdjP', 396),
 ('IPrP', 176),
 ('InjP', 136),
 ('DPrP', 85),
 ('InrP', 17))

In [5]:
!head -n 10 {my_file('calap_plainx.txt')}


I_Kings
I_Kings 1
1:1 ܘ ܡܠܟܐ ܕܘܝܕ ܣܐܒ ܘ ܥܠ ܒ ܫܢܝܐ ܘ ܡܟܣܝܢ ܗܘܘ ܠܗ ܒ ܠܒܘܫܐ ܘ ܠܐ ܫܚܢ 
1:2 ܘ ܐܡܪܘ ܠܗ ܥܒܕܘܗܝ ܗܐ ܥܒܕܝܟ ܩܕܡܝܟ ܢܒܥܘܢ ܠ ܡܪܢ ܡܠܟܐ ܥܠܝܡܬܐ ܒܬܘܠܬܐ ܘ ܬܩܘܡ ܩܕܡ ܡܠܟܐ ܘ ܬܗܘܐ ܠܗ ܡܫܡܫܢܝܬܐ ܘ ܬܫܟܒ ܒ ܥܘܒܟ ܘ ܢܫܚܢ ܠ ܡܪܢ ܡܠܟܐ 
1:3 ܘ ܒܥܘ ܥܠܝܡܬܐ ܕ ܫܦܝܪܐ ܒ ܟܠܗ ܬܚܘܡܐ ܕ ܐܝܣܪܝܠ ܘ ܐܫܟܚܘ ܠ ܐܒܝܫܓ ܫܝܠܘܡܝܬܐ ܘ ܐܝܬܝܘܗ ܠ ܡܠܟܐ 
1:4 ܘ ܥܠܝܡܬܐ ܫܦܝܪܐ ܗܘܬ ܒ ܚܙܘܗ ܛܒ ܘ ܗܘܬ ܠ ܡܠܟܐ ܡܫܡܫܢܝܬܐ ܘ ܡܫܡܫܐ ܠܗ ܘ ܡܠܟܐ ܠܐ ܝܕܥܗ 
1:5 ܘ ܐܕܘܢܝܐ ܒܪ ܚܓܝܬ ܡܬܪܘܪܒ ܘ ܐܡܪ ܐܢܐ ܐܡܠܟ ܘ ܥܒܕ ܠܗ ܡܪܟܒܬܐ ܘ ܦܪܫܐ ܘ ܚܡܫܝܢ ܓܒܪܝܢ ܕ ܪܗܛܝܢ ܗܘܘ ܩܕܡܘܗܝ 
1:6 ܘ ܠܐ ܟܐܐ ܒܗ ܐܒܘܗܝ ܡܢ ܝܘܡܘܗܝ ܘ ܐܡܪ ܠܗ ܡܛܠ ܡܢܐ ܗܟܢܐ ܥܒܕ ܐܢܬ ܘ ܐܦ ܗܘ ܫܦܝܪ ܗܘܐ ܒ ܚܙܘܗ ܛܒ ܘ ܠܗ ܝܠܕܬ ܒܬܪ ܐܒܫܠܘܡ 
1:7 ܘ ܗܘܘ ܦܬܓܡܘܗܝ ܥܡ ܝܘܐܒ ܒܪ ܨܘܪܝܐ ܘ ܥܡ ܐܒܝܬܪ ܟܗܢܐ ܘ ܡܥܕܪܝܢ ܒܬܪ ܐܕܘܢܝܐ 
