# Words in the Psalms

In this simple notebook we will get all the words in the Psalms and export them to csv.

There will be 3 columns, with a row for each word:
* utf8 word
* ascii transcription of word
* english gloss of the lexeme

In [1]:
import csv

# import text-fabric
from tf.fabric import Fabric

# get data processor
TF = Fabric(modules='hebrew/etcbc4c')

# load node features 
api = TF.load('''
              book chapter verse
              g_word_utf8 g_cons gloss
              ''')

# globalize TF methods
api.makeAvailableIn(globals())

This is Text-Fabric 2.3.7
Api reference : https://github.com/ETCBC/text-fabric/wiki/Api
Tutorial      : https://github.com/ETCBC/text-fabric/blob/master/docs/tutorial.ipynb
Data sources  : https://github.com/ETCBC/text-fabric-data
Data docs     : https://etcbc.github.io/text-fabric-data
Shebanq docs  : https://shebanq.ancient-data.org/text
Slack team    : https://shebanq.slack.com/signup
Questions? Ask shebanq@ancient-data.org for an invite to Slack
109 features found and 0 ignored
  0.00s loading features ...
   |     0.01s B book                 from /Users/Cody/github/text-fabric-data/hebrew/etcbc4c
   |     0.00s B chapter              from /Users/Cody/github/text-fabric-data/hebrew/etcbc4c
   |     0.01s B verse                from /Users/Cody/github/text-fabric-data/hebrew/etcbc4c
   |     0.15s B g_cons               from /Users/Cody/github/text-fabric-data/hebrew/etcbc4c
   |     0.19s B g_word_utf8          from /Users/Cody/github/text-fabric-data/hebrew/etcbc4c
   |     0.00s

In [2]:
# get the Psalms book node number
psalms = T.nodeFromSection(('Psalms',))

# get all word nodes in the Psalms
psalms_words = L.d(psalms, otype='word')

print(f'{len(psalms_words)} words loaded from the Psalms...')

25371 words loaded from the Psalms...


In [3]:
# get the word data

# list to hold rows
word_rows = []

# iterate, call features on words, and append to word_rows
for word in psalms_words:
    
    # get lex object stored on word (for gloss)
    lex = L.u(word, otype='lex')[0] # returns tuple with 1 item; get it with index
    
    # call features on the word nodes
    utf = F.g_word_utf8.v(word)
    transliterated = F.g_cons.v(word)
    gloss = F.gloss.v(lex) # with lex object
    
    # gather into tuple
    word_data = (utf, transliterated, gloss)
    
    # append to rows list
    word_rows.append(word_data)

In [4]:
# example 1
print(word_rows[0])

print()

# example 2
print(word_rows[1])

('אַ֥שְֽׁרֵי', '>CRJ', 'happiness')

('הָ', 'H', 'the')


In [5]:
# export to csv

# header
header = ('utf8', 'transcription', 'gloss')

with open('psalms_words.csv', 'w') as file:
    
    writer = csv.writer(file)
    
    writer.writerow(header)
    writer.writerows(word_rows)