# Qere and Ketiv Investigation

26th January 2017

While examining perfective verbs with pronomial suffixes in Accordance's ETCBC module I noticed that the qere carry full morphological tagging while the ketiv do not. This notebook is to conduct the same investigation against the etcbc data directly via TF.

In [5]:
%load_ext autoreload
%autoreload 2

In [6]:
import sys, collections
from tf.fabric import Fabric
TF = Fabric(modules=['hebrew/etcbc4c'])

This is Text-Fabric 2.2.0
Api reference : https://github.com/ETCBC/text-fabric/wiki/Api
Tutorial      : https://github.com/ETCBC/text-fabric/blob/master/docs/tutorial.ipynb
Data sources  : https://github.com/ETCBC/text-fabric-data
Data docs     : https://etcbc.github.io/text-fabric-data
Shebanq docs  : https://shebanq.ancient-data.org/text
Slack team    : https://shebanq.slack.com/signup
Questions? Ask shebanq@ancient-data.org for an invite to Slack
108 features found and 0 ignored


Now load the lex and qere features.

In [40]:
api = TF.load('''g_lex_utf8 qere_utf8 g_word_utf8
                 gn nu ps qere_trailer_utf8''')
api.makeAvailableIn(globals())

  0.00s loading features ...
   |     0.00s Feature overview: 102 nodes; 5 edges; 1 configs; 7 computeds
  0.10s All features loaded/computed - for details use loadLog()


Find the first 20 words and make sure their g_lex_utf8 are pointed.

In [18]:
indent(reset=True)
i = 0
ms = 8
for n in range(1, ms):
    print('Word {} lex {}'.format(F.g_lex_utf8.v(n), F.g_word_utf8.v(n)))

Word בְּ lex בְּ
Word רֵאשִׁית lex רֵאשִׁ֖ית
Word בָּרָא lex בָּרָ֣א
Word אֱלֹה lex אֱלֹהִ֑ים
Word אֵת lex אֵ֥ת
Word הַ lex הַ
Word שָּׁמַי lex שָּׁמַ֖יִם


## Finding qere and ketiv
This shows that while the qere are pointed the ketiv are not. Regarding why the Accordance module only shows morphology on the qere I do not know but I imagine it is just how they set the data up in their module. In any case below is a bit of code that dumps all ketiv and qere and basic morphology.

In [45]:
indent(reset=True)

def get_morph(n):
    return (F.gn.v(n) + ' ' + F.nu.v(n) + ' ' + F.ps.v(n))
    
fmt = "{:10s}\t{:20s}\t{:20s}"
i = 0
ms = F.otype.maxSlot
print(fmt.format('Ketiv', 'Qere', 'Morphology'))
for n in range(1, ms):
    qere = F.qere_utf8.v(n)
    if qere is not None:
        ketiv = F.g_word_utf8.v(n)
        print(fmt.format(ketiv, qere, get_morph(n)))
        i += 1
info('Done and found {} qere'.format(i))


Ketiv     	Qere                	Morphology          
הוצא      	הַיְצֵ֣א            	m sg p2             
אהלה      	אָהֳלֹֽו            	m sg NA             
אהלה      	אָהֳלֹ֑ו            	m sg NA             
אהלה      	אָֽהֳלֹו֙           	m sg NA             
צביים     	צְבֹויִ֔ם           	unknown sg NA       
צביים     	צְבֹויִ֔ם           	unknown sg NA       
ו         	וַ                  	NA NA NA            
יישׂם     	יּוּשַׂ֤ם            	m sg p3             
גיים      	גֹויִם֙             	m pl NA             
צידה      	צָֽיִד              	m sg NA             
ו         	וְ                  	NA NA NA            
ישׁתחו    	יִֽשְׁתַּחֲו֤וּ      	m pl p3             
ב         	בָּ֣א               	m sg p3             
גד        	גָ֑ד                	m sg NA             
צוארו     	צַוָּארָ֖יו         	m pl NA             
אהלה      	אָֽהֳלֹ֔ו           	m sg NA             
יעישׁ     	יְע֥וּשׁ             	m sg NA             
יעישׁ     	יְע֥וּשׁ             	m sg NA       