# Greek and Hebrew Wordsearch in Jupyter

19th March 2017

This is a test to see if we can use the wordsearch generator and render the result inline in a notebook.

Please follow this yellow-brick road carefully. The error handling is not up to snuff yet. Mismatching things like Greek verse references with Hebrew corpora designations might be interesting. Do not expect repeated generations of the wordsearches for the same verse to yield identical grids. There are random selections in the layout algorithm, and apparently the word list sorts are unstable. You are possibly going to need to reset the notebook server on occasion to get it to reload state. Not sure what is going on there yet.

In [1]:
%load_ext autoreload
%autoreload 2

Import the necessary modules. In this case I have set PYTHONPATH to include the directories where these packages are located.

In [2]:
from bibleutils.versification import convert_refs, expand_refs, parse_refs, \
                                     ReferenceFormID
from puzzles.core.etcbc import Corpus, get_words
from puzzles.wordsearch.wordsearch import WordSearch

from IPython.display import HTML, display

## Greek Wordsearch

Now generate a wordsearch grid with words drawn from a couple of verses of Luke.

In [3]:
refs = expand_refs(convert_refs(parse_refs('Luke 1:2,5', form='ETCBCG'),
                   ReferenceFormID.ETCBCG))
        
# Get the list of words from TF
sections = []
[sections.append((r.st_book, r.st_ch, r.st_vs)) for r in refs ]
word_list = get_words(*sections, work=Corpus.GREEK)

# Note that the rows and columns values (here both 10) are ignored by the current placement algorithm
ws = WordSearch(set(word_list), 10, 10, WordSearch.LTR)

Dump a few basic statistics to show things are as expected.

In [4]:
print(f'stats={ws._stat_map}, number of words={len(ws._words)}, extents: right={ws._right}, bottom={ws._bottom}')

stats={'placedR': 12, 'placedRD': 21, 'placedD': 4}, number of words=37, extents: right=13, bottom=14


Now display the grid.

In [5]:
display(HTML(ws.get_grid('html')))

0,1,2,3,4,5,6,7,8,9,10,11,12,13
π,α,ρ,έ,δ,ο,σ,α,ν,κ,α,θ,ὼ,ς
ἐ,γ,ῴ,Ζ,Ἰ,ο,υ,δ,α,ί,α,ς,π,ἀ
θ,φ,ε,β,α,σ,ι,λ,έ,ω,ς,ὶ,ἐ,ξ
υ,Ἐ,η,ν,ἡ,χ,ὀ,Ἡ,α,λ,ό,γ,ο,υ
γ,α,λ,μ,ό,μ,α,ν,ρ,ὐ,τ,ο,ῦ,ί
α,ὐ,ὑ,ι,ε,μ,έ,ρ,ό,ῴ,τ,α,ῖ,ς
τ,τ,Ἐ,π,σ,ρ,ε,ρ,ί,μ,δ,ῆ,ἀ,π
έ,ό,ἱ,γ,η,ά,ί,ν,α,α,α,ο,ς,ί
ρ,π,Ἀ,ε,έ,ρ,β,α,ο,ι,ς,τ,υ,ὸ
ω,τ,ὄ,α,ρ,ν,έ,ε,ς,ι,ς,ῦ,ι,Ζ


Now display the list of words in the grid

In [6]:
ws.get_word_list()

['παρέδοσαν',
 'ἐφημερίας',
 'γενόμενοι',
 'θυγατέρων',
 'Ἐλισάβετ',
 'βασιλέως',
 'αὐτόπται',
 'Ἰουδαίας',
 'Ζαχαρίας',
 'ὑπηρέται',
 'Ἐγένετο',
 'ἡμέραις',
 'ὀνόματι',
 'ἱερεύς',
 'Ἡρῴδου',
 'Ἀαρών',
 'αὐτῆς',
 'ὄνομα',
 'λόγου',
 'καθὼς',
 'ἀρχῆς',
 'ταῖς',
 'αὐτῷ',
 'γυνὴ',
 'ἡμῖν',
 'Ἀβιά',
 'τῶν',
 'καὶ',
 'τῆς',
 'τοῦ',
 'τις',
 'ἐκ',
 'ἐν',
 'οἱ',
 'τὸ',
 'ἀπ',
 'ἐξ']

In [7]:
len(ws.get_word_list())

37

## Hebrew Wordsearch

In [8]:
refs = expand_refs(convert_refs(parse_refs('Genesis 1:1-3', form='ETCBCH'),
                   ReferenceFormID.ETCBCH))
        
# Get the list of words from TF
sections = []
[sections.append((r.st_book, r.st_ch, r.st_vs)) for r in refs ]
word_list = get_words(*sections, work=Corpus.HEBREW)

# Note that the rows and columns values (here both 10) are ignored by the current placement algorithm
ws = WordSearch(set(word_list), 10, 10, WordSearch.RTL)

Dump a few basic statistics to show things are as expected.

In [9]:
print(f'stats={ws._stat_map}, number of words={len(ws._words)}, extents: right={ws._right}, bottom={ws._bottom}')

stats={'placedD': 12, 'placedL': 4, 'placedLD': 16}, number of words=32, extents: right=0, bottom=10


Display it

In [10]:
display(HTML(ws.get_grid('html')))

0,1,2,3,4,5,6,7,8,9
יִ,ר,ו,אֹֽ,ם,י,הִ֔,לֹ,אֱ,רֵ
עַ,בְּ,ה,תָ֥,יְ,הָ,תְ,לֹ,מְ,א
פֶ,וַֽ,מָּֽ,אָ֗,בָּ,הֹ֑,הִ֖,אֱ,רַ,שִׁ֖
שָּׁ,יִ,רֶ,רָ֣,ו,י,לֹ,הַ,חֶ֖,י
ם,ץ,א,ם,ם,הִ֑,יֹּ֥,שָּׁ,פֶ,ת
לֹ,וַ,תֹ֨,פְּ,י,יְ,א,מַ֖,ת,פְּ
ם,ה,נֵ֣,ם,הִ,אֹ֑,מֶ,יִ,אָֽ,נֵ֥
וּ֙,י,יְ,י,חֹ֖,ו,ר,ם,רֶ,י
תְ,הִ֣,וְ,עַ,שֶׁ,ר,אֵ֥,בֹ֔,ץ,ר֣
י,וּ֙,ךְ,ל,ךְ,ת,ה,אָֽ,וָ,וּ


You will notice that in the Hebrew case every difference in pointing is treated as a different word. Obviously that is not desirable and I will fix it. 

But that's it for now !