# Ecce - "Behold"

## Table of Contents

1. Overview
2. English Standard Version (data processing)
3. Nave's Topical Index
  - Data preprocessing
  - Exploratory Data Analysis
  - Modeling
4. Treasury of Scripture Knowledge
  - Data preprocessing
  - Exploratory Data Analysis
  - Modeling

In [1]:
from glob import glob
from ecce.utils import *

## English Standard Version

In [2]:
glob('ecce/data/ESV*')

['ecce/data/ESV.json']

In [3]:
import ecce.esv as esv
import ecce.reference as reference

list(esv.flattened_verses())[0:2]

INFO:root:Loading ESV JSON...


[('Judges',
  '11',
  '24',
  'Will you not possess what Chemosh your god gives you to possess? And all that the LORD our God has dispossessed before us, we will possess.'),
 ('Judges',
  '11',
  '25',
  'Now are you any better than Balak the son of Zippor, king of Moab? Did he ever contend against Israel, or did he ever go to war with them?')]

In [4]:
esv.text(reference.init('Genesis', 1, 1))

'In the beginning, God created the heavens and the earth.'

## Nave's Topical Index

In [5]:
glob('ecce/data/nave/[atsnrc]*')

['ecce/data/nave/sample_sql.txt',
 'ecce/data/nave/altered_nave.dat',
 'ecce/data/nave/topicxref.txt',
 'ecce/data/nave/subtopics.txt',
 'ecce/data/nave/topics.txt',
 'ecce/data/nave/nave.dat',
 'ecce/data/nave/readme.txt',
 'ecce/data/nave/categories.txt']

In [6]:
import ecce.nave as nave
import ecce.model.nave.data as nave_data

nave.df().head()

Unnamed: 0,topic_key,category_key,subtopic_key,subtopic_text,sort_order_subtopic,source_topic_key,reference_list,topic_name,category_text,sort_order_category
0,-2146356859,-2108238392,0,General reference(s) to this category,0,$$T0001327,"Ex32:19,25",DANCING,Idolatrous,3.0
1,-2146356859,-1508129049,0,General reference(s) to this category,0,$$T0001327,Mt14:6; Mr6:22,DANCING,Herodias dances in the presence of Herod (Anti...,2.0
2,-2146356859,696243684,0,General reference(s) to this category,0,$$T0001327,Ex15:20; Ex32:19; Jud11:34; Jud21:19-21; 1Sa18...,DANCING,General scriptures concerning,1.0
3,-2145736290,-148185818,0,General reference(s) to this category,0,$$T0001234,Le11:17; De14:17; Isa34:11; Zep2:14,CORMORANT,A bird forbidden as food,1.0
4,-2145523250,1308325410,0,General reference(s) to this category,0,$$T0000957,Ac8:27,CANDACE,Queen of Ethiopia,1.0


In [7]:
nave.df().at[3, 'reference_list']

'Le11:17; De14:17; Isa34:11; Zep2:14'

In [8]:
nave.parse(nave.df().at[3, 'reference_list'])

[Reference(book='Leviticus', chapter=11, verse=17),
 Reference(book='Deuteronomy', chapter=14, verse=17),
 Reference(book='Isaiah', chapter=34, verse=11),
 Reference(book='Zephaniah', chapter=2, verse=14)]

In [9]:
nave.init()[1000:1002]

[(Reference(book='1 Corinthians', chapter=9, verse=13),
  {'topic_key': -2054210176,
   'category_key': -529042571,
   'subtopic_key': 1255183917,
   'subtopic_text': 'In supporting himself',
   'sort_order_subtopic': 2,
   'source_topic_key': '$$T0001742',
   'reference_list': '1Co9:7-23',
   'topic_name': 'EVIL',
   'category_text': 'INSTANCES OF',
   'sort_order_category': 2.0}),
 (Reference(book='1 Corinthians', chapter=9, verse=14),
  {'topic_key': -2054210176,
   'category_key': -529042571,
   'subtopic_key': 1255183917,
   'subtopic_text': 'In supporting himself',
   'sort_order_subtopic': 2,
   'source_topic_key': '$$T0001742',
   'reference_list': '1Co9:7-23',
   'topic_name': 'EVIL',
   'category_text': 'INSTANCES OF',
   'sort_order_category': 2.0})]

In [10]:
nave_data.frame().iloc[1000:1010]



Unnamed: 0,book,chapter,verse,topics,text
1004,1 Corinthians,3,18,"[PRIDE, HUMILITY, WISDOM, PARADOX]",Let no one deceive himself. If anyone among yo...
1005,1 Corinthians,3,19,"[QUOTATIONS, ALLUSIONS, WISDOM, IGNORANCE]",For the wisdom of this world is folly with God...
1006,1 Corinthians,3,20,"[WISDOM, ALLUSIONS, QUOTATIONS, VANITY, GOD, H...","and again, ""The Lord knows the thoughts of the..."
1007,1 Corinthians,3,21,"[CONFIDENCE, Christian, RIGHTEOUS, DEATH, MINI...",So let no one boast in men. For all things are...
1008,1 Corinthians,3,22,"[DEATH, GOD, RIGHTEOUS]",whether Paul or Apollos or Cephas or the world...
1009,1 Corinthians,3,23,"[JESUS, RIGHTEOUS, DEATH, THE CHRIST, GOD]","and you are Christ's, and Christ is God's."
1010,1 Corinthians,4,1,"[ZEAL, Christian, STEWARD, MINISTER, SERVANT]","This is how one should regard us, as servants ..."
1011,1 Corinthians,4,2,"[FAITHFULNESS, ZEAL, Christian, STEWARD, MINIS...","Moreover, it is required of stewards that they..."
1012,1 Corinthians,4,3,[ZEAL],But with me it is a very small thing that I sh...
1013,1 Corinthians,4,4,"[JESUS, THE CHRIST, ZEAL]","I am not aware of anything against myself, but..."


In [11]:
from funcy import flatten, compose

topic_count = compose(len, set, flatten, attr('topics'))

raw_length = topic_count(nave_data.frame())
filtered_length = topic_count(nave_data.filtered_frame(min_per_topic=30))

print(f'Filtering topics (verses >= 30) changes count from {raw_length} to {filtered_length}')

Filtering topics (verses >= 30) changes count from 4398 to 853


In [12]:
nave_data.print_topic_graph()

# Topics per Verse
###################################################################
████████████                                        1011  1 topic  
██████████████████████████                          2192  2 topics 
███████████████████████████████████████             3238  3 topics 
████████████████████████████████████████████████    3966  4 topics 
██████████████████████████████████████████████████  4063  5 topics 
██████████████████████████████████████████████      3755  6 topics 
███████████████████████████████████████             3176  7 topics 
████████████████████████████████                    2631  8 topics 
███████████████████████                             1944  9 topics 
█████████████████                                   1443  10 topics
████████████                                        1033  11 topics
█████████                                            778  12 topics
██████                                               536  13 topics
████                         

In [13]:
nave_data.verse_counts().sort_values(by='verse_count', ascending=False).iloc[0:15]

Unnamed: 0,topic_name,verse_count
3536,GOD,6047
3968,JESUS,5069
3137,THE CHRIST,5069
836,ISRAEL,5008
1062,PSALMS,2738
3637,DAVID,2336
1938,MINISTER,2220
3679,CHURCH,2198
2799,Christian,2155
4079,ADVERSITIES,1856


In [14]:
nave_data.topic_counts().sort_values(by='topic_count', ascending=False).iloc[0:10]

Unnamed: 0,verse,topic_count
24276,Matthew 27:24,27
24257,Matthew 27:5,27
7036,Acts 26:18,27
24692,Nehemiah 8:7,26
3748,2 Chronicles 17:8,26
24689,Nehemiah 8:4,26
24256,Matthew 27:4,26
24255,Matthew 27:3,25
5066,2 Kings 19:37,24
5127,2 Kings 22:14,23


In [15]:
esv.text(reference.init('Matthew', 27, 24))

'So when Pilate saw that he was gaining nothing, but rather that a riot was beginning, he took water and washed his hands before the crowd, saying, "I am innocent of this man\'s blood; see to it yourselves."'

In [16]:
print(set([d['topic_name'] for d in nave.by_reference()['Matthew'][27][24]]))

{'PURIFICATION', 'HYPOCRISY', 'WASHING', 'HAND', 'COURT', 'INNOCENCY', 'BLOOD', 'PILATE, PONTIUS', 'HOMICIDE', 'OPINION, PUBLIC', 'GOVERNMENT', 'JESUS, THE CHRIST', 'RESPONSIBILITY', 'POLITICS', 'PRAYER', 'CAPERNAUM', 'DEMAGOGISM', 'BARABBAS', 'JUDGE', 'PRISONERS', 'COMPLICITY', 'VERDICT', 'ABLUTION', 'MONTH', 'RULERS'}


In [17]:
nave.by_topic_nodes().head()

Unnamed: 0,id,label,reference_count
0,tpc:-2146356859,DANCING,28
1,tpc:-2145736290,CORMORANT,4
2,tpc:-2145523250,CANDACE,1
3,tpc:-2145052645,CHRONOLOGY,1
4,tpc:-2144306684,GINNETHON,3


In [18]:
nave.by_topic()['CORMORANT']['A bird forbidden as food']['General reference(s) to this category']

[Passage(name='Leviticus 11:17', references=[Reference(book='Leviticus', chapter=11, verse=17)], text='the little owl, the cormorant, the short-eared owl,'),
 Passage(name='Deuteronomy 14:17', references=[Reference(book='Deuteronomy', chapter=14, verse=17)], text='and the tawny owl, the carrion vulture and the cormorant,'),
 Passage(name='Isaiah 34:11', references=[Reference(book='Isaiah', chapter=34, verse=11)], text='But the hawk and the porcupine shall possess it, the owl and the raven shall dwell in it. He shall stretch the line of confusion over it, and the plumb line of emptiness.'),
 Passage(name='Zephaniah 2:14', references=[Reference(book='Zephaniah', chapter=2, verse=14)], text='Herds shall lie down in her midst, all kinds of beasts; even the owl and the hedgehog shall lodge in her capitals; a voice shall hoot in the window; devastation will be on the threshold; for her cedar work will be laid bare.')]

In [19]:
import ecce.passage as passage

nave.topics_frame(passage.init([reference.init('Leviticus', 11, 17)]))

Unnamed: 0,id,label,reference_count,references
1,tpc:-2145736290,CORMORANT,4,"((Deuteronomy, 14, 17), (Isaiah, 34, 11), (Zep..."
17,tpc:-2134662781,OWL,12,"((Isaiah, 34, 13), (Leviticus, 11, 17), (Isaia..."
295,tpc:-1883475154,FOOD,111,"((1 Samuel, 25, 18), (Romans, 14, 14), (Leviti..."
736,tpc:-1469473955,SANITATION,545,"((Deuteronomy, 14, 26), (Leviticus, 13, 39), (..."
2286,tpc:-46757338,BIRDS,138,"((Jeremiah, 5, 28), (Genesis, 1, 24), (Jeremia..."
3499,tpc:1093495174,UNCLEAN,76,"((Deuteronomy, 14, 13), (Leviticus, 11, 26), (..."
4141,tpc:1688371376,ANIMALS,648,"((Exodus, 9, 5), (Numbers, 22, 24), (Job, 40, ..."


### Data Prep

In [20]:
nave.extract_topics_of(dict(topic_name='JESUS, THE CHRIST'))

['JESUS', 'THE CHRIST']

In [21]:
nave_data.tokenize(['In the beginning, God created the heavens and the earth.'])

<1x13337 sparse matrix of type '<class 'numpy.int64'>'
	with 8 stored elements in Compressed Sparse Row format>

In [22]:
train_text, test_text, train_topics, test_topics = nave_data.data_split()

print('train_text', train_text.shape)
print('train_topics', train_topics.shape)
print('test_text', test_text.shape)
print('test_topics', test_topics.shape)

train_text (24515, 13337)
train_topics (24515, 853)
test_text (6129, 13337)
test_topics (6129, 853)


### Model Prediction

In [23]:
from ecce.model.nave.model import NaveModel

nm = NaveModel()
nm.load_weights('ecce/data/checkpoints/nave-4576e8.hdf5')
nm.predict('What is man that you are mindful of him?', threshold=0.02)

Using TensorFlow backend.


[TopicResult(probability=0.1233959, id='tpc:1494754856', label='QUOTATIONS AND ALLUSIONS'),
 TopicResult(probability=0.08785395, id='tpc:1487961918', label='JOB'),
 TopicResult(probability=0.071867436, id='tpc:-175446433', label='PSALMS'),
 TopicResult(probability=0.04190604, id='tpc:-817545140', label='MAN'),
 TopicResult(probability=0.04133786, id='tpc:-1808138860', label='PRAISE'),
 TopicResult(probability=0.032243002, id='tpc:-1481495806', label='PROPHECY'),
 TopicResult(probability=0.024686696, id='tpc:1455684257', label='CONDESCENSION OF GOD'),
 TopicResult(probability=0.02251567, id='tpc:-1195302428', label='SIN (1)'),
 TopicResult(probability=0.021504914, id='tpc:1324765133', label='RELIGION')]

In [24]:
nm.evaluate()

INFO:root:Splitting train/val/test data...


Evaluating...
Accuracy 13.61%


### Result Mapping

In [25]:
nave.topics_matching_extracted('SIMON')

Unnamed: 0,id,label,reference_count,simple_label
2780,tpc:438585048,SIMON,40,SIMON
4432,tpc:1955587128,PARSIMONY (STINGINESS),13,PARSIMONY STINGINESS
4337,tpc:1868692075,SIMONY,2,SIMONY


In [26]:
nave.best_match_topic_for('JESUS')

{'id': 'tpc:-314322582',
 'label': 'JESUS, THE CHRIST',
 'reference_count': 10151,
 'simple_label': 'JESUS, THE CHRIST'}

## Treasury of Scripture Knowledge

In [27]:
glob('ecce/data/tsk/*')

['ecce/data/tsk/tskxref.txt',
 'ecce/data/tsk/parsed.csv',
 'ecce/data/tsk/readme.txt']

In [28]:
import ecce.tsk as tsk
import ecce.model.tsk.data as tsk_data

tsk.df().head()

Unnamed: 0,book,chapter,verse,sort_order,phrase,reference_list
0,Genesis,1,1,2,beginning,pr 8:22-24;pr 16:4;mr 13:19;joh 1:1-3;heb 1:10...
1,Genesis,1,1,3,God,ex 20:11;ex 31:18;1ch 16:26;ne 9:6;job 26:13;j...
2,Genesis,1,2,1,without,job 26:7;isa 45:18;jer 4:23;na 2:10
3,Genesis,1,2,2,Spirit,job 26:14;ps 33:6;ps 104:30;isa 40:12-14
4,Genesis,1,3,1,God,"ps 33:6,9;ps 148:5;mt 8:3;joh 11:43"


In [29]:
tsk.df().at[0, 'reference_list']

'pr 8:22-24;pr 16:4;mr 13:19;joh 1:1-3;heb 1:10;1jo 1:1'

In [30]:
tsk.parse(tsk.df().at[0, 'reference_list'])

[Reference(book='Proverbs', chapter=8, verse=22),
 Reference(book='Proverbs', chapter=8, verse=23),
 Reference(book='Proverbs', chapter=8, verse=24),
 Reference(book='Proverbs', chapter=16, verse=4),
 Reference(book='Mark', chapter=13, verse=19),
 Reference(book='John', chapter=1, verse=1),
 Reference(book='John', chapter=1, verse=2),
 Reference(book='John', chapter=1, verse=3),
 Reference(book='Hebrews', chapter=1, verse=10),
 Reference(book='1 John', chapter=1, verse=1)]

In [31]:
tsk.init().head()

Unnamed: 0,uuid,linked_book,linked_chapter,linked_verse,phrase,book,chapter,verse
0,93a9c7b7,Genesis,1,5,"Day, and",Genesis,8,22
1,93a9c7b7,Genesis,1,5,"Day, and",Psalms,19,2
2,93a9c7b7,Genesis,1,5,"Day, and",Psalms,74,16
3,93a9c7b7,Genesis,1,5,"Day, and",Psalms,104,20
4,93a9c7b7,Genesis,1,5,"Day, and",Isaiah,45,7


In [32]:
tsk.find_by_uuid('93a9c7b7')

Unnamed: 0,uuid,linked_book,linked_chapter,linked_verse,phrase,book,chapter,verse
0,93a9c7b7,Genesis,1,5,"Day, and",Genesis,8,22
1,93a9c7b7,Genesis,1,5,"Day, and",Psalms,19,2
2,93a9c7b7,Genesis,1,5,"Day, and",Psalms,74,16
3,93a9c7b7,Genesis,1,5,"Day, and",Psalms,104,20
4,93a9c7b7,Genesis,1,5,"Day, and",Isaiah,45,7
5,93a9c7b7,Genesis,1,5,"Day, and",Jeremiah,33,20
6,93a9c7b7,Genesis,1,5,"Day, and",1 Corinthians,3,13
7,93a9c7b7,Genesis,1,5,"Day, and",Ephesians,5,13
8,93a9c7b7,Genesis,1,5,"Day, and",1 Thessalonians,5,5


In [33]:
tsk.passages_by_uuid('93a9c7b7', include_text=True)[0:1]

[Passage(name='Genesis 8:22', references=[Reference(book='Genesis', chapter=8, verse=22)], text='While the earth remains, seedtime and harvest, cold and heat, summer and winter, day and night, shall not cease."')]

### Data Prep

In [34]:
train_text, test_text, train_clusters, test_clusters = tsk_data.data_split()

print('train_text', train_text.shape)
print('train_clusters', train_clusters.shape)
print('test_text', test_text.shape)
print('test_clusters', test_clusters.shape)

train_text (24831, 150)
train_clusters (24831, 63581)
test_text (6208, 150)
test_clusters (6208, 63581)


### Model Prediction

In [1]:
from ecce.model.tsk.model import ClusterModel

cm = ClusterModel()
cm.load_weights('ecce/data/checkpoints/tsk-cluster-8a1db9.hdf5')
result = cm.predict('What is man that you are mindful of him?', n_max=2)

Using TensorFlow backend.
INFO:root:Loading ESV JSON...


In [3]:
from pprint import pprint
import json

print_raw = compose(pprint, json.loads, json.dumps)

print_raw(result)

[[0.018426116555929184,
  '00b37c33',
  ['Isaiah', 2, 22],
  'Stop regarding man in whose nostrils is breath, for of what account is he?',
  [['Job 7:15-21',
    [['Job', 7, 15],
     ['Job', 7, 16],
     ['Job', 7, 17],
     ['Job', 7, 18],
     ['Job', 7, 19],
     ['Job', 7, 20],
     ['Job', 7, 21]],
    '15 so that I would choose strangling and death rather than my bones.\n'
    '16 I loathe my life; I would not live forever. Leave me alone, for my '
    'days are a breath.\n'
    '17 What is man, that you make so much of him, and that you set your heart '
    'on him,\n'
    '18 visit him every morning and test him every moment?\n'
    '19 How long will you not look away from me, nor leave me alone till I '
    'swallow my spit?\n'
    '20 If I sin, what do I do to you, you watcher of mankind? Why have you '
    'made me your mark? Why have I become a burden to you?\n'
    '21 Why do you not pardon my transgression and take away my iniquity? For '
    'now I shall lie in the eart

## Ecce Model

In [4]:
from ecce.model.ecce import EcceModel

m = EcceModel('ecce/data/checkpoints/nave-4576e8.hdf5',
              'ecce/data/checkpoints/tsk-cluster-8a1db9.hdf5')

result = m.predict('What is man that you are mindful of him?', top_clusters=2)



In [6]:
print_raw(result.topics)

[[0.1233958974480629, 'tpc:1494754856', 'QUOTATIONS AND ALLUSIONS'],
 [0.08785395324230194, 'tpc:1487961918', 'JOB'],
 [0.071867436170578, 'tpc:-175446433', 'PSALMS'],
 [0.04190604016184807, 'tpc:-817545140', 'MAN'],
 [0.04133785888552666, 'tpc:-1808138860', 'PRAISE'],
 [0.03224300220608711, 'tpc:-1481495806', 'PROPHECY'],
 [0.024686696007847786, 'tpc:1455684257', 'CONDESCENSION OF GOD'],
 [0.022515669465065002, 'tpc:-1195302428', 'SIN (1)'],
 [0.02150491438806057, 'tpc:1324765133', 'RELIGION']]


In [7]:
print_raw(result.clusters)

[[0.018426116555929184,
  '00b37c33',
  ['Isaiah', 2, 22],
  'Stop regarding man in whose nostrils is breath, for of what account is he?',
  [['Job 7:15-21',
    [['Job', 7, 15],
     ['Job', 7, 16],
     ['Job', 7, 17],
     ['Job', 7, 18],
     ['Job', 7, 19],
     ['Job', 7, 20],
     ['Job', 7, 21]],
    '15 so that I would choose strangling and death rather than my bones.\n'
    '16 I loathe my life; I would not live forever. Leave me alone, for my '
    'days are a breath.\n'
    '17 What is man, that you make so much of him, and that you set your heart '
    'on him,\n'
    '18 visit him every morning and test him every moment?\n'
    '19 How long will you not look away from me, nor leave me alone till I '
    'swallow my spit?\n'
    '20 If I sin, what do I do to you, you watcher of mankind? Why have you '
    'made me your mark? Why have I become a burden to you?\n'
    '21 Why do you not pardon my transgression and take away my iniquity? For '
    'now I shall lie in the eart

In [9]:
print_raw(result.passage_topics)

[[0.12403968987829263, 'tpc:-1776404894', 'PHILOSOPHY'],
 [0.12403968987829263, 'tpc:1071709628', 'ELIHU'],
 [0.12403968987829263, 'tpc:1528792171', 'INFIDELITY'],
 [0.07421056689121147, 'tpc:-1561663083', 'MOABITES'],
 [0.07421056689121147, 'tpc:-1347520606', 'PISGAH'],
 [0.07421056689121147, 'tpc:916989502', 'TRUTH'],
 [0.07421056689121147, 'tpc:1180418977', 'WORLDLINESS'],
 [0.07421056689121147, 'tpc:1860581920', 'SORCERY'],
 [0.07421056689121147, 'tpc:-2015850403', 'SECRET'],
 [0.07421056689121147, 'tpc:-17202098', 'CHURCH AND STATE']]
