<img align="right" src="images/tf-small.png" width="128"/>
<img align="right" src="images/etcbc.png"/>
<img align="right" src="images/dans-small.png"/>

You might want to consider the [start](search.ipynb) of this tutorial.

In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
from tf.app import use

In [3]:
A = use('bhsa', hoist=globals())

	connecting to online GitHub repo annotation/app-bhsa ... connected
Using TF-app in /Users/dirk/text-fabric-data/annotation/app-bhsa/code:
	#d3cf8f0c2ab5d690a0fda14ea31c33da5c5c8483 (latest commit)
	connecting to online GitHub repo etcbc/bhsa ... connected
Using data in /Users/dirk/text-fabric-data/etcbc/bhsa/tf/c:
	rv1.6 (latest release)
	connecting to online GitHub repo etcbc/phono ... connected
Using data in /Users/dirk/text-fabric-data/etcbc/phono/tf/c:
	r1.2 (latest release)
	connecting to online GitHub repo etcbc/parallels ... connected
Using data in /Users/dirk/text-fabric-data/etcbc/parallels/tf/c:
	r1.2 (latest release)
   |     0.00s No structure info in otext, the structure part of the T-API cannot be used


# Relations

So far we heve seen search templates specifying feature conditions on nodes 
and a bit of nesting of those nodes, with an occasional extra constraint on their 
positions.

We show some more possibilities.
An more thorough treatment is in [relations](searchRelations.ipynb).

We can refer to (spatial) relationships between nodes by means of extra constraints
of the form 

```
n relop m
```

where `n` and `m` are names of node parts in your template, and `relop` is the name of a relational operator.

Text-Fabric comes with a fixed bunch of spatial relational operators,
and your data set may contain *edge*-features, which correspond to additional relational operators.

You can get the list of all relops that you can currently use:

In [4]:
S.relationsLegend()

                      = left equal to right (as node)
                      # left unequal to right (as node)
                      < left before right (in canonical node ordering)
                      > left after right (in canonical node ordering)
                     == left occupies same slots as right
                     && left has overlapping slots with right
                     ## left and right do not have the same slot set
                     || left and right do not have common slots
                     [[ left embeds right
                     ]] left embedded in right
                     << left completely before right
                     >> left completely after right
                     =: left and right start at the same slot
                     := left and right end at the same slot
                     :: left and right start and end at the same slot
                     <: left immediately before right
                     :> left immediately after right
   

## Feature comparison

Note the operators that are surrounded by `. .` and have `f` and/or `g` and/or `r` in them.
You can supply any node feature `f` and `g` in your dataset, and any regular exprression `r`.

We start with looking for verb, noun pairs, adjacent, which have the same grammatical number.

Moreover, the noun must be part of the subject.

In [37]:
query = '''

phrase function=Subj
  w2:word sp=subs

w1:word sp=verb
<: w2

w1 .nu. w2
'''

In [38]:
results = A.search(query)

  1.32s 4150 results


In [39]:
A.show(results, condenseType='clause', end=4)

Now we want such pairs, but then where the grammatical number differs.

In [40]:
query = '''

phrase function=Subj
  w2:word sp=subs

w1:word sp=verb
<: w2

w1 .nu#nu. w2
'''

In [41]:
results = A.search(query)

  1.30s 1557 results


In [42]:
A.show(results, condenseType='clause', end=4)

and now where the subject is not God(s).

In [43]:
query = '''

phrase function=Subj
  w2:word sp=subs lex#>LHJM/

w1:word sp=verb
<: w2

w1 .nu#nu. w2
'''

In [44]:
results = A.search(query)

  1.68s 1350 results


In [45]:
A.show(results, condenseType='clause', end=4)

## Edges

Note that all *edge* features in the dataset correspond to three relational operators.
For example, `mother` gives rise to the operators `-mother>` and `<mother-` and `<mother>`.

### Simple edges
Here is an example: look for pairs of clauses of which one is the mother of the other.
In our dataset, there is an *edge* between the two clauses, and this edge is coded in the feature `mother`.
The following query shows how to use the `mother` edge information.

In [5]:
query = '''
clause
-mother> clause
'''
results = A.search(query)
A.table(results, end=10)
total = len(results)

  0.25s 13897 results


n,p,clause,clause.1
1,Genesis 1:4,כִּי־טֹ֑וב,וַיַּ֧רְא אֱלֹהִ֛ים אֶת־הָאֹ֖ור
2,Genesis 1:10,כִּי־טֹֽוב׃,וַיַּ֥רְא אֱלֹהִ֖ים
3,Genesis 1:12,כִּי־טֹֽוב׃,וַיַּ֥רְא אֱלֹהִ֖ים
4,Genesis 1:14,לְהַבְדִּ֕יל בֵּ֥ין הַיֹּ֖ום וּבֵ֣ין הַלָּ֑יְלָה,יְהִ֤י מְאֹרֹת֙ בִּרְקִ֣יעַ הַשָּׁמַ֔יִם
5,Genesis 1:15,לְהָאִ֖יר עַל־הָאָ֑רֶץ,וְהָי֤וּ לִמְאֹורֹת֙ בִּרְקִ֣יעַ הַשָּׁמַ֔יִם
6,Genesis 1:17,לְהָאִ֖יר עַל־הָאָֽרֶץ׃,וַיִּתֵּ֥ן אֹתָ֛ם אֱלֹהִ֖ים בִּרְקִ֣יעַ הַשָּׁמָ֑יִם
7,Genesis 1:17,וְלִמְשֹׁל֙ בַּיֹּ֣ום וּבַלַּ֔יְלָה,לְהָאִ֖יר עַל־הָאָֽרֶץ׃
8,Genesis 1:18,וּֽלֲהַבְדִּ֔יל בֵּ֥ין הָאֹ֖ור וּבֵ֣ין הַחֹ֑שֶׁךְ,וְלִמְשֹׁל֙ בַּיֹּ֣ום וּבַלַּ֔יְלָה
9,Genesis 1:18,כִּי־טֹֽוב׃,וַיַּ֥רְא אֱלֹהִ֖ים
10,Genesis 1:21,כִּי־טֹֽוב׃,וַיַּ֥רְא אֱלֹהִ֖ים


A clause and its mother do not have to be in the same verse.
We are going to fetch are the cases where they are in different verses.

Note that we need a more flexible syntax here, where we specify a few templates, give names
to a few positions in the template, and then constrain those positions
by stipulating relationships between them.

> **Caution**
Referring to verses is not as innocent as it seems.
That will be addressed in [gaps](searchGaps.ipynb)

In [6]:
query = '''
v1:verse
    c1:clause
v2:verse
    c2:clause

c1 -mother> c2
v1 # v2
'''
results = A.search(query)
A.table(results, end=10)
differentVerse = len(results)

  0.33s 710 results


n,p,verse,clause,verse.1,clause.1
1,Genesis 1:18,וְלִמְשֹׁל֙ בַּיֹּ֣ום וּבַלַּ֔יְלָה וּֽלֲהַבְדִּ֔יל בֵּ֥ין הָאֹ֖ור וּבֵ֣ין הַחֹ֑שֶׁךְ וַיַּ֥רְא אֱלֹהִ֖ים כִּי־טֹֽוב׃,וְלִמְשֹׁל֙ בַּיֹּ֣ום וּבַלַּ֔יְלָה,וַיִּתֵּ֥ן אֹתָ֛ם אֱלֹהִ֖ים בִּרְקִ֣יעַ הַשָּׁמָ֑יִם לְהָאִ֖יר עַל־הָאָֽרֶץ׃,לְהָאִ֖יר עַל־הָאָֽרֶץ׃
2,Genesis 2:7,וַיִּיצֶר֩ יְהוָ֨ה אֱלֹהִ֜ים אֶת־הָֽאָדָ֗ם עָפָר֙ מִן־הָ֣אֲדָמָ֔ה וַיִּפַּ֥ח בְּאַפָּ֖יו נִשְׁמַ֣ת חַיִּ֑ים וַֽיְהִ֥י הָֽאָדָ֖ם לְנֶ֥פֶשׁ חַיָּֽה׃,וַיִּיצֶר֩ יְהוָ֨ה אֱלֹהִ֜ים אֶת־הָֽאָדָ֗ם עָפָר֙ מִן־הָ֣אֲדָמָ֔ה,אֵ֣לֶּה תֹולְדֹ֧ות הַשָּׁמַ֛יִם וְהָאָ֖רֶץ בְּהִבָּֽרְאָ֑ם בְּיֹ֗ום עֲשֹׂ֛ות יְהוָ֥ה אֱלֹהִ֖ים אֶ֥רֶץ וְשָׁמָֽיִם׃,בְּיֹ֗ום
3,Genesis 7:3,גַּ֣ם מֵעֹ֧וף הַשָּׁמַ֛יִם שִׁבְעָ֥ה שִׁבְעָ֖ה זָכָ֣ר וּנְקֵבָ֑ה לְחַיֹּ֥ות זֶ֖רַע עַל־פְּנֵ֥י כָל־הָאָֽרֶץ׃,לְחַיֹּ֥ות זֶ֖רַע עַל־פְּנֵ֥י כָל־הָאָֽרֶץ׃,מִכֹּ֣ל׀ הַבְּהֵמָ֣ה הַטְּהֹורָ֗ה תִּֽקַּח־לְךָ֛ שִׁבְעָ֥ה שִׁבְעָ֖ה אִ֣ישׁ וְאִשְׁתֹּ֑ו וּמִן־הַבְּהֵמָ֡ה אֲ֠שֶׁר לֹ֣א טְהֹרָ֥ה הִ֛וא שְׁנַ֖יִם אִ֥ישׁ וְאִשְׁתֹּֽו׃,מִכֹּ֣ל׀ הַבְּהֵמָ֣ה הַטְּהֹורָ֗ה תִּֽקַּח־לְךָ֛ שִׁבְעָ֥ה שִׁבְעָ֖ה אִ֣ישׁ וְאִשְׁתֹּ֑ו
4,Genesis 22:17,כִּֽי־בָרֵ֣ךְ אֲבָרֶכְךָ֗ וְהַרְבָּ֨ה אַרְבֶּ֤ה אֶֽת־זַרְעֲךָ֙ כְּכֹוכְבֵ֣י הַשָּׁמַ֔יִם וְכַחֹ֕ול אֲשֶׁ֖ר עַל־שְׂפַ֣ת הַיָּ֑ם וְיִרַ֣שׁ זַרְעֲךָ֔ אֵ֖ת שַׁ֥עַר אֹיְבָֽיו׃,כִּֽי־בָרֵ֣ךְ אֲבָרֶכְךָ֗,וַיֹּ֕אמֶר בִּ֥י נִשְׁבַּ֖עְתִּי נְאֻם־יְהוָ֑ה כִּ֗י יַ֚עַן אֲשֶׁ֤ר עָשִׂ֨יתָ֙ אֶת־הַדָּבָ֣ר הַזֶּ֔ה וְלֹ֥א חָשַׂ֖כְתָּ אֶת־בִּנְךָ֥ אֶת־יְחִידֶֽךָ׃,כִּ֗י
5,Genesis 24:44,וְאָמְרָ֤ה אֵלַי֙ גַּם־אַתָּ֣ה שְׁתֵ֔ה וְגַ֥ם לִגְמַלֶּ֖יךָ אֶשְׁאָ֑ב הִ֣וא הָֽאִשָּׁ֔ה אֲשֶׁר־הֹכִ֥יחַ יְהוָ֖ה לְבֶן־אֲדֹנִֽי׃,הִ֣וא הָֽאִשָּׁ֔ה,הִנֵּ֛ה אָנֹכִ֥י נִצָּ֖ב עַל־עֵ֣ין הַמָּ֑יִם וְהָיָ֤ה הָֽעַלְמָה֙ הַיֹּצֵ֣את לִשְׁאֹ֔ב וְאָמַרְתִּ֣י אֵלֶ֔יהָ הַשְׁקִֽינִי־נָ֥א מְעַט־מַ֖יִם מִכַּדֵּֽךְ׃,הָֽעַלְמָה֙
6,Genesis 27:45,עַד־שׁ֨וּב אַף־אָחִ֜יךָ מִמְּךָ֗ וְשָׁכַח֙ אֵ֣ת אֲשֶׁר־עָשִׂ֣יתָ לֹּ֔ו וְשָׁלַחְתִּ֖י וּלְקַחְתִּ֣יךָ מִשָּׁ֑ם לָמָ֥ה אֶשְׁכַּ֛ל גַּם־שְׁנֵיכֶ֖ם יֹ֥ום אֶחָֽד׃,עַד־שׁ֨וּב אַף־אָחִ֜יךָ מִמְּךָ֗,וְיָשַׁבְתָּ֥ עִמֹּ֖ו יָמִ֣ים אֲחָדִ֑ים עַ֥ד אֲשֶׁר־תָּשׁ֖וּב חֲמַ֥ת אָחִֽיךָ׃,עַ֥ד אֲשֶׁר־תָּשׁ֖וּב חֲמַ֥ת אָחִֽיךָ׃
7,Genesis 36:16,אַלּֽוּף־קֹ֛רַח אַלּ֥וּף גַּעְתָּ֖ם אַלּ֣וּף עֲמָלֵ֑ק אֵ֣לֶּה אַלּוּפֵ֤י אֱלִיפַז֙ בְּאֶ֣רֶץ אֱדֹ֔ום אֵ֖לֶּה בְּנֵ֥י עָדָֽה׃,אַלּֽוּף־קֹ֛רַח אַלּ֥וּף גַּעְתָּ֖ם אַלּ֣וּף עֲמָלֵ֑ק,אֵ֖לֶּה אַלּוּפֵ֣י בְנֵֽי־עֵשָׂ֑ו בְּנֵ֤י אֱלִיפַז֙ בְּכֹ֣ור עֵשָׂ֔ו אַלּ֤וּף תֵּימָן֙ אַלּ֣וּף אֹומָ֔ר אַלּ֥וּף צְפֹ֖ו אַלּ֥וּף קְנַֽז׃,בְּנֵ֤י אֱלִיפַז֙ בְּכֹ֣ור עֵשָׂ֔ו אַלּ֤וּף תֵּימָן֙ אַלּ֣וּף אֹומָ֔ר אַלּ֥וּף צְפֹ֖ו אַלּ֥וּף קְנַֽז׃
8,Genesis 36:30,אַלּ֥וּף דִּשֹׁ֛ן אַלּ֥וּף אֵ֖צֶר אַלּ֣וּף דִּישָׁ֑ן אֵ֣לֶּה אַלּוּפֵ֧י הַחֹרִ֛י לְאַלֻּפֵיהֶ֖ם בְּאֶ֥רֶץ שֵׂעִֽיר׃ פ,אַלּ֥וּף דִּשֹׁ֛ן אַלּ֥וּף אֵ֖צֶר אַלּ֣וּף דִּישָׁ֑ן,אֵ֖לֶּה אַלּוּפֵ֣י הַחֹרִ֑י אַלּ֤וּף לֹוטָן֙ אַלּ֣וּף שֹׁובָ֔ל אַלּ֥וּף צִבְעֹ֖ון אַלּ֥וּף עֲנָֽה׃,אַלּ֤וּף לֹוטָן֙ אַלּ֣וּף שֹׁובָ֔ל אַלּ֥וּף צִבְעֹ֖ון אַלּ֥וּף עֲנָֽה׃
9,Genesis 36:41,אַלּ֧וּף אָהֳלִיבָמָ֛ה אַלּ֥וּף אֵלָ֖ה אַלּ֥וּף פִּינֹֽן׃,אַלּ֧וּף אָהֳלִיבָמָ֛ה אַלּ֥וּף אֵלָ֖ה אַלּ֥וּף פִּינֹֽן׃,וְ֠אֵלֶּה שְׁמֹ֞ות אַלּוּפֵ֤י עֵשָׂו֙ לְמִשְׁפְּחֹתָ֔ם לִמְקֹמֹתָ֖ם בִּשְׁמֹתָ֑ם אַלּ֥וּף תִּמְנָ֛ע אַלּ֥וּף עַֽלְוָ֖ה אַלּ֥וּף יְתֵֽת׃,אַלּ֥וּף תִּמְנָ֛ע אַלּ֥וּף עַֽלְוָ֖ה אַלּ֥וּף יְתֵֽת׃
10,Genesis 36:42,אַלּ֥וּף קְנַ֛ז אַלּ֥וּף תֵּימָ֖ן אַלּ֥וּף מִבְצָֽר׃,אַלּ֥וּף קְנַ֛ז אַלּ֥וּף תֵּימָ֖ן אַלּ֥וּף מִבְצָֽר׃,אַלּ֧וּף אָהֳלִיבָמָ֛ה אַלּ֥וּף אֵלָ֖ה אַלּ֥וּף פִּינֹֽן׃,אַלּ֧וּף אָהֳלִיבָמָ֛ה אַלּ֥וּף אֵלָ֖ה אַלּ֥וּף פִּינֹֽן׃


### Edges with values

There are also edge features that somehow *qualify* the relation between nodes they specify.

The edge feature `crossref` in the
[parallels](https://github.com/ETCBC/parallels)
module specifies a relationship between verses: they are *parallel* if they are similar. 
But `crossref` also tells you how similar, in the form of a number that is the percentage of similarity
according to the measure used by the algorithm to detect the parallels.

This number is called the *value* of the `crossref` edge. 
In our search templates we make use of the *values* of edge features.

Not all edge features provide values. `mother` does not. But `crossref` does.

Here is how many crossreferences we have. The `crossref` edge feature is symmetric: if v is parallel to w, w is parallel to v. So in our query we stipulate that v comes before w:

In [7]:
query = '''
v:verse
-crossref> w:verse
v < w
'''
results = A.search(query)

  0.14s 15871 results


We get a quick overview of the similarity distribution of parallels by means of `freqList()`:

In [8]:
E.crossref.freqList()

((100, 8456),
 (80, 7796),
 (84, 2874),
 (86, 2328),
 (76, 1274),
 (77, 1220),
 (78, 1170),
 (79, 844),
 (81, 844),
 (75, 836),
 (83, 754),
 (88, 730),
 (82, 720),
 (92, 250),
 (85, 248),
 (90, 240),
 (91, 216),
 (94, 160),
 (87, 148),
 (95, 148),
 (89, 142),
 (96, 90),
 (93, 88),
 (98, 76),
 (99, 58),
 (97, 32))

If we want the cases with a high similarity, we can say:

In [9]:
query = '''
v:verse
-crossref>95> w:verse
v < w
'''
results = A.search(query)
A.table(results, end=10)

  0.09s 4356 results


n,p,verse,verse.1
1,1_Chronicles 1:5,בְּנֵ֣י יֶ֔פֶת גֹּ֣מֶר וּמָגֹ֔וג וּמָדַ֖י וְיָוָ֣ן וְתֻבָ֑ל וּמֶ֖שֶׁךְ וְתִירָֽס׃,בְּנֵ֣י יֶ֔פֶת גֹּ֣מֶר וּמָגֹ֔וג וּמָדַ֖י וְיָוָ֣ן וְתֻבָ֑ל וּמֶ֖שֶׁךְ וְתִירָֽס׃ ס
2,1_Chronicles 1:8,וּבְנֵ֖י חָ֑ם כּ֥וּשׁ וּמִצְרַ֖יִם וּפ֥וּט וּכְנָֽעַן׃,בְּנֵ֖י חָ֑ם כּ֥וּשׁ וּמִצְרַ֖יִם פּ֥וּט וּכְנָֽעַן׃
3,1_Chronicles 1:9,וּבְנֵ֣י כ֔וּשׁ סְבָא֙ וַֽחֲוִילָ֔ה וְסַבְתָּ֥ה וְרַעְמָ֖ה וְסַבְתְּכָ֑א וּבְנֵ֥י רַעְמָ֖ה שְׁבָ֥א וּדְדָֽן׃,וּבְנֵ֣י כ֔וּשׁ סְבָא֙ וַחֲוִילָ֔ה וְסַבְתָּ֥א וְרַעְמָ֖א וְסַבְתְּכָ֑א וּבְנֵ֥י רַעְמָ֖א שְׁבָ֥א וּדְדָֽן׃ ס
4,1_Chronicles 1:10,וְכ֖וּשׁ יָלַ֣ד אֶת־נִמְרֹ֑ד ה֣וּא הֵחֵ֔ל לִֽהְיֹ֥ות גִּבֹּ֖ר בָּאָֽרֶץ׃,וְכ֖וּשׁ יָלַ֣ד אֶת־נִמְרֹ֑וד ה֣וּא הֵחֵ֔ל לִהְיֹ֥ות גִּבֹּ֖ור בָּאָֽרֶץ׃ ס
5,1_Chronicles 1:11,וּמִצְרַ֡יִם יָלַ֞ד אֶת־לוּדִ֧ים וְאֶת־עֲנָמִ֛ים וְאֶת־לְהָבִ֖ים וְאֶת־נַפְתֻּחִֽים׃,וּמִצְרַ֡יִם יָלַ֞ד אֶת־לוּדִ֧ים וְאֶת־עֲנָמִ֛ים וְאֶת־לְהָבִ֖ים וְאֶת־נַפְתֻּחִֽים׃
6,1_Chronicles 1:12,וְֽאֶת־פַּתְרֻסִ֞ים וְאֶת־כַּסְלֻחִ֗ים אֲשֶׁ֨ר יָצְא֥וּ מִשָּׁ֛ם פְּלִשְׁתִּ֖ים וְאֶת־כַּפְתֹּרִֽים׃ ס,וְֽאֶת־פַּתְרֻסִ֞ים וְאֶת־כַּסְלֻחִ֗ים אֲשֶׁ֨ר יָצְא֥וּ מִשָּׁ֛ם פְּלִשְׁתִּ֖ים וְאֶת־כַּפְתֹּרִֽים׃ ס
7,1_Chronicles 1:13,וּכְנַ֗עַן יָלַ֛ד אֶת־צִידֹ֥ן בְּכֹרֹ֖ו וְאֶת־חֵֽת׃,וּכְנַ֗עַן יָלַ֛ד אֶת־צִידֹ֥ון בְּכֹרֹ֖ו וְאֶת־חֵֽת׃
8,1_Chronicles 1:14,וְאֶת־הַיְבוּסִי֙ וְאֶת־הָ֣אֱמֹרִ֔י וְאֵ֖ת הַגִּרְגָּשִֽׁי׃,וְאֶת־הַיְבוּסִי֙ וְאֶת־הָ֣אֱמֹרִ֔י וְאֵ֖ת הַגִּרְגָּשִֽׁי׃
9,1_Chronicles 1:15,וְאֶת־הַֽחִוִּ֥י וְאֶת־הַֽעַרְקִ֖י וְאֶת־הַסִּינִֽי׃,וְאֶת־הַחִוִּ֥י וְאֶת־הַֽעַרְקִ֖י וְאֶת־הַסִּינִֽי׃
10,1_Chronicles 1:18,וְאַרְפַּכְשַׁ֖ד יָלַ֣ד אֶת־שָׁ֑לַח וְשֶׁ֖לַח יָלַ֥ד אֶת־עֵֽבֶר׃,וְאַרְפַּכְשַׁ֖ד יָלַ֣ד אֶת־שָׁ֑לַח וְשֶׁ֖לַח יָלַ֥ד אֶת־עֵֽבֶר׃


If we want to inspect the cases with a lower similarity:

In [10]:
query = '''
v:verse
-crossref<80> w:verse
v < w
'''
results = A.search(query)
A.table(results, end=10)

  0.09s 2672 results


n,p,verse,verse.1
1,Genesis 1:17,וְהָי֤וּ לִמְאֹורֹת֙ בִּרְקִ֣יעַ הַשָּׁמַ֔יִם לְהָאִ֖יר עַל־הָאָ֑רֶץ וַֽיְהִי־כֵֽן׃,וַיִּתֵּ֥ן אֹתָ֛ם אֱלֹהִ֖ים בִּרְקִ֣יעַ הַשָּׁמָ֑יִם לְהָאִ֖יר עַל־הָאָֽרֶץ׃
2,Genesis 5:7,וַיִּֽהְי֣וּ יְמֵי־אָדָ֗ם אַֽחֲרֵי֙ הֹולִידֹ֣ו אֶת־שֵׁ֔ת שְׁמֹנֶ֥ה מֵאֹ֖ת שָׁנָ֑ה וַיֹּ֥ולֶד בָּנִ֖ים וּבָנֹֽות׃,וַֽיְחִי־שֵׁ֗ת אַֽחֲרֵי֙ הֹולִידֹ֣ו אֶת־אֱנֹ֔ושׁ שֶׁ֣בַע שָׁנִ֔ים וּשְׁמֹנֶ֥ה מֵאֹ֖ות שָׁנָ֑ה וַיֹּ֥ולֶד בָּנִ֖ים וּבָנֹֽות׃
3,Genesis 5:13,וַיִּֽהְי֣וּ יְמֵי־אָדָ֗ם אַֽחֲרֵי֙ הֹולִידֹ֣ו אֶת־שֵׁ֔ת שְׁמֹנֶ֥ה מֵאֹ֖ת שָׁנָ֑ה וַיֹּ֥ולֶד בָּנִ֖ים וּבָנֹֽות׃,וַיְחִ֣י קֵינָ֗ן אַחֲרֵי֙ הֹולִידֹ֣ו אֶת־מַֽהֲלַלְאֵ֔ל אַרְבָּעִ֣ים שָׁנָ֔ה וּשְׁמֹנֶ֥ה מֵאֹ֖ות שָׁנָ֑ה וַיֹּ֥ולֶד בָּנִ֖ים וּבָנֹֽות׃
4,Genesis 5:16,וַיִּֽהְי֣וּ יְמֵי־אָדָ֗ם אַֽחֲרֵי֙ הֹולִידֹ֣ו אֶת־שֵׁ֔ת שְׁמֹנֶ֥ה מֵאֹ֖ת שָׁנָ֑ה וַיֹּ֥ולֶד בָּנִ֖ים וּבָנֹֽות׃,וַֽיְחִ֣י מַֽהֲלַלְאֵ֗ל אַֽחֲרֵי֙ הֹולִידֹ֣ו אֶת־יֶ֔רֶד שְׁלֹשִׁ֣ים שָׁנָ֔ה וּשְׁמֹנֶ֥ה מֵאֹ֖ות שָׁנָ֑ה וַיֹּ֥ולֶד בָּנִ֖ים וּבָנֹֽות׃
5,Genesis 5:30,וַיִּֽהְי֣וּ יְמֵי־אָדָ֗ם אַֽחֲרֵי֙ הֹולִידֹ֣ו אֶת־שֵׁ֔ת שְׁמֹנֶ֥ה מֵאֹ֖ת שָׁנָ֑ה וַיֹּ֥ולֶד בָּנִ֖ים וּבָנֹֽות׃,וַֽיְחִי־לֶ֗מֶךְ אַֽחֲרֵי֙ הֹולִידֹ֣ו אֶת־נֹ֔חַ חָמֵ֤שׁ וְתִשְׁעִים֙ שָׁנָ֔ה וַחֲמֵ֥שׁ מֵאֹ֖ת שָׁנָ֑ה וַיֹּ֥ולֶד בָּנִ֖ים וּבָנֹֽות׃
6,Genesis 11:11,וַיִּֽהְי֣וּ יְמֵי־אָדָ֗ם אַֽחֲרֵי֙ הֹולִידֹ֣ו אֶת־שֵׁ֔ת שְׁמֹנֶ֥ה מֵאֹ֖ת שָׁנָ֑ה וַיֹּ֥ולֶד בָּנִ֖ים וּבָנֹֽות׃,וַֽיְחִי־שֵׁ֗ם אַֽחֲרֵי֙ הֹולִידֹ֣ו אֶת־אַרְפַּכְשָׁ֔ד חֲמֵ֥שׁ מֵאֹ֖ות שָׁנָ֑ה וַיֹּ֥ולֶד בָּנִ֖ים וּבָנֹֽות׃ ס
7,Genesis 11:13,וַיִּֽהְי֣וּ יְמֵי־אָדָ֗ם אַֽחֲרֵי֙ הֹולִידֹ֣ו אֶת־שֵׁ֔ת שְׁמֹנֶ֥ה מֵאֹ֖ת שָׁנָ֑ה וַיֹּ֥ולֶד בָּנִ֖ים וּבָנֹֽות׃,וַֽיְחִ֣י אַרְפַּכְשַׁ֗ד אַֽחֲרֵי֙ הֹולִידֹ֣ו אֶת־שֶׁ֔לַח שָׁלֹ֣שׁ שָׁנִ֔ים וְאַרְבַּ֥ע מֵאֹ֖ות שָׁנָ֑ה וַיֹּ֥ולֶד בָּנִ֖ים וּבָנֹֽות׃ ס
8,Genesis 11:15,וַיִּֽהְי֣וּ יְמֵי־אָדָ֗ם אַֽחֲרֵי֙ הֹולִידֹ֣ו אֶת־שֵׁ֔ת שְׁמֹנֶ֥ה מֵאֹ֖ת שָׁנָ֑ה וַיֹּ֥ולֶד בָּנִ֖ים וּבָנֹֽות׃,וַֽיְחִי־שֶׁ֗לַח אַחֲרֵי֙ הֹולִידֹ֣ו אֶת־עֵ֔בֶר שָׁלֹ֣שׁ שָׁנִ֔ים וְאַרְבַּ֥ע מֵאֹ֖ות שָׁנָ֑ה וַיֹּ֥ולֶד בָּנִ֖ים וּבָנֹֽות׃ ס
9,Genesis 11:17,וַיִּֽהְי֣וּ יְמֵי־אָדָ֗ם אַֽחֲרֵי֙ הֹולִידֹ֣ו אֶת־שֵׁ֔ת שְׁמֹנֶ֥ה מֵאֹ֖ת שָׁנָ֑ה וַיֹּ֥ולֶד בָּנִ֖ים וּבָנֹֽות׃,וַֽיְחִי־עֵ֗בֶר אַחֲרֵי֙ הֹולִידֹ֣ו אֶת־פֶּ֔לֶג שְׁלֹשִׁ֣ים שָׁנָ֔ה וְאַרְבַּ֥ע מֵאֹ֖ות שָׁנָ֑ה וַיֹּ֥ולֶד בָּנִ֖ים וּבָנֹֽות׃ ס
10,Genesis 11:23,וַיִּֽהְי֣וּ יְמֵי־אָדָ֗ם אַֽחֲרֵי֙ הֹולִידֹ֣ו אֶת־שֵׁ֔ת שְׁמֹנֶ֥ה מֵאֹ֖ת שָׁנָ֑ה וַיֹּ֥ולֶד בָּנִ֖ים וּבָנֹֽות׃,וַיְחִ֣י שְׂר֗וּג אַחֲרֵ֛י הֹולִידֹ֥ו אֶת־נָחֹ֖ור מָאתַ֣יִם שָׁנָ֑ה וַיֹּ֥ולֶד בָּנִ֖ים וּבָנֹֽות׃ ס


This shows how all features in your data can be queried in search templates, even the features that give values
to edges.

# Feature conditions

So far we have seen feature conditions in templates of these forms

```
node feature=value
```

But there is more.

## Trivially true

You can say

```
node feature*
```

which selects all nodes, irrespective of the existence or value of feature.

This is a useless criterion in the sense that it does not influence the set of results.

But when some applications run queries for you, they might use the features mentioned in your query
to decorate the results retrieved. 

This is your way to tell such applications that you want the values of `feature` included in your results.

The text fabric browser looks at the features when it exports your results to CSV.

In [11]:
query1 = '''
word vt*
'''

query2 = '''
word
'''

results = A.search(query1)
print(len(results))

results = A.search(query1)
print(len(results))

  0.77s 426584 results
426584
  0.76s 426584 results
426584


## Inequality

You can also say

```
node feature#value
```
which selects nodes where the feature does not have `value`.

## Multiple values

When stating a feature condition, such as `chapter=1`,
you may also specify a list of alternative values:

```
  chapter=1|2|3
```

You may list as many values as you wish, for every feature.

It also works with inequalities:

```
  chapter#1|2|3
```

Let's find all verbally inflected words that are:
not in the qal, not in the third person, not in the singular,
not in the masculine.

In [12]:
query = '''
word sp=verb vs#qal vt#infc|infa|ptca|ptcp ps#p3 nu#sg gn#m
'''

A.displaySetup(extraFeatures='vt ps nu gn')
results = A.search(query, shallow=True)
for r in sorted(results)[0:5]:
    A.pretty(r)

  1.00s 271 results


In [13]:
A.displayReset('extraFeatures')

## Existence of values

If you are not interested in the particular value of a feature,
but only in whether there is a value or not, you can express that.

### Qeres

We can ask for all words that have a qere.
Just leave out the `=value` part.

```
word qere
```

Conversely, we can ask for words without a qere.
Just add a `#` after the feature name.

```
word qere#
```

Let's test it.

In [14]:
query = '''
word
'''
print('Words in total:')
results = A.search(query)
allWords = len(results)

print('Words with a qere:')
query = '''
word qere
'''
results = A.search(query)
qereWords = len(results)

print('Words without a qere:')
query = '''
word qere#
'''
results = A.search(query)
plainWords = len(results)

print(f'qereWords + plainWords == allWords ? {qereWords + plainWords == allWords}')

Words in total:
  0.39s 426584 results
Words with a qere:
  0.25s 1892 results
Words without a qere:
  0.59s 424692 results
qereWords + plainWords == allWords ? True


## Boundaries

For features with *numerical* values, we may ask for values higher or lower than a given value.

The 
[dist](https://etcbc.github.io/bhsa/features/hebrew/2017/dist.html)
feature gives the distance between an object and its mother.

We want to see it values by means of `freqList()`, but the feature is not yet loaded.
Let's do a query with it, after running it, the feature is loaded.

In [15]:
query = '''
clause dist=1
'''
results = A.search(query)

  0.19s 598 results


Now we can explore the frequencies:

In [16]:
F.dist.freqList()[0:10]

((0, 631195),
 (-1, 104875),
 (-2, 38155),
 (-3, 14985),
 (-4, 7662),
 (-5, 3650),
 (-6, 2137),
 (1, 1773),
 (-7, 1375),
 (-8, 918))

Let us say we are interested in clause only. The feature `dist` is defined for multiple node types.
We can pass a set of node types to `freqList()` in order to get the frequencies restricted to those types:

In [17]:
F.dist.freqList({'clause'})[0:10]

((0, 67352),
 (-1, 11574),
 (-2, 3263),
 (-3, 2438),
 (-4, 1383),
 (-5, 667),
 (1, 598),
 (-6, 328),
 (-7, 166),
 (-8, 71))

There are negative distances. In those cases the mother preceeds the daughter. Let's get the mothers that
precede their daughters by a large amount.

In [18]:
query = '''
verse
    clause dist<-10
'''
results = A.search(query)
A.table(sorted(results), end=10)

  0.09s 86 results


n,p,verse,clause
1,Genesis 25:12,וְאֵ֛לֶּה תֹּלְדֹ֥ת יִשְׁמָעֵ֖אל בֶּן־אַבְרָהָ֑ם אֲשֶׁ֨ר יָלְדָ֜ה הָגָ֧ר הַמִּצְרִ֛ית שִׁפְחַ֥ת שָׂרָ֖ה לְאַבְרָהָֽם׃,אֲשֶׁ֨ר יָלְדָ֜ה הָגָ֧ר הַמִּצְרִ֛ית שִׁפְחַ֥ת שָׂרָ֖ה לְאַבְרָהָֽם׃
2,Genesis 30:33,וְעָֽנְתָה־בִּ֤י צִדְקָתִי֙ בְּיֹ֣ום מָחָ֔ר כִּֽי־תָבֹ֥וא עַל־שְׂכָרִ֖י לְפָנֶ֑יךָ כֹּ֣ל אֲשֶׁר־אֵינֶנּוּ֩ נָקֹ֨ד וְטָל֜וּא בָּֽעִזִּ֗ים וְחוּם֙ בַּכְּשָׂבִ֔ים גָּנ֥וּב ה֖וּא אִתִּֽי׃,אֲשֶׁר־אֵינֶנּוּ֩ נָקֹ֨ד וְטָל֜וּא בָּֽעִזִּ֗ים וְחוּם֙ בַּכְּשָׂבִ֔ים
3,Genesis 49:11,אֹסְרִ֤י לַגֶּ֨פֶן֙ עִירֹ֔ו וְלַשֹּׂרֵקָ֖ה בְּנִ֣י אֲתֹנֹ֑ו כִּבֵּ֤ס בַּיַּ֨יִן֙ לְבֻשֹׁ֔ו וּבְדַם־עֲנָבִ֖ים סוּתֹֽו׃,אֹסְרִ֤י לַגֶּ֨פֶן֙ עִירֹ֔ו
4,Genesis 50:13,וַיִּשְׂא֨וּ אֹתֹ֤ו בָנָיו֙ אַ֣רְצָה כְּנַ֔עַן וַיִּקְבְּר֣וּ אֹתֹ֔ו בִּמְעָרַ֖ת שְׂדֵ֣ה הַמַּכְפֵּלָ֑ה אֲשֶׁ֣ר קָנָה֩ אַבְרָהָ֨ם אֶת־הַשָּׂדֶ֜ה לַאֲחֻזַּת־קֶ֗בֶר מֵאֵ֛ת עֶפְרֹ֥ן הַחִתִּ֖י עַל־פְּנֵ֥י מַמְרֵֽא׃,אֲשֶׁ֣ר קָנָה֩ אַבְרָהָ֨ם אֶת־הַשָּׂדֶ֜ה לַאֲחֻזַּת־קֶ֗בֶר מֵאֵ֛ת עֶפְרֹ֥ן הַחִתִּ֖י
5,Exodus 18:8,וַיְסַפֵּ֤ר מֹשֶׁה֙ לְחֹ֣תְנֹ֔ו אֵת֩ כָּל־אֲשֶׁ֨ר עָשָׂ֤ה יְהוָה֙ לְפַרְעֹ֣ה וּלְמִצְרַ֔יִם עַ֖ל אֹודֹ֣ת יִשְׂרָאֵ֑ל אֵ֤ת כָּל־הַתְּלָאָה֙ אֲשֶׁ֣ר מְצָאָ֣תַם בַּדֶּ֔רֶךְ וַיַּצִּלֵ֖ם יְהוָֽה׃,אֲשֶׁ֨ר עָשָׂ֤ה יְהוָה֙ לְפַרְעֹ֣ה וּלְמִצְרַ֔יִם עַ֖ל אֹודֹ֣ת יִשְׂרָאֵ֑ל
6,Exodus 25:9,כְּכֹ֗ל אֲשֶׁ֤ר אֲנִי֙ מַרְאֶ֣ה אֹותְךָ֔ אֵ֚ת תַּבְנִ֣ית הַמִּשְׁכָּ֔ן וְאֵ֖ת תַּבְנִ֣ית כָּל־כֵּלָ֑יו וְכֵ֖ן תַּעֲשֽׂוּ׃ ס,אֲשֶׁ֤ר אֲנִי֙ מַרְאֶ֣ה אֹותְךָ֔ אֵ֚ת תַּבְנִ֣ית הַמִּשְׁכָּ֔ן וְאֵ֖ת תַּבְנִ֣ית כָּל־כֵּלָ֑יו
7,Exodus 38:26,בֶּ֚קַע לַגֻּלְגֹּ֔לֶת מַחֲצִ֥ית הַשֶּׁ֖קֶל בְּשֶׁ֣קֶל הַקֹּ֑דֶשׁ לְכֹ֨ל הָעֹבֵ֜ר עַל־הַפְּקֻדִ֗ים מִבֶּ֨ן עֶשְׂרִ֤ים שָׁנָה֙ וָמַ֔עְלָה לְשֵׁשׁ־מֵאֹ֥ות אֶ֨לֶף֙ וּשְׁלֹ֣שֶׁת אֲלָפִ֔ים וַחֲמֵ֥שׁ מֵאֹ֖ות וַחֲמִשִּֽׁים׃,הָעֹבֵ֜ר עַל־הַפְּקֻדִ֗ים מִבֶּ֨ן עֶשְׂרִ֤ים שָׁנָה֙ וָמַ֔עְלָה
8,Exodus 39:13,וְהַטּוּר֙ הָֽרְבִיעִ֔י תַּרְשִׁ֥ישׁ שֹׁ֖הַם וְיָשְׁפֵ֑ה מֽוּסַבֹּ֛ת מִשְׁבְּצֹ֥ות זָהָ֖ב בְּמִלֻּאֹתָֽם׃,מֽוּסַבֹּ֛ת מִשְׁבְּצֹ֥ות זָהָ֖ב בְּמִלֻּאֹתָֽם׃
9,Leviticus 11:9,אֶת־זֶה֙ תֹּֽאכְל֔וּ מִכֹּ֖ל אֲשֶׁ֣ר בַּמָּ֑יִם כֹּ֣ל אֲשֶׁר־לֹו֩ סְנַפִּ֨יר וְקַשְׂקֶ֜שֶׂת בַּמַּ֗יִם בַּיַּמִּ֛ים וּבַנְּחָלִ֖ים אֹתָ֥ם תֹּאכֵֽלוּ׃,אֲשֶׁר־לֹו֩ סְנַפִּ֨יר וְקַשְׂקֶ֜שֶׂת בַּמַּ֗יִם בַּיַּמִּ֛ים וּבַנְּחָלִ֖ים
10,Leviticus 11:10,וְכֹל֩ אֲשֶׁ֨ר אֵֽין־לֹ֜ו סְנַפִּ֣יר וְקַשְׂקֶ֗שֶׂת בַּיַּמִּים֙ וּבַנְּחָלִ֔ים מִכֹּל֙ שֶׁ֣רֶץ הַמַּ֔יִם וּמִכֹּ֛ל נֶ֥פֶשׁ הַחַיָּ֖ה אֲשֶׁ֣ר בַּמָּ֑יִם שֶׁ֥קֶץ הֵ֖ם לָכֶֽם׃,אֲשֶׁ֨ר אֵֽין־לֹ֜ו סְנַפִּ֣יר וְקַשְׂקֶ֗שֶׂת בַּיַּמִּים֙ וּבַנְּחָלִ֔ים


## Regular expressions

An even more powerful way of specifying desired feature values is by regular expressions.
You can do this for *string-valued* values features only.

Instead of specifying a feature condition like this

```
typ=WIm0
```

or

```
typ=WIm0|WImX
```

you can say

```
typ~WIm[0X]
```

Note that you do not use the `=` between feature name and value specification, 
but `~`.

The syntax and semantics of regular expressions are those as defined in the
[Python docs](https://docs.python.org/3/library/re.html#regular-expression-syntax).

Note, that if you need to enter a `\` in the regular expression, you have to double it.
Also, when you need a space in it, you have to put a `\` in front of it.

### No value no match

If you search with regular expressions, then nodes without a value do not match any regular expression.

The regular expression `.*` matches everything.

#### Qeres

Not all words have a qere.

So we expect the following template to list all words that do have a qere and none of those that don't.

In [19]:
query = '''
word qere~.*
'''
results = list(A.search(query))
matchWords = len(results)
print(
    'Compare this with qere words: '
    f'{qereWords}: {"Equal" if matchWords == qereWords else "Unequal"}')

  0.35s 1892 results
Compare this with qere words: 1892: Equal


### More examples

#### Two letter nouns

We pick two letter nouns that start with an aleph.

In [20]:
query = '''
word sp=subs g_cons~^>.$
'''
results = A.search(query)
A.table(results, end=20)

  0.51s 816 results


n,p,word
1,Genesis 2:6,אֵ֖ד
2,Genesis 3:20,אֵ֥ם
3,Genesis 14:18,אֵ֥ל
4,Genesis 14:19,אֵ֣ל
5,Genesis 14:20,אֵ֣ל
6,Genesis 14:22,אֵ֣ל
7,Genesis 15:17,אֵ֔שׁ
8,Genesis 16:13,אֵ֣ל
9,Genesis 17:1,אֵ֣ל
10,Genesis 17:4,אַ֖ב


Hover over the words and you see where in the Bible they are.
Click on it, and you go to the word in SHEBANQ.

Let us zoom in on one of the results.
We want to know more about the lexeme in question.

There are several methods to do that.

##### Show the nodes

First of all, let us show the nodes.

In [21]:
A.table(results, start=9, end=9, withNodes=True)

n,p,word
9,Genesis 17:1,אֵ֣ל 7342


Now we can use `pretty()` to get more info.

In [22]:
A.pretty(247827)

Note that under the word is a link to its lexeme entry in SHEBANQ.

##### Programmatically
With a bit of TF juggling you could also have got this link programmatically:

In [23]:
A.webLink(L.u(results[8][0], otype='lex')[0])

##### Enrich the query

We can also add some context to the query.
Since we are interested in the lexemes, let's add those to the query.

Every word lies embedded in a lexeme.

In [24]:
query = '''
lex
  word sp=subs g_cons~^>.$
'''
results = A.search(query)
A.table(results, end=10)

  0.51s 816 results


n,p,lex,word
1,Exodus 4:8,אֹות,אֹ֣ת
2,Exodus 4:8,אֹות,אֹ֥ת
3,Exodus 8:19,אֹות,אֹ֥ת
4,Exodus 12:13,אֹות,אֹ֗ת
5,Genesis 2:6,אֵד,אֵ֖ד
6,Genesis 27:45,אַף,אַף־
7,Genesis 30:2,אַף,אַ֥ף
8,Exodus 4:14,אַף,אַ֨ף
9,Exodus 11:8,אַף,אָֽף׃ ס
10,Exodus 32:19,אַף,אַ֣ף


Same amount of results, but the order is different.
We just use Python to get the lexemes only, together with their first occurrence.
We make a list of tuples, and feed that to `A.table()`.

In [25]:
lexemes = set()
lexResults = []
for (lex, word) in results:
    if lex not in lexemes:
        lexemes.add(lex)
        lexResults.append((lex, word))
A.table(lexResults)

n,p,lex,word
1,Exodus 4:8,אֹות,אֹ֣ת
2,Genesis 2:6,אֵד,אֵ֖ד
3,Genesis 27:45,אַף,אַף־
4,Genesis 17:4,אָב,אַ֖ב
5,Genesis 3:20,אֵם,אֵ֥ם
6,Genesis 24:29,אָח,אָ֖ח
7,Isaiah 20:6,אִי,אִ֣י
8,Genesis 14:18,אֵל,אֵ֥ל
9,Genesis 15:17,אֵשׁ,אֵ֔שׁ
10,Genesis 31:29,אֵל,אֵ֣ל


Observe how you can use a query to get an interesting node set,
which you can then massage using standard Python machinery,
after which you can display the results prettily with `A.table()` or `A.show()`.

**The take-away lesson is: you can use `A.table()` and `A.show()` on arbitrary iterables of tuples of nodes,
whether or not they come from an executed query.**

The headers of the tables are taken from the node types of the first tuple, so if your tuples
are not consistent in their types, the headers will be non-sensical:

In [26]:
tuples = (
    (1, 1000000),
    (1000001, 2),
)
A.table(tuples)

n,p,word,phrase_atom
1,1_Samuel 25:25,בְּ,אֶל־אִישׁ֩ הַבְּלִיַּ֨עַל הַזֶּ֜ה
2,Genesis 1:1,עַל־נָבָ֗ל,רֵאשִׁ֖ית


But `A.show()` makes perfect sense, also in this case.

In [27]:
A.show(tuples)

In [28]:
A.show(tuples, condensed=False)

#### we-x clauses with a non-qal verb

If you look at the [clause types](https://etcbc.github.io/bhsa/features/hebrew/2017/typ.html)
you see a lot of types indicating that the clause starts with `we`:

```
Way0	Wayyiqtol-null clause
WayX	Wayyiqtol-X clause
WIm0	We-imperative-null clause
WImX	We-imperative-X clause
WQt0	We-qatal-null clause
WQtX	We-qatal-X clause
WxI0	We-x-imperative-null clause
WXIm	We-X-imperative clause
WxIX	We-x-imperative-X clause
WxQ0	We-x-qatal-null clause
WXQt	We-X-qatal clause
WxQX	We-x-qatal-X clause
WxY0	We-x-yiqtol-null clause
WXYq	We-X-yiqtol clause
WxYX	We-x-yiqtol-X clause
WYq0	We-yiqtol-null clause
WYqX	We-yiqtol-X clause
```

We are interested in the `We-x` and `We-X` clauses, so all clauses whose `typ` starts with `Wx` or `WX`.

There are quite a number of verb stems. By means of a regular expression we can pick everything except `qal`.

In the
[Python docs on regular expressions](https://docs.python.org/3/library/re.html#regular-expression-syntax)
we see that we can check for that by `^(?:!qal)`.

In [29]:
query = '''
clause typ~^W[xX]
  word sp=verb vs#qal
'''
results = list(A.search(query))
A.table(results, end=10)

  0.62s 3098 results


n,p,clause,word
1,Genesis 1:20,וְעֹוף֙ יְעֹופֵ֣ף עַל־הָאָ֔רֶץ עַל־פְּנֵ֖י רְקִ֥יעַ הַשָּׁמָֽיִם׃,יְעֹופֵ֣ף
2,Genesis 2:10,וּמִשָּׁם֙ יִפָּרֵ֔ד,יִפָּרֵ֔ד
3,Genesis 2:25,וְלֹ֖א יִתְבֹּשָֽׁשׁוּ׃,יִתְבֹּשָֽׁשׁוּ׃
4,Genesis 3:18,וְקֹ֥וץ וְדַרְדַּ֖ר תַּצְמִ֣יחַֽ לָ֑ךְ,תַּצְמִ֣יחַֽ
5,Genesis 4:4,וְהֶ֨בֶל הֵבִ֥יא גַם־ה֛וּא מִבְּכֹרֹ֥ות צֹאנֹ֖ו וּמֵֽחֶלְבֵהֶ֑ן,הֵבִ֥יא
6,Genesis 4:7,וְאִם֙ לֹ֣א תֵיטִ֔יב,תֵיטִ֔יב
7,Genesis 4:14,וּמִפָּנֶ֖יךָ אֶסָּתֵ֑ר,אֶסָּתֵ֑ר
8,Genesis 4:26,וּלְשֵׁ֤ת גַּם־הוּא֙ יֻלַּד־בֵּ֔ן,יֻלַּד־
9,Genesis 6:1,וּבָנֹ֖ות יֻלְּד֥וּ לָהֶֽם׃,יֻלְּד֥וּ
10,Genesis 6:12,וְהִנֵּ֣ה נִשְׁחָ֑תָה,נִשְׁחָ֑תָה


In [30]:
A.show(results, start=0, end=3)

#### Find all glosses with a space

In [31]:
query = '''
lex gloss~[\ ] sp=subs
'''
results = list(A.search(query))
A.table(results, start=1, end=5)

  0.02s 406 results


n,p,lex
1,תְּהֹום,תְּהֹום
2,תַּחַת,תַּחַת
3,יַבָּשָׁה,יַבָּשָׁה
4,דֶּשֶׁא,דֶּשֶׁא
5,שֶׁרֶץ,שֶׁרֶץ


In [32]:
A.show(results, condensed=False, start=1, end=5)

## Custom sets

Eventually you reach cases where search templates are just not up to it.

Examples:

* What if you want to restrict a search to sentences that do not contain infrequent words?
* It is fairly tricky to look for gapped phrases. What if you look for complex patterns, but only in
  gapped phrases?

Before you dive head over heels into hand coding, here is an intermediate solution.
You can create node sets by means of search, and then use those node sets in other search templates
at the places where you have node types.

You can make custom sets with arbitrary nodes, not all of the same type.
Let's collect all non-word, non-lex nodes that contain fairly frequent words only.
We also collect a set of nodes that contain highly infrequent words.

There is a feature for that, [rank_lex](https://etcbc.github.io/bhsa/features/hebrew/2017/rank_lex.html).
Since we have not loaded it, we do so now.

In [33]:
TF.load('rank_lex', add=True)

  0.00s loading features ...
   |     0.00s Not enough info for structure in otext, structure functionality will not work
  0.01s All additional features loaded - for details use loadLog()


We set a threshold `COMMON_RANK`, and pick all objects with only high ranking words, their ranks between 0 and `COMMON_RANK`.

We set a threshold `RARE_RANK`, and pick all objects that contain at least one low ranking word, its rank higher than `RARE_RANK`.

In [34]:
COMMON_RANK = 100
RARE_RANK = 500

frequent = set()
infrequent = set()

for n in N():
    nTp = F.otype.v(n)
    if nTp == 'lex':
       continue
    if nTp == 'word':
        ranks = [F.rank_lex.v(n)]
    else:
        ranks = [F.rank_lex.v(w) for w in L.d(n, otype='word')]
    maxRank = max(ranks)
    minRank = min(ranks)
    if maxRank < COMMON_RANK:
        frequent.add(n)
    if maxRank > RARE_RANK:
        infrequent.add(n)
        
print(f'{len(frequent):>6} members in set frequent')
print(f'{len(infrequent):>6} members in set infrequent')

669186 members in set frequent
424804 members in set infrequent


Now we can do all kinds of searches within the domain of `frequent` and `infrequent` things.

We give the names to all the sets and put them in a dictionary.

In [35]:
customSets=dict(
    frequent=frequent,
    infrequent=infrequent,
)

Then we pass it to `A.search()` with a query to look for sentences with a rare word that have a clause with only frequent words:

In [36]:
query = '''
infrequent otype=sentence
  frequent otype=clause
'''
results = A.search(query, sets=customSets)
A.table(results, start=1, end=10)

  1.37s 4301 results


n,p,sentence,clause
1,Genesis 1:10,וַיַּ֥רְא אֱלֹהִ֖ים כִּי־טֹֽוב׃,וַיַּ֥רְא אֱלֹהִ֖ים
2,Genesis 1:12,וַיַּ֥רְא אֱלֹהִ֖ים כִּי־טֹֽוב׃,וַיַּ֥רְא אֱלֹהִ֖ים
3,Genesis 1:18,וַיַּ֥רְא אֱלֹהִ֖ים כִּי־טֹֽוב׃,וַיַּ֥רְא אֱלֹהִ֖ים
4,Genesis 1:21,וַיַּ֥רְא אֱלֹהִ֖ים כִּי־טֹֽוב׃,וַיַּ֥רְא אֱלֹהִ֖ים
5,Genesis 1:25,וַיַּ֥רְא אֱלֹהִ֖ים כִּי־טֹֽוב׃,וַיַּ֥רְא אֱלֹהִ֖ים
6,Genesis 1:29,הִנֵּה֩ נָתַ֨תִּי לָכֶ֜ם אֶת־כָּל־עֵ֣שֶׂב׀ זֹרֵ֣עַ זֶ֗רַע אֲשֶׁר֙ עַל־פְּנֵ֣י כָל־הָאָ֔רֶץ וְאֶת־כָּל־הָעֵ֛ץ אֲשֶׁר־בֹּ֥ו פְרִי־עֵ֖ץ זֹרֵ֣עַ זָ֑רַע וּֽלְכָל־חַיַּ֣ת הָ֠אָרֶץ וּלְכָל־עֹ֨וף הַשָּׁמַ֜יִם וּלְכֹ֣ל׀ רֹומֵ֣שׂ עַל־הָאָ֗רֶץ אֲשֶׁר־בֹּו֙ נֶ֣פֶשׁ חַיָּ֔ה אֶת־כָּל־יֶ֥רֶק עֵ֖שֶׂב לְאָכְלָ֑ה,אֲשֶׁר֙ עַל־פְּנֵ֣י כָל־הָאָ֔רֶץ
7,Genesis 2:2,וַיְכַ֤ל אֱלֹהִים֙ בַּיֹּ֣ום הַשְּׁבִיעִ֔י מְלַאכְתֹּ֖ו אֲשֶׁ֣ר עָשָׂ֑ה,אֲשֶׁ֣ר עָשָׂ֑ה
8,Genesis 2:2,וַיִּשְׁבֹּת֙ בַּיֹּ֣ום הַשְּׁבִיעִ֔י מִכָּל־מְלַאכְתֹּ֖ו אֲשֶׁ֥ר עָשָֽׂה׃,אֲשֶׁ֥ר עָשָֽׂה׃
9,Genesis 2:3,כִּ֣י בֹ֤ו שָׁבַת֙ מִכָּל־מְלַאכְתֹּ֔ו אֲשֶׁר־בָּרָ֥א אֱלֹהִ֖ים לַעֲשֹֽׂות׃ פ,לַעֲשֹֽׂות׃ פ
10,Genesis 2:4,בְּיֹ֗ום עֲשֹׂ֛ות יְהוָ֥ה אֱלֹהִ֖ים אֶ֥רֶץ וְשָׁמָֽיִם׃ וַיִּיצֶר֩ יְהוָ֨ה אֱלֹהִ֜ים אֶת־הָֽאָדָ֗ם עָפָר֙ מִן־הָ֣אֲדָמָ֔ה,בְּיֹ֗ום


We are going to show this really nice:

* we add the feature `rank_lex` to the display
* we suppress the other features
* we color the rare words and the common words differently

In [37]:
A.displaySetup(extraFeatures='rank_lex')
highlights = {}
for (sentence, clause) in results:
    highlights[sentence] = 'magenta'
    highlights[clause] = 'cyan'
    for w in L.d(sentence, otype='word'):
        if F.rank_lex.v(w) > RARE_RANK:
            highlights[w] = 'magenta'
    for w in L.d(clause, otype='word'):
        if F.rank_lex.v(w) < COMMON_RANK:
            highlights[w] = 'cyan'
A.show(results, condensed=False, start=6, end=7, suppress={'sp', 'vt', 'vs', 'function', 'typ'}, highlights=highlights, withNodes=True)

Now infrequent sentences ending in a frequent word:

In [38]:
query = '''
infrequent otype=sentence
  := frequent otype=word
'''
results = A.search(query, sets=customSets)
A.table(results, start=1, end=10)

  1.46s 10793 results


n,p,sentence,word
1,Genesis 1:1,בְּרֵאשִׁ֖ית בָּרָ֣א אֱלֹהִ֑ים אֵ֥ת הַשָּׁמַ֖יִם וְאֵ֥ת הָאָֽרֶץ׃,אָֽרֶץ׃
2,Genesis 1:2,וְר֣וּחַ אֱלֹהִ֔ים מְרַחֶ֖פֶת עַל־פְּנֵ֥י הַמָּֽיִם׃,מָּֽיִם׃
3,Genesis 1:6,יְהִ֥י רָקִ֖יעַ בְּתֹ֣וךְ הַמָּ֑יִם,מָּ֑יִם
4,Genesis 1:6,וִיהִ֣י מַבְדִּ֔יל בֵּ֥ין מַ֖יִם לָמָֽיִם׃,מָֽיִם׃
5,Genesis 1:9,יִקָּו֨וּ הַמַּ֜יִם מִתַּ֤חַת הַשָּׁמַ֨יִם֙ אֶל־מָקֹ֣ום אֶחָ֔ד,אֶחָ֔ד
6,Genesis 1:10,וַיִּקְרָ֨א אֱלֹהִ֤ים׀ לַיַּבָּשָׁה֙ אֶ֔רֶץ,אֶ֔רֶץ
7,Genesis 1:11,תַּֽדְשֵׁ֤א הָאָ֨רֶץ֙ דֶּ֔שֶׁא עֵ֚שֶׂב מַזְרִ֣יעַ זֶ֔רַע עֵ֣ץ פְּרִ֞י עֹ֤שֶׂה פְּרִי֙ לְמִינֹ֔ו אֲשֶׁ֥ר זַרְעֹו־בֹ֖ו עַל־הָאָ֑רֶץ,אָ֑רֶץ
8,Genesis 1:15,וְהָי֤וּ לִמְאֹורֹת֙ בִּרְקִ֣יעַ הַשָּׁמַ֔יִם לְהָאִ֖יר עַל־הָאָ֑רֶץ,אָ֑רֶץ
9,Genesis 1:22,וְהָעֹ֖וף יִ֥רֶב בָּאָֽרֶץ׃,אָֽרֶץ׃
10,Genesis 1:26,וְיִרְדּוּ֩ בִדְגַ֨ת הַיָּ֜ם וּבְעֹ֣וף הַשָּׁמַ֗יִם וּבַבְּהֵמָה֙ וּבְכָל־הָאָ֔רֶץ וּבְכָל־הָרֶ֖מֶשׂ הָֽרֹמֵ֥שׂ עַל־הָאָֽרֶץ׃,אָֽרֶץ׃


As a check, we replace the custom set `frequent` by the ordinary type `word` with a rank condition.

In [39]:
query = '''
infrequent otype=sentence
  := word rank_lex<100
'''
results = A.search(query, sets=customSets)
A.table(results, start=1, end=10)

  1.03s 10793 results


n,p,sentence,word
1,Genesis 1:1,בְּרֵאשִׁ֖ית בָּרָ֣א אֱלֹהִ֑ים אֵ֥ת הַשָּׁמַ֖יִם וְאֵ֥ת הָאָֽרֶץ׃,אָֽרֶץ׃
2,Genesis 1:2,וְר֣וּחַ אֱלֹהִ֔ים מְרַחֶ֖פֶת עַל־פְּנֵ֥י הַמָּֽיִם׃,מָּֽיִם׃
3,Genesis 1:6,יְהִ֥י רָקִ֖יעַ בְּתֹ֣וךְ הַמָּ֑יִם,מָּ֑יִם
4,Genesis 1:6,וִיהִ֣י מַבְדִּ֔יל בֵּ֥ין מַ֖יִם לָמָֽיִם׃,מָֽיִם׃
5,Genesis 1:9,יִקָּו֨וּ הַמַּ֜יִם מִתַּ֤חַת הַשָּׁמַ֨יִם֙ אֶל־מָקֹ֣ום אֶחָ֔ד,אֶחָ֔ד
6,Genesis 1:10,וַיִּקְרָ֨א אֱלֹהִ֤ים׀ לַיַּבָּשָׁה֙ אֶ֔רֶץ,אֶ֔רֶץ
7,Genesis 1:11,תַּֽדְשֵׁ֤א הָאָ֨רֶץ֙ דֶּ֔שֶׁא עֵ֚שֶׂב מַזְרִ֣יעַ זֶ֔רַע עֵ֣ץ פְּרִ֞י עֹ֤שֶׂה פְּרִי֙ לְמִינֹ֔ו אֲשֶׁ֥ר זַרְעֹו־בֹ֖ו עַל־הָאָ֑רֶץ,אָ֑רֶץ
8,Genesis 1:15,וְהָי֤וּ לִמְאֹורֹת֙ בִּרְקִ֣יעַ הַשָּׁמַ֔יִם לְהָאִ֖יר עַל־הָאָ֑רֶץ,אָ֑רֶץ
9,Genesis 1:22,וְהָעֹ֖וף יִ֥רֶב בָּאָֽרֶץ׃,אָֽרֶץ׃
10,Genesis 1:26,וְיִרְדּוּ֩ בִדְגַ֨ת הַיָּ֜ם וּבְעֹ֣וף הַשָּׁמַ֗יִם וּבַבְּהֵמָה֙ וּבְכָל־הָאָ֔רֶץ וּבְכָל־הָרֶ֖מֶשׂ הָֽרֹמֵ֥שׂ עַל־הָאָֽרֶץ׃,אָֽרֶץ׃


Note that no matter how expensive the construction of a set has been, once you have it, queries based on it are just fast. There is no penalty when you use given sets instead of the familiar node types.

# Next

You have seen how to filter on feature values, of nodes and of edges.

Now we want to set up sets for real.
[sets](searchSets.ipynb)

---

[basic](search.ipynb)
advanced
[sets](searchSets.ipynb)
[relations](searchRelations.ipynb)
[quantifiers](searchQuantifiers.ipynb)
[rough](searchRough.ipynb)
[gaps](searchGaps.ipynb)