<img align="right" src="tf-small.png"/>

# Search

*Search* in Text-Fabric is a template based way of looking for structural patterns in your dataset.

It is inspired by the idea of
[topographic query](http://books.google.nl/books?id=9ggOBRz1dO4C),
as worked out in 
[MQL](https://shebanq.ancient-data.org/shebanq/static/docs/MQL-Query-Guide.pdf)
which has been implemented in 
[Emdros](http://emdros.org).
See also [pitfalls of MQL](https://etcbc.github.io/text-fabric-data/features/hebrew/etcbc4c/0_mql.html)

Within Text-Fabric we have the unique possibility to combine the ease of formulating search templates for
complicated syntactical patterns with the power of programmatically processing the results.

This notebook will show you how to get up and running.

# Before we continue
Search is a big feature in Text-Fabric.
It is also a very recent addition.

##### Caution:
> There might be bugs.

Search is also costly.
A lot of work of implementing search has been dedicated to optimize performance.
But the search templates are very powerful, and can be very diverse.
I do not pretend to have found strategies that work optimally for all search templates.

That being said, I think search might turn out useful in many cases, and I welcome your feedback.

*Dirk Roorda, 2016-12-23*

# Search command

It al starts by saying (just an example)

```
S.study('''
# here comes my search template:

phrase det=und
    w1:word sp=verb gn=f nu=pl ps=p3
    w2:word sp=subs
    
  w1 < w2  
''')
```

See also the complete reference to the
[search template syntax](https://github.com/ETCBC/text-fabric/wiki/Api#search-template-syntax).

All search related things use the
[`S` api](https://github.com/ETCBC/text-fabric/wiki/Api#search).

In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
from tf.fabric import Fabric

In [3]:
ETCBC = 'hebrew/etcbc4c'
TF = Fabric( modules=ETCBC )

This is Text-Fabric 1.2.7
Api reference : https://github.com/ETCBC/text-fabric/wiki/Api
Tutorial      : https://github.com/ETCBC/text-fabric/blob/master/docs/tutorial.ipynb
Data sources  : https://github.com/ETCBC/text-fabric-data
Data docs     : https://etcbc.github.io/text-fabric-data/features/hebrew/etcbc4c/0_overview.html
Shebanq docs  : https://shebanq.ancient-data.org/text
Slack team    : https://shebanq.slack.com/signup
Questions? Ask shebanq@ancient-data.org for an invite to Slack
107 features found and 0 ignored


Let us just *not* load any specific features.

In [4]:
api = TF.load('')
api.makeAvailableIn(globals())

  0.00s loading features ...
   |     0.00s Feature overview: 102 nodes; 4 edges; 1 configs; 6 computeds
  4.18s All features loaded/computed - for details use loadLog()


Here is a simple query to get started.
We are interested in two lexemes, but we would also like to fetch the nodes in their context.

##### Note
> This is not a very good use case, 
because in Text-Fabric it is easy to find context nodes around your nodes of interest.

In [5]:
query = '''
book
  chapter
    verse
      clause
        clause_atom
          phrase
            phrase_atom
              word lex=JC/|>JN/
'''

The next thing to do is to feed it to the search api, which will *study* it.
The syntax will be checked, features loaded, the search space will be set up, narrowed down, 
and the fetching of results will be prepared, but not yet executed.

In [6]:
S.study(query)

  0.00s Checking search template ...
  0.00s loading features ...
   |     0.21s B lex                  from /Users/dirk/github/text-fabric-data/hebrew/etcbc4c
  0.21s All additional features loaded - for details use loadLog()
  0.21s Setting up search space for 8 objects ...
  0.94s Constraining search space with 7 relations ...
  0.99s Setting up retrieval plan ...
  1.01s Ready to deliver results from 5870 nodes
Iterate over S.fetch() to get the results
See S.showPlan() to interpret the results


Before we rush to the results, lets have a look at the *plan*.

In [7]:
S.showPlan()

  7.07s The results are connected to the original search template as follows:
 0     
 1 R0  book
 2 R1    chapter
 3 R2      verse
 4 R3        clause
 5 R4          clause_atom
 6 R5            phrase
 7 R6              phrase_atom
 8 R7                word lex=JC/|>JN/
 9     


Here you see already what your results will look like.
Each result `r` is a *tuple* of nodes:
```
(R0, R1, R2, R3, R4, R5, R6, R7)
```
that instantiate the objects in your template.

## Excursion
In case you are curious, you can get details about the search space as well:

In [8]:
S.showPlan(details=True)

Search with 8 objects and 7 relations
Results are instantiations of the following objects:
node  0-book                              (    39   choices)
node  1-chapter                           (   416   choices)
node  2-verse                             (   799   choices)
node  3-clause                            (   922   choices)
node  4-clause_atom                       (   922   choices)
node  5-phrase                            (   923   choices)
node  6-phrase_atom                       (   923   choices)
node  7-word                              (   926   choices)
Instantiations are computed along the following relations:
node                      0-book          (    39   choices)
edge  0-book          [[  1-chapter       (     9.5 choices)
edge  1-chapter       [[  2-verse         (     2.0 choices)
edge  2-verse         [[  3-clause        (     1.0 choices)
edge  3-clause        [[  4-clause_atom   (     1.0 choices)
edge  4-clause_atom   [[  5-phrase        (     1.0 choic

The part about the *nodes* shows you how many possible instantiations for each object in your template
has been found.
These are not results yet, because only combinations of instantiations
that satisfy all constraints are results.

The constraints come from the relations between the objects that you specified.
In this case, there are only implicit relations: those of embedding `[[`. 
Later on we'll examine all basic relations.

The part about the *edges* shows you the constraints, and in what order they will be computed
when stitching results together.
In this case the order is exactly the order by which the relations appear in the template,
but that will not always be the case.
Text-Fabric spends some time and ingenuity to find out an optimal *stitch plan*.

Nevertheless, fetching results may take time. 

For some queries, it can take a large amount of time to walk through all results.
Even worse, it may happen that it takes a large amount of time before getting the *first* result.

This has to do with search strategies on the one hand,
and the very likely possibility to encounter *pathological* search patterns,
which have billions of results, mostly unintended.
For example, a simple query that asks for 5 words in the Hebrew Bible without further constraints,
will have 425,000 to the power of 5 results.
That is 10-e28 (a one with 28 zeroes,
roughly the number of molecules in a few hundred litres of air.
That may not sound much, but it is 10,000 times the amount of bytes
that can be currently stored on the whole internet.

Text-Fabric search is not yet done with finding optimal search strategies,
and I hope to refine its arsenal of methods in the future, depending on what you report.

## Back to business
It is always a good idea to get a feel for the amount of results, before you dive into them head-on.

In [9]:
S.count(progress=1, limit=5)

  0.00s Counting results per 1 up to 5 ...
   |     0.00s 1
   |     0.00s 2
   |     0.00s 3
   |     0.01s 4
   |     0.01s 5
  0.01s Done: 5 results


We asked for 5 result in total, with a progress message for every one.
That was a bit conservative.

In [10]:
S.count(progress=100, limit=500)

  0.00s Counting results per 100 up to 500 ...
   |     0.02s 100
   |     0.05s 200
   |     0.07s 300
   |     0.12s 400
   |     0.15s 500
  0.15s Done: 500 results


Still pretty quick, now we want to count all results.

In [11]:
S.count(progress=200, limit=-1)

  0.00s Counting results per 200 up to  the end of the results ...
   |     0.05s 200
   |     0.09s 400
   |     0.15s 600
   |     0.19s 800
  0.20s Done: 926 results


Now it is time to see something of those results.

In [12]:
S.fetch(amount=10)

((1367552, 1368104, 1428265, 486532, 576266, 781925, 1045061, 299521),
 (1367553, 1368106, 1428282, 486600, 576336, 782114, 1045258, 299826),
 (1367553, 1368107, 1428301, 486694, 576432, 782371, 1045520, 300192),
 (1367553, 1368108, 1428315, 486756, 576495, 782555, 1045709, 300482),
 (1367553, 1368108, 1428310, 486735, 576473, 782501, 1045653, 300389),
 (1367553, 1368109, 1428327, 486823, 576563, 782738, 1045900, 300764),
 (1367553, 1368111, 1428352, 486915, 576657, 782998, 1046167, 301139),
 (1367553, 1368111, 1428351, 486910, 576652, 782986, 1046155, 301122),
 (1367554, 1368113, 1428393, 487091, 576838, 783442, 1046621, 301774),
 (1367554, 1368113, 1428394, 487094, 576841, 783449, 1046628, 301781))

Not very informative.
Just a quick observation: look at the last column.
These are the result nodes for the `word` part in the query, indicated as `R7` by `showPlan()` before.
And indeed, they are all below 425,000, the number of words in the Hebrew Bible.

Nevertheless, we want to glean a bit more information off them.

In [13]:
for r in S.fetch(amount=10):
    print(S.glean(r))

  Jonah 4:11 clause[אֲשֶׁ֣ר יֶשׁ־בָּ֡הּ ] clause_atom[אֲשֶׁ֣ר יֶשׁ־בָּ֡הּ ] phrase[יֶשׁ־] phrase_atom[יֶשׁ־] 
  Micah 2:1 clause[כִּ֥י יֶשׁ־לְאֵ֖ל יָדָֽם׃ ] clause_atom[כִּ֥י יֶשׁ־לְאֵ֖ל יָדָֽם׃ ] phrase[יֶשׁ־] phrase_atom[יֶשׁ־] 
  Micah 3:7 clause[כִּ֛י אֵ֥ין מַעֲנֵ֖ה אֱלֹהִֽים׃ ] clause_atom[כִּ֛י אֵ֥ין מַעֲנֵ֖ה אֱלֹהִֽים׃ ] phrase[אֵ֥ין ] phrase_atom[אֵ֥ין ] 
  Micah 4:9 clause[הֲמֶ֣לֶךְ אֵֽין־בָּ֗ךְ ] clause_atom[הֲמֶ֣לֶךְ אֵֽין־בָּ֗ךְ ] phrase[אֵֽין־] phrase_atom[אֵֽין־] 
  Micah 4:4 clause[וְאֵ֣ין מַחֲרִ֑יד ] clause_atom[וְאֵ֣ין מַחֲרִ֑יד ] phrase[אֵ֣ין ] phrase_atom[אֵ֣ין ] 
  Micah 5:7 clause[וְאֵ֥ין מַצִּֽיל׃ ] clause_atom[וְאֵ֥ין מַצִּֽיל׃ ] phrase[אֵ֥ין ] phrase_atom[אֵ֥ין ] 
  Micah 7:2 clause[וְיָשָׁ֥ר בָּאָדָ֖ם ...] clause_atom[וְיָשָׁ֥ר בָּאָדָ֖ם ...] phrase[אָ֑יִן ] phrase_atom[אָ֑יִן ] 
  Micah 7:1 clause[אֵין־אֶשְׁכֹּ֣ול ] clause_atom[אֵין־אֶשְׁכֹּ֣ול ] phrase[אֵין־] phrase_atom[אֵין־] 
  Nahum 2:9 clause[וְאֵ֥ין מַפְנֶֽה׃ ] clause_atom[וְאֵ֥ין מַפְנֶֽה׃ ] phrase[אֵ֥

##### Caution
> It is not possible to do `len(S.fetch())`.
Because it is a *generator*, not a list.
It will deliver a result every time it is being asked and for as long as there are results,
but it does not know in advance how many there will be.

>Fetching a result can be costly, because due to the constraints, a lot of possibilities
may have to be tried and rejected before a the next result is found.

> That is why you often see results coming in at varying speeds when counting them.

This search template has some pretty tight constraints on one of its objects,
so the amount of data to dealt with it pretty limited.

Let us turn to a template where this is not so.

In [14]:
query = '''
# test
# verse book=Genesis chapter=2 verse=25
verse
  clause
                                 
    p1:phrase
        w1:word
        w3:word
        w1 < w3

    p2:phrase
        w2:word
        w1 < w2 
        w3 > w2
    
    p1 # p2   
'''

A couple of remarks.

* some objects have got a name
* there are additional relations specified between named objects
* `<` means: *comes before*, and `>`: *comes after*
* `#` means: *is not the same thing*
* later on we describe those relations in more detail

##### Note on order
> Look at the words `w1` and `w3` below phrase `p1`.
Although in the template `w1` comes before `w2`, this is not 
translated in a search constraint of the same nature.

> Order between objects in a template is never significant, only embedding is.

Because order is not significant, you have to specify order relations yourself.

It turns out that this is better than the other way around.
In MQL order *is* significant, and it is very difficult to 
search for `w1` and `w2` in any order.

##### Note on gaps
> Look at the phrases `p1` and `p2`.
We do not specify an order here, only that they are different.
There are many spatial relationships possible between different objects.
In many cases, neither the one comes before the other, nor vice versa.
They can overlap, one can occur in a gap of the other, they can be completely disjoint
and interleaved, etc.

> When we use `#`, we say: it does not matter what their spatial configuration is,
as long as they are not the same.

In [15]:
S.study(query)

  0.00s Checking search template ...
  0.00s Setting up search space for 7 objects ...
  0.51s Constraining search space with 10 relations ...
  0.54s Setting up retrieval plan ...
  0.60s Ready to deliver results from 1897304 nodes
Iterate over S.fetch() to get the results
See S.showPlan() to interpret the results


That was quick!
Well, Text-Fabric knows that narrowing down the search space in this case would take ages,
without resulting in a significantly shrunken space.
So it skips doing so for most constraints.

Let us see the plan, with details.

In [16]:
S.showPlan(details=True)

Search with 7 objects and 10 relations
Results are instantiations of the following objects:
node  0-verse                             ( 23213   choices)
node  1-clause                            ( 88000   choices)
node  2-phrase                            (253174   choices)
node  3-word                              (426581   choices)
node  4-word                              (426581   choices)
node  5-phrase                            (253174   choices)
node  6-word                              (426581   choices)
Instantiations are computed along the following relations:
node                      0-verse         ( 23213   choices)
edge  0-verse         [[  1-clause        (     3.4 choices)
edge  1-clause        [[  5-phrase        (     2.4 choices)
edge  5-phrase        [[  6-word          (     1.4 choices)
edge  1-clause        [[  2-phrase        (     3.3 choices)
edge  2-phrase        #   5-phrase        (252920.8 choices)
edge  2-phrase        [[  4-word          (     1.7 choi

As you see, we have a hefty search space here.
Let us play with the `count()` function.

In [17]:
S.count(progress=10, limit=100)

  0.00s Counting results per 10 up to 100 ...
   |     0.21s 10
   |     0.21s 20
   |     0.21s 30
   |     0.26s 40
   |     0.26s 50
   |     0.27s 60
   |     0.30s 70
   |     0.30s 80
   |     0.30s 90
   |     0.30s 100
  0.31s Done: 100 results


We can be bolder than this!

In [18]:
S.count(progress=100, limit=1000)

  0.00s Counting results per 100 up to 1000 ...
   |     0.26s 100
   |     0.32s 200
   |     0.32s 300
   |     0.59s 400
   |     0.74s 500
   |     0.76s 600
   |     0.93s 700
   |     1.11s 800
   |     1.34s 900
   |     1.47s 1000
  1.47s Done: 1000 results


Ok, not too bad, but note that it takes a big fraction of a second to get just 100 results.

Now let us go for all of them by the thousand.

In [19]:
S.count(progress=1000, limit=-1)

  0.00s Counting results per 1000 up to  the end of the results ...
   |     1.37s 1000
   |     2.23s 2000
   |     3.77s 3000
   |     4.47s 4000
   |     5.80s 5000
   |     7.41s 6000
   |       11s 7000
    13s Done: 7512 results


See? This is substantial work.

In [20]:
for r in S.fetch(amount=10):
    print(S.glean(r))

Genesis 2:25 clause[וַיִּֽהְי֤וּ שְׁנֵיהֶם֙ עֲרוּמִּ֔ים הָֽ...] phrase[שְׁנֵיהֶם֙ הָֽאָדָ֖ם וְאִשְׁתֹּ֑ו ]   phrase[עֲרוּמִּ֔ים ] 
Genesis 2:25 clause[וַיִּֽהְי֤וּ שְׁנֵיהֶם֙ עֲרוּמִּ֔ים הָֽ...] phrase[שְׁנֵיהֶם֙ הָֽאָדָ֖ם וְאִשְׁתֹּ֑ו ]   phrase[עֲרוּמִּ֔ים ] 
Genesis 2:25 clause[וַיִּֽהְי֤וּ שְׁנֵיהֶם֙ עֲרוּמִּ֔ים הָֽ...] phrase[שְׁנֵיהֶם֙ הָֽאָדָ֖ם וְאִשְׁתֹּ֑ו ]   phrase[עֲרוּמִּ֔ים ] 
Genesis 2:25 clause[וַיִּֽהְי֤וּ שְׁנֵיהֶם֙ עֲרוּמִּ֔ים הָֽ...] phrase[שְׁנֵיהֶם֙ הָֽאָדָ֖ם וְאִשְׁתֹּ֑ו ]   phrase[עֲרוּמִּ֔ים ] 
Genesis 4:4 clause[וְהֶ֨בֶל הֵבִ֥יא גַם־ה֛וּא ...] phrase[הֶ֨בֶל גַם־ה֛וּא ]   phrase[הֵבִ֥יא ] 
Genesis 4:4 clause[וְהֶ֨בֶל הֵבִ֥יא גַם־ה֛וּא ...] phrase[הֶ֨בֶל גַם־ה֛וּא ]   phrase[הֵבִ֥יא ] 
Genesis 10:21 clause[גַּם־ה֑וּא אֲבִי֙ כָּל־בְּנֵי־...] phrase[גַּם־ה֑וּא אֲחִ֖י יֶ֥פֶת הַ...]   phrase[אֲבִי֙ כָּל־בְּנֵי־עֵ֔בֶר ] 
Genesis 10:21 clause[גַּם־ה֑וּא אֲבִי֙ כָּל־בְּנֵי־...] phrase[גַּם־ה֑וּא אֲחִ֖י יֶ֥פֶת הַ...]   phrase[אֲבִי֙ כָּל־בְּנֵי־עֵ֔בֶר ] 
Genesis 10:21 cl

By the way, here is some code that looks for basically the same phenomenon: a phrase within the
gap of another phrase.
It does not use search, and it gets a bit more focused results, in much less time.

##### Hint
> If you are comfortable with programming, and what you look for is fairly generic,
you are better off without search, provided you can translate your insight in the
data into an effective procedure within Text-Fabric.

In [21]:
info('Getting gapped phrases')
results = []
for v in F.otype.s('verse'):
    for c in L.d(v, otype='clause'):
        ps = L.d(c, otype='phrase')
        for p in ps:
            words = L.d(p, otype='word')
            (bp, ep) = (words[0], words[-1])
            for q in ps:
                if p == q: continue
                bq = L.d(q, 'word')[0]
                if bp < bq and bq < ep:
                    results.append((v, c, p, q, bp, bq, ep))
info('{} results'.format(len(results)))
for r in results[0:10]:
    print(r)

    44s Getting gapped phrases
    49s 369 results
(1413737, 426799, 605793, 605794, 1159, 1160, 1164)
(1413765, 426921, 606150, 606151, 1720, 1721, 1723)
(1413937, 427418, 607746, 607747, 4819, 4821, 4828)
(1413997, 427601, 608322, 608323, 5803, 5805, 5809)
(1414001, 427616, 608369, 608370, 5868, 5869, 5875)
(1414034, 427723, 608705, 608706, 6515, 6521, 6530)
(1414086, 427917, 609286, 609287, 7431, 7432, 7437)
(1414143, 428159, 609997, 609998, 8502, 8507, 8520)
(1414143, 428159, 609997, 609999, 8502, 8508, 8520)
(1414172, 428286, 610379, 610380, 9127, 9129, 9133)


But we can use the pretty printing of `glean()` here as well!

In [22]:
for r in results[0:10]:
    print(S.glean(r))

Genesis 2:25 clause[וַיִּֽהְי֤וּ שְׁנֵיהֶם֙ עֲרוּמִּ֔ים הָֽ...] phrase[שְׁנֵיהֶם֙ הָֽאָדָ֖ם וְאִשְׁתֹּ֑ו ] phrase[עֲרוּמִּ֔ים ]   
Genesis 4:4 clause[וְהֶ֨בֶל הֵבִ֥יא גַם־ה֛וּא ...] phrase[הֶ֨בֶל גַם־ה֛וּא ] phrase[הֵבִ֥יא ]   
Genesis 10:21 clause[גַּם־ה֑וּא אֲבִי֙ כָּל־בְּנֵי־...] phrase[גַּם־ה֑וּא אֲחִ֖י יֶ֥פֶת הַ...] phrase[אֲבִי֙ כָּל־בְּנֵי־עֵ֔בֶר ]   
Genesis 12:17 clause[וַיְנַגַּ֨ע יְהוָ֧ה׀ אֶת־פַּרְעֹ֛ה ...] phrase[אֶת־פַּרְעֹ֛ה וְאֶת־בֵּיתֹ֑ו ] phrase[נְגָעִ֥ים גְּדֹלִ֖ים ]   
Genesis 13:1 clause[וַיַּעַל֩ אַבְרָ֨ם מִמִּצְרַ֜יִם ...] phrase[אַבְרָ֨ם ה֠וּא וְאִשְׁתֹּ֧ו וְ...] phrase[מִמִּצְרַ֜יִם ]   
Genesis 14:16 clause[וְגַם֩ אֶת־לֹ֨וט אָחִ֤יו ...] phrase[גַם֩ אֶת־לֹ֨וט אָחִ֤יו וּ...] phrase[הֵשִׁ֔יב ]   
Genesis 17:7 clause[לִהְיֹ֤ות לְךָ֙ לֵֽאלֹהִ֔ים ...] phrase[לְךָ֙ וּֽלְזַרְעֲךָ֖ אַחֲרֶֽיךָ׃ ] phrase[לֵֽאלֹהִ֔ים ]   
Genesis 19:4 clause[וְאַנְשֵׁ֨י הָעִ֜יר אַנְשֵׁ֤י ...] phrase[אַנְשֵׁ֨י הָעִ֜יר אַנְשֵׁ֤י סְדֹם֙ ...] phrase[נָסַ֣בּוּ ]   
Genesis 19:4 clause[וְאַנְשֵׁ

# Testing basic relations

Basic relations are about the identity spatial ordering of objects.
Are they the same, do they occupy the same slots, do they overlap, is one embedded in the other,
does one come before the other?

We also have edge features, that specify relationships between nodes.

Although the basic relationships are easy to define, and even easy to implement,
they may be very costly to use. 
When searching, most of them have to be computed very many times.

Some of them have been precomputed and stored in an index, e.g. the embedding relationships.
They can be used without penalty.

Other relations are not suitable for precomputing: most inequality relations are of that kind.
It would require an enormous amount of storage to precompute for each node the set of nodes that
occupy different slots. This type of relation will not be used in narrowing down the search space,
which means that it may take more time to get the results.

We are going to test all of our basic relationships here.

Let us first see what relations we have:

In [23]:
print(S.relationLegend)

                      = left equal to right (as node)
                      # left unequal to right (as node)
                      < left before right (in canonical node ordering)
                      > left after right (in canonical node ordering)
                     == left occupies same slots as right
                     && left has overlapping slots with right
                     ## left and right do not have the same slot set
                     || left and right do not have common slots
                     [[ left embeds right
                     ]] left embedded in right
                     << left completely before right
                     >> left completely after right
-distributional_parent> edge feature "distributional_parent"
<distributional_parent- edge feature "distributional_parent" (opposite direction)
    -functional_parent> edge feature "functional_parent"
    <functional_parent- edge feature "functional_parent" (opposite direction)
               -mother> 

# = (equal as node)

The `=` means that both parts are the same node. Left and right are not two things with similar properties,
no, they are one and the same thing.

Useful if the thing you search for it part of two wildly different patterns.

In [24]:
query = '''
v1:verse
  sentence
    clause rela=Objc
      phrase
        word sp=verb gn=f nu=pl
v2:verse
  sentence
    c1:clause
    c2:clause
    c3:clause
    c1 < c2
    c2 < c3
v1 = v2
'''
S.study(query)
S.showPlan()
for r in S.fetch(amount=10):
    print(S.glean(r))

  0.00s Checking search template ...
  0.00s loading features ...
   |     0.18s B gn                   from /Users/dirk/github/text-fabric-data/hebrew/etcbc4c
   |     0.15s B nu                   from /Users/dirk/github/text-fabric-data/hebrew/etcbc4c
   |     0.24s B rela                 from /Users/dirk/github/text-fabric-data/hebrew/etcbc4c
   |     0.14s B sp                   from /Users/dirk/github/text-fabric-data/hebrew/etcbc4c
  0.72s All additional features loaded - for details use loadLog()
  0.72s Setting up search space for 10 objects ...
  1.59s Constraining search space with 11 relations ...
  1.96s Setting up retrieval plan ...
  2.00s Ready to deliver results from 327603 nodes
Iterate over S.fetch() to get the results
See S.showPlan() to interpret the results
  2.00s The results are connected to the original search template as follows:
 0     
 1 R0  v1:verse
 2 R1    sentence
 3 R2      clause rela=Objc
 4 R3        phrase
 5 R4          word sp=verb gn=f nu=pl
 6 R

# # (unequal as node)

`n # m` if `n` and `m` are not the same node.

If you write a template, and you know that the one should come before the other,
consider using `<` or `>`, which will constrain the results better.

We have seen this in action in the search for gapped phrases.

# < and > (canonical)

`n < m` if `n` comes before `m` in the
[canonical ordering](https://github.com/ETCBC/text-fabric/wiki/Api#sorting-nodes)
of nodes.

We have seen them in action before.

# == (same slots)

Two objects are extensionally equal if they occupy exactly the same slots.

Quite an expensive relation, as you will see: 30 seconds for 3608 results.

In [25]:
query = '''
v:verse
    s:sentence
v == s
'''
S.study(query)
S.showPlan()
for r in S.fetch(amount=10):
    print(S.glean(r))
S.count(progress=1000, limit=10000)

  0.00s Checking search template ...
  0.00s Setting up search space for 2 objects ...
  0.04s Constraining search space with 2 relations ...
  0.91s Setting up retrieval plan ...
  0.94s Ready to deliver results from 7216 nodes
Iterate over S.fetch() to get the results
See S.showPlan() to interpret the results
  0.94s The results are connected to the original search template as follows:
 0     
 1 R0  v:verse
 2 R1      s:sentence
 3     v == s
 4     
Ecclesiastes 3:12 sentence[יָדַ֕עְתִּי כִּ֛י אֵ֥ין טֹ֖וב בָּ֑ם ...]
Ecclesiastes 3:13 sentence[וְגַ֤ם כָּל־הָאָדָם֙ ...]
Ecclesiastes 3:18 sentence[אָמַ֤רְתִּֽי אֲנִי֙ בְּלִבִּ֔י עַל־...]
Jeremiah 10:17 sentence[אִסְפִּ֥י מֵאֶ֖רֶץ כִּנְעָתֵ֑ךְ יֹשֶׁ֖בֶת ...]
Ecclesiastes 3:21 sentence[מִ֣י יֹודֵ֗עַ ר֚וּחַ בְּנֵ֣י הָ...]
Jeremiah 10:22 sentence[קֹ֤ול שְׁמוּעָה֙ הִנֵּ֣ה בָאָ֔ה וְ...]
Jeremiah 10:23 sentence[יָדַ֣עְתִּי יְהוָ֔ה כִּ֛י לֹ֥א לָ...]
Ecclesiastes 4:3 sentence[וְטֹוב֙ מִשְּׁנֵיהֶ֔ם אֵ֥ת ...]
Jeremiah 11:1 sentence[הַדָּבָר֙ אֲשֶ

# && (overlap)

Two objects overlap if and only if they share at least one slot.
This is quite costly to use in some cases.

In [26]:
query = '''
verse
    phrase
      s1:subphrase
      s2:subphrase
      s1 # s2
      s1 && s2
'''
S.study(query)
S.showPlan()
for r in S.fetch(amount=10):
    print(S.glean(r))

  0.00s Checking search template ...
  0.00s Setting up search space for 4 objects ...
  0.17s Constraining search space with 5 relations ...
  0.80s Setting up retrieval plan ...
  1.10s Ready to deliver results from 503971 nodes
Iterate over S.fetch() to get the results
See S.showPlan() to interpret the results
  1.10s The results are connected to the original search template as follows:
 0     
 1 R0  verse
 2 R1      phrase
 3 R2        s1:subphrase
 4 R3        s2:subphrase
 5           s1 # s2
 6           s1 && s2
 7     
Genesis 1:14 phrase[לְאֹתֹת֙ וּלְמֹ֣ועֲדִ֔ים ...] subphrase[לְאֹתֹת֙ וּלְמֹ֣ועֲדִ֔ים ] subphrase[לְאֹתֹת֙ ]
Genesis 1:14 phrase[לְאֹתֹת֙ וּלְמֹ֣ועֲדִ֔ים ...] subphrase[לְאֹתֹת֙ ] subphrase[לְאֹתֹת֙ וּלְמֹ֣ועֲדִ֔ים ]
Genesis 1:14 phrase[לְאֹתֹת֙ וּלְמֹ֣ועֲדִ֔ים ...] subphrase[לְמֹ֣ועֲדִ֔ים ] subphrase[לְאֹתֹת֙ וּלְמֹ֣ועֲדִ֔ים ]
Genesis 1:14 phrase[לְאֹתֹת֙ וּלְמֹ֣ועֲדִ֔ים ...] subphrase[לְאֹתֹת֙ וּלְמֹ֣ועֲדִ֔ים ] subphrase[לְמֹ֣ועֲדִ֔ים ]
Genesis 1:14 phrase[לְא

# ## (not the same slots)

True when the two objects in question do not occupy exactly the same set of slots.
This is a very loose relationship.

It is implemented, but not yet tested, and at the moment I have not a clear use case for it.

# || (disjoint slots)

True when the two objects in question do not share any slots.
This is a rather loose relationship.

It is implemented, but not yet tested, and at the moment I have not a clear use case for it.

# [[ and ]] (embedding)

`n [[ m` if object `n` embeds `m`.

`n ]] m` if object `n` lies embedded in `n`.

These relations are used implicitly in templates when there is indentation:

```
s:sentence
  p:phrase
    w1:word gn=f
    w2:word gn=m
```

implicitly states the following embeddings:

* `s ]] p`
* `p ]] w1`
* `p ]] w2`

We have seen these relations in action.

# << and >> (before and after with slots)

These relations test whether one object comes before or after an other, in the sense that the slots
occupied by the one object ly completely before or after the slots occupied by the other object.

In [27]:
query = '''
verse
  sentence
    c1:clause
    p:phrase
    c2:clause
    c1 << p
    c2 >> p
'''
S.study(query)
S.showPlan()
for r in S.fetch(amount=10):
    print(S.glean(r))

  0.00s Checking search template ...
  0.00s Setting up search space for 5 objects ...
  0.17s Constraining search space with 6 relations ...
  0.18s Setting up retrieval plan ...
  0.19s Ready to deliver results from 515957 nodes
Iterate over S.fetch() to get the results
See S.showPlan() to interpret the results
  0.20s The results are connected to the original search template as follows:
 0     
 1 R0  verse
 2 R1    sentence
 3 R2      c1:clause
 4 R3      p:phrase
 5 R4      c2:clause
 6         c1 << p
 7         c2 >> p
 8     
Genesis 1:11 sentence[תַּֽדְשֵׁ֤א הָאָ֨רֶץ֙ דֶּ֔שֶׁא עֵ֚שֶׂב ...] clause[תַּֽדְשֵׁ֤א הָאָ֨רֶץ֙ דֶּ֔שֶׁא עֵ֚שֶׂב ...] phrase[עֹ֤שֶׂה ] clause[אֲשֶׁ֥ר זַרְעֹו־בֹ֖ו ]
Genesis 1:11 sentence[תַּֽדְשֵׁ֤א הָאָ֨רֶץ֙ דֶּ֔שֶׁא עֵ֚שֶׂב ...] clause[תַּֽדְשֵׁ֤א הָאָ֨רֶץ֙ דֶּ֔שֶׁא עֵ֚שֶׂב ...] phrase[פְּרִי֙ ] clause[אֲשֶׁ֥ר זַרְעֹו־בֹ֖ו ]
Genesis 1:11 sentence[תַּֽדְשֵׁ֤א הָאָ֨רֶץ֙ דֶּ֔שֶׁא עֵ֚שֶׂב ...] clause[תַּֽדְשֵׁ֤א הָאָ֨רֶץ֙ דֶּ֔שֶׁא עֵ֚שֶׂב ...] phrase[לְמִינֹ֔