We reproduce the MQL queries in the
[SHEBANQ tutorial by Bas Meeuse](https://github.com/ETCBC/Tutorials/blob/master/SHEBANQ%20tutorial%202021.pdf)
by means of Text-Fabric queries.

Here is the
[TF Query Guide](https://annotation.github.io/text-fabric/tf/about/searchusage.html).

In [3]:
from tf.app import use
from tf.core.helpers import project

In [4]:
VERSION = '2017'
# A = use('bhsa', hoist=globals(), version=VERSION)
A = use('bhsa:clone', checkout="clone", hoist=globals(), version=VERSION)

# Example 1

[Bas Meeuse: Example 1: Moses starts the speech](https://shebanq.ancient-data.org/hebrew/query?version=2017&id=4439)

```
[clause FOCUS
  [word lex = '>MR[' OR lex = 'DBR['] 
  [phrase function = Subj
    [word lex = 'MCH=/']
  ]
  ..
  [phrase function IN (Cmpl)
    [word lex = 'JHWH/' OR lex = '>LHJM/']
  ]
]

```

**Results**

results | verses | words
--- | --- | ---
8 | 8 | 42

In [46]:
query = """
clause
  word lex=>MR[|DBR[
  <: phrase function=Subj
    word lex=MCH=/
  < phrase function=Cmpl
    word lex=JHWH/|>LHJM/
"""

In [47]:
results = A.search(query)

  1.64s 8 results


We need to do some effort to count the distinct verses and focused words in the results.
The following function does it.
Focus is an iterable of positions in the result tuples that correspond to the `FOCUS` objects
in the MQL query.

In [48]:
def getTfVerses(results, focus):
    verses = set()
    words = set()

    for result in results:
        verse = L.u(result[0], otype="verse")[0]
        verses.add(verse)
        for pos in focus:
            focusNode = result[pos]
            if F.otype.v(focusNode) == "word":
                words.add(focusNode)
            else:
                for word in L.d(focusNode, otype="word"):
                    words.add(word)

    verses = sorted(verses)

    print(f"{len(verses):>3} verses")
    print(f"{len(words):>3} words")
    return verses

In [50]:
tfVerses = getTfVerses(results, (0,))

  8 verses
 42 words


**The numbers agree exactly**.

This is not a rigorous proof that the MQL results are identical to the TF results, but it is a very good indication.

In [51]:
A.table(results)

n,p,clause,word,phrase,word.1,phrase.1,word.2
1,Exodus 3:11,וַיֹּ֤אמֶר מֹשֶׁה֙ אֶל־הָ֣אֱלֹהִ֔ים,יֹּ֤אמֶר,מֹשֶׁה֙,מֹשֶׁה֙,אֶל־הָ֣אֱלֹהִ֔ים,אֱלֹהִ֔ים
2,Exodus 3:13,וַיֹּ֨אמֶר מֹשֶׁ֜ה אֶל־הָֽאֱלֹהִ֗ים,יֹּ֨אמֶר,מֹשֶׁ֜ה,מֹשֶׁ֜ה,אֶל־הָֽאֱלֹהִ֗ים,אֱלֹהִ֗ים
3,Exodus 4:10,וַיֹּ֨אמֶר מֹשֶׁ֣ה אֶל־יְהוָה֮,יֹּ֨אמֶר,מֹשֶׁ֣ה,מֹשֶׁ֣ה,אֶל־יְהוָה֮,יְהוָה֮
4,Exodus 19:23,וַיֹּ֤אמֶר מֹשֶׁה֙ אֶל־יְהוָ֔ה,יֹּ֤אמֶר,מֹשֶׁה֙,מֹשֶׁה֙,אֶל־יְהוָ֔ה,יְהוָ֔ה
5,Exodus 33:12,וַיֹּ֨אמֶר מֹשֶׁ֜ה אֶל־יְהוָ֗ה,יֹּ֨אמֶר,מֹשֶׁ֜ה,מֹשֶׁ֜ה,אֶל־יְהוָ֗ה,יְהוָ֗ה
6,Numbers 11:11,וַיֹּ֨אמֶר מֹשֶׁ֜ה אֶל־יְהוָ֗ה,יֹּ֨אמֶר,מֹשֶׁ֜ה,מֹשֶׁ֜ה,אֶל־יְהוָ֗ה,יְהוָ֗ה
7,Numbers 14:13,וַיֹּ֥אמֶר מֹשֶׁ֖ה אֶל־יְהוָ֑ה,יֹּ֥אמֶר,מֹשֶׁ֖ה,מֹשֶׁ֖ה,אֶל־יְהוָ֑ה,יְהוָ֑ה
8,Numbers 27:15,וַיְדַבֵּ֣ר מֹשֶׁ֔ה אֶל־יְהוָ֖ה,יְדַבֵּ֣ר,מֹשֶׁ֔ה,מֹשֶׁ֔ה,אֶל־יְהוָ֖ה,יְהוָ֖ה


We compute the verses and words involved in the results.

# Example 2

[Bas Meeuse: Example 2: FJM + prep. L](https://shebanq.ancient-data.org/hebrew/query?version=2017&id=4440)

```
[clause
  [word FOCUS lex = 'FJM[']
  ..
  [word FOCUS lex = "L"]
  [word lex <> '<JN/' AND lex <> 'PNH/']
]
```

**Results**

results | verses | words
--- | --- | ---
156 | 136 | 294

In [53]:
query = """
clause
  word lex=FJM[
  < word lex=L
  <: word lex#<JN/|PNH/
"""

In [54]:
results = A.search(query)

  1.50s 155 results


**N.B.:** one result less than in SHEBANQ.

In [55]:
tfVerses = getTfVerses(results, (1, 2))

135 verses
292 words


We show the missing result, below we explain how we found it.

In [56]:
A.table(results, start=30, end=31)

n,p,clause,word,word.1,word.2
30,Joshua 7:19,שִֽׂים־נָ֣א כָבֹ֗וד לַֽיהוָ֛ה אֱלֹהֵ֥י יִשְׂרָאֵ֖ל,שִֽׂים־,לַֽ,יהוָ֛ה
31,Joshua 8:12,וַיָּ֨שֶׂם אֹותָ֜ם אֹרֵ֗ב בֵּ֧ין בֵּֽית־אֵ֛ל וּבֵ֥ין הָעַ֖י מִיָּ֥ם לָעִֽיר׃,יָּ֨שֶׂם,לָ,


Joshua 8:2 is missing!

We expand this verse in SHEBANQ:

![josh](images/josh.png)

That there is a gap in the clause, right after the word `L` between words 116853 and 116858.
In MQL, the adjacency of things is relative to the container it is in.
If the container has a gap, the words around the gap are considered adjacent.

In this example it means that this part of the query:

```
  [word FOCUS lex = "L"]
  [word lex <> '<JN/' AND lex <> 'PNH/']
```

is matched by words 116853 and 116858.
And the MQL query considers those two words as adjacent *within the clause*.

In Text-Fabric, adjacency between words is absolute: it is not relative to a container object.
The Text-Fabric notion of adjacency is more crude. 
The reason is that in Text-Fabric, the query does not have to be a tree, where each object has a unique
immediate parent object. There could be several parent objects, and each of the parents may have different
gaps, and if we had the concept of relative adjacency, our query language would need a way to express that.

It has not, and to me it is an open question whether we should complicate search templates in that way.

Anyway, as it stands,  there is no obvious workaround to get the exactly the same behaviour as the MQL query.

That said, we can try something that comes close:

We state that the `L` is is not immediately followed by a word that is `<JN/` or `PNH/`.

In [57]:
query = """
clause
  word lex=FJM[
  < word lex=L
  /without/
  <: w3:word lex=<JN/|PNH/
  /-/
"""

In [58]:
results = A.search(query)

  0.77s 161 results


It turns out that we get more results than in SHEBANQ.
We first count the verses and words involved in the results.

In [59]:
tfVerses2 = getTfVerses(results, (1, 2))

139 verses
302 words


Here is a result that is not in SHEBANQ (see below how we found it).

In [60]:
A.table(results, start=3, end=3)

n,p,clause,word,word.1
3,Genesis 27:37,הֵ֣ן גְּבִ֞יר שַׂמְתִּ֥יו לָךְ֙,שַׂמְתִּ֥יו,לָךְ֙


The Genesis 27:37 result has something in common with the Joshua 8:2 result in SHEBANQ that we saw above: the `L` has a pronominal suffix.

Here it is also the last word in the clause.
So it seems to be an intended result of the MQL query.

Let's make a mental shift: what *is* the intention of the MQL query?
Here is a bit of query-exegesis, in that the query itself is the object of the exegesis.

The MQL query mentions three `[word]` objects, but it puts only the first two of them in `FOCUS`. 

1. it is not interested in the actual value of the third one;
2. the third `[word]` is constrained by a very loose restriction: it can be anything, except two specific values.

These two things point to the intended meaning of the query, namely

> find a clause with the word `FJM[`, and somewhere after that the word `L`, 
which is not followed by either the word `<JN/` or the word `PNH/`.

This differs subtly from what the query actually says:

> find a clause with the word `FJM[`, and somewhere after that the word `L`, 
which is followed by another word that is not `<JN/` and not `PNH/`.

The difference is one of *quantification*.

More schematically, the MQL states literally:

> there is a **word** after **a** that is not **b** and not **c**

But the intention is:

> for each **word** after **a** it is not **b** and not **c**

MQL also has a concept of quantifier, a bit more limited than in TF, but it suffices in this case: `NOTEXIST`.
The query then becomes:

```
[clause
  [word FOCUS lex = 'FJM[']
  ..
  [word FOCUS lex = "L"]
  NOTEXIST [word lex = '<JN/' OR lex = 'PNH/']
]

```

See 
[Dirk Roorda: Example 2: not exist](https://shebanq.ancient-data.org/hebrew/query?version=2017&id=4467)

**Results**

results | verses | words
--- | --- | ---
160 | 138 | 300

Now we have one result more in Text-Fabric than in SHEBANQ.
That is 2 Samuel 14:7, see below.

In [61]:
A.table(results, start=58, end=58)

n,p,clause,word,word.1
58,2_Samuel 14:7,לְבִלְתִּ֧י שִׂים־לְאִישִׁ֛י שֵׁ֥ם וּשְׁאֵרִ֖ית עַל־פְּנֵ֥י הָאֲדָמָֽה׃ פ,שִׂים־,לְ


When we look it up in SHEBANQ we find this:

![sam](images/sam.png)

The thing here is that word 168188 is `PNH/`.
It turns out that the `NOTEXIST` operator in MQL quantifies over all words that *follow* from that position.

If `NOTEXIST [word properties]` meant that there is no word *at* that position with those properties, all was well for our purposes.
But it means that there is no word *from* that position with those properties.

So ot turns out: nice idea, but it does not work out in MQL.

Now the tide has turned: we have trouble in MQL to find a query that exactly matches our intention, while in TF we can.

Still, there might be problems.

If there is a clause, with `L`, then a gap, and then either `<JN/` or `PNH/`,
the SHEBANQ query would skip it, but the Text-Fabric query would include it.

Let's check in Text-Fabric whether this occurs.

In [62]:
query = """
clause
  clause_atom
    word lex=L
    :=
  < clause_atom
    =: word lex=<JN/|PNH/
"""

In [63]:
results = A.search(query)

  0.87s 0 results


Nope.

But is this query itself right?
Let's look for a known case, namely Joshua 8:2 above.

In [64]:
query = """
clause
  clause_atom
    word lex=L
    :=
  < clause_atom
    =: word lex=MN
"""

In [65]:
results = A.search(query)

  0.88s 2 results


In [66]:
A.table(results)

n,p,clause,clause_atom,word,clause_atom.1,word.1
1,Joshua 8:2,שִׂים־לְךָ֥ מֵאַחֲרֶֽיהָ׃,שִׂים־לְךָ֥,לְךָ֥,מֵאַחֲרֶֽיהָ׃,מֵ
2,Hosea 10:15,כָּ֗כָה עָשָׂ֤ה לָכֶם֙ מִפְּנֵ֖י רָעַ֣ת רָֽעַתְכֶ֑ם,כָּ֗כָה עָשָׂ֤ה לָכֶם֙,לָכֶם֙,מִפְּנֵ֖י רָעַ֣ת רָֽעַתְכֶ֑ם,מִ


Yes, this kind of query finds exactly what we are looking for.

**Conclusion**

In Text-Fabric we have found a query with slightly different results.
But these results match the intention of the query just a bit better than the original query.

We tried to improve the MQL query by using `NOTEXIST`, but that did not work out.

However, the TF query might include (contrived) cases that the MQL query would rightfully skip. 
We can verify whether those cases actually exist by running a separate TF query, and it turns out they do not exist.

**Lesson**

Whenever an exegesis hinges on the results of a query, check and double check.
You probably will have to run multiple queries in SHEBANQ and combine the results.
This will quickly get very cumbersome.
If that happens, it starts to pay off to use Text-Fabric, where you have more complete power over 
the computations and their results.

**Appendix**

How did we spot the differences between the results of the MQL query and the TF query?

We just fetched the results from SHEBANQ by means of its CSV export and compared them with the TF results,
as follows.

Read the CSV file, and convert the book/chapter/verse labels to version nodes in TF.

In [67]:
def getShebanqVerses(fileName):
    sVerses = set()
    with open(fileName) as fh:
        for line in fh:
            fields = line.rstrip("\n").split("\t")[0:3]
            sVerses.add(T.nodeFromSection((fields[0], int(fields[1]), int(fields[2])), lang="la"))

    sVerses = sorted(sVerses)
    print(f"{len(sVerses)} verses in SHEBANQ")
    return sVerses

In [68]:
sVerses = getShebanqVerses("mql-ex2.tsv")

136 verses in SHEBANQ


In [69]:
sVerses2 = getShebanqVerses("mql-ex2dr.tsv")

138 verses in SHEBANQ


Compare the two lists, and find the first discrepancy.

In [70]:
def firstDifference(leftVerses, rightVerses):
    equal = True
    
    for (i, lv) in enumerate(leftVerses):
        leftVerse = T.sectionFromNode(lv)
        if i < len(rightVerses):
            rv = rightVerses[i]
            if lv == rv:
                continue
            rightVerse = T.sectionFromNode(rv)
            print(f"DIFFERENCE:\n{leftVerse}\n{rightVerse}")
            equal = False
            break
        else:
            print(f"DIFFERENCE:\n{leftVerse}\nno verses left")
            equal = False
            break
    if equal and len(leftVerses) < len(rightVerses):
        rv = rightVerses[len(leftVerses)]
        rightVerse = T.sectionFromNode(rv)
        print(f"DIFFERENCE:\nno verses left\n{rightVerse}")
        equal = False
    if equal:
        print("EQUAL")
    return equal

Find the first difference between the original MQL query and the first TF query:

In [71]:
firstDifference(sVerses, tfVerses)

DIFFERENCE:
('Joshua', 8, 2)
('Joshua', 8, 12)


False

Find the first difference between the original MQL query and the second TF query (with the quantifier):

In [72]:
firstDifference(sVerses, tfVerses2)

DIFFERENCE:
('Genesis', 43, 32)
('Genesis', 27, 37)


False

Find the first difference between the improved MQL query and the second TF query (with the quantifier):

In [73]:
firstDifference(sVerses2, tfVerses2)

DIFFERENCE:
('2_Samuel', 23, 5)
('2_Samuel', 14, 7)


False

# Example 3

[Bas Meeuse: Example 3: Whose people?](https://shebanq.ancient-data.org/hebrew/query?version=2017&id=4441)

```
[phrase_atom FOCUS
  [word AS P sp = prps]
  ..
  [word lex = "W" OR lex = ">W"]
  ..
  [word prs !~ "a" AND prs_ps = P.ps]
]
```

**Results**

results | verses | words
--- | --- | ---
308 | 150 | 685

In [74]:
query = """
phrase_atom
  p:word sp=prps
  < word lex=W|>W
  < w:word prs#a
  
p .ps=prs_ps. w
"""

In [75]:
results = A.search(query)

  1.63s 308 results


In [76]:
tfVerses = getTfVerses(results, (0,))

150 verses
685 words


**The numbers agree exactly**.

In [77]:
A.table(results, end=5)

n,p,phrase_atom,word,word.1,word.2
1,Genesis 6:18,אַתָּ֕ה וּבָנֶ֛יךָ וְאִשְׁתְּךָ֥ וּנְשֵֽׁי־בָנֶ֖יךָ,אַתָּ֕ה,וּ,בָנֶ֛יךָ
2,Genesis 6:18,אַתָּ֕ה וּבָנֶ֛יךָ וְאִשְׁתְּךָ֥ וּנְשֵֽׁי־בָנֶ֖יךָ,אַתָּ֕ה,וּ,אִשְׁתְּךָ֥
3,Genesis 6:18,אַתָּ֕ה וּבָנֶ֛יךָ וְאִשְׁתְּךָ֥ וּנְשֵֽׁי־בָנֶ֖יךָ,אַתָּ֕ה,וּ,בָנֶ֖יךָ
4,Genesis 6:18,אַתָּ֕ה וּבָנֶ֛יךָ וְאִשְׁתְּךָ֥ וּנְשֵֽׁי־בָנֶ֖יךָ,אַתָּ֕ה,וְ,אִשְׁתְּךָ֥
5,Genesis 6:18,אַתָּ֕ה וּבָנֶ֛יךָ וְאִשְׁתְּךָ֥ וּנְשֵֽׁי־בָנֶ֖יךָ,אַתָּ֕ה,וְ,בָנֶ֖יךָ


# Example 4

[Wido van Peursen: Judges 5.1 (Sample query)](https://shebanq.ancient-data.org/hebrew/query?version=2017&id=53)
```
select all objects where
[clause
  [phrase FOCUS function=Pred
    [word sp=verb AND nu=sg AND gn=f]
  ]
  ..
  [phrase FOCUS function=Subj
    [word sp=conj]
  ]
]
```

**Results**

results | verses | words
--- | --- | ---
65 | 51 | 315

In [78]:
query = """
clause
  phrase function=Pred
    word sp=verb nu=sg gn=f
  < phrase function=Subj
    word sp=conj
"""

In [79]:
results = A.search(query)

  1.42s 65 results


In [80]:
tfVerses = getTfVerses(results, (1, 3))

 51 verses
315 words


**The numbers agree exactly**.

In [81]:
A.table(results, end=5)

n,p,clause,phrase,word,phrase.1,word.1
1,Genesis 24:61,וַתָּ֨קָם רִבְקָ֜ה וְנַעֲרֹתֶ֗יהָ,תָּ֨קָם,תָּ֨קָם,רִבְקָ֜ה וְנַעֲרֹתֶ֗יהָ,וְ
2,Genesis 31:14,וַתַּ֤עַן רָחֵל֙ וְלֵאָ֔ה,תַּ֤עַן,תַּ֤עַן,רָחֵל֙ וְלֵאָ֔ה,וְ
3,Genesis 33:7,וַתִּגַּ֧שׁ גַּם־לֵאָ֛ה וִילָדֶ֖יהָ,תִּגַּ֧שׁ,תִּגַּ֧שׁ,גַּם־לֵאָ֛ה וִילָדֶ֖יהָ,וִ
4,Genesis 47:13,וַתֵּ֜לַהּ אֶ֤רֶץ מִצְרַ֨יִם֙ וְאֶ֣רֶץ כְּנַ֔עַן מִפְּנֵ֖י הָרָעָֽב׃,תֵּ֜לַהּ,תֵּ֜לַהּ,אֶ֤רֶץ מִצְרַ֨יִם֙ וְאֶ֣רֶץ כְּנַ֔עַן,וְ
5,Exodus 15:16,תִּפֹּ֨ל עֲלֵיהֶ֤ם אֵימָ֨תָה֙ וָפַ֔חַד,תִּפֹּ֨ל,תִּפֹּ֨ל,אֵימָ֨תָה֙ וָפַ֔חַד,וָ
