# Tutorial: Greek Syntax Queries using Lowfat and Jupyter Notebooks

This tutorial illustrates some of the kinds of queries that can be done using the <a href="https://github.com/biblicalhumanities/greek-new-testament/tree/master/syntax-trees/nestle1904-lowfat">Nestle 1904 Lowfat Syntax Trees</a> and <a href="https://jupyter.org/">Jupyter notebooks</a>. It is aimed at someone who knows Greek fairly well but may not have experience with query languages or programming.  It uses the `greeksyntax` package, written to simplify the task of writing queries for this environment.

This tutorial does not cover installation. It assumes that you have installed <a href="http://basex.org/">BaseX</a> and that the current <a href="https://github.com/biblicalhumanities/greek-new-testament/tree/master/syntax-trees/nestle1904-lowfat">Nestle 1904 Lowfat Syntax Trees</a> are installed in a database called "nestle1904lowfat".  It also assumes that you are running a <a href="https://jupyter.org/">Jupyter notebook</a> from the <code>labnotes</code> subdirectory in the <a href="https://github.com/biblicalhumanities/greek-new-testament">greek-new-testament</a> repo from <a href="http://biblicalhumanities.org/dashboard">biblicalhumanities.org</a>.

## Opening the Database

The following code imports the functions we need and opens the database:

In [1]:
from greeksyntax.lowfat import *
q = lowfat("nestle1904lowfat")

Let's make sure that we have successfully opened the database using a simple query:

In [2]:
q.xquery("count(//book)")

'27'

If the query works, you are up and running.  Let's get on with the tutorial.

## Don't Try to Return the Whole Database

You should be aware that there are limits on the amount of data Jupyter allows a query to return.  Queries can return large results - e.g., the next query returns the entire book of Matthew - but there are limits.  If your query returns too much data, you will see the following error:

In [3]:
# This query attempts to return every word in the Greek New Testament.  Jupyter returns an error.
q.xquery("//w")

IOPub data rate exceeded.
The notebook server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--NotebookApp.iopub_data_rate_limit`.


The solution is to write a more specific query.  You will see how to do that in the following sections.

## Book, Chapter, Verse, Word

Let's start by looking up specific texts. The following query returns the sentences in Matthew 5.  Because this is a large result, Jupyter creates a scrollable window.

In [4]:
q.find(milestone("Matt.5"))

The words in this output contain information on the morphology - if you let your mouse rest over a word you will see a tooltip that looks like this:

![image.png](attachment:image.png)

That can be helpful if you are not sure of the morphology for an individual word.

The `milestone` function generates a query that looks for sentences corresponding to a particular reference.  You can execute it by itself to see the query it generates.  

In [5]:
milestone("Matt.5.6")

"//sentence[milestone[@id='Matt.5.6']]"

Milestones have the following structure:

- `Matt` - an entire book
- `Matt.5` - a chapter
- `Matt.5.36` - a verse
- `Matt.5.36!4` - a word

Let's look at an individual word, then an individual verse.

In [6]:
q.find(milestone("Matt.5.6"))

In [7]:
q.find(milestone("Matt.5.6!3"))

Like the previous query, these results have tooltips so you can hover over a word with your mouse and see information about the word.

## Words, Lemmas, and Morphology

In [8]:
q.find(milestone("Matt.5.6!1"))

In this tutorial, most results are presented as readable text, but words have a rich structure that contains a great deal of information.  Let's use the `xquery()` function to see the raw structure of that same word:

In [9]:
q.xquery(milestone("Matt.5.6!1"))

'<w xmlns:xi="http://www.w3.org/2001/XInclude" role="p" class="adj" osisId="Matt.5.6!1" n="400050060010010" lemma="μακάριος" normalized="μακάριοι" strong="3107" number="plural" gender="masculine" case="nominative" head="true">μακάριοι</w>'

If you like color, you can use the `pretty()` function to make that a little more readable:

In [10]:
pretty(q.xquery(milestone("Matt.5.6!1")))

We can use this information to look for specific characteristics of words.  Let's take a look at the individual parts of this:

- `<w>` - Each word is wrapped in a `w` element.  You can count the words in the Greek New Testament with this query: `count(//w)`.
- `xmlns:xi="http://www.w3.org/2001/XInclude"` is just noise for our purposes.  Ignore it.  It comes from including individual books into a master file using XInclude.
- `class="verb"` - this word is a verb.  You can count the verbs in the Greek New Testament with this query: `count(//w[@class='verb'])`, which counts the `w` elements that have `class` attributes with the value `verb`.
- `role` - the grammatical role of the word within its clause, in this case `p` means `predicate`. Not all words have roles - sometimes the role is given to a group of words rather than individual words, and some words like conjunctions do not have clausal roles.  You can count individual words that occur as predicates using this query: `count(//w[@role='p'])`.
- `osisId` - the milestone for the individual word. You can find this word using the following query: `//w[@osisId='Matt.5.6!1']`.
- `n` - an integer that can be used to sort words into sentence order.
- `lemma` - the dictionary form of the word.  You can look up other instances of this word with this query: `//w[@lemma='μακάριος']`.
- `normalized` - a "normalized" form of the word that ignores changes in accent due to phonological context such as position in the sentence or the presence of clitics.  You can look up other instances of this normalized form with this query: `//w[@normalized='μακάριοι']`.
- `strong` - a Strong's number.
- `number`, `gender`, `case`, etc - morphology of the word. You can look up other adjectives that are plural, masculine, and nominative using this query: `//w[@class='adj' and @number='plural' and @gender='masculine' and @case='nominative']`.

For more documentation on this format, see [the Lowfat documentation](https://github.com/biblicalhumanities/greek-new-testament/tree/master/syntax-trees/nestle1904-lowfat).


You can play with the queries shown above by creating new cells with the + button in the menu bar and putting your conditions in a string like this:

In [11]:
q.find("//w[@class='adj' and @number='plural' and @gender='masculine' and @case='nominative']")

You may want to return your query results in other ways. For instance, you can use `highlight()` to show query results highlighted in the context of the original sentence.  Here is the same query using the `highlight()` function:

In [12]:
q.highlight("//w[@class='adj' and @number='plural' and @gender='masculine' and @case='nominative']")

A similar function, `sentence()`, shows the matching item after the sentence.

In [13]:
q.sentence("//w[@class='adj' and @number='plural' and @gender='masculine' and @case='nominative']")

The `milestone()` function is designed to make it easy to specify where to search.  Let's do the same query, but limit it to Matthew 6.

In [14]:
q.highlight(milestone("Matt.6")+"//w[@class='adj' and @number='plural' and @gender='masculine' and @case='nominative']"  )

Since we are reusing that query, let's put it in a variable so we don't have to type it each time.  Then we will apply the same query to look for instances in Luke and Acts.

In [15]:
pmn_adjs = "//w[@class='adj' and @number='plural' and @gender='masculine' and @case='nominative']"

q.highlight(milestone("Luke") + pmn_adjs)
q.highlight(milestone("Acts") + pmn_adjs)

Incidentally, if you want to see the query that corresponds to `milestone("Luke") + pmn_adjs`, you can simply evaluate it like this:

In [16]:
milestone("Luke") + pmn_adjs

"//sentence[milestone[starts-with(@id,'Luke')]]//w[@class='adj' and @number='plural' and @gender='masculine' and @case='nominative']"

## Syntax

Syntax is largely about exploring relationships within a clause. The `@role` attribute identifies these relationships.  Clauses can contain other clauses and phrases in complex recursive structures.

Groups of words are found in `<wg>` elements ("word group").  A clause is identified by the attribute `class='cl'`.  Like words, word groups can have `role` attributes that identify their role in a clause. 

Let's look for clauses that function as objects of other clauses.  

In [17]:
q.highlight("//wg[@class='cl' and @role='o']")

Like morphology queries, syntax queries can be combined with milestones to scope them. Let's use a milestone to look only in the Book of Acts:

In [18]:
q.highlight(milestone("Acts")+"//wg[@class='cl' and @role='o']")

Queries can combine conditions on individual words and conditions on word groups.  Let's modify that query to show only clauses that contain participles and function as objects of other clauses.  We will use `role='v'` rather than `class='verb` so that we find only clauses in which the participle governs the clause.

In [19]:
q.highlight(milestone("Acts")+"//wg[@class='cl' and @role='o' and w[@role='v' and @mood='participle']]")

Word groups can also represent phrases of various kinds (see [this documentation](https://github.com/biblicalhumanities/greek-new-testament/tree/master/syntax-trees/nestle1904-lowfat)).

Let's look for prepositional phrases that contain the word πίστις:

In [20]:
q.highlight("//wg[@class='pp' and .//w[@lemma='πίστις']]")

And let's narrow that to prepostitional phrases where the preposition is ἐν:

In [21]:
q.highlight("//wg[@class='pp' and w[@lemma='ἐν'] and .//w[@lemma='πίστις']]")

Now let's narrow these results further, showing only phrases where πίστις occurs in the same word group as ἐν or the word group immediately below it.

In [24]:
q.highlight("//wg[@class='pp' and w[@lemma='ἐν'] and (w, wg/w)[@lemma='πίστις']]")

## Next Steps

This is only an introductory tutorial showing a small number of queries.  It is meant to whet your appetite, to inspire you to think of queries that will teach you about aspects of biblical Greek you are interested in.

I plan to follow this up with more Jupyter notebooks, illustrating specific questions I would like to explore.  I also expect to add more resources to the `greeksyntax` package.  If you want to follow this work, I encourage you to [follow my blog](http://jonathanrobie.biblicalhumanities.org/). 