# A User-Friendl(ier) Approach to Jupyter Notebooks for Greek

This is a tutorial aimed at Greek nerds who want a simpler way to use Jupyter Notebooks to explore questions related to New Testament Greek. 

I have recently written two tutorials aimed at programmers and people who might set up a system. This tutorial assumes much less programming background, and shows very little XML.  I use CSS and HTML display to create result lists that are easier to read.  But it does assume that you have a running instance of BaseX, with the [Nestle1904 Lowfat Greek New Testament Syntax Trees](https://github.com/biblicalhumanities/greek-new-testament/tree/master/syntax-trees/nestle1904-lowfat) installed in a database named `nestle1904lowfat`.

## Opening the Database, Defining some Functions

Before we get started, you will have to open the database and run the code to define some functions that we will be using.

This code opens the database:

In [19]:
from BaseXClient import BaseXClient
session = BaseXClient.Session('localhost', 1984, 'admin', 'admin')
session.execute("open nestle1904lowfat")

''

Before we define our querying functions, we need to assign a variable to the location where two `.css` stylesheets are stored in the Lowfat trees:

In [10]:
css_dir = "/Users/jonathan/git/greek-new-testament/syntax-trees/nestle1904-lowfat/xml/"

Now let's define a few functions that will hide the details of XML for most queries:

In [26]:
from IPython.display import HTML
from pygments import highlight
from pygments.lexers import XmlLexer
from pygments.formatters import HtmlFormatter
from IPython.display import HTML

def xquery(query):
    return session.query(query).execute()

def boxwood(xml):
    display(HTML('<style type="text/css">{}{}{}</style>{}'.format(
        "hit { display: block; margin-top: 2em; }",
        open(css_dir+'treedown.css').readlines(),
        open(css_dir+'boxwood.css').readlines(),
        xml)))

def s(query):
    xq = "for $i in " + query + """
          let $s := $i/ancestor::sentence 
          return <hit>{ $s/milestone, $s/p, "➡️ ", $i}</hit>"""
    boxwood(xquery(xq))
    
def t(query):
    boxwood(xquery(query))
    
def xq(query):
    formatter = HtmlFormatter()
    display(
        HTML('<style type="text/css">{}</style>{}'.format (
            formatter.get_style_defs('.highlight'),
            highlight(xquery(query), XmlLexer(), formatter)))) 

For instance, this query finds the first preposition in the New Testament:

In [27]:
q = '/descendant::w[@class="prep"][1]'

Most of the queries in this tutorial use the `s()` function, which displays query results in the context of the sentences they are found in.  Let's see what that looks like for our query:

In [22]:
s(q)

If you don't need the verse reference and the complete sentence, you can use the `t()` function instead:

In [23]:
t(q)

If you want to see the actual XML representation of a result, you can use the `xq()` function:

In [28]:
xq(q)

Jupyter Notebooks limit the amount of data you can retrieve with any one query, so your queries should be specific enough to avoid returning huge amounts of data.  For instance, the following query will fail:

In [32]:
q = "//w"

s(q)

IOPub data rate exceeded.
The notebook server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--NotebookApp.iopub_data_rate_limit`.


But queries can return a reasonably large number of results.  Let's look for uses of the preposition 'ἐκ`:

In [35]:
q = "//w[@lemma='ἐκ']"

s(q)

## Use Case: Exploring Prepositions

In this tutorial, we will look for examples of prepositional phrases using the Greek word πίστις, such as:

- ἀπὸ τῆς πίστεως
- διὰ πίστεως 
- ἐκ πίστεως 
- εἰς πίστιν
- ἐν τῇ πίστει / ἐν πίστει
- κατὰ πίστιν 
- μετὰ πίστεως
- περὶ τὴν πίστιν
- χωρὶς πίστεως

A `wg` element ("word group") with the class attribute `pp` is a prepositional phrase.  Let's look for prepositional phrases that contain the word "ἐκ":

In [143]:
s('//wg[@class="pp" and w/@lemma="ἐκ"]')

Let's look at just the examples containing  ἐκ πίστεως:

In [36]:
 s('//wg[@class="pp" and w/@lemma="ἐκ" and .//w="πίστεως"]')

Now let's look at the more general case, prepositional phrases that contain πίστις.  (This query misses some, we will see why in the next query).

In [126]:
 s('//wg[@class="pp" and w/@lemma="πίστις"]')

In some phrases, πίστις is found in a word group below the prepositional phrase.  For instance, the phrase  ἀπὸ τῆς πίστεως contains the word group τῆς πίστεως.  Let's look for examples that contain such a word group:

In [128]:
 s('//wg[@class="pp" and wg/w/@lemma="πίστις"]')

Now let's make a list of prepositions that occur in both of the above lists.  We will use the `distinct-values()` function to remove duplicates.

In [136]:
 xq('distinct-values(//wg[@class="pp" and (w/@lemma, wg/w/@lemma)="πίστις"]/w[@class="prep"]/@lemma)')

ἀπό
εἰς
ἐκ
διά
ἐν
μετά
ἐπί
περί
κατά
χωρίς


There are quite a few prepositions in this list, it would be nice to have a list for each preposition.  One way is to simply create a separete query for each preposition.  For instance, this query looks for constructions using ️ἀπὸ:

In [138]:
s('//wg[@class="pp" and w/@lemma="ἀπό" and (w/@lemma, wg/w/@lemma)="πίστις"]')

This query looks for constructions using ️ἐπί:

In [37]:
s('//wg[@class="pp" and w/@lemma="ἐπί" and (w/@lemma, wg/w/@lemma)="πίστις"]')

Another way is to learn how to do grouping in XQuery. This is considerably more complex, but more useful for a variety of queries.  Teaching this is beyond the scope of this tutorial, but here is an example to illustrate the level of complexity and the useful output it provides.

In [38]:
q = """
   declare default collation "http://basex.org/collation?lang=el";

   for $p in //wg[@class="pp" and (w/@lemma, wg/w/@lemma)="πίστις"]
   let $prep := ($p/w[@class="prep"]/@lemma, "multiple")[1]
   group by $prep
   order by $prep
   return 
     <group>
        <h2>{ $prep, "(", count($p), ")" }</h2>
        { 
           $p ! <hit><b>{ string((.//w)[1]/@osisId)}</b> { . }</hit> 
        }
     </group>
"""

t(q)