# Print the syntax tree of a specific verse (N1904-TF)

## Table of content (ToC)<a class="anchor" id="TOC"></a>
* <a href="#bullet1">1 - Introduction</a>
* <a href="#bullet2">2 - Load Text-Fabric app and data</a>
* <a href="#bullet3">3 - Performing the queries</a>
     * <a href="#bullet3x1">3.1 - Show a specific verse</a>
         * <a href="#bullet3x1x1">3.1.1 - Alternative methods</a>
     * <a href="#bullet3x2">3.2 - Selecting individual words of the verse</a>
     * <a href="#bullet3x3">3.3 - Available text output formats</a>
     * <a href="#bullet3x4">3.4 - Compare text output formats</a>
* <a href="#bullet4">4 - Notebook version details</a> 

# 1 - Introduction <a class="anchor" id="bullet1"></a>
##### [Back to ToC](#TOC)

This Jupyter Notebook demonstrates how to display the syntax tree for a specific verse. We will begins by explaining how you can use Text-Fabric to select a particular verse from the Greek New Testament. Additionally, it utilizes the <a href="https://centerblc.github.io/N1904/viewtypes.html#start" target="_blank">A.viewType()</a> function to showcase the differences between the two types of syntax tree presentations included in the dataset.

# 2 - Load Text-Fabric app and data <a class="anchor" id="bullet2"></a>
##### [Back to ToC](#TOC)

In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
# Loading the Text-Fabric code
# Note: it is assumed Text-Fabric is installed in your environment
from tf.fabric import Fabric
from tf.app import use

In [3]:
# load the N1904 app and data
N1904 = use ("CenterBLC/N1904", version="1.0.0", hoist=globals())

**Locating corpus resources ...**

Name,# of nodes,# slots / node,% coverage
book,27,5102.93,100
chapter,260,529.92,100
verse,7944,17.34,100
sentence,8011,17.2,100
group,8945,7.01,46
clause,42506,8.36,258
wg,106868,6.88,533
phrase,69007,1.9,95
subphrase,116178,1.6,135
word,137779,1.0,100


Display is setup for viewtype [syntax-view](https://github.com/CenterBLC/N1904/blob/main/docs/syntax-view.md#start)

See [here](https://github.com/CenterBLC/N1904/blob/main/docs/viewtypes.md#start) for more information on viewtypes

In [4]:
# The following will push the Text-Fabric stylesheet to this notebook (to facilitate proper display with notebook viewer)
N1904.dh(N1904.getCss())

# 3 - Performing the queries <a class="anchor" id="bullet3"></a>
##### [Back to ToC](#TOC)

## 3.1 - Show a specific verse<a class="anchor" id="bullet3x1"></a>

The following example demonstrates a query for a specific verse (Mark 1:1). As expected, the query returns a single result.

In [6]:
# Define the query template
VerseQuery = '''
book book=Mark
  chapter chapter=1
      verse verse=1
'''

In [7]:
# The following will create a list containing ordered tuples consisting of node numbers of the items as they appear in the query
VerseResult = N1904.search(VerseQuery)

  0.01s 1 result


The result stored in object `VerseResult` is a list of tuples. In this example, the list contains only one tuple. Each tuple corresponds to the nodes retrieved based on the query template, with the number of nodes in the tuple matching the three specified in the query. 

You can inspect the contents of `VerseResult` using the following print statement:

```python
print (VerseResult)
```
This will display the numeric values for the selected book, chapter, and verse nodes, which are in this example:
```text
[(137781, 137835, 383782)]
```
Next we will print the syntax tree for the obtained results:

In [36]:
# Print the result
N1904.show(VerseResult,queryFeatures=False)

### 3.1.1 - Alternative methods <a class="anchor" id="bullet3x1x1"></a>

In addition to the straightforward example provided earlier, there are other, more advanced methods for selecting a specific verse in Text-Fabric. In this section two will be briefly described.

Firstly, the code from the previous cells can be combined into a single, compact, and efficient line of code. This approach yields a list of tuples, with each tuple containing only one element (namely a `verse` node), which is then used as argument for the `A.show()` function:

```python
N1904.show(N1904.search('verse book=Mark chapter=1 verse=1'))
```

Another method takes advantage of the fact that `A.show()` expects a list of tuples. This is achieved by encapsulating the numeric value of the `verse` node in a list, using square brackets `[ ]`, and making the integer part of a tupe, using parentheses `(  , )`. Additionally, the function `T.nodeFromSection()` expects a tuple as input, which is created using `('Mark',1,1)`. Combining these steps results in the following construction:

```python
N1904.show([(T.nodeFromSection(('Mark',1,1)), )])
```

## 3.2 - Selecting individual words of the verse <a class="anchor" id="bullet3x2"></a>

A similar (but still different) result can be obtained by selecting all words from the verse individualy. Since each word is counted as a separate result, the total number of results is higher — in this case, seven. Additionally, note that the found items (i.e., individual words) are highlighted in yellow. Using the argument `condensed=True`combines all the found items, limiting the display to a single instance of the verse, as all results come from the same verse. If the argument `condensed=False` were supplied, the verse would be displayed seven times, with each instance highlighting the next consecutive word in yellow.

In [37]:
# Define the query template
AltVerseQuery = '''
word book=Mark chapter=1 verse=1
'''

# The following will create a list containing ordered tuples consisting of node numbers of the items as they appear in the query
AltVerseResult = N1904.search(AltVerseQuery)

# Print some of the results
N1904.show(AltVerseResult, start=1, end=15, condensed=True, queryFeatures=False)

  0.10s 7 results


## 3.3 - Available text output formats <a class="anchor" id="bullet3x3"></a>

Text-Fabric's data design enables a very flexible representation of the corpus text (<a href="https://centerblc.github.io/N1904/textformats.html#start" target="_blank">see this section</a>). If no specific format is defined, the default format will be used, which was set during the dataset's creation (for this dataset, the default is `text-orig-full`). Additionally, the dataset includes several other formats that are particularly relevant to the Greek New Testament corpus.

To view the available formats for displaying the text in this dataset:

In [40]:
T.formats

{'lex-orig-plain': 'word',
 'lex-translit-plain': 'word',
 'text-orig-full': 'word',
 'text-orig-plain': 'word',
 'text-translit-plain': 'word',
 'text-unaccent-plain': 'word'}

This list reveals that all defined formats are based on `word` nodes. This means that the output for any given format is generated using a specific set of features associated with `word` nodes. The exact combination of features used for each text format can be examined by running the following command:

In [42]:
N1904.showFormats()

format | level | template
--- | --- | ---
`lex-orig-plain` | **word** | `{lemma}{trailer}`
`lex-translit-plain` | **word** | `{lemmatranslit}{trailer}`
`text-orig-full` | **word** | `{before}{text}{after}`
`text-orig-plain` | **word** | `{text}{trailer}`
`text-translit-plain` | **word** | `{translit}{trailer}`
`text-unaccent-plain` | **word** | `{unaccent}{trailer}`


#### Remark regarding data origin

This data originates from file `otext.tf`:

>
```
@config
...
@fmt:text-orig-full={before}{text}{after}
...
```


## 3.4 - Compare text output formats <a class="anchor" id="bullet3x4"></a>

Using the example verse, we can illustrate how the different formats in this dataset influence the presentation of Mark 1:1. To do this, we iterate over the defined text formats and display the text associated with the `verse` node in each format.

In [43]:
# note: node 383782 is of type 'verse' and associated to Mark 1:1 
for formats in T.formats:
    print(f'fmt={formats}\t: {T.text(383782,formats)}')

fmt=lex-orig-plain	: ἀρχή ὁ εὐαγγέλιον Ἰησοῦς Χριστός υἱός θεός. 
fmt=lex-translit-plain	: arkhe o euaggelion Iesous Khristos uios theos. 
fmt=text-orig-full	: Ἀρχὴ τοῦ εὐαγγελίου Ἰησοῦ Χριστοῦ (Υἱοῦ Θεοῦ). 
fmt=text-orig-plain	: Ἀρχὴ τοῦ εὐαγγελίου Ἰησοῦ Χριστοῦ Υἱοῦ Θεοῦ. 
fmt=text-translit-plain	: Arkhe tou euaggeliou Iesou Khristou Uiou Theou. 
fmt=text-unaccent-plain	: Αρχη του ευαγγελιου Ιησου Χριστου Υιου Θεου. 


# 4 - Notebook version<a class="anchor" id="bullet4"></a>
##### [Back to ToC](#TOC)

<div style="float: left;">
  <table>
    <tr>
      <td><strong>Author</strong></td>
      <td>Tony Jurg</td>
    </tr>
    <tr>
      <td><strong>Version</strong></td>
      <td>1.1</td>
    </tr>
    <tr>
      <td><strong>Date</strong></td>
      <td>9 October 2024</td>
    </tr>
  </table>
</div>