<img align="right" src="images/tf.png"/>
<img align="right" src="images/etcbc.png"/>
<img align="right" src="images/logo.png"/>

# Search Introduction

*Search* in Text-Fabric is a template based way of looking for structural patterns in your dataset.

Within Text-Fabric we have the unique possibility to combine the ease of formulating search templates for
complicated syntactical patterns with the power of programmatically processing the results.

This notebook will show you how to get up and running.

## Easy command

Search is as simple as saying (just an example)

```python
results = A.search(template)
A.show(results)
```

See all ins and outs in the
[search template docs](https://annotation.github.io/text-fabric/Use/Search/#search-templates).

# Incantation

The ins and outs of installing Text-Fabric, getting the corpus, and initializing a notebook are
explained in the [start tutorial](start.ipynb).

In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
import unicodedata
from tf.app import use

In [3]:
A = use('athenaeus:clone', hoist=globals())

Using TF-app in /Users/dirk/github/annotation/app-athenaeus/code:
	repo clone offline under ~/github (local github)
rate limit is 5000 requests per hour, with 5000 left for this hour
	connecting to online GitHub repo pthu/athenaeus ... connected
	no releases
Using data in /Users/dirk/text-fabric-data/pthu/athenaeus/Athenaeus/Deipnosophistae/tf/1.1:
	#8139435fc5d337cbfb841267bdb42986201b153b (latest commit)


# Basic search command

We start with the most simple form of issuing a query.
Let's look for the words in book 3, chapter 4.

All work involved in searching takes place under the hood.

In [13]:
ch = results[0][1]
ch

286140

In [15]:
A.plain(ch)

In [12]:
query = '''
book book=3
  chapter chapter=4
    word
'''
results = A.search(query)
A.table(results, end=10)

  0.05s 66 results


n,p,book,chapter,word
1,3 4,book 3,chapter 4,ΣΙΚΥΟΣ.
2,3 4,book 3,chapter 4,παροιμία·
3,3 4,book 3,chapter 4,σικυὸν
4,3 4,book 3,chapter 4,"τρώγουσα,"
5,3 4,book 3,chapter 4,"γύναι,"
6,3 4,book 3,chapter 4,τὴν
7,3 4,book 3,chapter 4,χλαῖναν
8,3 4,book 3,chapter 4,ὕφαινε.
9,3 4,book 3,chapter 4,Μάτρων
10,3 4,book 3,chapter 4,ἐν


The hyperlinks take us all to the beginning of the book of Matthew.

Note that we can choose start and/or end points in the results list.

In [5]:
A.table(results, start=8, end=13, linked=3)

n,p,book,chapter,word
8,3 4:1812,book 3,chapter 4,ὕφαινε.
9,3 4:1813,book 3,chapter 4,Μάτρων
10,3 4:1813,book 3,chapter 4,ἐν
11,3 4:1813,book 3,chapter 4,παρῳδίαις( )·
12,3 4:1813,book 3,chapter 4,καὶ
13,3 4:1813,book 3,chapter 4,σικυὸν


We can show the results more fully with `show()`.

In [6]:
A.show(results, start=1, end=3)

Before we go on, there is a thing with Unicode.

All Greek strings in this corpus are in decomposed normal form. That means e.g. that
`ἐπί` has 5 letters.

However, when this string is printed in a Jupyter notebook, it is converted to composed normal form.
So when we copy and paste such a string in a query, we must make sure that we paste the denormalized form.

We use a utility function for that:

In [16]:
def ud(s):
  return unicodedata.normalize('NFD', s)

In [17]:
query = f'''
word lemma={ud('ἐπί')}
'''
results = A.search(query)
A.show(results, condenseType='_sentence', condensed=True, end=10)

  0.24s 1390 results


In [18]:
A.table(results, end=10)

n,p,word
1,1 2:12,ἐπὶ
2,1 2:12,ἐπὶ
3,1 2:12,ἐπὶ
4,1 2:12,ἐπὶ
5,1 4:20,ἐπὶ
6,1 4:21,ἐπ’
7,1 4:24,ἐπὶ
8,1 4:27,ἐπὶ
9,1 7:55,ἐπὶ
10,1 8:68,ἐπὶ


In [19]:
query = f'''
word lemma=*ακαδημαικων
'''
results = A.search(query)
A.show(results, condenseType='_sentence', end=10)

  0.25s 2 results
