<img align="right" src="images/tf.png" width="128"/>
<img align="right" src="images/uu-small.png" width="128"/>
<img align="right" src="images/dans.png" width="128"/>

---

To get started: consult [start](start.ipynb)

---

# Export to Excel

In a notebook, you can perform searches and view them in a tabular display and zoom in on items with
pretty displays.

But there are times that you want to take your results outside Text-Fabric, outside a notebook, outside Python, and just
work with them in other programs, such as Excel.

You want to do that not only with query results, but with all kinds of lists of tuples of nodes.

There is a function for that, `A.export()`, and here we show what it can do.

In [1]:
%load_ext autoreload
%autoreload 2

# Incantation

The ins and outs of installing Text-Fabric, getting the corpus, and initializing a notebook are
explained in the [start tutorial](start.ipynb).

In [2]:
import os
from tf.app import use

In [3]:
A = use('quran', hoist=globals())

Using quran commit 2a86beb3f768265578d137052ca7f2b314f06ea1
  in /Users/dirk/text-fabric-data/__apps__/quran
Using q-ran/quran/tf - 0.2 rv0.3 in /Users/dirk/text-fabric-data


**Documentation:** <a target="_blank" href="https://github.com/q-ran/quran/blob/master/docs" title="provenance of Quran">QURAN</a> <a target="_blank" href="https://annotation.github.io/text-fabric/Writing/Arabic" title="('Arabic characters and transcriptions',)">Character table</a> <a target="_blank" href="https://github.com/q-ran/quran/blob/master/docs/features-0.2.md#features.md" title="QURAN feature documentation">Feature docs</a> <a target="_blank" href="https://github.com/annotation/app-quran" title="quran API documentation">quran API</a> <a target="_blank" href="https://annotation.github.io/text-fabric/Api/Fabric/" title="text-fabric-api">Text-Fabric API 7.3.8</a> <a target="_blank" href="https://annotation.github.io/text-fabric/Use/Search/" title="Search Templates Introduction and Reference">Search Reference</a>

# Inspect the contents of a file
We write a function that can peek into file on your system, and show the first few lines.
We'll use it to inspect the exported files that we are going to produce.

In [4]:
EXPORT_FILE = os.path.expanduser('~/Downloads/results.tsv')
UPTO = 10

def checkout():
    with open(EXPORT_FILE, encoding='utf_16') as fh:
        for (i, line) in enumerate(fh):
            if i >= UPTO:
                break
            print(line)

# Encoding

Our exported `.tsv` files open in Excel without hassle, even if they contain non-latin characters.
That is because TF writes such files in an
encoding that works well with Excel: `utf_16_le`.
You can just open them in Excel, there is no need for conversion before or after opening these files.

Should you want to process these files by means of a (Python) program, 
take care to read them with encoding `utf_16`.

# Example query

We first run a query in order to export the results.

In [5]:
query = '''
aya
  word pos=verb
  <: word pos=noun posx=proper root=Alh
  <: word
'''
results = A.search(query)

  0.33s 529 results


# Bare export

You can export the table of results to Excel.

The following command writes a tab-separated file `results.tsv` to your downloads directory.

You can specify arguments `toDir=directory` and `toFile=file name` to write to a different file.
If the directory does not exist, it will be created.

We stick to the default, however.

In [6]:
A.export(results)

Check out the contents:

In [7]:
checkout()

R	S1	S2	NODE1	TYPE1	NODE2	TYPE2	TEXT2	pos2	NODE3	TYPE3	TEXT3	pos3	posx3	root3	NODE4	TYPE4	TEXT4

1	2	7	205662	aya	131	word	خَتَمَ 	verb	132	word	ٱللَّهُ 	noun	proper	Alh	133	word	عَلَىٰ 

2	2	17	205672	aya	341	word	ذَهَبَ 	verb	342	word	ٱللَّهُ 	noun	proper	Alh	343	word	بِ

3	2	20	205675	aya	417	word	شَآءَ 	verb	418	word	ٱللَّهُ 	noun	proper	Alh	419	word	لَ

4	2	26	205681	aya	641	word	أَرَادَ 	verb	642	word	ٱللَّهُ 	noun	proper	Alh	643	word	بِ

5	2	27	205682	aya	676	word	أَمَرَ 	verb	677	word	ٱللَّهُ 	noun	proper	Alh	678	word	بِ

6	2	55	205710	aya	1393	word	نَرَى 	verb	1394	word	ٱللَّهَ 	noun	proper	Alh	1395	word	جَهْرَةً 

7	2	70	205725	aya	1922	word	شَآءَ 	verb	1923	word	ٱللَّهُ 	noun	proper	Alh	1924	word	لَ

8	2	73	205728	aya	1996	word	يُحْىِ 	verb	1997	word	ٱللَّهُ 	noun	proper	Alh	1998	word	ٱلْ

9	2	76	205731	aya	2127	word	فَتَحَ 	verb	2128	word	ٱللَّهُ 	noun	proper	Alh	2129	word	عَلَيْ



You see the following columns:

* **R** the sequence number of the result tuple in the result list
* **S1 S2** the section as sura and aya in separate columns
* **NODEi TYPEi** the node and its type, for each node **i** in the result tuple
* **TEXTi** the full text of node **i**, if the node type admits a concise text representation
* **pos2** the value of feature `pos` on node 2, since our query mentions the feature `pos` on node 2
* other features: likewise for `pos3`, `posx3`, `root3`

# Richer exports

If we want to see the feature `posx` and the word gender (feature `gn`) on the last word (4), we must mention them
in the query. 

We can do so as follows:

In [8]:
query = '''
aya
  word pos=verb
  <: word pos=noun posx=proper root=Alh
  <: word posx* gn*
'''
results = A.search(query)

  0.55s 529 results


The same number of results as before. 
The `*` is a trivial condition, it is always true.

We do the export again and peek at the results.

In [9]:
A.export(results)
checkout()

R	S1	S2	NODE1	TYPE1	NODE2	TYPE2	TEXT2	pos2	NODE3	TYPE3	TEXT3	pos3	posx3	root3	NODE4	TYPE4	TEXT4	gn4	posx4

1	2	7	205662	aya	131	word	خَتَمَ 	verb	132	word	ٱللَّهُ 	noun	proper	Alh	133	word	عَلَىٰ 		

2	2	17	205672	aya	341	word	ذَهَبَ 	verb	342	word	ٱللَّهُ 	noun	proper	Alh	343	word	بِ		

3	2	20	205675	aya	417	word	شَآءَ 	verb	418	word	ٱللَّهُ 	noun	proper	Alh	419	word	لَ		emphatic

4	2	26	205681	aya	641	word	أَرَادَ 	verb	642	word	ٱللَّهُ 	noun	proper	Alh	643	word	بِ		

5	2	27	205682	aya	676	word	أَمَرَ 	verb	677	word	ٱللَّهُ 	noun	proper	Alh	678	word	بِ		

6	2	55	205710	aya	1393	word	نَرَى 	verb	1394	word	ٱللَّهَ 	noun	proper	Alh	1395	word	جَهْرَةً 	f	

7	2	70	205725	aya	1922	word	شَآءَ 	verb	1923	word	ٱللَّهُ 	noun	proper	Alh	1924	word	لَ		emphatic

8	2	73	205728	aya	1996	word	يُحْىِ 	verb	1997	word	ٱللَّهُ 	noun	proper	Alh	1998	word	ٱلْ		

9	2	76	205731	aya	2127	word	فَتَحَ 	verb	2128	word	ٱللَّهُ 	noun	proper	Alh	2129	word	عَلَيْ		



As you see, you have extra columns **gn4** and **posx4**.

This gives you a lot of control over the generation of spreadsheets.

# Not from queries

You can also export lists of node tuples that are not obtained by a query:

In [10]:
tuples = (
    tuple(results[0][1:3]),
    tuple(results[1][1:3]),
)

tuples

((131, 132), (341, 342))

Two rows, each row has a clause node and a word node.

Let's do a bare export:

In [11]:
A.export(tuples)
checkout()

R	S1	S2	NODE1	TYPE1	TEXT1	NODE2	TYPE2	TEXT2	pos2

1	2	7	131	word	خَتَمَ 	132	word	ٱللَّهُ 	noun

2	2	17	341	word	ذَهَبَ 	342	word	ٱللَّهُ 	noun



Wait a minute: why is the `pos2` there?

It is because we have run a query before where we asked for `pos`.

If we do not want to be influenced by previous things we've run, we need to reset the display:

In [12]:
A.displayReset('tupleFeatures')

Again:

In [13]:
A.export(tuples)
checkout()

R	S1	S2	NODE1	TYPE1	TEXT1	NODE2	TYPE2	TEXT2

1	2	7	131	word	خَتَمَ 	132	word	ٱللَّهُ 

2	2	17	341	word	ذَهَبَ 	342	word	ٱللَّهُ 



# Display setup

We can get richer exports by means of
`A.displaySetup()`, using the parameter `tupleFeatures`:

In [14]:
A.displaySetup(tupleFeatures=(
    (0, 'pos posx'),
    (1, 'pos root'),
))

We assign extra features per member of the tuple.

In the above case:

* the first (`0`) member (the first word node), gets features `pos` and `posx`;
* the second (`1`) member (the second word node), gets features `pos` and `root`.

In [15]:
A.export(tuples)
checkout()

R	S1	S2	NODE1	TYPE1	TEXT1	pos1	posx1	NODE2	TYPE2	TEXT2	pos2	root2

1	2	7	131	word	خَتَمَ 	verb		132	word	ٱللَّهُ 	noun	Alh

2	2	17	341	word	ذَهَبَ 	verb		342	word	ٱللَّهُ 	noun	Alh



Talking about display setup: other parameters also have effect, e.g. the text format.

Let's change it to the ascii transcription.

In [16]:
A.export(tuples, fmt='text-trans-full')
checkout()

R	S1	S2	NODE1	TYPE1	TEXT1	pos1	posx1	NODE2	TYPE2	TEXT2	pos2	root2

1	2	7	131	word	xatama 	verb		132	word	{ll~ahu 	noun	Alh

2	2	17	341	word	*ahaba 	verb		342	word	{ll~ahu 	noun	Alh



# Chained queries

You can chain queries like this:

In [17]:
results = (
    A.search('''
aya
  word pos=verb
  <: word pos=noun posx=proper root=Alh
  <: word pos=verb
''')[0:2]
    +
    A.search('''
aya
  word pos=verb
  <: word pos=noun posx=proper root=Alh
  <: word pos=noun
''')[0:2]
)

  0.42s 24 results
  0.41s 101 results


In such cases, it is better to setup the features yourself:

In [18]:
A.displaySetup(
    tupleFeatures=(
        (1, 'root formation tense'),
        (3, 'pos gn nu ps mood'),
    ),
    fmt='text-phono-full',
)

Now we can do a fine export:

In [19]:
A.export(results)
checkout()

R	S1	S2	NODE1	TYPE1	NODE2	TYPE2	TEXT2	root2	formation2	tense2	NODE3	TYPE3	TEXT3	NODE4	TYPE4	TEXT4	pos4	gn4	nu4	ps4	mood4

1	2	91	205746	aya	2645	word	أَنزَلَ 	nzl	IV	perfect	2646	word	ٱللَّهُ 	2647	word	قَالُ	verb	m	p	3	

2	2	170	205825	aya	5164	word	أَنزَلَ 	nzl	IV	perfect	5165	word	ٱللَّهُ 	5166	word	قَالُ	verb	m	p	3	

3	2	55	205710	aya	1393	word	نَرَى 	rAy		imperfect	1394	word	ٱللَّهَ 	1395	word	جَهْرَةً 	noun	f			

4	2	80	205735	aya	2239	word	يُخْلِفَ 	xlf	IV	imperfect	2240	word	ٱللَّهُ 	2241	word	عَهْدَ	noun	m			



---

All chapters:

* **[start](start.ipynb)** introduction to computing with your corpus
* **[display](display.ipynb)** become an expert in creating pretty displays of your text structures
* **[search](search.ipynb)** turbo charge your hand-coding with search templates
* **exportExcel** make tailor-made spreadsheets out of your results
* (to come) **[share](share.ipynb)** draw in other people's data and let them use yours
* **[similarAyas](similarAyas.ipynb)** spot the similarities between lines

Back to [start](start.ipynb)