![dans](images/dans.png)
![tf](images/tf-small.png)

---
Start with [convert](convert.ipynb)

---

# Use the Banks example corpus

## Load TF

We are going to load the new data: all features.

We start a new instance of the TF machinery.

In [19]:
import os
import re

from tf.fabric import Fabric

In [20]:
TF_DIR = os.path.expanduser('~/github/annotation/tutorials/text-fabric/examples/banks/tf')

VERSION = '0.1'

TF_PATH = f'{TF_DIR}/{VERSION}'
TF = Fabric(locations=TF_PATH)

This is Text-Fabric 7.5.4
Api reference : https://annotation.github.io/text-fabric/Api/Fabric/

10 features found and 0 ignored


We ask for a list of all features:

In [21]:
allFeatures = TF.explore(silent=True, show=True)
loadableFeatures = allFeatures['nodes'] + allFeatures['edges']
loadableFeatures

('author',
 'gap',
 'letters',
 'number',
 'otype',
 'punc',
 'terminator',
 'title',
 'oslots')

We load all features:

In [22]:
api = TF.load(loadableFeatures, silent=False)

  0.00s loading features ...
   |     0.00s T otype                from /Users/dirk/github/annotation/tutorials/text-fabric/examples/banks/tf/0.1
   |     0.00s T oslots               from /Users/dirk/github/annotation/tutorials/text-fabric/examples/banks/tf/0.1
   |     0.00s T title                from /Users/dirk/github/annotation/tutorials/text-fabric/examples/banks/tf/0.1
   |     0.00s T number               from /Users/dirk/github/annotation/tutorials/text-fabric/examples/banks/tf/0.1
   |     0.00s T letters              from /Users/dirk/github/annotation/tutorials/text-fabric/examples/banks/tf/0.1
   |     0.00s T punc                 from /Users/dirk/github/annotation/tutorials/text-fabric/examples/banks/tf/0.1
   |      |     0.00s C __levels__           from otype, oslots, otext
   |      |     0.00s C __order__            from otype, oslots, __levels__
   |      |     0.00s C __rank__             from otype, __order__
   |      |     0.00s C __levUp__            from otype

You see that all files are marked with a `T`.

That means that Text-Fabric loads the features by reading the plain text `.tf` files.
But after reading, it makes a binary equivalent and stores it as a `.tfx`
file in the hidden `.tf` directory next to it.

Furthermore, you see some lines marked with `C`. Here Text-Fabric is computing derived data,
mostly about sections, the order of nodes, and the relative positions of nodes with respect to the slots they
are linked to.

The results of this pre-computation are also stored in that hidden `.tf` directory.

The next time, Text-Fabric loads the data from their binary `.tfx` files, which is much faster.
And the pre-computation step will be skipped.

If the binary files get outdated Text-Fabric will recompile and recompute everything automatically.

So let's load again.

In [23]:
TF = Fabric(locations=TF_PATH)
api = TF.load(loadableFeatures, silent=False)

This is Text-Fabric 7.5.4
Api reference : https://annotation.github.io/text-fabric/Api/Fabric/

10 features found and 0 ignored
  0.00s loading features ...
   |     0.00s B otype                from /Users/dirk/github/annotation/tutorials/text-fabric/examples/banks/tf/0.1
   |     0.00s B oslots               from /Users/dirk/github/annotation/tutorials/text-fabric/examples/banks/tf/0.1
   |     0.00s B title                from /Users/dirk/github/annotation/tutorials/text-fabric/examples/banks/tf/0.1
   |     0.00s B number               from /Users/dirk/github/annotation/tutorials/text-fabric/examples/banks/tf/0.1
   |     0.00s B letters              from /Users/dirk/github/annotation/tutorials/text-fabric/examples/banks/tf/0.1
   |     0.00s B punc                 from /Users/dirk/github/annotation/tutorials/text-fabric/examples/banks/tf/0.1
   |     0.00s B author               from /Users/dirk/github/annotation/tutorials/text-fabric/examples/banks/tf/0.1
   |     0.00s B gap    

Where there were `T`s before, there are now `B`s.

### Hoisting

We can access all TF data programmatically by using `api.Features`, or `api.F` (same thing) and a bunch of
other API members. 

But if we working with a single data source, we can hoist those API members to the global namespace.

This is not a thing to be done when you write modules for other people, but if you are the user yourself,
why should not you make life just a little bit easier?

In [24]:
api.makeAvailableIn(globals())

[('Computed',
  'computed-data',
  ('C Computed', 'Call AllComputeds', 'Cs ComputedString')),
 ('Features', 'edge-features', ('E Edge', 'Eall AllEdges', 'Es EdgeString')),
 ('Fabric', 'loading', ('ensureLoaded', 'TF', 'ignored', 'loadLog')),
 ('Locality', 'locality', ('L Locality',)),
 ('Misc', 'messaging', ('cache', 'error', 'indent', 'info', 'reset')),
 ('Nodes',
  'navigating-nodes',
  ('N Nodes', 'sortKey', 'sortKeyTuple', 'otypeRank', 'sortNodes')),
 ('Features',
  'node-features',
  ('F Feature', 'Fall AllFeatures', 'Fs FeatureString')),
 ('Search', 'search', ('S Search',)),
 ('Text', 'text', ('T Text',))]

As a result, you have an overview of the names you can use.

## Exploration

Finally, let's explore this set by means of Text-Fabric.

### Frequency list

We can get ordered frequency lists for the values of all features.

First the words:

In [25]:
F.letters.freqList()

(('the', 8),
 ('of', 5),
 ('and', 4),
 ('in', 3),
 ('we', 3),
 ('everything', 2),
 ('know', 2),
 ('most', 2),
 ('ones', 2),
 ('patterns', 2),
 ('us', 2),
 ('Besides', 1),
 ('Culture', 1),
 ('Everything', 1),
 ('So', 1),
 ('a', 1),
 ('about', 1),
 ('aid', 1),
 ('any', 1),
 ('around', 1),
 ('as', 1),
 ('barbarian', 1),
 ('bottom', 1),
 ('can', 1),
 ('care', 1),
 ('climbing', 1),
 ('composed', 1),
 ('control', 1),
 ('dead', 1),
 ('elegant', 1),
 ('enjoyable', 1),
 ('final', 1),
 ('find', 1),
 ('free', 1),
 ('games', 1),
 ('good', 1),
 ('harness', 1),
 ('have', 1),
 ('high', 1),
 ('humans', 1),
 ('impossible', 1),
 ('is', 1),
 ('it', 1),
 ('languages', 1),
 ('left', 1),
 ('life', 1),
 ('line', 1),
 ('make', 1),
 ('mattered', 1),
 ('mountains', 1),
 ('not', 1),
 ('nothing', 1),
 ('our', 1),
 ('over', 1),
 ('own', 1),
 ('problems', 1),
 ('really', 1),
 ('romance', 1),
 ('safety', 1),
 ('societies', 1),
 ('sports', 1),
 ('studying', 1),
 ('such', 1),
 ('take', 1),
 ('terms', 1),
 ('that', 1),

For the node types we can get info by calling this:

In [26]:
C.levels.data

(('book', 99.0, 100, 100),
 ('chapter', 49.5, 101, 102),
 ('sentence', 33.0, 115, 117),
 ('line', 7.666666666666667, 103, 114),
 ('word', 1, 1, 99))

It means that chapters are 49.5 words long on average, and that the chapter nodes are 101 and 102.

And you see that we have 99 words.

---
All chapters:

* [convert](convert.ipynb)
* *use*
* [share](share.ipynb)
* [app](app.ipynb)

---