In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
from tf.app import use

# Cluster display in Old Babylonian

We show some details of the display logic by following an example: cluster nodes in the Old Babylonian corpus.

Clusters are difficult, because

* they do not necessarily respect proper embedding
* material can be part of several clusters

We show how we deal with the second part and prevent multiple display of members of multiple clusters.
As an illustration, we'll show the effect of an earlier bug and indicate the fix.

We start with loading the corpus.

In [3]:
A = use('oldbabylonian', hoist=globals())

Using TF-app in /Users/dirk/text-fabric-data/annotation/app-oldbabylonian/code:
	rv2.0.5=#6350ec8fbf3f11676e951b5c18fe3f29cc3d7c7b offline under ~/text-fabric-data (local release)
Using data in /Users/dirk/text-fabric-data/Nino-cunei/oldbabylonian/tf/1.0.5:
	rv1.5=#53a9dcaf54ed598cf21a1e271f30838824cb1c62 offline under ~/text-fabric-data (local release)
   |     0.00s Dataset without structure sections in otext:no structure functions in the T-API


# An example line

Here is a line with some nested clusters.

The node number is stored in the variable `ln`.

We show the raw ATF source of the line, and the text according to several text formats.

In [4]:
ln  = 230788

In [5]:
F.srcLn.v(ln)

'1. [a-na] _{d}suen_-i-[din-nam]'

In [6]:
T.text(ln)

'[a-na] _{d}suen_-i-[din-nam]'

In [7]:
T.text(ln, fmt='text-orig-rich')

'a-na d⁼suen-i-din-nam'

In [8]:
T.text(ln, fmt='text-orig-unicode')

'𒀀𒈾 𒀭𒂗𒍪𒄿𒁷𒉆'

N.B: These are the right unicodes but not the right signs, we need another font for that.

We can get the right signs by using `plain`:

In [9]:
A.plain(ln, fmt='text-orig-unicode')

even better, we translate the effect of clusters into layout:

In [10]:
A.plain(ln, fmt='layout-orig-unicode')

Click on the passage link in order to go to the page for this tablet on CDLI, where you can read off the
exact source:

```
1. [a-na] _{d}suen_-i-[din-nam]
```

## The clusters

By means of the
[`L` API](https://annotation.github.io/text-fabric/Api/Locality/) 
the clusters of this line can be found.

They are returned as a tuple of nodes.

In [11]:
cls = L.d(ln, otype="cluster")
cls

(203220, 203221, 203222, 203223)

We'll give each cluster its own highlight color:

In [12]:
colors = """
    cyan
    yellow
    lightsalmon
    lightgreen
""".strip().split()

highlights = dict(zip(cls, colors))
highlights

{203220: 'cyan', 203221: 'yellow', 203222: 'lightsalmon', 203223: 'lightgreen'}

In [13]:
A.plain(ln, highlights=highlights)

In this corpus, `pretty` displays unfold until the word level, by default.

But first we want it to unfold to the very end, to the sign level.

In [14]:
A.pretty(ln, highlights=highlights, baseTypes="sign")

Unfolding the the word level poses a challenge: an earlier implementation of the display algorithm resulted in this:

<img src="images/faultyClusterDisplayOBB.png" width="740">

This is wrong because all clusters have a spurious occurrence in the display.

What happened here?

The display algorithm (let's call him D) arrived at the line and was informed to display all children of type `word` and `cluster`.
Where does the child info come from? From defaults and configuration, resulting in a mapping called `childType`:

In [15]:
A.showContext(key="childType")

<details open><summary><b>oldbabylonian</b> <i>app context</i></summary>

<details open><summary>11. childType</summary>


*   **`cluster`**: 
    *   `cluster`
    *   `sign`
    *   `word`
*   **`document`**: 
    *   `face`
*   **`face`**: 
    *   `line`
*   **`line`**: 
    *   `cluster`
    *   `word`
*   **`word`**: 
    *   `cluster`
    *   `sign`

</details>
</details>


D arrives at the first child, a word (`a-na`).
Since `word` is in the `baseTypes`, D stops unfolding, and switches to plain display.
In doing so D encounters the first cluster, and it gets displayed and highlighted in `plain`.

Then D moves on to the next child, which is the first cluster (`[a-na]`, the one that D has already been displayed.

**But due to a subtle bug, D has forgotten that!**

We have remedied that:

In [16]:
A.pretty(ln, highlights=highlights)

# More examples

We finish off with some more examples.

In [23]:
colors = """
    cyan
    yellow
    lightsalmon
    lightgreen
""".strip().split()

def example(ln):
    print(ln)
    print(F.srcLn.v(ln))
    print(T.text(ln))
    A.plain(ln, fmt='layout-orig-unicode')
    cls = L.d(ln, otype="cluster")
    highlights = dict(zip(cls, colors[0:len(cls)]))
    print(highlights)
    A.plain(ln, highlights=highlights)
    A.pretty(ln, highlights=highlights, baseTypes="sign")
    A.pretty(ln, highlights=highlights)

In [24]:
example(F.otype.s('line')[22])

230810
6'. x _[a-sza3_ s,i]-bi-it ku-un-zu-lum
x _[a-sza3_ s,i]-bi-it ku-un-zu-lum


{203268: 'cyan', 203269: 'yellow'}


In [32]:
example(F.otype.s('line')[2553])

233341
6. isz-tu sza-ad-da-aq-dam# a-na _sze#-[numun?_]
isz-tu sza-ad-da-aq-dam# a-na _sze#-[numun?_]


{205585: 'cyan', 205586: 'yellow'}


# Developer's cells


Use `A.reuse()` if you have changed the `config.yaml` of this corpus and want to reapply the settings.

Inspect the result of the new settings by means of `A.showContext()`.

In [26]:
A.reuse()

In [17]:
A.showContext()

<details ><summary><b>oldbabylonian</b> <i>app context</i></summary>

<details ><summary>1. afterChild</summary>

{}

</details>
<details ><summary>2. allowedBaseTypes</summary>


1.  `word`
2.  `cluster`

</details>
<details ><summary>3. apiVersion</summary>

`1`

</details>
<details ><summary>4. appName</summary>

`oldbabylonian`

</details>
<details ><summary>5. appPath</summary>

`/Users/dirk/text-fabric-data/annotation/app-oldbabylonian/code`

</details>
<details ><summary>6. baseTypes</summary>


1.  `word`

</details>
<details ><summary>7. browseContentPretty</summary>

False

</details>
<details ><summary>8. browseNavLevel</summary>

`2`

</details>
<details ><summary>9. charText</summary>

`mapping from readings to UNICODE`

</details>
<details ><summary>10. charUrl</summary>

`https://github.com/Nino-cunei/oldbabylonian/blob/master/docs/programs/mapReadings.ipynb`

</details>
<details ><summary>11. childType</summary>


*   **`cluster`**: 
    *   `cluster`
    *   `sign`
    *   `word`
*   **`document`**: 
    *   `face`
*   **`face`**: 
    *   `line`
*   **`line`**: 
    *   `cluster`
    *   `word`
*   **`word`**: 
    *   `cluster`
    *   `sign`

</details>
<details ><summary>12. childrenCustom</summary>

{}

</details>
<details ><summary>13. chunkedTypes</summary>

set()

</details>
<details ><summary>14. commit</summary>

`6350ec8fbf3f11676e951b5c18fe3f29cc3d7c7b`

</details>
<details ><summary>15. condenseType</summary>

`line`

</details>
<details ><summary>16. condenseTypes</summary>


1.  
    *   `document`
    *   `158.14708171206226`
    *   `226669`
    *   `227953`
2.  
    *   `face`
    *   `71.70748059280169`
    *   `227954`
    *   `230787`
3.  
    *   `line`
    *   `7.423525114155251`
    *   `230788`
    *   `258162`
4.  
    *   `word`
    *   `2.6436180641788116`
    *   `258163`
    *   `334667`
5.  
    *   `cluster`
    *   `1.782122905027933`
    *   `203220`
    *   `226668`
6.  
    *   `sign`
    *   `1`
    *   `1`
    *   `203219`

</details>
<details ><summary>17. corpus</summary>

`Old Babylonian Letters 1900-1600: Cuneiform tablets `

</details>
<details ><summary>18. css</summary>


```
.pnum {
    font-family: sans-serif;
    font-size: small;
    font-weight: bold;
    color: #444444;
}
.op {
    padding:  0.5em 0.1em 0.1em 0.1em;
    margin: 0.8em 0.1em 0.1em 0.1em;
    font-family: monospace;
    font-size: x-large;
    font-weight: bold;
}
.period {
    font-family: monospace;
    font-size: medium;
    font-weight: bold;
    color: #0000bb;
}
.comment {
    color: #7777dd;
    font-family: monospace;
    font-size: small;
}
.operator {
    color: #ff77ff;
    font-size: large;
}
/* LANGUAGE: superscript and subscript */

/* cluster */
.det {
    vertical-align: super;
}
/* cluster */
.langalt {
    vertical-align: sub;
}
/* REDACTIONAL: line over or under  */

/* flag */
.collated {
    font-weight: bold;
    text-decoration: underline;
}
/* cluster */
.excised {
    color: #dd0000;
    text-decoration: line-through;
}
/* cluster */
.supplied {
    color: #0000ff;
    text-decoration: overline;
}
/* flag */
.remarkable {
    font-weight: bold;
    text-decoration: overline;
}

/* UNSURE: italic*/

/* cluster */
.uncertain {
    font-style: italic
}
/* flag */
.question {
    font-weight: bold;
    font-style: italic
}

/* BROKEN: text-shadow */

/* cluster */
.missing {
    color: #999999;
    text-shadow: #bbbbbb 1px 1px;
}
/* flag */
.damage {
    font-weight: bold;
    color: #999999;
    text-shadow: #bbbbbb 1px 1px;
}
.empty {
  color: #ff0000;
}


```


</details>
<details ><summary>19. dataDisplay</summary>


*   **`showVerseInTuple`**: `True`
*   **`textFormats`**: 
    *   **`layout-orig-rich`**: 
        *   **`method`**: `layoutRich`
        *   **`style`**: `trans`
    *   **`layout-orig-unicode`**: 
        *   **`method`**: `layoutUnicode`
        *   **`style`**: `orig`
    *   **`text-orig-full`**: 
        *   **`style`**: `source`
    *   **`text-orig-plain`**: 
        *   **`style`**: `trans`
    *   **`text-orig-rich`**: 
        *   **`style`**: `trans`
    *   **`text-orig-unicode`**: 
        *   **`style`**: `orig`

</details>
<details ><summary>20. defaultClsOrig</summary>

`txtu akk`

</details>
<details ><summary>21. defaultFormat</summary>

`text-orig-full`

</details>
<details ><summary>22. direction</summary>

`ltr`

</details>
<details ><summary>23. docBase</summary>

`https://github.com/Nino-cunei/oldbabylonian/blob/master/docs`

</details>
<details ><summary>24. docExt</summary>

`.md`

</details>
<details ><summary>25. docPage</summary>

`about`

</details>
<details ><summary>26. docRoot</summary>

`https://github.com`

</details>
<details ><summary>27. docUrl</summary>

`https://github.com/Nino-cunei/oldbabylonian/blob/master/docs/about.md`

</details>
<details ><summary>28. docs</summary>


*   **`charText`**: `mapping from readings to UNICODE`
*   **`charUrl`**: `https://github.com/Nino-cunei/oldbabylonian/blob/master/docs/programs/mapReadings.ipynb`
*   **`docBase`**: `https://github.com/Nino-cunei/oldbabylonian/blob/master/docs`
*   **`docExt`**: `.md`
*   **`docPage`**: `about`
*   **`docRoot`**: `https://github.com`
*   **`docUrl`**: `https://github.com/Nino-cunei/oldbabylonian/blob/master/docs/about.md`
*   **`featureBase`**: `https://github.com/Nino-cunei/oldbabylonian/blob/master/docs/transcription.md`
*   **`featurePage`**: *empty*

</details>
<details ><summary>29. doi</summary>

`10.5281/zenodo.2579207`

</details>
<details ><summary>30. exampleSection</summary>

`P509373 obverse:1`

</details>
<details ><summary>31. exampleSectionHtml</summary>

`<code>P509373 obverse:1</code>`

</details>
<details ><summary>32. excludedFeatures</summary>

set()

</details>
<details ><summary>33. extension</summary>

` akk`

</details>
<details ><summary>34. featureBase</summary>

`https://github.com/Nino-cunei/oldbabylonian/blob/master/docs/transcription.md`

</details>
<details ><summary>35. featurePage</summary>

*empty*

</details>
<details ><summary>36. features</summary>


*   **`cluster`**: 
    *   []
    *   {}
*   **`document`**: 
    *   []
    *   {}
*   **`face`**: 
    *   []
    *   {}
*   **`line`**: 
    *   
        *   `remarks`
        *   `translation@en`
    *   {}
*   **`sign`**: 
    *   
        *   `collated`
        *   `remarkable`
        *   `question`
        *   `damage`
        *   `det`
        *   `uncertain`
        *   `missing`
        *   `excised`
        *   `supplied`
        *   `langalt`
        *   `comment`
        *   `remarks`
        *   `repeat`
        *   `fraction`
        *   `operator`
        *   `grapheme`
    *   {}
*   **`word`**: 
    *   []
    *   {}

</details>
<details ><summary>37. featuresBare</summary>


*   **`cluster`**: 
    *   []
    *   {}
*   **`document`**: 
    *   
        *   `collection`
        *   `volume`
        *   `docnumber`
        *   `docnote`
    *   {}
*   **`face`**: 
    *   
        *   `object`
    *   {}
*   **`line`**: 
    *   []
    *   {}
*   **`sign`**: 
    *   []
    *   {}
*   **`word`**: 
    *   []
    *   {}

</details>
<details ><summary>38. formatCls</summary>


*   **`layout-orig-rich`**: `txtt`
*   **`layout-orig-unicode`**: `txtu akk`
*   **`text-orig-full`**: `txto`
*   **`text-orig-plain`**: `txtt`
*   **`text-orig-rich`**: `txtt`
*   **`text-orig-unicode`**: `txtu akk`

</details>
<details ><summary>39. formatHtml</summary>


1.  `layout-orig-rich`
2.  `layout-orig-unicode`

</details>
<details ><summary>40. formatMethod</summary>


*   **`layout-orig-rich`**: `layoutRich`
*   **`layout-orig-unicode`**: `layoutUnicode`

</details>
<details ><summary>41. formatStyle</summary>


*   **`normal`**: `txtn`
*   **`orig`**: `txtu akk`
*   **`phono`**: `txtp`
*   **`source`**: `txto`
*   **`trans`**: `txtt`

</details>
<details ><summary>42. graphicsRelative</summary>

None

</details>
<details ><summary>43. hasGraphics</summary>

set()

</details>
<details ><summary>44. interfaceDefaults</summary>


*   **`lineNumbers`**: False
*   **`prettyTypes`**: `True`
*   **`queryFeatures`**: `True`
*   **`showChunks`**: None
*   **`showGraphics`**: None
*   **`standardFeatures`**: False
*   **`withNodes`**: False
*   **`withTypes`**: False

</details>
<details ><summary>45. isChunkOf</summary>

{}

</details>
<details ><summary>46. isCompatible</summary>

`True`

</details>
<details ><summary>47. labels</summary>


*   **`cluster`**: 
    *   `{type}`
    *   
        *   `type`
*   **`document`**: 
    *   `True`
    *   ()
*   **`face`**: 
    *   `True`
    *   ()
*   **`line`**: 
    *   *empty*
    *   ()
*   **`sign`**: 
    *   `True`
    *   ()
*   **`word`**: 
    *   `True`
    *   ()

</details>
<details ><summary>48. language</summary>

`Akkadian`

</details>
<details ><summary>49. levelCls</summary>


*   **`cluster`**: 
    *   **`children`**: `children hor wrap`
    *   **`container`**: `contnr c1`
    *   **`label`**: `lbl c1`
*   **`document`**: 
    *   **`children`**: `children hor wrap`
    *   **`container`**: `contnr c4`
    *   **`label`**: `lbl c4`
*   **`face`**: 
    *   **`children`**: `children hor wrap`
    *   **`container`**: `contnr c4`
    *   **`label`**: `lbl c4`
*   **`line`**: 
    *   **`children`**: `children hor wrap`
    *   **`container`**: `contnr c3`
    *   **`label`**: `lbl c3`
*   **`sign`**: 
    *   **`children`**: *empty*
    *   **`container`**: `contnr c0`
    *   **`label`**: `lbl c0`
*   **`word`**: 
    *   **`children`**: `children hor `
    *   **`container`**: `contnr c2`
    *   **`label`**: `lbl c2`

</details>
<details ><summary>50. levels</summary>


*   **`cluster`**: 
    *   **`flow`**: `hor`
    *   **`level`**: `1`
    *   **`stretch`**: False
    *   **`wrap`**: `True`
*   **`document`**: 
    *   **`flow`**: `hor`
    *   **`level`**: `4`
    *   **`stretch`**: `True`
    *   **`wrap`**: `True`
*   **`face`**: 
    *   **`flow`**: `hor`
    *   **`level`**: `4`
    *   **`stretch`**: `True`
    *   **`wrap`**: `True`
*   **`line`**: 
    *   **`flow`**: `hor`
    *   **`level`**: `3`
    *   **`stretch`**: `True`
    *   **`wrap`**: `True`
*   **`sign`**: 
    *   **`flow`**: `ver`
    *   **`level`**: 0
    *   **`stretch`**: False
    *   **`wrap`**: False
*   **`word`**: 
    *   **`flow`**: `hor`
    *   **`level`**: `2`
    *   **`stretch`**: `True`
    *   **`wrap`**: False

</details>
<details ><summary>51. lexMap</summary>

{}

</details>
<details ><summary>52. lexTypes</summary>

set()

</details>
<details ><summary>53. lineNumberFeature</summary>


*   **`document`**: `srcLnNum`
*   **`face`**: `srcLnNum`
*   **`line`**: `srcLnNum`

</details>
<details ><summary>54. local</summary>

`local`

</details>
<details ><summary>55. localDir</summary>

`/Users/dirk/text-fabric-data/Nino-cunei/oldbabylonian/_temp`

</details>
<details ><summary>56. moduleSpecs</summary>

()

</details>
<details ><summary>57. noChildren</summary>

set()

</details>
<details ><summary>58. noDescendTypes</summary>

set()

</details>
<details ><summary>59. noneValues</summary>


1.  None

</details>
<details ><summary>60. org</summary>

`Nino-cunei`

</details>
<details ><summary>61. plainCustom</summary>

{}

</details>
<details ><summary>62. prettyCustom</summary>

{}

</details>
<details ><summary>63. provenanceSpec</summary>


*   **`corpus`**: `Old Babylonian Letters 1900-1600: Cuneiform tablets `
*   **`doi`**: `10.5281/zenodo.2579207`
*   **`graphicsRelative`**: None
*   **`moduleSpecs`**: ()
*   **`org`**: `Nino-cunei`
*   **`relative`**: `tf`
*   **`repo`**: `oldbabylonian`
*   **`version`**: `1.0.5`
*   **`webBase`**: `https://cdli.ucla.edu`
*   **`webHint`**: `Show this document on CDLI`
*   **`webLang`**: None
*   **`webLexId`**: None
*   **`webUrl`**: `https://cdli.ucla.edu/search/search_results.php?SearchMode=Text&ObjectID=<1>`
*   **`webUrlLex`**: None
*   **`zip`**: None

</details>
<details ><summary>64. relative</summary>

`tf`

</details>
<details ><summary>65. release</summary>

`v2.0.5`

</details>
<details ><summary>66. repo</summary>

`oldbabylonian`

</details>
<details ><summary>67. sectionSep1</summary>

` `

</details>
<details ><summary>68. sectionSep2</summary>

`:`

</details>
<details ><summary>69. showVerseInTuple</summary>

`True`

</details>
<details ><summary>70. styles</summary>

{}

</details>
<details ><summary>71. templates</summary>


*   **`cluster`**: 
    *   *empty*
    *   ()
*   **`document`**: 
    *   `True`
    *   ()
*   **`face`**: 
    *   `True`
    *   ()
*   **`line`**: 
    *   *empty*
    *   ()
*   **`sign`**: 
    *   `True`
    *   ()
*   **`word`**: 
    *   *empty*
    *   ()

</details>
<details ><summary>72. tfDoc</summary>

`https://annotation.github.io/text-fabric`

</details>
<details ><summary>73. transform</summary>

{}

</details>
<details ><summary>74. typeDisplay</summary>


*   **`cluster`**: 
    *   **`children`**: 
        *   `cluster`
        *   `word`
        *   `sign`
    *   **`label`**: `{type}`
    *   **`stretch`**: False
*   **`document`**: 
    *   **`featuresBare`**: `collection volume docnumber docnote`
    *   **`lineNumber`**: `srcLnNum`
*   **`face`**: 
    *   **`featuresBare`**: `object`
    *   **`lineNumber`**: `srcLnNum`
*   **`line`**: 
    *   **`children`**: 
        *   `cluster`
        *   `word`
    *   **`features`**: `remarks translation@en`
    *   **`lineNumber`**: `srcLnNum`
*   **`sign`**: 
    *   **`features`**: `collated remarkable question damage det uncertain missing excised supplied langalt comment remarks repeat fraction operator grapheme`
*   **`word`**: 
    *   **`base`**: `True`
    *   **`children`**: 
        *   `cluster`
        *   `sign`
    *   **`label`**: `True`
    *   **`wrap`**: False

</details>
<details ><summary>75. urlGh</summary>

`https://github.com`

</details>
<details ><summary>76. urlNb</summary>

`https://nbviewer.jupyter.org/github`

</details>
<details ><summary>77. verseTypes</summary>


1.  `line`

</details>
<details ><summary>78. version</summary>

`1.0.5`

</details>
<details ><summary>79. webBase</summary>

`https://cdli.ucla.edu`

</details>
<details ><summary>80. webHint</summary>

`Show this document on CDLI`

</details>
<details ><summary>81. webLang</summary>

None

</details>
<details ><summary>82. webLexId</summary>

None

</details>
<details ><summary>83. webUrl</summary>

`https://cdli.ucla.edu/search/search_results.php?SearchMode=Text&ObjectID=<1>`

</details>
<details ><summary>84. webUrlLex</summary>

None

</details>
<details ><summary>85. writing</summary>

`akk`

</details>
<details ><summary>86. zip</summary>


1.  `oldbabylonian`

</details>
</details>


230810

In [31]:
for ln in F.otype.s('line')[2550:2600]:
    print(F.srcLn.v(ln))

3. um-ma ip-qu2-{d}sza-la#-[ma]
4. {d}utu u3 {d}marduk li-ba-al#-[li-t,u2-ka]
5. lu sza-al-ma-[ta]
6. isz-tu sza-ad-da-aq-dam# a-na _sze#-[numun?_]
7. sza sza-ma-asz-ki-il-li u2-na-i-[id-ka-ma]
8. u2-ul tu-sza-x-[...]
9. tup-pa-ti-ia a-di ha-am-szi-[szu u2-sza-bi-la-ku-ma]
10. me-he-er tup-pa-ti-ia u2-ul [tu-sza-bi-lam]
11. u3 t,e4-em-ka u2-ul ta-[asz-pu-ra-am]
12. ki-sza ta-al-la-kam
1. a-na u2-di-ia szu-ur-ku-bi#-[im]
2. uz-na-ia i-ba-asz-szi-a [(x)]
3. {disz}{d}ia-ab-li-ia-isz-ta-mar x [...]
4. ki-ma u2-de-e ku-un-nu-[ki-im?]
5. _1(u) gin2 ku3-babbar_ i-na gi-ir-ri-im-ma [...]
6. asz-szu ki-a-am-ma lu-up-pu-da-ku
7. _iti kin#-{d}inanna u4 3(u)#-kam_
8. a-na ma-ah-ri-ka [a]-al-la-kam
9. 1(ban2) ze2!(SZE)-ra-am sza# sza#-ma-asz-ki-il#-[li ...]
10. {na4}ku-nu-kam a-na {d}ia-ab-li-ia#-[isz-ta-mar]
11. i-di-im-ma a-na ma-ah-ri-ia#
12. t,u2#-ur-dam#
13. sza#-at-tum la i-iz-zi-ba#-[an-ni]
1. _1(asz) sze gur_ i-di-in-ma _{gi}x-x-na hi-a_
2. u3 _{gi}x hi-a_ szu-pi2-isz
1. a-na a-wi#-lim
2. q

CC-BY Dirk Roorda