<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc" style="margin-top: 1em;"><ul class="toc-item"><li><span><a href="#Start-up" data-toc-modified-id="Start-up-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Start up</a></span></li></ul></div>

<img align="left" src="images/P005381-obverse-photo.png" width="15%"/>
<img align="left" src="images/P005381-obverse-lineart-annot.png" width="15%"/>
<img align="right" src="images/P005381-reverse-photo.png" width="15%"/>
<img align="right" src="images/P005381-reverse-lineart.png" width="15%"/>

<p>
```
&P005381 = MSVO 3, 70
```
</p>
<p>
<img src="images/P005381-obverse-atf.png" width="40%"/>
<img src="images/P005381-reverse-atf.png" width="40%"/>
</p>

<img align="right" src="images/tf-small.png"/>


# Numbers

## Start up

We import the Python modules we need.

In [20]:
%load_ext autoreload
%autoreload 2

In [21]:
import sys, os, collections
from glob import glob
from IPython.display import Markdown, Image, display
from tf.fabric import Fabric

We set up our working locations on the file system.

In [22]:
GITHUB = 'https://github.com'
REPO_REL = 'Dans-labs/Nino-cunei'
REPO = f'~/github/{REPO_REL}'
SOURCE = 'uruk'
VERSION = '0.1'
CORPUS = f'{REPO}/tf/{SOURCE}/{VERSION}'
SOURCE_DIR = os.path.expanduser(f'{REPO}/sources/cdli')
IDEO_DIR = os.path.expanduser(f'{REPO}/sources/ideographs')
PROGRAM_DIR = os.path.expanduser(f'{REPO}/programs')
TEMP_DIR = os.path.expanduser(f'{REPO}/_temp')
REPORT_DIR = os.path.expanduser(f'{REPO}/reports')

We create the temporary and report directories, if they do not exist already.

In [23]:
sys.path.append(PROGRAM_DIR)
from cunei import Cunei
from utils import Compare

In [24]:
for cdir in (TEMP_DIR, REPORT_DIR):
    os.makedirs(cdir, exist_ok=True)

In [25]:
TF = Fabric(locations=[CORPUS], modules=[''], silent=False )

This is Text-Fabric 3.2.2
Api reference : https://github.com/Dans-labs/text-fabric/wiki/Api
Tutorial      : https://github.com/Dans-labs/text-fabric/blob/master/docs/tutorial.ipynb
Example data  : https://github.com/Dans-labs/text-fabric-data

33 features found and 0 ignored


In [26]:
api = TF.load('''
    grapheme prime repeat
    variant variantOuter
    modifier modifierInner modifierFirst
    damage uncertain remarkable written
    period name type identifier catalogId excavation
    number fullNumber origNumber badNumbering
    crossref text
    srcLn srcLnNum
    op sub comments''')
api.makeAvailableIn(globals())
CUNEI = Cunei(api)
COMP = Compare(api, SOURCE_DIR, TEMP_DIR)

  0.00s loading features ...
   |     0.00s B catalogId            from /Users/dirk/github/Dans-labs/Nino-cunei/tf/uruk/0.1
   |     0.02s B fullNumber           from /Users/dirk/github/Dans-labs/Nino-cunei/tf/uruk/0.1
   |     0.02s B number               from /Users/dirk/github/Dans-labs/Nino-cunei/tf/uruk/0.1
   |     0.05s B grapheme             from /Users/dirk/github/Dans-labs/Nino-cunei/tf/uruk/0.1
   |     0.05s B srcLn                from /Users/dirk/github/Dans-labs/Nino-cunei/tf/uruk/0.1
   |     0.02s B srcLnNum             from /Users/dirk/github/Dans-labs/Nino-cunei/tf/uruk/0.1
   |     0.00s B prime                from /Users/dirk/github/Dans-labs/Nino-cunei/tf/uruk/0.1
   |     0.01s B repeat               from /Users/dirk/github/Dans-labs/Nino-cunei/tf/uruk/0.1
   |     0.01s B variant              from /Users/dirk/github/Dans-labs/Nino-cunei/tf/uruk/0.1
   |     0.00s B variantOuter         from /Users/dirk/github/Dans-labs/Nino-cunei/tf/uruk/0.1
   |     0.00s B modi

In [27]:
def dm(markdown): display(Markdown(markdown))

Specification of the Shin systems: just the bare minimum of info.

In [28]:
numberSystems = dict(
    shinP = (40, 3, 18, 24, 45),
    shinPP = (4,19, 36, 41, 46, 49),
    shinS = (25, 27, 28, 42, 5, 20, 47, 37),
)

We turn the numbers into numeral graphemes:

In [29]:
systems = {}

for (shin, numbers) in numberSystems.items():
    systems[shin] = {f'N{n:>02}' for n in numbers}

Reality check

In [30]:
systems

{'shinP': {'N03', 'N18', 'N24', 'N40', 'N45'},
 'shinPP': {'N04', 'N19', 'N36', 'N41', 'N46', 'N49'},
 'shinS': {'N05', 'N20', 'N25', 'N27', 'N28', 'N37', 'N42', 'N47'}}

We also want the opposite: given a numeral, which system is it?

In [31]:
numeralMap = {}

for (shin, numerals) in systems.items():
    for n in numerals:
        if n in numeralMap:
            dm(f'**warning:** Numeral {n} in {shin} was already in {numeralMap[n]}')
        numeralMap[n] = shin

numeralMap

{'N03': 'shinP',
 'N04': 'shinPP',
 'N05': 'shinS',
 'N18': 'shinP',
 'N19': 'shinPP',
 'N20': 'shinS',
 'N24': 'shinP',
 'N25': 'shinS',
 'N27': 'shinS',
 'N28': 'shinS',
 'N36': 'shinPP',
 'N37': 'shinS',
 'N40': 'shinP',
 'N41': 'shinPP',
 'N42': 'shinS',
 'N45': 'shinP',
 'N46': 'shinPP',
 'N47': 'shinS',
 'N49': 'shinPP'}

Exercise:

For each tablet, add three properties: hasShinP, hasShinPP, hasShinS.
They will be True if and only if the tablet has a numeral in that category.
Even better, instead of True or False, we let them record how many numerals in that set they have. 

In [33]:
tabletNumerics = collections.defaultdict(collections.Counter)

for tablet in F.otype.s('tablet'):
    pNum = F.catalogId.v(tablet)
    for sign in L.d(tablet, otype='sign'):
        if F.type.v(sign) == 'numeral':
            numeral = F.grapheme.v(sign)
            system = numeralMap.get(numeral, None)
            if system is not None:
                tabletNumerics[pNum][system] += 1

Now we write a csv file to the report directory, so that you can work with the data in Excel.

We show the first few lines in the notebook

In [39]:
filePath = f'{REPORT_DIR}/tabletNumerics.tsv'
lines = []
systemNames = sorted(systems)
fieldNames = "\t".join(systemNames)
for pNum in sorted(tabletNumerics):
    data = tabletNumerics[pNum]
    values = "\t".join(str(data[s]) for s in systemNames)
    lines.append(f'{pNum}\t{values}\n')
with open(filePath, 'w') as fh:
    fh.write(f'tablet\t{fieldNames}\n')
    fh.write(''.join(lines))

print(''.join(lines[0:10]))

P000148	0	0	1
P000245	2	0	0
P000266	0	1	0
P000308	2	0	0
P000434	2	0	0
P000511	1	0	0
P000517	1	0	0
P000550	2	0	0
P000734	2	0	0
P000735	2	0	1



Below is what we did Thursday 2018-03-01 in Leiden

In [11]:
shinPpCases = []

nCases = 0
for case in F.otype.s('case'):
    if F.fullNumber.v(case) is None:
        continue
    nCases += 1
    caseGraphemes = {F.grapheme.v(s) for s in L.d(case, otype='sign')}
    if caseGraphemes & SHINPP_GRAPHEMES:
        shinPpCases.append(case)

dm(f'**shinPpCases** {len(shinPpCases)} out of {nCases} terminal cases')

**shinPpCases** 684 out of 41142 terminal cases

In [12]:
fields = '''
    tablet
    face
    column
    line
'''.strip().split()

formatStr = ('{}\t' * (len(fields) - 1)) + '{}\n'

headerLine = formatStr.format(*fields)
headerLine

'tablet\tface\tcolumn\tline\n'

In [13]:
with open(f'{REPORT_DIR}/shinpp.tsv', 'w') as fh:
    for case in shinPpCases:
        (tablet, column, ln) = T.sectionFromNode(case)
        (face, columnNum) = column.split(':')
        line = F.srcLn.v(case)
        fh.write(formatStr.format(tablet, face, columnNum, line))