Alpheios.net implementation demo
===

The Alpheios.net implementation runs on Capitains.org software suite using [Nautilus](http://github.com/capitains/nautilus).

## Configuration

The following cells are used to avoid rewriting too many cells if the address of the DTS API were to change.


In [2]:
import requests
import requests_cache
from urllib.parse import urljoin

URI = "http://texts.alpheios.net/api/dts"

## Getting the available endpoints

DTS entry point is a listing of the available endpoints and their URLs. This means that for each implementation of DTS, this single URL will give you all the information you need to perform arbitrary queries. The Alpheios Capitains implementation provides the three endpoints :

In [3]:
entry_request = requests.get(URI)
ENDPOINTS = entry_request.json()
ENDPOINTS

{'@context': 'dts/EntryPoint.jsonld',
 '@type': 'EntryPoint',
 'navigation': '/api/dts/navigation',
 'collections': '/api/dts/collections',
 '@id': '/api/dts',
 'documents': '/api/dts/document'}

As you can see, all three endpoints have been given URIs. Because we do not know the text we want to see, we'll browse from here :

## Browsing the root of the catalog

The root of the data catalog is the result of the basic GET request on the `collections` endpoint :

In [4]:
ROOT_COLLECTION  = requests.get(urljoin(URI, ENDPOINTS["collections"])).json()
ROOT_COLLECTION

{'@context': {'dts': 'https://w3id.org/dts/api#',
  '@vocab': 'https://www.w3.org/ns/hydra/core#'},
 '@type': 'Collection',
 'totalItems': 3,
 'title': 'None',
 '@id': 'default',
 'member': [{'totalItems': 3,
   'title': 'Classical Latin',
   '@id': 'urn:alpheios:latinLit',
   '@type': 'Collection'},
  {'totalItems': 4,
   'title': 'Ancient Greek',
   '@id': 'urn:alpheios:greekLit',
   '@type': 'Collection'},
  {'totalItems': 3,
   'title': 'Classical Arabic',
   '@id': 'urn:alpheios:arabicLit',
   '@type': 'Collection'}]}

The root collection has 3 items : 1 in Ancient Greek, 1 in Classical Arabic, 1 in Classical Latin. Let's go see the one for Classical Latin.

## Requesting a specific collection

Requesting a specific collection is simple : you go to the Collections endpoint, add the parameter `id` with the `@id` property of your item.

### LatinLit

We first want to get the LatinLit collection : it will be at the URI http://texts.alpheios.net/api/dts/collections?id=urn:alpheios:latinLit

In [5]:
LatinLit  = requests.get(urljoin(URI, ENDPOINTS["collections"]+"?id=urn:alpheios:latinLit")).json()
LatinLit

{'dts:extensions': {'ns2:prefLabel': [{'@value': 'Classical Latin',
    '@language': 'eng'},
   {'@value': 'Latin Classique', '@language': 'fre'}]},
 '@context': {'dts': 'https://w3id.org/dts/api#',
  'ns2': 'http://www.w3.org/2004/02/skos/core#',
  '@vocab': 'https://www.w3.org/ns/hydra/core#'},
 '@type': 'Collection',
 'totalItems': 3,
 'title': 'Classical Latin',
 '@id': 'urn:alpheios:latinLit',
 'member': [{'totalItems': 1,
   'title': 'Catullus',
   '@id': 'urn:cts:latinLit:phi0472',
   '@type': 'Collection'},
  {'totalItems': 1,
   'title': 'Propertius, Sextus',
   '@id': 'urn:cts:latinLit:phi0620',
   '@type': 'Collection'},
  {'totalItems': 1,
   'title': 'Ovid',
   '@id': 'urn:cts:latinLit:phi0959',
   '@type': 'Collection'}]}

### Ovid

See the `dts:extensions` property ? It seems that our collection has 2 `skos:prefLabel` in two languages : Classical Latin for English, Latin Classique in French !

But wait ! Another collection

In [6]:
Ovid_Collection  = requests.get(urljoin(URI, ENDPOINTS["collections"]+"?id=urn:cts:latinLit:phi0959")).json()
Ovid_Collection

{'dts:extensions': {'ns1:prefLabel': [{'@value': 'Ovid', '@language': 'eng'}],
  'cts:groupname': [{'@value': 'Ovid', '@language': 'eng'}]},
 '@context': {'ns1': 'http://www.w3.org/2004/02/skos/core#',
  '@vocab': 'https://www.w3.org/ns/hydra/core#',
  'dts': 'https://w3id.org/dts/api#',
  'cts': 'http://chs.harvard.edu/xmlns/cts/'},
 '@type': 'Collection',
 'totalItems': 1,
 'title': 'Ovid',
 '@id': 'urn:cts:latinLit:phi0959',
 'member': [{'totalItems': 1,
   'title': 'Metamorphoses',
   '@id': 'urn:cts:latinLit:phi0959.phi006',
   '@type': 'Collection'}]}

### Metamorphoses

Wait, it has again another collection. Let's go ! Let's see where this ends !

In [7]:
Metamorphoses_Collection  = requests.get(
    urljoin(URI, ENDPOINTS["collections"]+"?id=urn:cts:latinLit:phi0959.phi006")
).json()
Metamorphoses_Collection

{'dts:extensions': {'ns1:prefLabel': [{'@value': 'Metamorphoses',
    '@language': 'eng'}],
  'ns3:language': 'lat',
  'cts:title': [{'@value': 'Metamorphoses', '@language': 'eng'}]},
 '@context': {'ns1': 'http://www.w3.org/2004/02/skos/core#',
  'ns3': 'http://purl.org/dc/elements/1.1/',
  'dts': 'https://w3id.org/dts/api#',
  'cts': 'http://chs.harvard.edu/xmlns/cts/',
  '@vocab': 'https://www.w3.org/ns/hydra/core#'},
 '@type': 'Collection',
 'totalItems': 1,
 'title': 'Metamorphoses',
 '@id': 'urn:cts:latinLit:phi0959.phi006',
 'member': [{'dts:citeDepth': 2,
   'dts:passage': '/api/dts/document?id=urn%3Acts%3AlatinLit%3Aphi0959.phi006.alpheios-text-lat1',
   'dts:citeStructure': {'dts:citeStructure': [{'dts:citeType': 'line'}],
    'dts:citeType': 'book'},
   'dts:extensions': {'ns1:prefLabel': [{'@value': 'Metamorphoses',
      '@language': 'lat'}],
    'ns3:language': 'lat',
    'cts:description': [{'@value': 'TEI XML Edition Enhanced with Syntax Diagrams for 1.1-1.9 and 1.163-1.

### Alpheios Edition of the Metamorphoses

Wait, the next one seems more complicated, let's request this single collection and read what's in there :

In [8]:
Metamorphoses_Edition_Collection  = requests.get(
    urljoin(URI, ENDPOINTS["collections"]+"?id=urn:cts:latinLit:phi0959.phi006.alpheios-text-lat1")
).json()
Metamorphoses_Edition_Collection

{'dts:citeDepth': 2,
 'dts:passage': '/api/dts/document?id=urn%3Acts%3AlatinLit%3Aphi0959.phi006.alpheios-text-lat1',
 'dts:citeStructure': {'dts:citeStructure': [{'dts:citeType': 'line'}],
  'dts:citeType': 'book'},
 'dts:extensions': {'ns1:prefLabel': [{'@value': 'Metamorphoses',
    '@language': 'lat'}],
  'ns3:language': 'lat',
  'cts:description': [{'@value': 'TEI XML Edition Enhanced with Syntax Diagrams for 1.1-1.9 and 1.163-1.773. Ovid. Metamorphoses. Hugo Magnus.\n      Gotha (Germany). Friedr. Andr. Perthes. 1892',
    '@language': 'eng'}],
  'cts:label': [{'@value': 'Metamorphoses', '@language': 'lat'}]},
 '@type': 'Resource',
 'dts:references': '/api/dts/navigation?id=urn%3Acts%3AlatinLit%3Aphi0959.phi006.alpheios-text-lat1',
 'totalItems': 0,
 'title': 'Metamorphoses',
 '@id': 'urn:cts:latinLit:phi0959.phi006.alpheios-text-lat1',
 '@context': {'ns1': 'http://www.w3.org/2004/02/skos/core#',
  'ns3': 'http://purl.org/dc/elements/1.1/',
  '@vocab': 'https://www.w3.org/ns/hydr

So, there are a few things we can see:

- There is prefLabels again, and some values for `cts` ontology properties.
- More importantly, the `@type` is not `Collection` anymore ! This means the current Collection can actually be read, it's not only metadata. Good to know hmm ?
- You see the `dts:citeDepth` ? It means the text has two levels of citation. In the context of this collection, the data curator actually specified them in `dts:citeStructure` :
    1. The first level has the name `poem`. This level has a second level:
        1. The second of the level inside poem has the name `line`

Now, we have two really interesting links, let's go see what's in there !

## What are the passages that I can single out in the Alpheios Edition of Metamorphoses ?

To reply to this long but quite clear title, there is only one thing to do : go to the `dts:references` URI we see here.

But wait, see the URI ? It's actually a simple construction :

- We use `navigation` from `ENDPOINTS`.
- We add the `@id` of the Resource we are interested in !

### All the Poems !

In [9]:
Ovid_Poems = requests.get(
    urljoin(URI, Metamorphoses_Edition_Collection["dts:references"])).json()
Ovid_Poems

{'citeType': 'book',
 'hydra:member': [{'ref': '1'},
  {'ref': '2'},
  {'ref': '3'},
  {'ref': '4'},
  {'ref': '5'},
  {'ref': '6'},
  {'ref': '7'},
  {'ref': '8'},
  {'ref': '9'},
  {'ref': '10'},
  {'ref': '11'},
  {'ref': '12'},
  {'ref': '13'},
  {'ref': '14'},
  {'ref': '15'}],
 '@context': {'hydra': 'https://www.w3.org/ns/hydra/core#',
  '@vocab': 'https://w3id.org/dts/api#'},
 'citeDepth': 2,
 'level': 1,
 '@id': '/api/dts/navigation?id=urn%3Acts%3AlatinLit%3Aphi0959.phi006.alpheios-text-lat1&groupBy=1&level=1',
 'passage': '/api/dts/document?id=urn%3Acts%3AlatinLit%3Aphi0959.phi006.alpheios-text-lat1{&ref}{&start}{&end}'}

Bummer ! There seems to be a single poem, here in `member`.

### Well, then all the lines of Poem 1!

But wait, didn't we say there was a second level ? To go see what's in there, we can simply go check the same URI, adding the ref we want to look into : `&ref=1`

In [10]:
Ovid_Poem_1_Lines = requests.get(
    urljoin(URI, Metamorphoses_Edition_Collection["dts:references"]+"&ref=1")).json()
Ovid_Poem_1_Lines

{'citeType': 'line',
 'hydra:member': [{'ref': '1.1'},
  {'ref': '1.2'},
  {'ref': '1.3'},
  {'ref': '1.4'},
  {'ref': '1.5'},
  {'ref': '1.6'},
  {'ref': '1.7'},
  {'ref': '1.8'},
  {'ref': '1.9'},
  {'ref': '1.10'},
  {'ref': '1.11'},
  {'ref': '1.12'},
  {'ref': '1.13'},
  {'ref': '1.14'},
  {'ref': '1.15'},
  {'ref': '1.16'},
  {'ref': '1.17'},
  {'ref': '1.18'},
  {'ref': '1.19'},
  {'ref': '1.20'},
  {'ref': '1.21'},
  {'ref': '1.22'},
  {'ref': '1.23'},
  {'ref': '1.24'},
  {'ref': '1.25'},
  {'ref': '1.26'},
  {'ref': '1.27'},
  {'ref': '1.28'},
  {'ref': '1.29'},
  {'ref': '1.30'},
  {'ref': '1.31'},
  {'ref': '1.32'},
  {'ref': '1.33'},
  {'ref': '1.34'},
  {'ref': '1.35'},
  {'ref': '1.36'},
  {'ref': '1.37'},
  {'ref': '1.38'},
  {'ref': '1.39'},
  {'ref': '1.40'},
  {'ref': '1.41'},
  {'ref': '1.42'},
  {'ref': '1.43'},
  {'ref': '1.44'},
  {'ref': '1.45'},
  {'ref': '1.46'},
  {'ref': '1.47'},
  {'ref': '1.48'},
  {'ref': '1.49'},
  {'ref': '1.50'},
  {'ref': '1.51'},
  {

That's quite a lot of references. All are lines, that's interesting. **We can even see that this edition is lacking the lines 10 to 162 !**

### TLDR : GROUPS !

But I don't want to check each line one by one. So, let's group them :

In [11]:
Ovid_Poem_1_Lines_Grouped = requests.get(
    urljoin(URI, Metamorphoses_Edition_Collection["dts:references"]+"&ref=1&groupBy=20")).json()
Ovid_Poem_1_Lines_Grouped

{'citeType': 'line',
 'hydra:member': [{'start': '1.1', 'end': '1.20'},
  {'start': '1.21', 'end': '1.40'},
  {'start': '1.41', 'end': '1.60'},
  {'start': '1.61', 'end': '1.80'},
  {'start': '1.81', 'end': '1.100'},
  {'start': '1.101', 'end': '1.120'},
  {'start': '1.121', 'end': '1.140'},
  {'start': '1.141', 'end': '1.160'},
  {'start': '1.161', 'end': '1.180'},
  {'start': '1.181', 'end': '1.200'},
  {'start': '1.201', 'end': '1.220'},
  {'start': '1.221', 'end': '1.240'},
  {'start': '1.241', 'end': '1.260'},
  {'start': '1.261', 'end': '1.280'},
  {'start': '1.281', 'end': '1.300'},
  {'start': '1.301', 'end': '1.320'},
  {'start': '1.321', 'end': '1.340'},
  {'start': '1.341', 'end': '1.360'},
  {'start': '1.361', 'end': '1.380'},
  {'start': '1.381', 'end': '1.400'},
  {'start': '1.401', 'end': '1.420'},
  {'start': '1.421', 'end': '1.440'},
  {'start': '1.441', 'end': '1.460'},
  {'start': '1.461', 'end': '1.480'},
  {'start': '1.481', 'end': '1.500'},
  {'start': '1.501', 'e

### What's inside that group ?

So, we have group now. But how do I know, if there is missing lines, what's forming my group ? 

Answer : By doing the same query, but using the `start` and `end` parameter, specifying that I want things inside this range adding `&level=0`.

In [12]:
print(
    urljoin(URI, Metamorphoses_Edition_Collection["dts:references"]+"&start=1.1&end=1.173&level=0")
)
Ovid_Poem_1_Lines_First_Group = requests.get(
    urljoin(URI, Metamorphoses_Edition_Collection["dts:references"]+"&start=1.1&end=1.173&level=0")
).json()
Ovid_Poem_1_Lines_First_Group

http://texts.alpheios.net/api/dts/navigation?id=urn%3Acts%3AlatinLit%3Aphi0959.phi006.alpheios-text-lat1&start=1.1&end=1.173&level=0


{'citeType': 'line',
 'hydra:member': [{'ref': '1.1'},
  {'ref': '1.2'},
  {'ref': '1.3'},
  {'ref': '1.4'},
  {'ref': '1.5'},
  {'ref': '1.6'},
  {'ref': '1.7'},
  {'ref': '1.8'},
  {'ref': '1.9'},
  {'ref': '1.10'},
  {'ref': '1.11'},
  {'ref': '1.12'},
  {'ref': '1.13'},
  {'ref': '1.14'},
  {'ref': '1.15'},
  {'ref': '1.16'},
  {'ref': '1.17'},
  {'ref': '1.18'},
  {'ref': '1.19'},
  {'ref': '1.20'},
  {'ref': '1.21'},
  {'ref': '1.22'},
  {'ref': '1.23'},
  {'ref': '1.24'},
  {'ref': '1.25'},
  {'ref': '1.26'},
  {'ref': '1.27'},
  {'ref': '1.28'},
  {'ref': '1.29'},
  {'ref': '1.30'},
  {'ref': '1.31'},
  {'ref': '1.32'},
  {'ref': '1.33'},
  {'ref': '1.34'},
  {'ref': '1.35'},
  {'ref': '1.36'},
  {'ref': '1.37'},
  {'ref': '1.38'},
  {'ref': '1.39'},
  {'ref': '1.40'},
  {'ref': '1.41'},
  {'ref': '1.42'},
  {'ref': '1.43'},
  {'ref': '1.44'},
  {'ref': '1.45'},
  {'ref': '1.46'},
  {'ref': '1.47'},
  {'ref': '1.48'},
  {'ref': '1.49'},
  {'ref': '1.50'},
  {'ref': '1.51'},
  {

## Getting the text

Now that we can see what the available passages are, why not get to the text passages ?

Let see... We build this the same way than the Navigation query ! But instead, we use `document` from the entry point !

### Getting an excerpt


In [13]:
Ovid_Poem_Text = requests.get(
    urljoin(URI, Metamorphoses_Edition_Collection["dts:passage"]+"&start=1.1&end=1.173")
)
print(
    urljoin(URI, Metamorphoses_Edition_Collection["dts:passage"]+"&start=1.1&end=1.173")
)
print(Ovid_Poem_Text.text)

http://texts.alpheios.net/api/dts/document?id=urn%3Acts%3AlatinLit%3Aphi0959.phi006.alpheios-text-lat1&start=1.1&end=1.173
<TEI xmlns="http://www.tei-c.org/ns/1.0" xmlns:py="http://codespeak.net/lxml/objectify/pytype" py:pytype="TREE"><dts:fragment xmlns:dts="https://w3id.org/dts/api#"><text><body n="urn:cts:latinLit:phi0959.phi006.alpheios-text-lat1"><div type="edition" xml:lang="lat" n="urn:cts:latinLit:phi0959.phi006.alpheios-text-lat1"><div type="textpart" subtype="book" n="1"><l n="1"><w ana="1-1">In</w><w ana="1-2">nova</w><w ana="1-3">fert</w><w ana="1-4">animus</w><w ana="1-5">mutatas</w><w ana="1-6">dicere</w><w ana="1-7">formas</w></l><l n="2"><w ana="1-8">corpora</w>; <w ana="2-1">di</w>, <w ana="2-3">coeptis</w> (<w ana="2-5">nam</w> 
               <w ana="2-6">vos</w> 
               <w ana="2-7">mutastis</w> 
               <w ana="2-8">et</w> 
               <w ana="2-9">illas</w>)</l><l n="3"><w ana="2-11">adspirate</w><w ana="2-12">meis</w><w ana="2-13 2-14">primaque<

### Navigating in the Document Endpoint

That's nice ! But how do I know where to go next ! I am lost !

Wait no, because we thought about it ! Look at the headers !

In [14]:
print(Ovid_Poem_Text.headers["Link"])

</api/dts/navigation?id=urn%3Acts%3AlatinLit%3Aphi0959.phi006.alpheios-text-lat1&start=1.1&end=1.173>; rel=contents, </api/dts/collections?id=urn%3Acts%3AlatinLit%3Aphi0959.phi006.alpheios-text-lat1>; rel=collection, </api/dts/document?ref=1&id=urn%3Acts%3AlatinLit%3Aphi0959.phi006.alpheios-text-lat1>; rel=up, </api/dts/document?id=urn%3Acts%3AlatinLit%3Aphi0959.phi006.alpheios-text-lat1&start=1.174&end=1.346>; rel=next


See that sweet rel=next ? Let's go !

In [15]:
Ovid_Poem_Text_Next = requests.get(
    urljoin(URI, "/api/dts/document?end=1.193&id=urn%3Acts%3AlatinLit%3Aphi0959.phi006.alpheios-text-lat1&start=1.174")
)
print(urljoin(URI, "/api/dts/document?end=1.193&id=urn%3Acts%3AlatinLit%3Aphi0959.phi006.alpheios-text-lat1&start=1.174"))
print(Ovid_Poem_Text_Next.text)

http://texts.alpheios.net/api/dts/document?end=1.193&id=urn%3Acts%3AlatinLit%3Aphi0959.phi006.alpheios-text-lat1&start=1.174
<TEI xmlns="http://www.tei-c.org/ns/1.0" xmlns:py="http://codespeak.net/lxml/objectify/pytype" py:pytype="TREE"><dts:fragment xmlns:dts="https://w3id.org/dts/api#"><text><body n="urn:cts:latinLit:phi0959.phi006.alpheios-text-lat1"><div type="edition" xml:lang="lat" n="urn:cts:latinLit:phi0959.phi006.alpheios-text-lat1"><div type="textpart" subtype="book" n="1"><l n="174"><w ana="11-4">caelicolae</w><w ana="11-5 11-6">clarique</w><w ana="11-7">suos</w><w ana="11-8">posuere</w><w ana="11-9">penates</w>.</l><l n="175"><w ana="12-1">Hic</w><w ana="12-2">locus</w><w ana="12-3">est</w>, <w ana="12-5">quem</w>, <w ana="12-7">si</w><w ana="12-8">verbis</w><w ana="12-9">audacia</w><w ana="12-10">detur</w>,</l><l n="176"><w ana="12-12">haud</w><w ana="12-13">timeam</w><w ana="12-14">magni</w><w ana="12-15">dixisse</w><w ana="12-16">Palatia</w><w ana="12-17">caeli</w>.</l><