# DraCor API Tutorial

To use the [DraCor-API](https://dracor.org/doc/api) you need to send HTTP-Requests to the API: `https://dracor.org/api`. In Python HTTP-Request can be sent with the library [requests](https://2.python-requests.org). We have to import this library:

In [1]:
import requests

If it fails, the package must be installed first. Run `pip install `

## `/info`: Info about the API 

In [2]:
r = requests.get("https://dracor.org/api/info")
r.text

'{\n  "name" : "DraCor API",\n  "status" : "beta",\n  "existdb" : "5.2.0",\n  "version" : "0.82.0"\n}'

The API returns this information in the JSON format, which you have to parse. You can use the library `json` for this.

In [3]:
import json
parsedResponse = json.loads(r.text)
parsedResponse

{'name': 'DraCor API',
 'status': 'beta',
 'existdb': '5.2.0',
 'version': '0.82.0'}

To get the current version of the API

In [4]:
print("The current version of the Dracor-API is " + parsedResponse['version'] + ".")

The current version of the Dracor-API is 0.82.0.


## `/corpora`: list available corpora 

Load available corpora and list name and title of each corpus.

In [5]:
r = requests.get("https://dracor.org/api/corpora?include=metrics")
corpora = json.loads(r.text)
corpora
for corpus in corpora:
    numofplays = corpus['metrics']['text']
    print(corpus['name'] + ": " + corpus['title'] + ' (' + str(numofplays) + ' plays)')

als: Alsatian Drama Corpus (25 plays)
bash: Bashkir Drama Corpus (3 plays)
cal: Calderón Drama Corpus (54 plays)
fre: French Drama Corpus (1560 plays)
ger: German Drama Corpus (545 plays)
greek: Greek Drama Corpus (39 plays)
ita: Italian Drama Corpus (139 plays)
rom: Roman Drama Corpus (36 plays)
rus: Russian Drama Corpus (212 plays)
shake: Shakespeare Drama Corpus (37 plays)
span: Spanish Drama Corpus (25 plays)
swe: Swedish Drama Corpus (73 plays)
tat: Tatar Drama Corpus (3 plays)


## Generic function to handle the requests and parse the result
Requesting data from the API in most cases follows a pattern:
 * construct the request-url. E.g. use `https://dracor.org/api/` as a base and attach `corpusname`, `playname`, a method, e.g. `cast` and in some cases a reponse-format, e.g. `csv`
 * use this constructed url in a request do the endpoint
 * retrieve the data and parse to a format, that can be than used in the program
 
By defining a function, this process can be speed up. Instead of repeating the code, a function can be defined, that takes `corpusname`, `playname` and `method` as arguments. In the example we assume, that the response will be JSON.

Parsing of JSON is done with the package `json`, which needs to be imported:

In [6]:
import json

The function accepts parameters as arguments, e.g. `corpusname="ger"`. Following arguments are supported:

* `apibase` (default will be `https://dracor.org/api/`)
* `corpusname`
* `playname`
* `method`
* `parse_json`: `True`, `False` (default) – will parse the response as `json`


In [17]:
#corpusname:str -> []
def get(**kwargs):
    #corpusname=corpusname
    #playname=playname
    #apibase="https://dracor.org/api/"
    #method=method
    #parse_json: True
    
    #could set different apibase, e.g. https://staging.dracor.org/api/ [not recommended, pls use the production server]
    if "apibase" in kwargs:
        if kwargs["apibase"].endswith("/"):
            apibase = kwargs["apibase"]
        else:
            apibase = kwargs["apibase"] + "/"
    else:
        #use default
        apibase = "https://dracor.org/api/"
    if "corpusname" in kwargs and "playname" in kwargs:
        # used for /api/corpora/{corpusname}/play/{playname}/
        if "method" in kwargs["method"]:
            request_url = apibase + "corpora/" + kwargs["corpusname"] + "/play/" + kwargs["playname"] + "/" + kwargs["method"]
        else:
            request_url = apibase + "corpora/" + kwargs["corpusname"] + "/play/" + kwargs["playname"]
    elif "corpusname" in kwargs and not "playname" in kwargs:
        if "method" in kwargs:
            request_url = apibase + "corpora/" + kwargs["corpusname"] + "/" + kwargs["method"]
        else:
            request_url = apibase + "corpora/" + kwargs["corpusname"] 
    elif "method" in kwargs and not "corpusname" in kwargs and not "playname" in kwargs:
            request_url = apibase + kwargs["method"]
    else: 
        #nothing set
        request = request_url = apibase + "info"
    
    #send the response
    r = requests.get(request_url)
    if r.status_code == 200:
        #success!
        if "parse_json" in kwargs:
            if kwargs["parse_json"] == True:
                json_data = json.loads(r.text)
                return json_data
            else:
                return r.text
        else:
            return r.text
    else:
        raise Exception("Request was not successful. Server returned status code: "  + str(r.status_code))
       

The function can now be called as follows below. The function call requests the Info about the API `/api/info`:

In [18]:
get(method="info", parse_json=True)

{'name': 'DraCor API',
 'status': 'beta',
 'existdb': '5.2.0',
 'version': '0.82.0'}

To request the metrics of a single play (`/api/corpora/{corpusname}/play/{playname}/metrics`) use the following function call:

In [19]:
get(corpusname="ger",playname="lessing-emilia-galotti",method="metrics",parse_json=True)

{'segments': [{'number': 1,
   'speakers': ['der_prinz', 'der_kammerdiener'],
   'title': 'Erster Aufzug | Erster Auftritt',
   'type': 'scene'},
  {'number': 2,
   'speakers': ['der_prinz', 'conti'],
   'title': 'Erster Aufzug | Zweiter Auftritt',
   'type': 'scene'},
  {'number': 3,
   'speakers': ['der_prinz'],
   'title': 'Erster Aufzug | Dritter Auftritt',
   'type': 'scene'},
  {'number': 4,
   'speakers': ['conti', 'der_prinz'],
   'title': 'Erster Aufzug | Vierter Auftritt',
   'type': 'scene'},
  {'number': 5,
   'speakers': ['der_prinz'],
   'title': 'Erster Aufzug | Fünfter Auftritt',
   'type': 'scene'},
  {'number': 6,
   'speakers': ['marinelli', 'der_prinz'],
   'title': 'Erster Aufzug | Sechster Auftritt',
   'type': 'scene'},
  {'number': 7,
   'speakers': ['der_prinz', 'der_kammerdiener'],
   'title': 'Erster Aufzug | Siebenter Auftritt',
   'type': 'scene'},
  {'number': 8,
   'speakers': ['der_prinz', 'camillo_rota'],
   'title': 'Erster Aufzug | Achter Auftritt',
 

## `/corpora/metadata`: list metadata for all plays in a corpus
In the following example the metadata of the plays in the *Alsatian Drama Corpus* and parses the JSON into a list of dictionaries.

In [22]:
get(corpusname="als",method="metadata",parse_json=True)

[{'size': 16,
  'wordCountSp': 35959,
  'averageClustering': 0.9785714285714285,
  'numOfPersonGroups': 0,
  'density': 0.9,
  'averagePathLength': 1.1,
  'maxDegreeIds': 'dorthee|rosine|christinel',
  'averageDegree': 13.5,
  'name': 'arnold-der-pfingstmontag',
  'normalizedGenre': None,
  'diameter': 2,
  'yearPremiered': '1835',
  'yearPrinted': '1816',
  'numOfL': 4282,
  'maxDegree': 15,
  'numOfSpeakers': 16,
  'wordCountText': 36362,
  'numOfP': 0,
  'yearNormalized': 1816,
  'libretto': False,
  'subtitle': 'Lustspiel in Straßburger Mundart',
  'title': 'Der Pfingstmontag',
  'numConnectedComponents': 1,
  'wordCountStage': 719,
  'numOfSpeakersUnknown': 0,
  'yearWritten': None,
  'firstAuthor': 'Arnold',
  'id': 'als000001',
  'numOfSpeakersFemale': 8,
  'numOfSegments': 42,
  'numOfSpeakersMale': 8,
  'wikipediaLinkCount': 0,
  'numOfActs': 5,
  'numOfCoAuthors': 0,
  'playName': 'arnold-der-pfingstmontag'},
 {'size': 22,
  'wordCountSp': 19735,
  'averageClustering': 0.9494

In [23]:
# demonstrate pandas df