## Data acquisition via the Europeana API

This notebook explains how you can extract data from the Europeana platform via its APIs. In most cases, the data that are made available via the Europeana APIs can be viewed on Europeana's public website as well. To get a sense of the type of data that are available, open the link below and have look at the metadata. 


[https://www.europeana.eu/en/item/598/9939757197402711_item_2420001](https://www.europeana.eu/en/item/598/9939757197402711_item_2420001)

If you want to anaylyse these data using quantitative methods, it is generally more effective to extract these values as raw data. This can be accomplished via two APIs. Europeana maintains a [Search API](https://pro.europeana.eu/page/search) and a [Record API](https://pro.europeana.eu/page/record). 

## Record API

The record API, firstly, enables you to retrieve the values that have been collected and created in individual records. It works with the following paramaters: 

* record_format: 'json' or 'rdf'
* item_id: The full identifier of the record. This typically inculdes the identifier for the content provider. 
* profile: Value 'standard'

You can view the raw data of the record you looked at earlier by clicking the URL that is generated by the cell below. 

In [1]:
import requests
import re
from os.path import join
import json

from credentials import api_key
base_url_record = 'https://api.europeana.eu/record/v2/'
record_format = 'json'
item_id = '598/9939757197402711_item_2420001'
profile = 'standard'

api_call = f'{base_url_record}{item_id}.{record_format}?profile={profile}&wskey={api_key}'

print(api_call)


https://api.europeana.eu/record/v2/598/9939757197402711_item_2420001.json?profile=standard&wskey=ineternates


### Exercise 

Try to obtain raw data about another object in Europeana. You can work with one of the objects below, for example. 

* https://www.europeana.eu/en/item/598/9939757142102711_item_2709295
* https://www.europeana.eu/item/598/9939757129002711_item_935820

The code that was given exports the the data in the JSON format. Also try to view the data in RDF/XML format. 


The cells below give an impression of the kinds of values that are available in Europeana records. Note that the code only shows you a small fraction of the data values that are actually available. 

In [2]:
base_url_record = 'https://api.europeana.eu/record/v2/'
record_format = 'json'
item_id = '598/9939757129002711_item_935820'
profile = 'standard'

api_call = f'{base_url_record}{item_id}.{record_format}?profile={profile}&wskey={api_key}'
print(api_call)

response = requests.get(api_call)

if response:
    api_data = response.json()


https://api.europeana.eu/record/v2/598/9939757129002711_item_935820.json?profile=standard&wskey=ineternates


In [3]:
agents = dict()

for agent in api_data['object']['agents']:
    if 'prefLabel' in agent:
        label = agent['prefLabel']['en'][0]
        about = agent['about']
        agents[about] = label

for agent in agents:
    print(f'{agent}\n{agents[agent]}\n')
    
places = dict()

for place in api_data['object']['places']:

    if 'en' in place['prefLabel']:
        label = place['prefLabel']['en'][0]
    else:
        label = ''
        
    about = place['about']
    places[about] = label
        


for place in places:
    print(f'{place}\n{places[place]}\n')
    
        
for proxy in api_data['object']['proxies']:

    if 'year' in proxy:
        print('Year: ' + proxy['year']['def'][0])
    print('Language: ' + proxy['dcLanguage']['def'][0])
    
    if 'dcDescription' in proxy:
        if 'en' in proxy['dcDescription']:
            for note in proxy['dcDescription']['en']:
                print(note)
                
    if 'dctermsExtent' in proxy:
        if 'en' in proxy['dctermsExtent']:
            for note in proxy['dcDescription']['en']:
                print(note)
        

http://www.wikidata.org/entity/Q892143
Bonaventura Vulcanius



http://data.europeana.eu/place/85
France

http://vocab.getty.edu/tgn/1000070
France

https://sws.geonames.org/2751772/


https://sws.geonames.org/11874022/


Year: 1425
Language: lat
Language: lat
decorated initials, coloured symbols in the tables
handwritten, littera textualis
17 lines, written space varies
Binding: Medieval binding, contemporary (so-called belt book).
Description (Senguerdius & 1716): http://hdl.handle.net/1887.1/item:290399
Description (Molhuysen 1910): http://hdl.handle.net/1887.1/item:3151050
Description (Catalogus compendiarius 1932): http://hdl.handle.net/1887.1/item:491294
Also described by MMDC and A.W. Byvanckgenootschap (database RKD, The Hague).
Restored (report available), 1986
decorated initials, coloured symbols in the tables
handwritten, littera textualis
17 lines, written space varies
Binding: Medieval binding, contemporary (so-called belt book).
Description (Senguerdius & 1716): http://hd

## Search API

The search API can be used to select records with specific properties in the Europeana database. Te query below selects all the items from Leiden University that are part of the [ARMA collection](https://www.medieval-reads.eu/home). 

In [4]:
base_url = 'https://api.europeana.eu/record/v2/search.json?'
profile= 'standard' 
qf= 'Leiden%20University%20Libraries'
query='middle%20ages'
rows=100
start=1

api_call = f'{base_url}profile={profile}&qf={qf}&query={query}&rows={rows}&start={start}&wskey={api_key}'
print(api_call)

response = requests.get(api_call)

if response:
    api_data = response.json()
    if len(api_data)>0:
        print(f"{api_data['totalResults']} results")

https://api.europeana.eu/record/v2/search.json?profile=standard&qf=Leiden%20University%20Libraries&query=middle%20ages&rows=100&start=1&wskey=ineternates
610 results


To experiment with the Search API, change the value of the query parameter below. 

In [7]:
base_url = 'https://api.europeana.eu/record/v2/search.json?'
profile= 'standard' 
query='alchemy'
rows=100
start=1

api_call = f'{base_url}profile={profile}&query={query}&rows={rows}&start={start}&wskey={api_key}'
print(api_call)

response = requests.get(api_call)

if response:
    api_data = response.json()
    if len(api_data)>0:
        print(f"{api_data['totalResults']} results")

https://api.europeana.eu/record/v2/search.json?profile=standard&query=alchemy&rows=100&start=1&wskey=ineternates
2201 results


In [8]:
if len(api_data)>0:
    for item in api_data['items']:
        print(item['guid'], end = '\n\n')

https://www.europeana.eu/item/2059218/data_sounds_IT_DDS0000012917000200?utm_source=api&utm_medium=api&utm_campaign=ineternates

https://www.europeana.eu/item/9200271/BibliographicResource_3000058904832?utm_source=api&utm_medium=api&utm_campaign=ineternates

https://www.europeana.eu/item/9200271/BibliographicResource_3000058904829?utm_source=api&utm_medium=api&utm_campaign=ineternates

https://www.europeana.eu/item/283/2746_C?utm_source=api&utm_medium=api&utm_campaign=ineternates

https://www.europeana.eu/item/92093/BibliographicResource_1000086140428?utm_source=api&utm_medium=api&utm_campaign=ineternates

https://www.europeana.eu/item/2064114/Museu_ProvidedCHO_Digitales_Kunst__und_Kulturarchiv_D_sseldorf_854849?utm_source=api&utm_medium=api&utm_campaign=ineternates

https://www.europeana.eu/item/9200579/qj5u2zua?utm_source=api&utm_medium=api&utm_campaign=ineternates

https://www.europeana.eu/item/9200519/ark__12148_btv1b90613892?utm_source=api&utm_medium=api&utm_campaign=ineternates

