# Accessing Europeana IIIF APIs
[Europeana IIIF APIs](https://pro.europeana.eu/page/iiif), allows us to download, share, and reuse images and text of Europeana newspapers.

This notebook tries to introduce how to explore the repository, basically search, read a record, obtain the fulltext and create a CSV dataset.

Europeana IIIF APIs requires an API key to access the endpoints. Please register with https://pro.europeana.eu/page/get-api to get a key.


## Setting up things

In [66]:
import requests, csv
import json
import pandas as pd

## Glogal configuration
In this section, we can add our api_key, the text that we want to use to search and retrieve the elements, and the number of records to retrieve.

In [67]:
api_key = 'J6W44jvPV'
query = 'paris'

## Performing a search using the API
The API allows us searching on text and retrieving the hits highlighted, as traditional systems (e.g. Lucene and Solr).

In [68]:
#url = 'https://newspapers.eanadev.org/api/v2/search.json?query=paris&profile=hits&wskey=J6W44jvPV'
url = 'https://newspapers.eanadev.org/api/v2/search.json'
r = requests.get(url, params = {'query': query, 'profile': 'hits', 'wskey': api_key })
print(r.url)
response = r.text
print(response)

https://newspapers.eanadev.org/api/v2/search.json?query=paris&profile=hits&wskey=J6W44jvPV
{"apikey":"J6W44jvPV","success":true,"requestNumber":999,"itemsCount":12,"totalResults":439085,"items":[{"completeness":5,"country":["Latvia"],"dataProvider":["National Library of Latvia"],"dcLanguage":["lv","ru"],"dcLanguageLangAware":{"def":["lv","ru"]},"dcTitleLangAware":{"def":["Sarkanais Sports - 1941-03-10"]},"edmConcept":["http://data.europeana.eu/concept/base/18"],"edmConceptLabel":[{"def":"Zeitung"},{"def":"Avis"},{"def":"Газета"},{"def":"Газета"},{"def":"Sanomalehti"},{"def":"Jornal"},{"def":"Вестник"},{"def":"Laikraštis"},{"def":"Laikraksts"},{"def":"Novine"},{"def":"Journal"},{"def":"Újság"},{"def":"Լրագիր"},{"def":"Novine"},{"def":"Газета"},{"def":"გაზეთი"},{"def":"Denník (žurnalistika)"},{"def":"Časopis"},{"def":"Nuachtán"},{"def":"Pàipear-naidheachd"},{"def":"Gazeta"},{"def":"Весник"},{"def":"Premsa"},{"def":"Новине"},{"def":"Tidning"},{"def":"신문"},{"def":"Prensa escrita"},{"def":"

## Displaying the mentions in the transcribed text where the search keyword was found.

In [69]:
results = json.loads(response)

for r in results['hits']:
    print('id:' + r['scope'])
    for s in r['selectors']:
        
        print(s.get('prefix', '') + s.get('exact', '') + s.get('suffix', ''))

id:/9200303/BibliographicResource_3000059897585
Latvijas PSR meistars Bērziņš skrēja vienā pāri ar Tomskas ātrslidotāju Sergēju. 
id:/9200303/BibliographicResource_3000059897596
Eiropas čempions Alfonss Bērziņš skrēja vienā pāri ar Saburovu (Ļeņlngrada). 
id:/9200303/BibliographicResource_3000059897574
Toto: vk 18.—, it 29,—, pāris 266.—. Hanovera Diloņa lieliskā uzvara. 
id:/9200303/BibliographicResource_3000059897894
Ei»er dieser Streiche ist berühmt gewolden und wurde in früheren Jahren oft erzählt: Der Thier-inaler Jadin hatte einen Freund, der im Jardin des Plantes (der zoologische Garten in Paris) Beamter war; diesen Freund bat er, ihm für eine Opern 11 acht eine» seiner Bären, den in ganz 
id:/9200303/BibliographicResource_3000059897564
Toto: vk par Janošu 10.—, it par Žubīti 32.—, pāris 53.50. 3. 1. Klaips — J. 
id:/9200303/BibliographicResource_3000059897584
No teicamas si tuācijas pāri lietuviešu vārtiem pārlidoja Uukkivi raidījums. 
id:/9200303/BibliographicResource_30000598

## Creating a CSV file

In [90]:
csv_out = csv.writer(open('eu_records.csv', 'w'), delimiter = ',', quotechar = '"', quoting = csv.QUOTE_MINIMAL)
csv_out.writerow(['title', 'thumbnail', 'date', 'license', 'typem', 'language', 'fulltextUrl', 'manifestUrl', 'fulltext'])

78

## Retrieving the manifests
A manifest describes the information needed for a viewer to present a digital object to the user, such as the title and the sequence of views/images. We can also retrieve the manifest of each item. According to the Europeana documentation, the request follows the pattern https://iiif.europeana.eu/presentation/[RECORD_ID]/manifest

The manifest includes the metadata, some of the attribues are multivalued.

The full text is available at 
https://www.europeana.eu/api/fulltext/9200303/BibliographicResource_3000059898023/472ef0641de5cce2ba8eb26d67110ed6#char=0,10o

In [109]:
results = json.loads(response)

for r in results['hits']:
    
    title = thumbnail = date = license = typem = language = fulltextUrl = manifestUrl = fulltext =''
    
    manifestUrl = 'https://iiif.europeana.eu/presentation/' + r['scope'] + '/manifest'
    responseManifest = requests.get(manifestUrl, params = {'wskey': api_key })
    print(responseManifest.url)
    
    # retrieving the metadata
    m = json.loads(responseManifest.text)
    
    # retrieving metadata
    title = m['label'][0]['@value']
    thumbnail = m['thumbnail']['@id']
    date = m['navDate']
    license = m['license']

    for i in m['metadata']:
        if i['label'] == 'type':
            typem = i['value'][0]['@value']
        elif i['label'] == 'language':
            language = i['value'][0]['@value']
        else: pass
        
    ## getting the full text
    annopageUrl = 'https://iiif.europeana.eu/presentation/' + r['scope'] + '/annopage/1'
    responseAnnopage = requests.get(annopageUrl, params = {'wskey': api_key })
    print(responseAnnopage.url)
    
    a = json.loads(responseAnnopage.text)
    fulltextUrl = a['resources'][0]['resource']['@id']
    print(fulltextUrl)
    
    responseFulltext = requests.get(fulltextUrl, params = {'wskey': api_key })
   
    # retrieving the metadata
    f = json.loads(responseFulltext.text)
    # TODO check encoding
    fulltext = f['value']
   
    print('-------')
    
    csv_out.writerow([title, thumbnail, date, license, typem, language, fulltextUrl, manifestUrl, fulltext])

https://iiif.europeana.eu/presentation//9200303/BibliographicResource_3000059897585/manifest?wskey=J6W44jvPV
https://iiif.europeana.eu/presentation//9200303/BibliographicResource_3000059897585/annopage/1?wskey=J6W44jvPV
https://www.europeana.eu/api/fulltext/9200303/BibliographicResource_3000059897585/47cfd10478e1981f348ccabf9fd2de3e
-------
https://iiif.europeana.eu/presentation//9200303/BibliographicResource_3000059897596/manifest?wskey=J6W44jvPV
https://iiif.europeana.eu/presentation//9200303/BibliographicResource_3000059897596/annopage/1?wskey=J6W44jvPV
https://www.europeana.eu/api/fulltext/9200303/BibliographicResource_3000059897596/33ac018d7d9d0d5442e0214c3f3f13d1
-------
https://iiif.europeana.eu/presentation//9200303/BibliographicResource_3000059897574/manifest?wskey=J6W44jvPV
https://iiif.europeana.eu/presentation//9200303/BibliographicResource_3000059897574/annopage/1?wskey=J6W44jvPV
https://www.europeana.eu/api/fulltext/9200303/BibliographicResource_3000059897574/50d44c2664ac

In [110]:
# Load the CSV file from GitHub.
# This puts the data in a Pandas DataFrame
df = pd.read_csv('eu_records.csv')

## Have a peek

In [111]:
df

Unnamed: 0,title,thumbnail,date,license,typem,language,fulltextUrl,manifestUrl,fulltext
0,Sarkanais Sports - 1941-02-17,https://api.europeana.eu/api/v2/thumbnail-by-u...,1941-02-17T00:00:00Z,http://creativecommons.org/publicdomain/mark/1.0/,http://schema.org/PublicationIssue,lv,https://www.europeana.eu/api/fulltext/9200303/...,https://iiif.europeana.eu/presentation//920030...,sjg?diues^C?O-'^uo? 5.?!?6??qo\nVISU ZEMJU PRO...
1,Sarkanais Sports - 1941-01-27,https://api.europeana.eu/api/v2/thumbnail-by-u...,1941-01-27T00:00:00Z,http://creativecommons.org/publicdomain/mark/1.0/,http://schema.org/PublicationIssue,lv,https://www.europeana.eu/api/fulltext/9200303/...,https://iiif.europeana.eu/presentation//920030...,Oblig?ts konfroleksempl?rs\n??? ?*?$? »m>t?>ne...
2,Sarkanais Sports - 1940-10-07,https://api.europeana.eu/api/v2/thumbnail-by-u...,1940-10-07T00:00:00Z,http://creativecommons.org/publicdomain/mark/1.0/,http://schema.org/PublicationIssue,lv,https://www.europeana.eu/api/fulltext/9200303/...,https://iiif.europeana.eu/presentation//920030...,"P/7 i ? / ( L-J Lix3 U\A :U Nr. 24 Pirmdien, 7..."
3,Windausche Zeitung - 1903-11-14,https://api.europeana.eu/api/v2/thumbnail-by-u...,1903-11-14T00:00:00Z,http://creativecommons.org/publicdomain/mark/1.0/,http://schema.org/PublicationIssue,de,https://www.europeana.eu/api/fulltext/9200303/...,https://iiif.europeana.eu/presentation//920030...,???????? ??????.\nAbonn«m»»tspr«is! jährlich 2...
4,Sarkanais Sports - 1940-12-30,https://api.europeana.eu/api/v2/thumbnail-by-u...,1940-12-30T00:00:00Z,http://creativecommons.org/publicdomain/mark/1.0/,http://schema.org/PublicationIssue,lv,https://www.europeana.eu/api/fulltext/9200303/...,https://iiif.europeana.eu/presentation//920030...,Vi*u *emfrt pttlef?tMh #«?1«?^/1??#||\nnr. 58 ...
5,Sarkanais Sports - 1940-09-09,https://api.europeana.eu/api/v2/thumbnail-by-u...,1940-09-09T00:00:00Z,http://creativecommons.org/publicdomain/mark/1.0/,http://schema.org/PublicationIssue,lv,https://www.europeana.eu/api/fulltext/9200303/...,https://iiif.europeana.eu/presentation//920030...,V\nVisu zemju prolet?rie?i/ savienoj?etiesl\n?...
6,Sarkanais Sports - 1941-04-07,https://api.europeana.eu/api/v2/thumbnail-by-u...,1941-04-07T00:00:00Z,http://creativecommons.org/publicdomain/mark/1.0/,http://schema.org/PublicationIssue,lv,https://www.europeana.eu/api/fulltext/9200303/...,https://iiif.europeana.eu/presentation//920030...,"VISU ZEMJU PROLET?RIE?I, SAVIENOJIETIES!\nReda..."
7,Sarkanais Sports - 1940-08-19,https://api.europeana.eu/api/v2/thumbnail-by-u...,1940-08-19T00:00:00Z,http://creativecommons.org/publicdomain/mark/1.0/,http://schema.org/PublicationIssue,lv,https://www.europeana.eu/api/fulltext/9200303/...,https://iiif.europeana.eu/presentation//920030...,i\nVisu l?m?u proletarleSf/ scvleno?atte?l ??2...
8,Sarkanais Sports - 1941-03-10,https://api.europeana.eu/api/v2/thumbnail-by-u...,1941-03-10T00:00:00Z,http://creativecommons.org/publicdomain/mark/1.0/,http://schema.org/PublicationIssue,lv,https://www.europeana.eu/api/fulltext/9200303/...,https://iiif.europeana.eu/presentation//920030...,"?? ?EMJU PROVSTjf?tTZST, SAVIENOJIETIES!\nReda..."
9,Sarkanais Sports - 1941-01-20,https://api.europeana.eu/api/v2/thumbnail-by-u...,1941-01-20T00:00:00Z,http://creativecommons.org/publicdomain/mark/1.0/,http://schema.org/PublicationIssue,lv,https://www.europeana.eu/api/fulltext/9200303/...,https://iiif.europeana.eu/presentation//920030...,"Visu ????\nRedakci|a, kantoris un ekspedicija ..."


## Showing the thumbnails as a gallery

Once we have queried the repository and we have the metadata as a CSV file, let's show the results as a thumbnail gallery

In [122]:
from IPython.display import HTML, Image

def _src_from_data(data):
    """Base64 encodes image bytes for inclusion in an HTML img element"""
    img_obj = Image(data=data)
    for bundle in img_obj._repr_mimebundle_():
        for mimetype, b64value in bundle.items():
            if mimetype.startswith('image/'):
                return f'data:{mimetype};base64,{b64value}'

def gallery(images, row_height='auto'):
    """Shows a set of images in a gallery that flexes with the width of the notebook.
    
    Parameters
    ----------
    images: list of str or bytes
        URLs or bytes of images to display

    row_height: str
        CSS height value to assign to all images. Set to 'auto' by default to show images
        with their native dimensions. Set to a value like '250px' to make all rows
        in the gallery equal height.
    """
    figures = []
    for image in images:
        if isinstance(image, bytes):
            src = _src_from_data(image)
            caption = ''
        else:
            src = image
            caption = f'<figcaption style="font-size: 0.6em">{image}</figcaption>'
        figures.append(f'''
            <figure style="margin: 5px !important;">
              <img src="{src}" style="height: {row_height}">
              
            </figure>
        ''')
    return HTML(data=f'''
        <div style="display: flex; flex-flow: row wrap; text-align: center;">
        {''.join(figures)}
        </div>
    ''')

In [121]:
#gallery(urls, row_height='150px')
gallery(df['thumbnail'], row_height='150px')