# Accessing LMEC Collections via IIIF API

This notebook provides some tips for using Digital Commonwealth's IIIF API to query the LMEC collections portal and programmatically retrieve metadata about collections items.

### Understanding the IIIF API

Through the BPL/Digital Commonwealth, all of LMEC's maps are compliant with the International Image Interoperability Framework (IIIF). This means you can use IIIF APIs to retrieve Image and Presentation responses for any LMEC collection item.

#### Image API

An **Image API** request can return either *image metadata* or *a static image*.

Let's say we want to request metadata and an image for this recently-accessioned [map of summer resorts along the Boston & Maine Railroad](https://collections.leventhalmap.org/search/commonwealth:g158f6689).

The LMEC's API syntax for requesting image metadata is `BASE_URL` + `IMAGE_ID` + `/info.json`:

    # base URL
    https://iiif.digitalcommonwealth.org/iiif/2/

    # image information request
    https://iiif.digitalcommonwealth.org/iiif/2/IMAGE_ID/info.json

The image ID can be found by parsing the JSON data from any old collections item:


In [None]:
import json
import requests

item = requests.get("https://collections.leventhalmap.org/search/commonwealth:g158f6689.json")
print(item.json()['response']['document']['exemplary_image_ssi'])

We can append this image ID to the base URL to retrieve image metadata:

In [None]:
base = "https://iiif.digitalcommonwealth.org/iiif/2/"
imageID = "commonwealth:7w62hz17g"

imageInfo = requests.get(base+imageID)

print(json.dumps(imageInfo.json(), indent=2))

The following syntax will return a static image:

    # full image as JPEG
    https://iiif.digitalcommonwealth.org/iiif/2/IMAGE_ID/full/full/0/default.jpg

And accessing it is as easy as combining a few sets of strings:

In [None]:
iiifSpec = "/full/full/0/default.jpg"

imageStatic = (base+imageID+iiifSpec)
print(imageStatic)

By tweaking the `iiifSpec` variable, you can easily manipulate the image, adding parameters for size, rotation, quality, format, and more. See the [IIIF docs](https://iiif.io/api/image/3.0/) for more information.

You can also retrieve image data in greater bulk with a `for` loop or a data frame. For example, the following search query returns 157 pictorial maps in Massachusetts:

    https://collections.leventhalmap.org/search?f%5Bsubject_geographic_sim%5D%5B%5D=Massachusetts&q=pictorial

By parsing this query in a data frame, we can retrieve IIIF image metadata and URLs in bulk.

Below, we've simply filtered the JSON API response by two fields: the Commonwealth ID and the IIIF ID.

For the sake of this example, I've limited the results to 20.

In [None]:
import pandas as pd

# don't forget to append
# `.json` after `search` in the URL!

data = requests.get("https://collections.leventhalmap.org/search.json?f%5Bsubject_geographic_sim%5D%5B%5D=Massachusetts&per_page=20&q=pictorial")

# parse the response using the JSON API
# and view it in a pandas data frame

df = pd.DataFrame(data.json()['response']['docs'])
fields = ['id', 'exemplary_image_ssi']
newFieldNames = {'id':'commonwealth_id', 'exemplary_image_ssi':'iiif_id'}
df_fltr = pd.DataFrame(df[fields])
df_fltr.rename(columns = newFieldNames, inplace = True)
df_fltr


Let's say we want to create a list that contains static image URLs for these 20 maps.

We can easily loop through the `iiif_id` column and use the image API syntax to generate a list of image URLs. In doing so, we might want to redefine the `iiifSpec` variable so that we're retrieving smaller images:

In [None]:
# redefine `iiifSpec` to load smaller images

iiifSpec = "/full/1200,/0/default.jpg"

# create an empty list to hold IIIF image endpoints and
# loop through the data frame to retrieve them

iiifMaps = []
for a in df_fltr['iiif_id']:
    a = (base+a+iiifSpec)
    iiifMaps.append(a)
print(iiifMaps)

#### Presentation API

Where the Image API makes it easy to retrieve image metadata and static images, the **Presentation API** allows us to embed high quality zoomable images in things like web apps or image viewers.

Practically, a IIIF manifest is just a URL. Functionally, it's the package that contains all information related to a particular digital object, including the image itself as well as the metadata.

The manifest is accessible via a URL that points to file which can be read by a IIIF tool or viewer, like Mirador or OpenSeadragon. ([description from Harvard Library](https://library.harvard.edu/services-tools/iiif-manifests-digital-objects))

In LMEC collections, IIIF Presentation manifests can be returned by appending `/manifest` to the URL for the item detail page. For example:

In [138]:
data = requests.get("https://collections.leventhalmap.org/search/commonwealth:g158f6689/manifest")

data.json()

{'@context': 'http://iiif.io/api/presentation/2/context.json',
 '@id': 'https://ark.digitalcommonwealth.org/ark:/50959/g158f6689/manifest',
 '@type': 'sc:Manifest',
 'label': 'Summer resorts of the coast, lake, and mountain regions along the Boston & Maine Railroad and connections',
 'thumbnail': {'@id': 'https://ark.digitalcommonwealth.org/ark:/50959/g158f6689/thumbnail',
  'service': {'@context': 'http://iiif.io/api/image/2/context.json',
   '@id': 'https://iiif.digitalcommonwealth.org/iiif/2/commonwealth:7w62hz17g',
   'profile': 'http://iiif.io/api/image/2/level2.json'}},
 'viewingHint': 'individuals',
 'metadata': [{'label': 'Title',
   'value': 'Summer resorts of the coast, lake, and mountain regions along the Boston & Maine Railroad and connections'},
  {'label': 'Date', 'value': '[1912]'},
  {'label': 'Creator',
   'value': ['Boston and Maine Railroad. General Passenger Department',
    'Matthews-Northrup Works']},
  {'label': 'Publisher', 'value': 'Buffalo : Matthews-Northrup 

This is just a single item. We can retrieve items in bulk with the Presentation API in much the same way as the Image API: by parsing with the JSON API.

Let's recreate the search for pictorial maps in Massachusetts that we filtered earlier

    https://collections.leventhalmap.org/search?f%5Bsubject_geographic_sim%5D%5B%5D=Massachusetts&q=pictorial

and then create a list of manifest URLs by appending `/manifest` to the base URL and the Commonwealth ID:

In [None]:
data = requests.get("https://collections.leventhalmap.org/search.json?f%5Bsubject_geographic_sim%5D%5B%5D=Massachusetts&per_page=20&q=pictorial")

df = pd.DataFrame(data.json()['response']['docs'])
fields = ['id', 'exemplary_image_ssi']
newFieldNames = {'id':'commonwealth_id', 'exemplary_image_ssi':'iiif_id'}
df_fltr = pd.DataFrame(df[fields])
df_fltr.rename(columns = newFieldNames, inplace = True)
df_fltr

# define manifest

manifest = "/manifest"

# redefine base URL

base = "https://collections.leventhalmap.org/search/"

# redefine empty list to hold IIIF image endpoints and
# loop through the data frame to retrieve them

iiifMaps = []
for a in df_fltr['commonwealth_id']:
    a = (base+a+manifest)
    iiifMaps.append(a)
print(iiifMaps)


For casual (e.g., non-library staff) users, the metadata from the Presentation API is better than the metadata from the JSON API. It's self-selected so that only widely relevant fields like creator and date are present.

All of that stuff lives in the `metadata` node of the `manifest` response. We can filter it with:

In [134]:
data = requests.get("https://collections.leventhalmap.org/search/commonwealth:g158f6689/manifest")

data.json()['metadata']

[{'label': 'Title',
  'value': 'Summer resorts of the coast, lake, and mountain regions along the Boston & Maine Railroad and connections'},
 {'label': 'Date', 'value': '[1912]'},
 {'label': 'Creator',
  'value': ['Boston and Maine Railroad. General Passenger Department',
   'Matthews-Northrup Works']},
 {'label': 'Publisher', 'value': 'Buffalo : Matthews-Northrup Works'},
 {'label': 'Type of Resource', 'value': 'Cartographic'},
 {'label': 'Format', 'value': 'Maps'},
 {'label': 'Language', 'value': 'English'},
 {'label': 'Subjects',
  'value': ['New England--Maps',
   'Railroads--New England--Maps',
   'Boston and Maine Railroad']},
 {'label': 'Location', 'value': 'Boston Public Library'},
 {'label': 'Collection (local)',
  'value': 'Norman B. Leventhal Map Center Collection'},
 {'label': 'Identifier',
  'value': ['https://ark.digitalcommonwealth.org/ark:/50959/g158f6689',
   '06_01_017902',
   'G3721.P3 1912 .B67',
   '39999085963088']},
 {'label': 'Terms of Use',
  'value': ['No know

Finally, if you want to query items like atlases which contain child items, such as

    https://www.digitalcommonwealth.org/search/commonwealth:tt44pw076/

you'll need to loop through the different parts of the `manifest` endpoint.

In our collections, items like this follow a hierarchy of `sequence` > `canvas` > `image`. If you query the manifest (e.g., append `/manifest` to the end of this item), you can access canvases like so:


In [None]:
atlas = requests.get("https://www.digitalcommonwealth.org/search/commonwealth:tt44pw076/manifest")

for a in atlas.json()['sequences'][0]['canvases']:
    print(a['@id'])

You mainly just need the 9-character identifier at the end of these URLs, so you might want to substring the rest away:

In [None]:
atlas_list = []

for a in atlas.json()['sequences'][0]['canvases']:
    atlas_list.append(a['@id'][-9:])

print(atlas_list)
