# Working with APIs in Python

Making API requests in Python can be really simple. There's a low-level module called urllib that can also make the kinds of web requests that we want, but it's not as friendly as the `requests` module, which we'll be using.

In [2]:
import requests

## Authentication

You'll have to authenticate each request to the Harvard Art Museum API with an API key. Other APIs may require different kinds of authentication (sometimes very complicated auth! Look for libraries at that point), but HAM has some pretty simple authentication, which makes things easy for us. You can sign up for a key [here](https://www.harvardartmuseums.org/collections/api).

In [3]:
APIKEY = "b0cde630-ce66-11e8-951c-b3d75228cc98" # Enter your API key here

## Basic request

We're going to start off with a basic request to the API. This API, like many others, has a variety of endpoints, each with their own url, slightly modified from a base url. We'll worry about the general case in a bit, for now let's look at a basic API request.

In this example, we'll re-create the first example in the [Object endpoint documentation](https://github.com/harvardartmuseums/api-docs/blob/master/object.md), which will give each of you the records for 10 objects that have never been viewed online in the museum's collections.

In [4]:
url = "https://api.harvardartmuseums.org/object"
parameters = {
    "q":"totalpageviews:0",
    "size":10,
    "apikey":APIKEY
}
R = requests.get(url,params=parameters)
R.json()

{'info': {'next': 'https://api.harvardartmuseums.org/object?q=totalpageviews%3A0&size=10&apikey=b0cde630-ce66-11e8-951c-b3d75228cc98&page=2',
  'page': 1,
  'pages': 5627,
  'totalrecords': 56270,
  'totalrecordsperquery': 10},
 'records': [{'accessionmethod': 'Gift',
   'accessionyear': 1933,
   'accesslevel': 1,
   'century': None,
   'classification': 'Vessels',
   'classificationid': 57,
   'colorcount': 0,
   'commentary': None,
   'contact': 'am_asianmediterranean@harvard.edu',
   'contextualtextcount': 0,
   'copyright': None,
   'creditline': 'Harvard Art Museums/Fogg Museum, Gift of The Republic of Spain through the Museo Arqueologico Nacional and Professor Arthur Kingsley Porter',
   'culture': 'Iberian',
   'datebegin': 0,
   'dated': None,
   'dateend': 0,
   'dateoffirstpageview': None,
   'dateoflastpageview': None,
   'department': 'Department of Ancient and Byzantine Art & Numismatics',
   'description': 'Iberian pottery found in the necropoleis at Peal de Becerro (Jaen

### Refresher on Dictionaries

Python dictionaries are sets of key / value pairs, where a value can be accessed by its key. You're essentially naming a value in a container, so you can easily call it up later.

Dictionaries have very fast lookups, so you can get a value from its key very quickly, no matter how large the dictionary is. However, they are also unordered, so if you iterate through all of the key / value pairs in the dictionary, there's no guarantee that they'll be in the same order.

We're just going to be looking up data in dictionaries, so here's a quick refresher on the syntax:

In [5]:
parameters['q']

'totalpageviews:0'

In [6]:
parameters['apikey'] # This also works when we've set the value to another variable

'b0cde630-ce66-11e8-951c-b3d75228cc98'

In [7]:
parameters['q'] = "totalpageviews:1" # You can also set the value of a key like you would a variable

## Making a Request

The request syntax is so simple, you might have missed it. Let's query again for objects with only one pageview, and take a closer look.

In [8]:
R = requests.get(url,params=parameters)

### Formatted parameters

That request has created a request object, which contains not only the data that we get from the Harvard Art Museums, but information on the request we sent, like the URL that it used. Notice that requests has turned our query parameter dictionary into a GET request at the end of our URL.

If you've been working with API requests or web scraping before, you might be used to seeing URLs get constructed like this:

```python
url = "https://api.harvardartmuseums.org/object?q=" + query + "&apikey=" + apikey
```

If you have, I'm sure you'll appreciate how much simpler this is, especially when dealing with more query parameters.

In [9]:
R.url

'https://api.harvardartmuseums.org/object?q=totalpageviews%3A1&size=10&apikey=b0cde630-ce66-11e8-951c-b3d75228cc98'

### Taking a look at the results

Request objects have a built-in method, `.json()`, which converts a JSON file received as a response to a request from a string of text that happens to be in this data format into Python native data structures, like lists, dictionaries, numbers and strings. We can use this method to see a dictionary representation of what we've gotten from the API request.

In [10]:
R.json()

{'info': {'next': 'https://api.harvardartmuseums.org/object?q=totalpageviews%3A1&size=10&apikey=b0cde630-ce66-11e8-951c-b3d75228cc98&page=2',
  'page': 1,
  'pages': 2396,
  'totalrecords': 23957,
  'totalrecordsperquery': 10},
 'records': [{'accessionmethod': 'Transfer',
   'accessionyear': 2011,
   'accesslevel': 1,
   'century': '20th century',
   'classification': 'Photographs',
   'classificationid': 17,
   'colorcount': 0,
   'commentary': None,
   'contact': 'am_moderncontemporary@harvard.edu',
   'contextualtextcount': 0,
   'copyright': None,
   'creditline': 'Harvard Art Museums/Fogg Museum, Transfer from the Carpenter Center for the Visual Arts, American Professional Photographers Collection',
   'culture': 'American',
   'datebegin': 1945,
   'dated': 'c. 1950',
   'dateend': 1955,
   'dateoffirstpageview': '2013-02-07',
   'dateoflastpageview': '2013-02-07',
   'department': 'Department of Photographs',
   'description': None,
   'dimensions': 'image: 12.7 x 10.16 cm (5 x 

## Changing our request

Let's say we're not interested in the most obscure parts of the collection (pot sherds, apparently), but rather in the most popular parts of the collection. There are a few ways we might go about doing this. One way might be to sort our search results by `totalpageviews`, and see what the top 10 are.

To do that, we can go back to the [API documentation](https://github.com/harvardartmuseums/api-docs/blob/master/object.md) and look for hints about what we might be able to do.

In [29]:
parameters = {
    "size":10,
    "apikey":APIKEY,
    "sort": "totalpageviews",
    "sortorder": "desc"
}
R = requests.get(url,params=parameters)
R.json()

{'info': {'next': 'https://api.harvardartmuseums.org/object?size=10&apikey=b0cde630-ce66-11e8-951c-b3d75228cc98&sort=totalpageviews&sortorder=desc&page=2',
  'page': 1,
  'pages': 23295,
  'totalrecords': 232946,
  'totalrecordsperquery': 10},
 'records': [{'accessionmethod': 'Bequest',
   'accessionyear': 1951,
   'accesslevel': 1,
   'century': '19th century',
   'classification': 'Paintings',
   'classificationid': 26,
   'colorcount': 10,
   'colors': [{'color': '#64af7d',
     'css3': '#5f9ea0',
     'hue': 'Green',
     'percent': 0.2979781420765,
     'spectrum': '#4fb94f'},
    {'color': '#64c896',
     'css3': '#66cdaa',
     'hue': 'Green',
     'percent': 0.21289617486339,
     'spectrum': '#47b853'},
    {'color': '#323219',
     'css3': '#2f4f4f',
     'hue': 'Brown',
     'percent': 0.19814207650273,
     'spectrum': '#3db657'},
    {'color': '#7d7d4b',
     'css3': '#696969',
     'hue': 'Green',
     'percent': 0.056775956284153,
     'spectrum': '#6cbd45'},
    {'color

In [34]:
records = R.json()['records']
for record in records:
    print(record['title'] + ": " + str(record['totalpageviews']))

Self-Portrait Dedicated to Paul Gauguin: 23715
The Gare Saint-Lazare: Arrival of a Train: 18458
Bahram Gur Fights the Horned Wolf (painting, verso; text, recto), illustrated folio from a manuscript of the Great Ilkhanid Shahnama (Book of Kings): 13947
Odalisque with a Slave: 12867
A Mother and Child and Four Studies of Her Right Hand, 1904; verso:  Self-Portrait Standing, 1903: 11817
Jeanne-Antoinette Poisson, Marquise de Pompadour: 10808
Red Boats, Argenteuil: 9144
Court of Gayumars (painting, recto; text, verso), folio from a manuscript of the Shahnama by Firdawsi: 8878
Self-Portrait in Tuxedo: 7230
Mother and Child: 6957


The top result from this query is a Van Gogh painted titled "Self-Portrait Dedicated to Paul Gauguin." You can grab just the first object by accessing the records list (which is indexed from 0):

In [32]:
topResult = R.json()['records'][0]
topResult

{'accessionmethod': 'Bequest',
 'accessionyear': 1951,
 'accesslevel': 1,
 'century': '19th century',
 'classification': 'Paintings',
 'classificationid': 26,
 'colorcount': 10,
 'colors': [{'color': '#64af7d',
   'css3': '#5f9ea0',
   'hue': 'Green',
   'percent': 0.2979781420765,
   'spectrum': '#4fb94f'},
  {'color': '#64c896',
   'css3': '#66cdaa',
   'hue': 'Green',
   'percent': 0.21289617486339,
   'spectrum': '#47b853'},
  {'color': '#323219',
   'css3': '#2f4f4f',
   'hue': 'Brown',
   'percent': 0.19814207650273,
   'spectrum': '#3db657'},
  {'color': '#7d7d4b',
   'css3': '#696969',
   'hue': 'Green',
   'percent': 0.056775956284153,
   'spectrum': '#6cbd45'},
  {'color': '#969664',
   'css3': '#808080',
   'hue': 'Green',
   'percent': 0.043715846994536,
   'spectrum': '#84c441'},
  {'color': '#afaf7d',
   'css3': '#bdb76b',
   'hue': 'Green',
   'percent': 0.042622950819672,
   'spectrum': '#9ecb3b'},
  {'color': '#4b9664',
   'css3': '#2e8b57',
   'hue': 'Green',
   'perc

You can easily access properties from the image record:

In [20]:
topResult['title']

'Self-Portrait Dedicated to Paul Gauguin'

Try adding an additional code field to the notebook below to access information about Van Gogh. If that's easy, try displaying all HAM works by Van Gogh and filtering to only records with an image associated.

The HAM object API can provide more information (such as `exhibition`, `citation`, `publication`, and `marks`) if you ask for a specific object by its objectid. For some records (often those with `verificationlevel` == 4) the lists for these properties can contain hundreds of entries.

In [27]:
objectid = topResult['objectid']
objectid
parameters = {
    "apikey": APIKEY
}
objectUrl = url + "/" + str(objectid)
R = requests.get(objectUrl, parameters)
topResultFull = R.json()
print(topResultFull['verificationlevel'] == 4)
topResultFull

True


{'accessionmethod': 'Bequest',
 'accessionyear': 1951,
 'accesslevel': 1,
 'century': '19th century',
 'classification': 'Paintings',
 'classificationid': 26,
 'colorcount': 10,
 'colors': [{'color': '#64af7d',
   'css3': '#5f9ea0',
   'hue': 'Green',
   'percent': 0.2979781420765,
   'spectrum': '#4fb94f'},
  {'color': '#64c896',
   'css3': '#66cdaa',
   'hue': 'Green',
   'percent': 0.21289617486339,
   'spectrum': '#47b853'},
  {'color': '#323219',
   'css3': '#2f4f4f',
   'hue': 'Brown',
   'percent': 0.19814207650273,
   'spectrum': '#3db657'},
  {'color': '#7d7d4b',
   'css3': '#696969',
   'hue': 'Green',
   'percent': 0.056775956284153,
   'spectrum': '#6cbd45'},
  {'color': '#969664',
   'css3': '#808080',
   'hue': 'Green',
   'percent': 0.043715846994536,
   'spectrum': '#84c441'},
  {'color': '#afaf7d',
   'css3': '#bdb76b',
   'hue': 'Green',
   'percent': 0.042622950819672,
   'spectrum': '#9ecb3b'},
  {'color': '#4b9664',
   'css3': '#2e8b57',
   'hue': 'Green',
   'perc

When we printed the 10 most popular records, you may have noticed a sharp dropoff after the first few records. Our Van Gogh painting is particularly popular, with ~5000 more views than the second most popular record and almost 4x as many as the tenth most popular. This particular Art Museum record is used as the default image asset for the demo installation of [Project Mirador](http://projectmirador.org/demo/), an image viewer for [IIIF (International Image Interoperability Framework)](https://iiif.io/).