# Dictionaries & APIs

This notebook introduces Python dictionaries and provides a walkthrough of using the `requests` library
to retrieve data from a REST API. 

## Contents

* Dictionaries in Python (see Severance, Py4E, "[Dictionaries](https://www.py4e.com/html3/09-dictionaries)")
  * Basic syntax
  * Iteration (for loop)
  * Looking for a key (using `in`)
  * Looking for a value (using `.values()`)
* Using Requests
  * Allows you to use python to make HTTP requests, then create an API call
  * Use a dictionary for paramters
  * Make an API call
  * Save to a local file

### Dictionaries

For more details, see Charles Severance, "[Dictionaries](https://www.py4e.com/html3/09-dictionaries)", from Python for Everbody

In [2]:
goneWiththeWind = dict()

print(goneWiththeWind)

{}


In [3]:
goneWiththeWind['title'] = 'Gone with the Wind'

print(goneWiththeWind)

{'title': 'Gone with the Wind'}


In [4]:
goneWiththeWind['author'] = 'Mitchell, Margaret'

print(goneWiththeWind)

{'title': 'Gone with the Wind', 'author': 'Mitchell, Margaret'}


In [5]:
goneWiththeWind['date'] = 1936

print(goneWiththeWind)

{'title': 'Gone with the Wind', 'author': 'Mitchell, Margaret', 'date': 1936}


We can reference the values by the key: 

In [6]:
print(goneWiththeWind['date'])

1936


You can search for keys using the `in` operator. Note that it does not work for searching the values. 

In [7]:
'title' in goneWiththeWind

True

In [8]:
'publisher' in goneWiththeWind

False

In [9]:
if 'title' in goneWiththeWind:
    print('The book has a title.')

The book has a title.


#### Activity

Create a book object of your own to test the dictionary caoncept. You can use the book title as the name of a dictionary

### Using Iteration with a Dictionary

Iteration is a good tool to help explore the dictionary. 

Use the `values()` method to get the values, which we can use to iterate through the dictionary: 

In [18]:
elements = list(goneWiththeWind.values())

print(elements)

['Gone with the Wind', 'Mitchell, Margaret', 1936]


In [20]:
for element in elements:
    print(element)

Gone with the Wind
Mitchell, Margaret
1936


In [16]:
for element in goneWiththeWind:
    print(element, ':', goneWiththeWind[element])

title : Gone with the Wind
author : Mitchell, Margaret
date : 1936


You can also use the `len()` function. What is it counting? 

In [21]:
len(goneWiththeWind)

3

## Making an API call

Let's use requests to scrape some data from an API endpoint. In this case, we can use the Library of Congress
search function, which is a REST API that responds to HTTP requests.

The documentation for requests can be found here: http://docs.python-requests.org/en/master/ 

The endpoint for the search query is `http://www.loc.gov/search/'

In [22]:
import requests

searchEndpoint = 'http://www.loc.gov/search/'

To pass in the parameters, we can use a dictionary! Let's try using `params`

In [26]:
parameters = {
    'fo' : 'json',
    'q'  : 'kittens',
    'fa' : 'online-format:image'
}

In [30]:
r = requests.get(searchEndpoint, params = parameters)

print('You requested:',r.url)
print('HTTP server response code:',r)
print('HTTP response headers',r.headers)

You requested: https://www.loc.gov/search/?fo=json&q=kittens&fa=online-format%3Aimage
HTTP server response code: <Response [200]>
HTTP response headers {'Date': 'Thu, 28 Mar 2019 20:14:34 GMT', 'Content-Type': 'application/json', 'Content-Length': '18229', 'Connection': 'keep-alive', 'ETag': '"9cbb4e9a37cd156702a53d654557724f"', 'access-control-allow-origin': '*', 'X-Frame-Options': 'allow-from https://unitedstateslibraryofcongress.marketing.adobe.com', 'Expires': 'Fri, 29 Mar 2019 20:10:10 GMT', 'Content-Encoding': 'gzip', 'Accept-Ranges': 'bytes', 'Cache-Control': 'no-transform, max-age=86400', 'Age': '0', 'Expect-CT': 'max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"', 'Server': 'cloudflare', 'CF-RAY': '4bec456bec1623b4-IAD'}


In [29]:
r.text

'{"pagination": {"from": 1, "results": "1 - 25", "last": "https://www.loc.gov/search/?fa=online-format:image&fo=json&q=kittens&sp=4", "total": 127, "previous": null, "perpage": 25, "perpage_options": [25, 50, 100, 150], "of": 3155, "next": "https://www.loc.gov/search/?fa=online-format:image&fo=json&q=kittens&sp=2", "current": 1, "to": 25, "page_list": [{"url": null, "number": 1}, {"url": "https://www.loc.gov/search/?fa=online-format:image&fo=json&q=kittens&sp=2", "number": 2}, {"url": "https://www.loc.gov/search/?fa=online-format:image&fo=json&q=kittens&sp=3", "number": 3}, {"url": "https://www.loc.gov/search/?fa=online-format:image&fo=json&q=kittens&sp=4", "number": "..."}], "first": null}, "timestamp": 1553803810653, "views": {"list": "https://www.loc.gov/search/?fa=online-format:image&fo=json&q=kittens", "brief": "https://www.loc.gov/search/?fa=online-format:image&fo=json&q=kittens&st=brief", "slideshow": "https://www.loc.gov/search/?fa=online-format:image&fo=json&q=kittens&st=slide

In [32]:
type(r.text)

str

#### API Call question

We made a request to the loc.gov JSON API. Can you fill in the following & explain the missing elements? 

```
http://www.loc.gov/_______/?fo=_______&q=_______
```

What other items might you use after the `?`...

### Parsing the Data from the API

Now, we can get the response, let's save to a file. To do this, use the `json` module. 

In [31]:
import json

In [51]:
data = json.loads(r.text)

# what are the keys?
for element in data:
    print(element)

pagination
timestamp
views
facet_views
facets
results
expert_resources
search
breadcrumbs
form_facets
options
facet_trail


In [47]:
for item in data['results']:
    print(item)

{'access_restricted': False, 'original_format': ['photo, print, drawing'], 'contributor': ['harris & ewing'], 'id': 'http://www.loc.gov/item/2016892679/', 'partof': ['prints and photographs division', 'harris & ewing collection', 'catalog'], 'subject': ['united states', 'glass negatives'], 'index': 1, 'group': ['hec', 'catalog', 'harris-ewing', 'main-catalog'], 'location_country': ['united states.'], 'title': '[Kittens]', 'online_format': ['image'], 'location': ['united states.', 'united states'], 'number_former_id': ['http://www.loc.gov/item/20269443', 'http://www.loc.gov/item/hec2013013458'], 'mime_type': ['image/gif', 'image/jpg', 'image/tif'], 'digitized': True, 'description': ['1 negative : glass ; 4 x 5 in. or smaller'], 'timestamp': '2018-05-03T19:53:23.840Z', 'site': ['pictures', 'catalog'], 'campaigns': [], 'extract_timestamp': '2018-05-03T19:32:52.021Z', 'date': '1923', 'number': ['http://www.loc.gov/item/20269443', 'http://www.loc.gov/item/hec2013013458'], 'other_title': [],

In [48]:
print(len(data['results']))

25


When compared with the html version here, notice that that page also has 25 results! 

See https://www.loc.gov/photos/?fa=online-format:image&q=kittens

Is it possible to extract each result into its own file? 

In [65]:
# block testing an extaction of each result into a separate file

data = json.loads(r.text)

#grab the images into a list
kittensList = data['results']
print(len(kittensList))

25


How could we extract the image URLs?                       

In [71]:
for key in kittensList[0]:
    print(key)

access_restricted
original_format
contributor
id
partof
subject
index
group
location_country
title
online_format
location
number_former_id
mime_type
digitized
description
timestamp
site
campaigns
extract_timestamp
date
number
other_title
dates
language
url
shelf_id
hassegments
image_url
aka


In [73]:
for kitten in kittensList:
    print(kitten['url'])

https://www.loc.gov/item/2016892679/
https://www.loc.gov/item/20002503/
https://www.loc.gov/item/sm1880.04831/
https://www.loc.gov/item/2013646722/
https://www.loc.gov/item/2017650796/
https://www.loc.gov/item/sm1883.19433/
https://www.loc.gov/item/20020550/
https://www.loc.gov/item/sm1879.15529/
https://www.loc.gov/item/04028966/
https://www.loc.gov/item/2008660988/
https://www.loc.gov/item/2016796464/
https://www.loc.gov/item/2016816441/
https://www.loc.gov/item/2016817090/
https://www.loc.gov/item/2005681032/
https://www.loc.gov/item/2014717546/
https://www.loc.gov/item/2002706499/
https://www.loc.gov/item/22017222/
https://www.loc.gov/item/sm1882.12947/
https://www.loc.gov/item/90708798/
https://www.loc.gov/item/2002697129/
https://www.loc.gov/item/2003653666/
https://www.loc.gov/item/2003653670/
https://www.loc.gov/item/07028973/
https://www.loc.gov/item/2002697127/
https://www.loc.gov/item/2002697126/
