# Chapter 3: Working with APIs

[Overview of what APIs are]

[Chronicling America](http://chroniclingamerica.loc.gov/about/) is a joint project of the National Endowment for the Humanities and the Library of Congress that [More description.

The website has a search function and my search for the term "[slavery](http://chroniclingamerica.loc.gov/search/pages/results/?andtext=slavery)" returned 404,325 results.

<img src="images/ca_slavery_search.png">

This is great, but they make it much better for researchers by providing an API to assist with searching and downloading their archive. [Note about bulk downloads.]

One of the nice things about APIs is that they are often intuitive, or at least interpretable after you see them. For example, to retrieve the first page of search results into an easily digestable format, you append ``&format=json`` to the end of the search URL ``http://chroniclingamerica.loc.gov/search/pages/results/?andtext=slavery``. In your browser, this returns a text file in the JSON format.

<img src="images/ca_slavery_api.png">

Thankfully, the programers have made the variable names understandable. As before, the search found 404,325 results (``"totalItems": 404325,``). The server did not return all of these, however, just 20 of them (``"itemsPerPage": 20``), starting with the first result (``"startIndex": 1,``) and ending with the 20th (``"endIndex": 20,``). 


http://chroniclingamerica.loc.gov/search/pages/results/?andtext=slavery&format=json


(http://chroniclingamerica.loc.gov/about/api/)

[``requests``](http://docs.python-requests.org/en/master/) is a useful and commonly used HTTP library for python. It is not a part of the default installation, but is included with Anaconda Python Distribution. 

In [34]:
import requests

It would be possible to use the API URL and parameters directly in the requests command, but since the most likely scenario involves making repeating calls to ``requests`` as part of a loop -- the search returned less than 1% of the results -- I store the strings first. 

In [35]:
base_url = 'http://chroniclingamerica.loc.gov/search/pages/results/'
parameters = '?andtext=slavery&format=json'

`requests.get()` is used for both accessing websites and APIs. The command can be modified by several arguements, but at a minimum, it requires the URL.

In [36]:
r = requests.get(base_url + parameters)

`r` is a `requests` response object. Any JSON returned by the server are stored in `.json().`

In [37]:
search_json = r.json()

JSONs are dictionary like objects, in that they have keys (think variable names) and values. `.keys()` returns a list of the keys.

In [39]:
print search_json.keys()

[u'totalItems', u'endIndex', u'startIndex', u'itemsPerPage', u'items']


You can return the value of any key by putting the key name in brackets.

In [41]:
search_json['totalItems']

404325

As is often the case with results from an API, most of the keys and values are metadate about either the search or what is being returned. These are useful for knowing if the search is returning what you want, which is particularly important when you are making multiple calls to the API. 

The data I'm intereted in is all in `items`. 

In [43]:
print type(search_json['items'])
print len(search_json['items'])

<type 'list'>
20


`items` is a list with 20 items.

In [64]:
print type(search_json['items'][0])
print type(search_json['items'][19])

<type 'dict'>
<type 'dict'>


Each of the 20 items in the list is a dictionary. 

In [65]:
first_item =  search_json['items'][0]

print first_item.keys()

[u'sequence', u'county', u'edition', u'frequency', u'id', u'section_label', u'city', u'date', u'title', u'end_year', u'note', u'state', u'subject', u'type', u'place_of_publication', u'start_year', u'edition_label', u'publisher', u'language', u'alt_title', u'lccn', u'country', u'ocr_eng', u'batch', u'title_normal', u'url', u'place', u'page']


While a standard CSV file has a header row that describes the contents of each column, a JSON file has keys identifying the values found in each case. Importantly, these keys need not be the same for each item. Additionally, values don't have to be numbers of strings, but could be lists or dictionaries. For example, this JSON could have included a `newspaper` key that was a dictionary with all the metadata about the newspaper the article and issue was published, an `article` key that include the article specific information as another dictionary, and a `text` key whose value was a string with the article text.

As before, we can examine the contents of a particular item, such as the publication's `title`.

In [68]:
print first_item['title']

Anti-slavery bugle. volume


In [54]:
import pandas as pd

# Make sure all columns are displayed
pd.set_option("display.max_columns",101)

In [55]:
pd.DataFrame(search_json['items'])

Unnamed: 0,alt_title,batch,city,country,county,date,edition,edition_label,end_year,frequency,id,language,lccn,note,ocr_eng,page,place,place_of_publication,publisher,section_label,sequence,start_year,state,subject,title,title_normal,type,url
0,[],batch_ohi_ariel_ver02,"[New Lisbon, Salem]",Ohio,"[Columbiana, Columbiana]",18490316,,,1861,Weekly,/lccn/sn83035487/1849-03-16/ed-1/seq-1/,[English],sn83035487,[Archived issues are available in digital form...,"LAVE\nam\nJlile\nVOL. 4. NO. 30.\nSALEM. OHIO,...",,"[Ohio--Columbiana--New Lisbon, Ohio--Columbian...","New-Lisbon, Ohio",Ohio American Antislavery Society,,1,1845,"[Ohio, Ohio]",[Antislavery movements--United States--Newspap...,Anti-slavery bugle. volume,anti-slavery bugle.,page,http://chroniclingamerica.loc.gov/lccn/sn83035...
1,[],batch_iune_golf_ver01,[Chicago],Illinois,[Cook County],19140516,,NOON EDITION,1917,Daily (except Sunday and holidays),/lccn/sn83045487/1914-05-16/ed-1/seq-10/,[English],sn83045487,"[""An adless daily newspaper."", Archived issues...",r\nmmmmmmmmmmmmmmmmmmmmmmmm\n'SLAVERY RIFE IN ...,,[Illinois--Cook County--Chicago],"Chicago, Ill.",N.D. Cochran,,10,1911,[Illinois],"[Chicago (Ill.)--Newspapers., Illinois--Chicag...",The day book.,day book.,page,http://chroniclingamerica.loc.gov/lccn/sn83045...
2,[],batch_iune_india_ver01,[Chicago],Illinois,[Cook County],19161109,,EXTRA,1917,Daily (except Sunday and holidays),/lccn/sn83045487/1916-11-09/ed-1/seq-26/,[English],sn83045487,"[""An adless daily newspaper."", Archived issues...",us remaining whites if we expect to\nstay on t...,,[Illinois--Cook County--Chicago],"Chicago, Ill.",N.D. Cochran,,26,1911,[Illinois],"[Chicago (Ill.)--Newspapers., Illinois--Chicag...",The day book.,day book.,page,http://chroniclingamerica.loc.gov/lccn/sn83045...
3,[],batch_iune_golf_ver01,[Chicago],Illinois,[Cook County],19150327,,NOON EDITION,1917,Daily (except Sunday and holidays),/lccn/sn83045487/1915-03-27/ed-1/seq-24/,[English],sn83045487,"[""An adless daily newspaper."", Archived issues...",THOUSANDS OF VEILED WOMEN OF TURKISH\nHAREM ON...,,[Illinois--Cook County--Chicago],"Chicago, Ill.",N.D. Cochran,,24,1911,[Illinois],"[Chicago (Ill.)--Newspapers., Illinois--Chicag...",The day book.,day book.,page,http://chroniclingamerica.loc.gov/lccn/sn83045...
4,[],batch_iune_foxtrot_ver01,[Chicago],Illinois,[Cook County],19130815,,,1917,Daily (except Sunday and holidays),/lccn/sn83045487/1913-08-15/ed-1/seq-5/,[English],sn83045487,"[""An adless daily newspaper."", Archived issues...",LOLA NORRiajQlVS SiENSAT-iPN AL t EVIDENCE IN ...,,[Illinois--Cook County--Chicago],"Chicago, Ill.",N.D. Cochran,,5,1911,[Illinois],"[Chicago (Ill.)--Newspapers., Illinois--Chicag...",The day book.,day book.,page,http://chroniclingamerica.loc.gov/lccn/sn83045...
5,[],batch_iune_foxtrot_ver01,[Chicago],Illinois,[Cook County],19130308,,NOON EDITION,1917,Daily (except Sunday and holidays),/lccn/sn83045487/1913-03-08/ed-1/seq-6/,[English],sn83045487,"[""An adless daily newspaper."", Archived issues...",that every possible weakness in. a\ngirl as &e...,,[Illinois--Cook County--Chicago],"Chicago, Ill.",N.D. Cochran,,6,1911,[Illinois],"[Chicago (Ill.)--Newspapers., Illinois--Chicag...",The day book.,day book.,page,http://chroniclingamerica.loc.gov/lccn/sn83045...
6,[],batch_iune_foxtrot_ver01,[Chicago],Illinois,[Cook County],19130424,,,1917,Daily (except Sunday and holidays),/lccn/sn83045487/1913-04-24/ed-1/seq-13/,[English],sn83045487,"[""An adless daily newspaper."", Archived issues...",mpICFED FOR WHITE -SLAVERY.\nTop Lola Norris-a...,,[Illinois--Cook County--Chicago],"Chicago, Ill.",N.D. Cochran,,13,1911,[Illinois],"[Chicago (Ill.)--Newspapers., Illinois--Chicag...",The day book.,day book.,page,http://chroniclingamerica.loc.gov/lccn/sn83045...
7,[],batch_dlc_elf_ver03,[Washington],District of Columbia,[None],18540511,,,1860,Weekly,/lccn/sn84026752/1854-05-11/ed-1/seq-1/,[English],sn84026752,[Also issued on microfilm by University Microf...,I IiiLMI or SUBSCRimOM\nI T. \ .. &m is publis...,,[District of Columbia--Washington],Washington [D.C.],L.P. Noble,,1,1847,[District of Columbia],[African Americans--Washington (D.C.)--Newspap...,The national era.,national era.,page,http://chroniclingamerica.loc.gov/lccn/sn84026...
8,[],batch_iune_foxtrot_ver01,[Chicago],Illinois,[Cook County],19130225,,,1917,Daily (except Sunday and holidays),/lccn/sn83045487/1913-02-25/ed-1/seq-30/,[English],sn83045487,"[""An adless daily newspaper."", Archived issues...",we are doing what the American\nmen did Avay b...,,[Illinois--Cook County--Chicago],"Chicago, Ill.",N.D. Cochran,,30,1911,[Illinois],"[Chicago (Ill.)--Newspapers., Illinois--Chicag...",The day book.,day book.,page,http://chroniclingamerica.loc.gov/lccn/sn83045...
9,[],batch_dlc_elf_ver03,[Washington],District of Columbia,[None],18540511,,,1860,Weekly,/lccn/sn84026752/1854-05-11/ed-1/seq-4/,[English],sn84026752,[Also issued on microfilm by University Microf...,f\nI 76\n[COXTIHCED PBOM KIMT PAGE.]\nour fath...,76,[District of Columbia--Washington],Washington [D.C.],L.P. Noble,,4,1847,[District of Columbia],[African Americans--Washington (D.C.)--Newspap...,The national era.,national era.,page,http://chroniclingamerica.loc.gov/lccn/sn84026...
