# Interacting with APIs in Python

By this point you know what a dictionary is, how to loop over a list, and so on.

In this section, we're going to build up to doing more complex, interesting things by leveraging already-existing services through APIs.

What is an API, exactly? Actually, even before we get to that, *why* APIs? What problem do they solve?

## 0. Why APIs?

Consider the following scenarios:

1. 🌏We have a database of several thousand places which we'd like to display on a map, but we only have their addresses. In order to display them on the map, we need their latitude/longitude values. (For the uninitiated, this process is known as *geocoding*.)
2. 🇺🇸With the goal of analyzing sentiment or perhaps the relative frequency of descriptors of various candidates, we would like to find all New York Times articles relating to the 2020 election. Perhaps we'd additionally like to collect relevant tweets about the candidates.

In either case, we can imagine what the naïve solution might look like, done by hand:

1. 🌏Fire up a browser, open Google Maps, type in the first address, copy/paste the lat/lng into the database (i.e. spreadsheet); move on to the second address and repeat the process, and so on until we're done.
2. 🇺🇸Navigate to the New York Times website and enter "2020 election" in the search bar, click on each article in the search results, determine that the article is in fact relevant, save the article somewhere, and move on to the next one, and so on until we're pretty sure we've got all relevant articles.

**Gross.** Nobody wants to do this. It looks like we've lined up some seriously painstaking work for ourselves. There must be a way to automate this stuff...

Enter the **Application Programming Interface (API)**. If someone (or some corporate entity---a needless distinction, as corporations are people, after all) has already created a service that solves our underlying need, e.g. Google Maps or the NYTimes search bar, then there's a good chance in this day and age that that service offers an API, which is essentially a tightly controlled, programmatic way to interact with the service.

### Sneak Peek

Here's what we're working up to. Try these out!

#### 🌏Geocoding

The file `data/addresses.csv` contains a handful of addresses. (Feel free to open the file and inspect it.) The code below gets lat/lngs for all of them:

In [None]:
from lib import geocoder

results = geocoder.census_geocode_csv('data/addresses.csv')
print(results)

#### Other thing

### Benefits

* Automating away the tedium of doing the same thing many times.
* Ability to build "on top of" someone else's API in order to focus on a specific thing we want to accomplish.

For that second point, consider e.g. building a real-time logistics app that displays the current location of our fleet of vehicles; all we have are the real-time GPS coordinates, so we use another service (e.g. Mapbox or Google/Apple/Bing Maps) for displaying the underlying roads and geographic data.

Hopefully by this point you're convinced of the potential utility of APIs. Let's get started!

## 1. Unauthenticated API: Geocoding 🌏

Suppose we have a set of addresses, and we need the latitude/longitude pairs.

We *could* use the Google Maps Geocoding API for this, but the service requires you to set up a billing account with a valid credit card before you can use the API, and i'd rather not deal with that right now.

So instead, let's use the [US Census Geocoding API](https://geocoding.geo.census.gov/geocoder/Geocoding_Services_API.html). The results won't be quite as high-quality (interpolated vs. rooftop geocodes), but the service is public and free, and illustrates the point.

**🖊Note:** This API is called *unauthenticated* because you don't have to tell it who you are in order to use it.

### 1.1. A first pass

At the end of the day, using this API boils down to figuring out the right URL to send an HTTP request to, and what parameters to give it. (Actually, this is pretty much how *all* HTTP-based APIs work.) Let's take the address of Huang Engineering Center as an example: `475 Via Ortega, Stanford, CA 94305`. After reading the instructions on how to use the census geocoding API, we determine that one way to make this request is:

https://geocoding.geo.census.gov/geocoder/locations/onelineaddress?benchmark=Public_AR_Current&format=json&address=475+Via+Ortega,Stanford,CA

🏋🏽‍♀️**Open the link above in the browser to see what the result looks like. Do you see the latitude/longitude buried in there?** 🏋🏽‍♀️

Next, we'll make this request programmatically.

(We won't talk about the `benchmark` argument for the purposes of this tutorial, but we'll touch on the `format` argument later. The `address` is the interesting thing here.)

To make HTTP requests from Python, we'll use the `requests` library ([docs](https://2.python-requests.org/en/master/)), which is not part of the standard library but makes web requests very simple. (This will come in handy when scraping later.) Let's install that now:

In [None]:
import sys

!{sys.executable} -m pip install requests

Don't worry if that was confusing or magical.

Now, we need to import it:

In [None]:
import requests

Check out what you can do with requests:

In [None]:
requests.get('http://example.com').text

That's a bona fide HTML webpage!

🏋🏽‍♀️**Your turn: Use the `requests` library to fetch a webpage. Pretty much any webpage will do. Print the `text` attribute of the response.** 🏋🏽‍♀️ Use the cell below.

Okay, now let's use our newfound skills with the `requests` library to do the same thing, but apply it to the geocoding example that we did in the browser:

In [None]:
response = requests.get(
  'https://geocoding.geo.census.gov/geocoder/locations/onelineaddress?' + \
  'benchmark=Public_AR_Current&format=json&address=475+Via+Ortega,Stanford,CA')
response, response.text

Well, HTTP 200 means "success", but how do we get the data we want out of this response object?

**Protip:** In order to find out what methods are available on an object, call the `dir()` builtin on the object.

🏋🏽‍♀️**Your turn: Use `dir` to find out what methods are available on `response`.** 🏋🏽‍♀️

In [None]:
# START
dir(response)
# END

You may have noticed, among the numerous items printed, entries including `text` and `json`. We've seen the `text` already:

In [None]:
response.text

**This is a [JSON](http://json.org)-formatted string.** The details of JSON aren't important for this lesson. Suffice it to say that if someone (i.e. API) hands you data in JSON form, you can parse it in almost any language (e.g. `json.loads` in Python) and then it becomes an object native to that language. 

*But how do we know this is JSON?* Well, we asked for it this way; the original parameters included `format=json`.

JSON is an extremely common response format in APIs, so it's worth getting familiar with it.

Maybe the `json` method is useful to us? Let's see:

In [None]:
help(response.json)

Hmm, seems promising. Let's try it out:

In [None]:
response.json()

Whoa, hey, that's just a Python dictionary, which you all know and love by now! Here, i'll prove it to you:

In [None]:
type(response.json())

Notice that the lat/lng pair we're interested in is buried a few levels deep.

🏋🏽‍♀️**Your turn: Extract the lat/lng coordinates from `response.json()`.**🏋🏽‍♀️

In [None]:
# START
response.json()['result']['addressMatches'][0]['coordinates']
# END

**Q: What happens if there are no matching addresses (i.e. no geocode results)?**

### 1.2. Tightening things up

Let's go back and take another look at that initial request:

https://geocoding.geo.census.gov/geocoder/locations/onelineaddress?benchmark=Public_AR_Current&address=475+Via+Ortega,Stanford,CA&format=json

The `requests` library provides a way to break this up in a more semantically meaningful way that we'll see comes in handy later. Everything after the `?` in the URL are the *parameters*, which can be passed as an argument to the `params` instead of pasted into the URL. The params usually contain the interesting part of the API request, so it's handy to have them better structured, like so:

In [None]:
requests.get(
  'https://geocoding.geo.census.gov/geocoder/locations/onelineaddress',
  params={
    'address': '475 Via Ortega, Stanford, CA',
    'benchmark': 'Public_AR_Current',
    'format': 'json',
  }).json()

Looks nicer, right?

We can do even better, though. Up to this point, we've been using the `onelineaddress` search type. After perusing the docs further, we decide it might be more robust if we were to use the `address` search type, which takes a *structured* address. This is a way of being more explicit about what are looking for, rather than leaving interpretation of the text string up to the API. So let's do the same search again, but in a structured manner, using the `address` endpoint:

In [None]:
requests.get(
  'https://geocoding.geo.census.gov/geocoder/locations/address',
  params={
    'street': '475 Via Ortega',
    'city': 'Stanford',
    'state': 'CA',
    'zip': '94305',
    'benchmark': 'Public_AR_Current',
    'format': 'json',
  }).json()

### 🏋🏽‍♀️Exercise 🏋🏽‍♀️

Okay, we've been using this API enough that some of the steps are getting repetitive, so it's probably worth putting this all in a function that accepts only the pieces that are changing and grabs just the coordinates from the response.

**Complete the function in the cell below** such that i can call

`geocode({'street': '475 Via Ortega', 'city': 'Stanford', 'state': 'CA', 'zip': '94305'})`

and get back the Python dictionary `{'lat': 37.428837, 'lng': -122.17598}`.

In [None]:
CENSUS_GEOCODE_URL = 'https://geocoding.geo.census.gov/geocoder/locations/address'

def geocode(address):
  params = {
    'benchmark': 'Public_AR_Current',
    'format': 'json',
  }
  # - add pieces of the address to params
  # - make the request to the geocoding API
  # - extract the coordinates from the response JSON
  # - make sure to rename x and y to lng and lat!
  # START
  params.update(address)
  response = requests.get(CENSUS_GEOCODE_URL, params=params)
  coords = response.json()['result']['addressMatches'][0]['coordinates']
  return {'lat': coords['y'], 'lng': coords['x']}
  # END

**Now try it out!** Does the function return the expected result?

In [None]:
result = geocode({'street': '475 Via Ortega', 'city': 'Stanford', 'state': 'CA', 'zip': '94305'})
assert result == {'lat': 37.428837, 'lng': -122.17598}
print('🎉Success!🎉')

**Try geocoding another address of your choosing (within the US).**

If you're really lacking for inspiration, you can geocode the Googleplex, `1600 Amphitheatre Pkwy, Mountain View, CA 94043`.

In [None]:
# START
geocode({'street': '1600 Amphitheatre Pkwy', 'city': 'Mountain View', 'state': 'CA', 'zip': '94043'})
# END

**Breathe.** Do you realize how powerful it is, what we just did there?

The second example is in `apis_two.ipynb`.