In [1]:
%matplotlib inline
import matplotlib
import seaborn as sns
sns.set()
matplotlib.rcParams['figure.dpi'] = 144

# Consuming APIs (and JSON)
<!-- requirement: secrets/twitter_secrets.json.sample -->
<!-- devrequirement: secrets/twitter_secrets.json.nogit -->


Consuming APIs is supposed to be easy (that's the point of having an API).  

Let's look at a simple example of consuming a JSON API.  The example we'll look at is a *geocoder*: That is, a service for converting between addresses and normalized geographic information (e.g. latitude and longitude).  Going from addresses to normalized form is "forward geocoding" and going the other way is "reverse geocoding".

We'll interact with a free (and non-authenticated) geocoder run by OpenStreetMap.  The geocoded information is available by sending a GET request to <tt>http:&#8203;//nominatim.openstreetmap.org/search?q=<i>address</i>&addressdetails=1&format=json</tt>.  The portion before the question mark (`http://nominatim.openstreetmap.org/search`) is the endpoint on the server, while the portion following, known as the *query string*, contains the data being sent to the server.  (Thus, a GET request can be repeated simply by requesting the same URL again.  In contrast, the data sent in a POST request is contained in the request body, not in the URL.)

As is typical, the query string consists of several key=value pairs, separated by ampersands.  The requested address is specified with the `q` key in this case.  Some characters, like the spaces and commas, cannot be used in the URL, so they must be encoded with the `quote()` function.

In [2]:
from urllib.parse import quote
from urllib.request import urlopen

address = "1600 Pennsylvania Avenue NW, Washington"
quote(address)

'1600%20Pennsylvania%20Avenue%20NW%2C%20Washington'

In [3]:
url = "http://nominatim.openstreetmap.org/search?q={}&addressdetails=1&format=json".format(quote(address))
url

'http://nominatim.openstreetmap.org/search?q=1600%20Pennsylvania%20Avenue%20NW%2C%20Washington&addressdetails=1&format=json'

We can request this URL with the `urlopen()` function, which returns a stream we can read from.

In [4]:
data = urlopen(url).read()
data

b'[{"place_id":147370996,"licence":"Data \xc2\xa9 OpenStreetMap contributors, ODbL 1.0. https://osm.org/copyright","osm_type":"way","osm_id":238241022,"boundingbox":["38.8974908","38.897911","-77.0368537","-77.0362519"],"lat":"38.897699700000004","lon":"-77.03655315","display_name":"White House, 1600, Pennsylvania Avenue Northwest, Washington, District of Columbia, 20500, United States","class":"office","type":"government","importance":1.05472115416811,"address":{"office":"White House","house_number":"1600","road":"Pennsylvania Avenue Northwest","city":"Washington","state":"District of Columbia","postcode":"20500","country":"United States","country_code":"us"}},{"place_id":149743250,"licence":"Data \xc2\xa9 OpenStreetMap contributors, ODbL 1.0. https://osm.org/copyright","osm_type":"way","osm_id":238241023,"boundingbox":["38.8973242","38.8974297","-77.0374621","-77.0373535"],"lat":"38.89737555","lon":"-77.0374079114865","display_name":"The Oval Office, 1600, Pennsylvania Avenue Northwe

The result was returned to us in the form of JSON. JSON is JavaScript Object Notation&mdash;it's a human readable text-based format for transmitting key-value pairs (and strings, numbers, and arrays). The `simplejson` package lets us convert between this and Python's native dictionaries, etc.

In [5]:
import simplejson as json

json.loads(data)

[{'place_id': 147370996,
  'licence': 'Data © OpenStreetMap contributors, ODbL 1.0. https://osm.org/copyright',
  'osm_type': 'way',
  'osm_id': 238241022,
  'boundingbox': ['38.8974908', '38.897911', '-77.0368537', '-77.0362519'],
  'lat': '38.897699700000004',
  'lon': '-77.03655315',
  'display_name': 'White House, 1600, Pennsylvania Avenue Northwest, Washington, District of Columbia, 20500, United States',
  'class': 'office',
  'type': 'government',
  'importance': 1.05472115416811,
  'address': {'office': 'White House',
   'house_number': '1600',
   'road': 'Pennsylvania Avenue Northwest',
   'city': 'Washington',
   'state': 'District of Columbia',
   'postcode': '20500',
   'country': 'United States',
   'country_code': 'us'}},
 {'place_id': 149743250,
  'licence': 'Data © OpenStreetMap contributors, ODbL 1.0. https://osm.org/copyright',
  'osm_type': 'way',
  'osm_id': 238241023,
  'boundingbox': ['38.8973242', '38.8974297', '-77.0374621', '-77.0373535'],
  'lat': '38.89737555

In [6]:
json.loads(data)[0]['boundingbox']

['38.8974908', '38.897911', '-77.0368537', '-77.0362519']

## Handling URL parameters


`urllib` module requires an enormous amount of work to perform the simplest of tasks. The `requests` library provides a higher-level way to do web requests. This is already nice in examples, like the above, where we need to encode parameters into the URL.  It is even more convenient when there are also `POST` parameters (or cookies, or authentication, or...) involved.  (Don't worry if you don't know what that means.)

In [7]:
import requests

def geocode(address):
    params = { 'format'        :'json', 
               'addressdetails': 1, #include breakdown of address into elements
               'q'             : address}
    return requests.get('http://nominatim.openstreetmap.org/search', params=params)

response = geocode("107 Page St., San Francisco")
response

<Response [200]>

The parameters are automatically encoded and assembled into the query string.

In [8]:
response.url

'https://nominatim.openstreetmap.org/search?format=json&addressdetails=1&q=107+Page+St.%2C+San+Francisco'

The raw response is available...

In [9]:
response.text

'[{"place_id":266176014,"licence":"Data © OpenStreetMap contributors, ODbL 1.0. https://osm.org/copyright","osm_type":"way","osm_id":802591083,"boundingbox":["37.773928208333","37.774028208333","-122.42263933333","-122.42253933333"],"lat":"37.77397820833333","lon":"-122.42258933333333","display_name":"107, Page Street, Western Addition, San Francisco, San Francisco City and County, San Francisco, California, 94102, United States","class":"place","type":"house","importance":0.511,"address":{"house_number":"107","road":"Page Street","neighbourhood":"Western Addition","city":"San Francisco","county":"San Francisco","state":"California","postcode":"94102","country":"United States","country_code":"us"}}]'

...but it can also be converted to JSON.

In [10]:
response.json()

[{'place_id': 266176014,
  'licence': 'Data © OpenStreetMap contributors, ODbL 1.0. https://osm.org/copyright',
  'osm_type': 'way',
  'osm_id': 802591083,
  'boundingbox': ['37.773928208333',
   '37.774028208333',
   '-122.42263933333',
   '-122.42253933333'],
  'lat': '37.77397820833333',
  'lon': '-122.42258933333333',
  'display_name': '107, Page Street, Western Addition, San Francisco, San Francisco City and County, San Francisco, California, 94102, United States',
  'class': 'place',
  'type': 'house',
  'importance': 0.511,
  'address': {'house_number': '107',
   'road': 'Page Street',
   'neighbourhood': 'Western Addition',
   'city': 'San Francisco',
   'county': 'San Francisco',
   'state': 'California',
   'postcode': '94102',
   'country': 'United States',
   'country_code': 'us'}}]

In [11]:
response.json()[0]['boundingbox']

['37.773928208333', '37.774028208333', '-122.42263933333', '-122.42253933333']

**Exercise:** The National Weather Service operates a free API for weather information.  A sample request looks like this: `http://forecast.weather.gov/MapClick.php?lat=37.7739&lon=-122.4225&FcstType=json`.

Use the geocoder to write a function

        def weather_at_address(address):
            ....
            
that gets the current weather (temperature, cloudy or not) from a human-entered address.

## Requests: Authenticated APIs


Lots of interesting APIs are free (or at least free for moderate use) but still require you to register first.  The `requests` library (together with some supporting ones, e.g. `requests_oauthlib`) make it easy to consume these too.

In order to access the Twitter API, you must first sign up: create an app on http://apps.twitter.com, get an access token, *et voila*, you have your shiny new credentials -- consisting of four pieces of data. The file `secrets/twitter_secrets.json.sample` has the format template; then rename the file to have a `.nogit` extension to prevent it being tracked in a git repository.

more on here: https://requests-oauthlib.readthedocs.io/en/latest/oauth1_workflow.html

In [None]:
from requests_oauthlib import OAuth1

with open("secrets/twitter_secrets.json.nogit") as fh:
    secrets = json.loads(fh.read())

# create an auth object
auth = OAuth1(
    secrets["api_key"],
    secrets["api_secret"],
    secrets["access_token"],
    secrets["access_token_secret"]
)

## GeoPy

Geopy makes it easy for Python developers to locate the coordinates of addresses, cities, countries, and landmarks across the globe using third-party geocoders and other data sources. geopy includes geocoder classes for the OpenStreetMap Nominatim , Google Geocoding API (V3), and many other geocoding services.

In [None]:
import geopy

To geolocate a single address, you can use Geopy python library. Geopy has different Geocoding services that you can choose from, including Google Maps, ArcGIS, AzureMaps, Bing, etc. Some of them require API keys, while others do not need. You can see the complete list here: https://geopy.readthedocs.io/en/stable/

In [12]:
from geopy.geocoders import ArcGIS, GoogleV3
import pandas as pd

addrs=['2155 E Wesley Ave, Denver, CO 80208',
 '8500 Pena Blvd, Denver, CO 80249',
 '1600 Pennsylvania Ave NW, Washington, DC 20500']

nom=ArcGIS()

for addr in addrs:
    n=nom.geocode(addr)
    print(n)
    print(n.latitude)
    print(n.longitude)
    df=pd.DataFrame({'Addr': addrs},index=['DU-ECS','DEN','WHouse'])
    df['Coord']=df['Addr'].apply(nom.geocode)
    df['Lat']=df['Coord'].apply(lambda x: x.latitude)
    df['Lon']=df['Coord'].apply(lambda x: x.longitude)
    df.drop('Coord', axis=1, inplace=True)
    print(df)

2155 E Wesley Ave, Denver, Colorado, 80210
39.673844999481844
-104.96145799055763
                                                  Addr        Lat         Lon
DU-ECS             2155 E Wesley Ave, Denver, CO 80208  39.673845 -104.961458
DEN                   8500 Pena Blvd, Denver, CO 80249  39.849805 -104.673830
WHouse  1600 Pennsylvania Ave NW, Washington, DC 20500  38.897675  -77.036547
8500 Pena Blvd, Denver, Colorado, 80249
39.84980499364892
-104.67382996560718
                                                  Addr        Lat         Lon
DU-ECS             2155 E Wesley Ave, Denver, CO 80208  39.673845 -104.961458
DEN                   8500 Pena Blvd, Denver, CO 80249  39.849805 -104.673830
WHouse  1600 Pennsylvania Ave NW, Washington, DC 20500  38.897675  -77.036547
1600 Pennsylvania Ave NW, Washington, District of Columbia, 20500
38.89767510765125
-77.03654699820865
                                                  Addr        Lat         Lon
DU-ECS             2155 E Wesley Av

In [13]:
df

Unnamed: 0,Addr,Lat,Lon
DU-ECS,"2155 E Wesley Ave, Denver, CO 80208",39.673845,-104.961458
DEN,"8500 Pena Blvd, Denver, CO 80249",39.849805,-104.67383
WHouse,"1600 Pennsylvania Ave NW, Washington, DC 20500",38.897675,-77.036547


In [14]:
import os

key=os.environ.get('GoogleAPI')
geolocator = GoogleV3(api_key=key)

## Authenticated APIs: Run the same with Google API

In [15]:
for addr in addrs:
    n=geolocator.geocode(addr)
    print(n)
    print(n.latitude)
    print(n.longitude)
    df=pd.DataFrame({'Addr': addrs},index=['DU-ECS','DEN','WHouse'])
    df['Coord']=df['Addr'].apply(nom.geocode)
    df['Lat']=df['Coord'].apply(lambda x: x.latitude)
    df['Lon']=df['Coord'].apply(lambda x: x.longitude)
    df.drop('Coord', axis=1, inplace=True)
    print(df)

2155 E Wesley Ave, Denver, CO 80210, USA
39.6743561
-104.9615286
                                                  Addr        Lat         Lon
DU-ECS             2155 E Wesley Ave, Denver, CO 80208  39.673845 -104.961458
DEN                   8500 Pena Blvd, Denver, CO 80249  39.849805 -104.673830
WHouse  1600 Pennsylvania Ave NW, Washington, DC 20500  38.897675  -77.036547
8500 Peña Blvd, Denver, CO 80249, USA
39.8380991
-104.6706896
                                                  Addr        Lat         Lon
DU-ECS             2155 E Wesley Ave, Denver, CO 80208  39.673845 -104.961458
DEN                   8500 Pena Blvd, Denver, CO 80249  39.849805 -104.673830
WHouse  1600 Pennsylvania Ave NW, Washington, DC 20500  38.897675  -77.036547
1600 Pennsylvania Avenue NW, Washington, DC 20500, USA
38.8976644
-77.037089
                                                  Addr        Lat         Lon
DU-ECS             2155 E Wesley Ave, Denver, CO 80208  39.673845 -104.961458
DEN             

#### Doing the reverse with Google geolocator!

In [16]:
point = '51.523910, -0.158578' #here's famous Sherlock Holmes' museum lat & lng
address = geolocator.reverse(point)

In [17]:
address

Location(247 Baker St, London NW1 6AS, UK, (51.5239184, -0.1585922, 0.0))