# Application Programming Interfaces

**ENGSCI233: Computational Techniques and Computer Systems** 

*Department of Engineering Science, University of Auckland*

In [None]:
# imports and environment: this cell must be executed before any other in the notebook
%matplotlib inline
#from apis233 import*  

The purpose of this notebook is give you a short introduction to Application Programming Interfaces, or APIs. The goal is for you to be able to effectively use an API, **not design one**. Nevertheless, we'll touch on some good API design principles, as these will help you to understand what you're working with.

## 1 API design

***<mark> What makes a good API? </mark>***

At its core, an API is a **set of functions, classes and other tools** provided to a user to help them build a computer program. For instance, in this module we will be using the [Overpass API](http://overpass-turbo.eu/) and [Python package](https://github.com/mvexel/overpass-api-python-wrapper), which provides an interface to OpenStreetMap, a massive database of crowdsourced geographic data. The **OpenStreetMap** project comprises about 800 GB of data, so it's obviously **not practical to download** and interact with the database directly. Instead, we use an API to make data requests to a webhosted repository. 

Without being overly prescriptive, good API design principles include<sup>1</sup>:

- Easy to learn and to use, with good documentation.
- As small as possible, doing a few things well, exposing only what is required (public) and hiding the rest (private).
- [Specification](../quality_control/quality_control.ipynb#2-Specifications) driven. Implementation details just confuse the user, keep the focus on inputs and outputs.
- Like a mini programming language. Names should be self-explanatory.

<sup>1</sup><sub>Taken from*"How to Design a Good API and Why it Matters"*, Joshua Bloch ([slides](https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/32713.pdf), [video](https://www.youtube.com/watch?v=heh4OeB9A-c))</sub>

### 1.1 Other interfaces

In an abstract sense, an interface is just a formalised protocol to **make it easy** for users to interact with the more complicated engine. APIs are for when you'd like this to be a **programming interaction** (writing and running statements). **Graphical User Interfaces** (GUIs) are for when you'd prefer to be pointing, clicking buttons, dragging objects, etc.  

## 2 Discovering the Overpass API

***<mark>Learning by doing.</mark>***

There's only so much that can be learned discussing APIs in a conceptual sense. We'll spend the rest of this module focusing on the OpenStreetMap API. The aim is to give you a sense of the scale of problems that can be solved when you have access to these kinds of massive online data repositories. In addition, learning, using, and troubleshooting someone else's code can spark new ways of thinking about your own coding<sup>2</sup>. 

We'll start with an introduction to OpenStreetMap and the web interface. Then, we'll try out the Overpass API, and it's Python wrapper. For a nice explainer on how they all link together, take a look [here](https://towardsdatascience.com/loading-data-from-openstreetmap-with-python-and-the-overpass-api-513882a27fd0).

<sup>2</sup><sub>It's also really frustrating, although you should be used to that by now.</sub>

### 2.1 OpenStreetMap

Visit the OpenStreetMap (OSM from now) web interface here https://www.openstreetmap.org/#map=16/-36.8608/174.7638

<img src="img/osm1.png" alt="Drawing" style="width: 900px;"/>

***You should see a detailed map of downtown Auckland, with an interface that is quite similar to Google Maps. ***

However, this is not actually OSM. OSM is a large database of **Nodes** (points in space, tagged with information), **Ways** (lists of Nodes) and **Relations** (lists of Nodes, Ways and other Relations). What you're looking at in your browser is a web-based GUI that helps you understand and explore the OSM database.

Zooming in to the University of Auckland, I can locate the rather questionable establishment, Shadows. Right-clicking here and choosing `Query features` brings up a list of nearby things, one of which is Pub [Shadows](https://www.openstreetmap.org/node/317654141). Clicking on that, I can bring up a second dialog providing additional information: this is a `Node` object, it is tagged as a [`pub`](https://wiki.openstreetmap.org/wiki/Tag:amenity=pub?uselang=en-US) [amenity](https://wiki.openstreetmap.org/wiki/Key:amenity?uselang=en-US), and its latitude and longitude are given as well.

<img src="img/osm2.png" alt="Drawing" style="width: 900px;"/>

What else can we find? Right-click on nearby Princes Street and query its features. Here, we can see an example of:

- A **Way**: Unclassified Road [Princes Street](https://www.openstreetmap.org/way/53197184#map=19/-36.85091/174.76871). Open this to see the **list of Nodes** defining the Way.
- A **Relation**: Relation [Beach Haven Wharf to Auckland University](https://www.openstreetmap.org/relation/1993410#map=14/-36.8206/174.7266). Open this to see the Auckland Transport 933 Bus Route.

<img src="img/osm3.png" alt="Drawing" style="width: 900px;"/>

Clearly OSM contains a wealth of data, although using the web interface will become cumbersome as we attempt more ambitious projects.

### 2.2 Overpass

For most people, their main interaction with OSM is to request data, i.e, lodging a **Query**. [Overpass](https://wiki.openstreetmap.org/wiki/Overpass_API) is an API optimized for this client query interaction.

As an API, it has its own specialised language, e.g.,

`node["amenity"="pub"](-36.89,174.70,-36.83,174.80); out;`

is a query that returns all nodes tagged with `Amenity Pub` contained in the box with latitude and longitude bounds `(-36.89,174.70,-36.83,174.80)`. Semi-colons are used to separate commands<sup>3</sup>. You can read more about Overpass queries [here](). 

<sup>3</sup><sub>I wouldn't do it that way, but the person who designs the API gets to set the rules.</sub>

#### 2.2.1 Example 1 - Searching Nodes

A good way to test out your Overpass queries is to enter them in [overpass turbo](http://overpass-turbo.eu/), which links your query to a map view. Enter the pub search query above in overpass turbo to locate all the pubs in Central Auckland.

<img src="img/osm4.png" alt="Drawing" style="width: 900px;"/>

Click the **Data** tab on the righthand side to see the corresponding XML file containing your data request.

***Which part of the API command sets the search limits?***

> <mark>*~ your answer here ~*</mark>

***How would you customise search limits for your own problem? (Hint: in Google Maps you can right click and choose "What's here" for lat/lon info.)***

> <mark>*~ your answer here ~*</mark>

***How can you find out which amenities are available for searching?***

> <mark>*~ your answer here ~*</mark>


#### 2.2 Example 2 - Searching Relations and using Recursion

Suppose we want to pull data on the Auckland Transport 933 Bus Route. This is a **Relation**, which comprises a list of **Ways**, each of which is a list of **Nodes**. Let's start by pulling just the relation though. In overpass turbo, enter

`relation["network"="AT"]["ref"="933"]; out;`

You will be met with a helpful `Incomplete Data` message that essentially says *"We CAN return the route data, but we CANNOT display the route."* Huh?

For now, choose `show data`. You will be taken to the data screen, which shows the bus route (a Relation) as comprising a series of legs (Ways). 

<img src="img/osm6.png" alt="Drawing" style="width: 900px;"/>

Unfortunately, our request has only returned the **names** of these Ways (under `ref`, a largely meaningless number). To plot the route, what we really need is (1) the Nodes that belong to each Way, and (2) the coordinate data belonging to each Node.

Fortunately, it is reasonably straightforward to modify our request in a way that says *"Keep recursively drilling down through the data - Relations $\rightarrow$ Ways $\rightarrow$ Nodes $\rightarrow$ coordinate data - and return me everything."*

***Implementing Recursion***

The easiest way to see recursion implemented is to choose `repair query` when you get the `Incomplete Data` message from overpass turbo. This will automatically append a 

`(._;>;);`

to your request. Running this query will generate a map of the bus route, and the nodes locations are now evident in the `Data` tab.

<img src="img/osm7.png" alt="Drawing" style="width: 900px;"/>

So how does `(._;>;);` achieve the desired recursion? This is an idiosyncracy of this particular API. To understand<sup>4</sup>, you need to start with the [API documentation](https://wiki.openstreetmap.org/wiki/Overpass_API/Language_Guide) and read sections 7 and 8. 

<sup>4</sup><sub>I mean, I could tell you, but *personal struggle* is a much better teacher than I am.</sub>

#### 2.3 Scoring this API

The Overpass API provides you, the user, with tools to simplify your interaction with the very large OSM database. **But is it a good API?**

- I personally found parts of it **somewhat easy** to learn (especially with overpass turbo). The documentation is a wiki page with some useful examples. Recursion is unintuitive. 

> <mark>*~ your score out of 5 ~*</mark>

- It exposes only a small part of OSM. The main API for OSM allows upload and editing of information. However, Overpass has a specific purpose, restricted to **data queries only**.

> <mark>*~ your score out of 5 ~*</mark>

- It is specification driven. The user does not need to understand which search algorithm is used (implementation). All they do is provide the parameters of their request (inputs) and then receive the data (outputs).

> <mark>*~ your score out of 5 ~*</mark>

- Names are somewhat self-explanatory. Recursion is not great, e.g., what do `._`, `>;` mean?

> <mark>*~ your score out of 5 ~*</mark>

In any case, someone clearly wasn't completely satisfied with Overpass, as there is a nice Python wrapper<sup>5</sup> we can use to generate our queries.
    
<sup>5</sup><sub>[An API for an API.](https://www.youtube.com/watch?v=G2jUhnCU9iA)</sub>

### 2.3 Overpass Python API

First, you'll need to install the Overpass Python API (also called overpass). We can do this from within this notebook using `pip` (Package Installer for Python). You'll only need to do this once - running the cell below again will tell you that it's already installed. `pip` commands can also be run from the command prompt, just without the `!`.

***Run the cell below to install Python overpass.***

In [None]:
!pip install overpass

Great! Python overpass is ready to go. Let's pull down some data and start exploring.

In [None]:
from overpass import API

api = API()
dat = api.get('node["amenity"="pub"](-36.89,174.70,-36.83,174.80)', responseformat="json")

So now we have an object called `dat`, which presumably contains the same query data we were looking at in overpass-turbo. But how should I **discover** the arrangement or structure of these data?

My preference at this stage is to switch over to **Visual Studio Code**, where its much easier to inspect Python variables and infer their meaning. Run the file `overpass_ex1.py` in **debugging**, setting a breakpoint on line 11. Then, in the lefthand Debug pane, we can inspect the object `dat`.

<img src="img/osm5.png" alt="Drawing" style="width: 900px;"/>

After a little effort, we can infer that the query data is contained in a set of **nested dictionaries and lists**.

For instance, the top level contains entries `"version"`, `"generator"`, `"osm3s"` and `"elements"`

In [None]:
print(dat.keys())

`dat["elements"]` is a list containing a large number of dictionary entries. For instance, the second appears to correspond to `"The Edinburgh Castle"`.

In [None]:
print(len(dat["elements"]))
print(dat["elements"][1])

The location data looks to be contained under the `"lat"` and `"lon"` keys, whereas the other metadata is stored under `"tags"` (another dictionary, whew!)

In [None]:
print(dat["elements"][1]["lat"])
print(dat["elements"][1]["lon"])

print(dat["elements"][1]["tags"])
print(dat["elements"][1]["tags"]["name"])

Suppose, however, my query is motivated by a desire for sticky floors and dark corners...

In [None]:
# search for shads
for el in dat["elements"]:
    if el["tags"]["name"] == "Shadows":
        break
        
print("... then get yourself to [{:3.2f},{:3.2f}]".format(el["lat"], el["lon"]))

***- - - - CLASS CODING EXERCISE - - - -***

In [None]:
# PART ONE
# --------
# How many pubs are there in West Auckland?
# *hint* - change the bounding box defining the search
#
# **your code here**

In [None]:
# PART TWO
# --------
# How long is Beach Haven Wharf to Auckland University bus route?
#
# *having trouble with this exercies in Jupyter Notebooks? do it in Visual Studio Code instead.*

First, we need to pull down some data.

In [None]:
# the command below pulls down the WAY information, but NOT the NODE and COORDINATE data we need
dat = api.get('relation["network"="AT"]["ref"="933"]', responseformat="json")
#print(dat['elements'])   # this print statement optional

# modify the command above with a recursion relation to pull down ALL the data associated with this bus route
# **your code here**
# dat = api.get(

We'll also need a way to measure the distance between two points.

In [None]:
# this function computes the distance between two points on the Earth's surface with known coordinates
import numpy as np
def dist(lat0,lon0,lat1,lon1):
    ''' returns the distance in km between two points using the haversine formula
    '''
    lat0,lon0,lat1,lon1 = list(np.array([lat0,lon0,lat1,lon1])/180.*np.pi)  # convert from degrees to radians
    return 2.*6400*np.arcsin(np.sqrt(np.sin((lat0-lat1)/2.)**2+np.cos(lat0)*np.cos(lat1)*np.sin((lon0-lon1)/2.)**2))

# show that the (crow-flies) distance in km between the Auckland [-36.87,174.77] and Wellington [-41.30,174.79] is about 495km
# **your code here**


The code below builds a simple look-up so that for a given node, we can get its coordinate location.

In [None]:
# INSPECT the code below. It: 
# - loops through all the elements
# - finds only the nodes
# - stores a list of [id, [lat,lon]] 
# - turns this into a dictionary - id is the key, location is the value

# where do each of the bullet points above occur in the code below?

nd_loc = []
for el in dat["elements"]:
    if el["type"] == "node":
        nd_id = el["id"]
        nd_loc.append((nd_id,[el["lat"], el["lon"]]))
nd_loc = dict(nd_loc)

In [None]:
# a demonstration of how to use this node location dictionary
print(nd_id)             # nd_id is a sample ID
print(nd_loc[nd_id])     # this is the dictionary entry for that ID
lat, lon = nd_loc[nd_id] # unpacking the coordinates and printing them out
print("for example: node {:d} is at [{:3.2f},{:3.2f}]".format(nd_id, lat, lon))

Finally, we have everything we need to process the OSM data for bus route length

In [None]:
# EXERCISE: measure the length of the bus-route
# **to do**
# run the incomplete code below and inspect the content of a WAY type element
for el in dat["elements"]:
    if el["type"] == "way":
        print(el)
        print(el['nodes'])
        break
        
# **to do**
# modify the code to:
# - loop through all the elements
# - find only the ways
# - extract the nodes in that way
# - for each adjoining pair of nodes, extract their coordinate locations
# - compute the distance between those nodes
# - add that distance to a running total
# answer should be about 24km


In [None]:
# ANSWERS, DON'T LOOK IN HERE!
#
#dat = api.get('relation["network"="AT"]["ref"="933"];(._;>;)', responseformat="json")
#
#dist(-36.87,174.77,-41.30,174.79)

'''
dtot = 0.
for el in dat["elements"]:
    if el["type"] == "way":
        nds_id = el['nodes']
        for nd0_id,nd1_id in zip(nds_id[:-1],nds_id[1:]):
            nd0 = nd_loc[nd0_id]
            nd1 = nd_loc[nd1_id]
            d = dist(nd0[0],nd0[1],nd1[0],nd1[1])
            dtot += d
            
print("route 933 is {:3.1f} km long".format(dtot))
'''