# CYPLAN255
### Urban Informatics and Visualization

# Lecture 11 -- APIs <img src="https://i.imgur.com/wNMULZP.jpg" width=700 align='right' title="Man Standing in the Lumberyard of Seattle Cedar Lumber Manufacturing, Alfred Eisenstaedt (1939)">
******
March 4, 2024

# Agenda
1. Announcements
2. Intro to APIs
3. Using APIs with Python
4. Reminder: How to use Conda environments
5. For next time
6. Questions


# 1. Announcements

- Final Project released tonight
- Assignment 4 (project proposal + initial analysis) due March 10
- GitHub Pages tutorial

# 2. Intro to APIs

- What's an API?
- Examples
- Why are APIs useful?
- Types of APIs and API responses

## 2.1. What's an API?

- **A**pplication
  - software, product, or service
- **P**rogramming
  - we're going to be writing code
- **I**nterface
  - a point where two systems, subjects, organizations, etc. meet and interact

APIs are _transactional_, kind of like ordering food at a restaurant:
 1. client requests an item from the menu
 2. waiter takes the order and tells the cook what to make
 3. cook prepares the item and gives it to the server
 4. server brings client the item they ordered


In this analogy, the **waiter** + **menu** + **server** constitute the **API**

## 2.2 Examples

<center><img src="images/api1.png" width=75%></center>

<center><img src="images/api2.png" width=75%></center>

<center><img src="images/api3.png" width=75%></center>

Yes, even pandas is an API!

## 2.3 Why is an APIs useful? i.e. why do companies publish them?

- Standardizes the access points to a service or piece of software
  - No ordering "off the menu"
- Allows proprietary details to remain private
  - The ingredients in the chef's secret sauce are never revealed
- The implementation details do not matter to the client
  - Customer doesn't need or want to be able to cook the dish themselves

## 2.4 Types of APIs

1. APIs in a programming language
  - Functions: `my_function()`
  - Arguments/parameters: `my_function(args=x)`
  - Function returns a value
2. APIs over the web
  - URL endpoints: `http://my.domain/endpoint`
  - Query parameters: `http://my.domain/endpoint?args=x`
  - HTTP request returns a value

### 2.4.1 REST APIs

The most common type of API you'll encounter on the web is the **REST** API. REST APIs define a limited set of operations for transferring data between a **client** and a **server** using **HTTP**.

<img src="https://phpenthusiast.com/theme/assets/images/blog/what_is_rest_api.png?021019a" width=80%>

## 2.5 API Response Objects

We've primarily been dealing with _tabular_ data so far (columns and rows), but most APIs on the web will return data in a **hierarchical** format like **JSON** or **XML**.

<center><img src="images/api4.png" width=500></center>

JSON stands for **JavaScript Object Notation**. Since JavaScript is the "language of the web", most of the data you'll get from web APIs will be formatted as JSON.

There's nothing special or fancy or scary about JSON: _it's just nested dictionaries_.

_**JSON IS JUST NESTED DICTIONARIES**_

This means its super easy to work with JSON in Python. 

For example, Python comes with its own built-in module for reading and writing JSON files:

```python
import json
```

This is JSON:

```javascript
{
  "firstName": "Jason",
  "lastName": "Response",
  "address": {
    "streetAddress": "404 Error Street",
    "city": "",
    "state": "Null Island",
    "postalCode": "10100-0100"
  }
  "spouse": null
}
```

# 3. APIs, Python, and you

## 3.1 Using the `requests` library

`requests` is my library of choice for querying API endpoints and URLs in Python. If you don't yet have it installed go ahead and do that now. Check out the [documentation](https://docs.python-requests.org/en/latest/).

Using requests is as simple as:
```python
import requests
requests.get("https://my.domain/endpoint")
```

It can get a bit more complicated than that if, for example, an API requires authentication, or if you want to `POST` rather than `GET` data. But we don't need to worry about that for now.

## 3.2. SF Trees

Let's get some street tree data from the San Francisco Open Data Portal and use it to practice with APIs. First, our imports:

In [1]:
import pandas as pd
import json      # library for working with JSON-formatted text strings
import requests  # library for accessing content from web URLs
import pprint    # library for cleanly printing Python data structures
pp = pprint.PrettyPrinter()

Take a moment to familiarize yourself with the API endpoint we'll be using by reading the documentation:
https://data.sfgov.org/City-Infrastructure/Street-Tree-List/tkzw-k3nq/about_data

Under Export > API Endpoint we can see the API endpoint url. Click on the button labeled "API Documentation" at the bottom of the window and you'll be taken to the Socrata documentation page for this specific endpoint. Check out and the section labeled "Fields" and you'll see the data columns available. This is the "menu" in our restaurant analogy.

Now let's download the data:

In [2]:
endpoint_url = "https://data.sfgov.org/resource/tkzw-k3nq.json"
response = requests.get(endpoint_url)

Now let's take a look at what we got:

In [3]:
response.text

'[{"treeid":"269179","qlegalstatus":"DPW Maintained","qspecies":"Lyonothamnus floribundus subsp. asplenifolius :: Santa Cruz Ironwood","qaddress":"2616 Ocean Ave","siteorder":"1","qsiteinfo":"Sidewalk: Curb side : Cutout","planttype":"Tree","qcaretaker":"CAN","plantdate":"2024-02-08T00:00:00.000","dbh":"3","plotsize":"3x6","permitnotes":"Permit Number 794837","xcoord":"5991014","ycoord":"2094984","latitude":"37.73191020438722","longitude":"-122.47343196312605","location":{"latitude":"37.73191020438722","longitude":"-122.47343196312605","human_address":"{\\"address\\": \\"\\", \\"city\\": \\"\\", \\"state\\": \\"\\", \\"zip\\": \\"\\"}"},":@computed_region_yftq_j783":"1",":@computed_region_p5aj_wyqh":"8",":@computed_region_rxqg_mtj9":"4",":@computed_region_bh8s_q3mv":"64",":@computed_region_fyvs_ahh9":"40",":@computed_region_ajp5_b2md":"41"}\n,{"treeid":"272626","qlegalstatus":"Permitted Site","qspecies":"Lyonothamnus floribundus subsp. asplenifolius :: Santa Cruz Ironwood","qaddress":"

We can use the `response.text` command to just look at the first 500 characters of the response object:

In [4]:
results = response.text
print(type(results))
print(results[:500])

<class 'str'>
[{"treeid":"269179","qlegalstatus":"DPW Maintained","qspecies":"Lyonothamnus floribundus subsp. asplenifolius :: Santa Cruz Ironwood","qaddress":"2616 Ocean Ave","siteorder":"1","qsiteinfo":"Sidewalk: Curb side : Cutout","planttype":"Tree","qcaretaker":"CAN","plantdate":"2024-02-08T00:00:00.000","dbh":"3","plotsize":"3x6","permitnotes":"Permit Number 794837","xcoord":"5991014","ycoord":"2094984","latitude":"37.73191020438722","longitude":"-122.47343196312605","location":{"latitude":"37.731910204


We can use Python's `json` module to convert that string into a dictionary (or list of dictionaries):

In [5]:
# parse the string into a Python dictionary (loads = "load string")
data = json.loads(results)
print(type(data))
print(data[:3]) # look at the first three items from the selection of the list

<class 'list'>
[{'treeid': '269179', 'qlegalstatus': 'DPW Maintained', 'qspecies': 'Lyonothamnus floribundus subsp. asplenifolius :: Santa Cruz Ironwood', 'qaddress': '2616 Ocean Ave', 'siteorder': '1', 'qsiteinfo': 'Sidewalk: Curb side : Cutout', 'planttype': 'Tree', 'qcaretaker': 'CAN', 'plantdate': '2024-02-08T00:00:00.000', 'dbh': '3', 'plotsize': '3x6', 'permitnotes': 'Permit Number 794837', 'xcoord': '5991014', 'ycoord': '2094984', 'latitude': '37.73191020438722', 'longitude': '-122.47343196312605', 'location': {'latitude': '37.73191020438722', 'longitude': '-122.47343196312605', 'human_address': '{"address": "", "city": "", "state": "", "zip": ""}'}, ':@computed_region_yftq_j783': '1', ':@computed_region_p5aj_wyqh': '8', ':@computed_region_rxqg_mtj9': '4', ':@computed_region_bh8s_q3mv': '64', ':@computed_region_fyvs_ahh9': '40', ':@computed_region_ajp5_b2md': '41'}, {'treeid': '272626', 'qlegalstatus': 'Permitted Site', 'qspecies': 'Lyonothamnus floribundus subsp. asplenifolius 

Pretty Print makes this a bit easier to read:

In [6]:
pp.pprint(data[:3])

[{':@computed_region_ajp5_b2md': '41',
  ':@computed_region_bh8s_q3mv': '64',
  ':@computed_region_fyvs_ahh9': '40',
  ':@computed_region_p5aj_wyqh': '8',
  ':@computed_region_rxqg_mtj9': '4',
  ':@computed_region_yftq_j783': '1',
  'dbh': '3',
  'latitude': '37.73191020438722',
  'location': {'human_address': '{"address": "", "city": "", "state": "", '
                                '"zip": ""}',
               'latitude': '37.73191020438722',
               'longitude': '-122.47343196312605'},
  'longitude': '-122.47343196312605',
  'permitnotes': 'Permit Number 794837',
  'plantdate': '2024-02-08T00:00:00.000',
  'planttype': 'Tree',
  'plotsize': '3x6',
  'qaddress': '2616 Ocean Ave',
  'qcaretaker': 'CAN',
  'qlegalstatus': 'DPW Maintained',
  'qsiteinfo': 'Sidewalk: Curb side : Cutout',
  'qspecies': 'Lyonothamnus floribundus subsp. asplenifolius :: Santa Cruz '
              'Ironwood',
  'siteorder': '1',
  'treeid': '269179',
  'xcoord': '5991014',
  'ycoord': '2094984'},
 {'

Pandas makes it easy to work with JSON since it's already so used to working with dictionaries:

In [7]:
pd.DataFrame.from_records(data, columns=['qspecies', 'latitude','longitude']).head()

Unnamed: 0,qspecies,latitude,longitude
0,Lyonothamnus floribundus subsp. asplenifolius ...,37.73191020438722,-122.47343196312605
1,Lyonothamnus floribundus subsp. asplenifolius ...,37.7314897495916,-122.4726578811063
2,Lyonothamnus floribundus subsp. asplenifolius ...,37.732278457866336,-122.47434831774638
3,Ceanothus thyrsiflorus :: Blueblossom Ceanothus,37.73129603659199,-122.47250543451436
4,Metrosideros excelsa :: New Zealand Xmas Tree,37.76037027016682,-122.50899960017829


But perhaps it would have been easier to work directly with the JSON output of our response object instead of using `response.text` and `json.loads()`:

In [8]:
pd.DataFrame.from_dict(response.json())[['qspecies', 'latitude','longitude']].head()

Unnamed: 0,qspecies,latitude,longitude
0,Lyonothamnus floribundus subsp. asplenifolius ...,37.73191020438722,-122.47343196312605
1,Lyonothamnus floribundus subsp. asplenifolius ...,37.7314897495916,-122.4726578811063
2,Lyonothamnus floribundus subsp. asplenifolius ...,37.732278457866336,-122.47434831774638
3,Ceanothus thyrsiflorus :: Blueblossom Ceanothus,37.73129603659199,-122.47250543451436
4,Metrosideros excelsa :: New Zealand Xmas Tree,37.76037027016682,-122.50899960017829


But perhaps it would have been even _easier_ if we had made our request directly from pandas in the first place:

// different ways to interact with the data

In [9]:
pd.read_json(endpoint_url).head()

Unnamed: 0,treeid,qlegalstatus,qspecies,qaddress,siteorder,qsiteinfo,planttype,qcaretaker,plantdate,dbh,...,latitude,longitude,location,:@computed_region_yftq_j783,:@computed_region_p5aj_wyqh,:@computed_region_rxqg_mtj9,:@computed_region_bh8s_q3mv,:@computed_region_fyvs_ahh9,:@computed_region_ajp5_b2md,qcareassistant
0,265775,Section 806 (d),Tristaniopsis laurina :: Swamp Myrtle,2410 33rd Ave,1.0,Sidewalk: Curb side : Cutout,Tree,Private,2021-05-14T00:00:00.000,3.0,...,37.742156,-122.490609,"{'latitude': '37.74215588871985', 'longitude':...",1.0,8.0,3.0,29491.0,35.0,35.0,
1,265603,Planning Code 138.1 required,Lophostemon confertus :: Brisbane Box,500 Folsom St,11.0,Sidewalk: Curb side : Cutout,Tree,Private,2021-03-24T00:00:00.000,3.0,...,37.787316,-122.394555,"{'latitude': '37.78731633026373', 'longitude':...",6.0,2.0,9.0,28855.0,6.0,8.0,
2,221066,Section 806 (d),Arbutus 'Marina' :: Hybrid Strawberry Tree,2 Page St,4.0,Sidewalk: Curb side : Cutout,Tree,Private,,2.0,...,37.774311,-122.421232,"{'latitude': '37.77431064706206', 'longitude':...",7.0,9.0,11.0,28852.0,10.0,9.0,
3,259749,DPW Maintained,Tilia tomentosa :: Silver Linden,1611 La Salle Ave,,Sidewalk: Curb side : Cutout,Tree,DPW,2018-10-13T00:00:00.000,3.0,...,37.736612,-122.389001,"{'latitude': '37.73661239537821', 'longitude':...",10.0,3.0,8.0,58.0,1.0,1.0,FUF
4,258527,Significant Tree,Platanus x hispanica :: Sycamore: London Plane,50 Hermann St,3.0,Sidewalk: Property side : Cutout,Tree,Private,,,...,37.770763,-122.425958,"{'latitude': '37.77076279145426', 'longitude':...",7.0,9.0,5.0,28852.0,10.0,9.0,


### 3.3. Exercise: Police Stops in San Francisco

Let's examine a second dataset from the San Francisco Open Data Portal for practice: police stops.

Go to the City Open Data Portal and get the url for a JSON request for the Police Stops dataset.  Here is a shortcut to the dataset: https://data.sfgov.org/Public-Safety/Police-Department-Incident-Reports-Historical-2003/tmnf-yvry/about_data

Use the methods we just learned for loading the data and creating a DataFrame. Explore the data using techniques from previous homeworks and lectures. For example, you could generate some summary statistics, or make some charts. What if you made a scatter plot where the x and y axes were the latitude and longitude columns from the traffic stop data, respectively? 

In [10]:
endpoint_url_police_stops_sf = "https://data.sfgov.org/resource/tmnf-yvry.json"
response_police_stops_sf = requests.get(endpoint_url_police_stops_sf)

In [17]:
pd.read_json(endpoint_url_police_stops_sf).head()

Unnamed: 0,pdid,incidntnum,incident_code,category,descript,dayofweek,date,time,pddistrict,resolution,...,:@computed_region_9dfj_4gjx,:@computed_region_4isq_27mq,:@computed_region_pigm_ib2e,:@computed_region_9jxd_iqea,:@computed_region_6ezc_tdp2,:@computed_region_h4ep_8xdi,:@computed_region_n4xg_c4py,:@computed_region_fcz8_est8,:@computed_region_nqbw_i6c3,:@computed_region_2dwj_jsy4
0,4133422003074,41334220,3074,ROBBERY,"ROBBERY, BODILY FORCE",Monday,2004-11-22,17:50,INGLESIDE,NONE,...,,,,,,,,,,
1,5118535807021,51185358,7021,VEHICLE THEFT,STOLEN AUTOMOBILE,Tuesday,2005-10-18,20:00,PARK,NONE,...,,,,,,,,,,
2,4018830907021,40188309,7021,VEHICLE THEFT,STOLEN AUTOMOBILE,Sunday,2004-02-15,02:00,SOUTHERN,NONE,...,,,,,,,,,,
3,11014543126030,110145431,26030,ARSON,ARSON,Friday,2011-02-18,05:27,INGLESIDE,NONE,...,,,,,,,,,,
4,10108108004134,101081080,4134,ASSAULT,BATTERY,Sunday,2010-11-21,17:00,SOUTHERN,NONE,...,,,,,,,,,,


# 4. Conda Virtual Environments

Create a new conda environment, activate it, and install the basic packages:
1.  `conda create -n my-first-env`
2. `conda activate my-first-env`
3. `conda config --add channels conda-forge`
4. `conda config --set channel_priority strict`
5. `conda install python ipython notebook nb_conda_kernels jupyter_contrib_nbextensions`
6. `jupyter contrib nbextension install --user`

# 5. For next time

1. Create a Mapbox account: https://account.mapbox.com/auth/signup/
2. Generate a Mapbox API token: https://account.mapbox.com/access-tokens

# 6. Questions? 