<a href="https://colab.research.google.com/github/eva-deegan/homepage/blob/main/rest/REST_APIs_fill_in_the_blank.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Using REST APIs as data sources

REST API stands for Representational State Transfer Application Programming Interface. It is a type of API that allows different systems to communicate over the internet using standard HTTP methods.

* Data is everywhere and it is generated constantly
* The number of data sources is amazingly huge
* Datasets are huge and can be used in many ways

* We may do amazing things using data made available by third-party:
    - https://developer.walmartlabs.com/docs
    - https://developer.spotify.com/documentation/web-api/
    - https://earthquake.usgs.gov/fdsnws/event/1/
    
    
We will have a nice and brief overview about how to consume data from REST APIs, mainly focusing on **JSON**.

this will allow us to work with "live" data


### What is an API?

**Application Programming Interface** defines the methods for one software program to interact with the other.

In the case of this lecture, we are dealing with a REST API, which sends data over a network: one type of Web service.

When we want to receive data from an Web service, we need to make a `request` to this service. When the server receives this request, it sends a `response`.

![request.png](https://github.com/kylewinfree/inf502-fall2025/blob/main/rest/request.png?raw=1)

There are different types of requests.

In our case we will use a `GET`, which is used to retrieve data. This is the type of request we use to collect data.

A response from the API contains 2 things (among others):
* response code
* response data

To make a request, we use:

In [4]:
import requests
response = requests.get('http://www.nau.edu/')
type(response)

The `request.get(URL)` returns an object Response, which provides, among other things, the response code.

In [5]:
response.status_code

200

The most common codes are:
* 200: Everything went okay, and the result has been returned (if any).
* 301: The server is redirecting you to a different endpoint. This can happen when a company switches domain names, or an endpoint name is changed.
* 400: The server thinks you made a bad request. This can happen when you don’t send along the right data, among other things.
* 401: The server thinks you’re not authenticated. Many APIs require login ccredentials, so this happens when you don’t send the right credentials to access an API.
* 403: The resource you’re trying to access is forbidden: you don’t have the right permissions to see it.
* 404: The resource you tried to access wasn’t found on the server.
* 503: The server is not ready to handle the request.

More details about status codes list can be found [here](https://developer.mozilla.org/en-US/docs/Web/HTTP/Status)

### What about getting the data?

First, read the documentation! Everytime you use an API, please read the documentation to understand how to use, the structure, etc.

We will use the [Open Notify API](http://api.open-notify.org/), which gives access to data about the international space station.

These APIs usually provide multiple endpoints, which are the ways we can interact with that service.

Let's try a request and see how it goes:

In [6]:
response = requests.get("http://api.open-notify.org/astros.json")
print(response.status_code)

200


In [7]:
type(response.content)

bytes

In [8]:
response.text

'{"people": [{"craft": "ISS", "name": "Oleg Kononenko"}, {"craft": "ISS", "name": "Nikolai Chub"}, {"craft": "ISS", "name": "Tracy Caldwell Dyson"}, {"craft": "ISS", "name": "Matthew Dominick"}, {"craft": "ISS", "name": "Michael Barratt"}, {"craft": "ISS", "name": "Jeanette Epps"}, {"craft": "ISS", "name": "Alexander Grebenkin"}, {"craft": "ISS", "name": "Butch Wilmore"}, {"craft": "ISS", "name": "Sunita Williams"}, {"craft": "Tiangong", "name": "Li Guangsu"}, {"craft": "Tiangong", "name": "Li Cong"}, {"craft": "Tiangong", "name": "Ye Guangfu"}], "number": 12, "message": "success"}'

In [9]:
response.json()

{'people': [{'craft': 'ISS', 'name': 'Oleg Kononenko'},
  {'craft': 'ISS', 'name': 'Nikolai Chub'},
  {'craft': 'ISS', 'name': 'Tracy Caldwell Dyson'},
  {'craft': 'ISS', 'name': 'Matthew Dominick'},
  {'craft': 'ISS', 'name': 'Michael Barratt'},
  {'craft': 'ISS', 'name': 'Jeanette Epps'},
  {'craft': 'ISS', 'name': 'Alexander Grebenkin'},
  {'craft': 'ISS', 'name': 'Butch Wilmore'},
  {'craft': 'ISS', 'name': 'Sunita Williams'},
  {'craft': 'Tiangong', 'name': 'Li Guangsu'},
  {'craft': 'Tiangong', 'name': 'Li Cong'},
  {'craft': 'Tiangong', 'name': 'Ye Guangfu'}],
 'number': 12,
 'message': 'success'}

### Working with JSON
JSON stands for JavaScript Object Notation. It is a way to encode data structures that ensures that they are easily readable.

JSON output look like Python something with *dictionaries, lists, strings* and *integers*. And it is...

But, how to use it? Well, we used it in the last command.


In [10]:
import json

json has two main functions:

* `json.dumps()` — Takes in a Python object and converts (dumps) to a string.
* `json.loads()` — Takes a JSON string and converts (loads) to a Python object.

The `dumps()` is particularly useful as we can use it to format the json, making it easier to understand the output

In [11]:
json_response = response.json()
formatted_json = json.dumps(json_response, sort_keys=True, indent=3) # indent is the number of spaces when making pretty

print(formatted_json)

{
   "message": "success",
   "number": 12,
   "people": [
      {
         "craft": "ISS",
         "name": "Oleg Kononenko"
      },
      {
         "craft": "ISS",
         "name": "Nikolai Chub"
      },
      {
         "craft": "ISS",
         "name": "Tracy Caldwell Dyson"
      },
      {
         "craft": "ISS",
         "name": "Matthew Dominick"
      },
      {
         "craft": "ISS",
         "name": "Michael Barratt"
      },
      {
         "craft": "ISS",
         "name": "Jeanette Epps"
      },
      {
         "craft": "ISS",
         "name": "Alexander Grebenkin"
      },
      {
         "craft": "ISS",
         "name": "Butch Wilmore"
      },
      {
         "craft": "ISS",
         "name": "Sunita Williams"
      },
      {
         "craft": "Tiangong",
         "name": "Li Guangsu"
      },
      {
         "craft": "Tiangong",
         "name": "Li Cong"
      },
      {
         "craft": "Tiangong",
         "name": "Ye Guangfu"
      }
   ]
}


### REST API with Query Parameters
In some cases, it is possible to pass parameters to filter the output of the API.

The https://earthquake.usgs.gov/fdsnws/event/1/query endpoint tells what are the earthquakes given a set of parameters. For example time, location, etc.
More information here:
https://earthquake.usgs.gov/fdsnws/event/1/#parameters  

In the example below, we show the earthquakes in January 2022 (`starttime` and `endtime`), with magnitude between 6 and 7 (`minmagnitude` and `maxmagnitude`).



In [12]:
import requests
import json
from datetime import datetime
# let's look at how this query is structured, and the output
response = requests.get("https://earthquake.usgs.gov/fdsnws/event/1/query?format=geojson&starttime=2022-01-01&endtime=2022-01-31&maxmagnitude=7&minmagnitude=6")
json_response = response.json()
formatted_json = json.dumps(json_response, sort_keys=False, indent=2)

print(formatted_json)

{
  "type": "FeatureCollection",
  "metadata": {
    "generated": 1761687706000,
    "url": "https://earthquake.usgs.gov/fdsnws/event/1/query?format=geojson&starttime=2022-01-01&endtime=2022-01-31&maxmagnitude=7&minmagnitude=6",
    "title": "USGS Earthquakes",
    "status": 200,
    "api": "1.14.1",
    "count": 18
  },
  "features": [
    {
      "type": "Feature",
      "properties": {
        "mag": 6.5,
        "place": "Kermadec Islands region",
        "time": 1643424399588,
        "updated": 1650045421040,
        "tz": null,
        "url": "https://earthquake.usgs.gov/earthquakes/eventpage/us7000gg3w",
        "detail": "https://earthquake.usgs.gov/fdsnws/event/1/query?eventid=us7000gg3w&format=geojson",
        "felt": 2,
        "cdi": 7.1,
        "mmi": 3.687,
        "alert": "green",
        "status": "reviewed",
        "tsunami": 1,
        "sig": 651,
        "net": "us",
        "code": "7000gg3w",
        "ids": ",pt22029000,us7000gg3w,at00r6gae1,usauto7000gg3w,",


#### Let's be a little more selective now.
- go find the documentation, figure out how to specifiy a location
- pick a place, such as Alaksa, and define the region
- run it again!
https://www.mapsofworld.com/usa/states/alaska/lat-long.html

In [19]:
response = requests.get("https://earthquake.usgs.gov/fdsnws/event/1/query?format=geojson&starttime=2022-01-01&endtime=2022-01-31&latitude=-33&longitude=-70&maxradius=1")
json_response = response.json()
formatted_json = json.dumps(json_response, sort_keys=False, indent=2)
print(formatted_json)

{
  "type": "FeatureCollection",
  "metadata": {
    "generated": 1761688801000,
    "url": "https://earthquake.usgs.gov/fdsnws/event/1/query?format=geojson&starttime=2022-01-01&endtime=2022-01-31&latitude=-33&longitude=-70&maxradius=1",
    "title": "USGS Earthquakes",
    "status": 200,
    "api": "1.14.1",
    "count": 2
  },
  "features": [
    {
      "type": "Feature",
      "properties": {
        "mag": 3.3,
        "place": "17 km ENE of Puente Alto, Chile",
        "time": 1643423014333,
        "updated": 1650045421040,
        "tz": null,
        "url": "https://earthquake.usgs.gov/earthquakes/eventpage/us7000gg3p",
        "detail": "https://earthquake.usgs.gov/fdsnws/event/1/query?eventid=us7000gg3p&format=geojson",
        "felt": 1,
        "cdi": 2,
        "mmi": null,
        "alert": null,
        "status": "reviewed",
        "tsunami": 0,
        "sig": 168,
        "net": "us",
        "code": "7000gg3p",
        "ids": ",us7000gg3p,",
        "sources": ",us,",


#### Getting what we need...
Now, we will *print the place, date, and magnitude of each of them*

Here is what we used:
```json
"features": [
    {
      "type": "Feature",
      "properties": {
        "mag": 6.2,
        "place": "66 km E of Hualien City, Taiwan",
        "time": 1641203195767,
...
```

You've already worked with iteration, let's use that with our json.  We want to to iterate through each earthquake observation, print some stuff, and at the end, print the maximum magnitude.  I'll get you started...

In [18]:
max_magnitude = 0

for earthquake in json_response["features"]:
  # get the magnitude
  print()
  # print some stuff

  # update our max on record as needed

# print the max

SyntaxError: incomplete input (ipython-input-3011901782.py, line 10)