# JSON and REST APIs

In this lecture, we will use a REST API that returns data about Star Wars characters in JSON format. You can find out more information about this API [here](https://swapi.co/).

## REST APIs

A REST API allows us to fetch data by accessing a URL. Each API works differently, so you have to read the documentation for each API carefully. The documentation for the Star Wars API can be found [here](https://swapi.co/documentation).

To get a feel for how REST APIs work, let's first fetch the contents of the URL at the command line, using `curl`.

In [1]:
!curl http://swapi.co/api/people/?search=skywalker

{"count":3,"next":null,"previous":null,"results":[{"name":"Luke Skywalker","height":"172","mass":"77","hair_color":"blond","skin_color":"fair","eye_color":"blue","birth_year":"19BBY","gender":"male","homeworld":"http://swapi.co/api/planets/1/","films":["http://swapi.co/api/films/2/","http://swapi.co/api/films/6/","http://swapi.co/api/films/3/","http://swapi.co/api/films/1/","http://swapi.co/api/films/7/"],"species":["http://swapi.co/api/species/1/"],"vehicles":["http://swapi.co/api/vehicles/14/","http://swapi.co/api/vehicles/30/"],"starships":["http://swapi.co/api/starships/12/","http://swapi.co/api/starships/22/"],"created":"2014-12-09T13:50:51.644000Z","edited":"2014-12-20T21:17:56.891000Z","url":"http://swapi.co/api/people/1/"},{"name":"Anakin Skywalker","height":"188","mass":"84","hair_color":"blond","skin_color":"fair","eye_color":"blue","birth_year":"41.9BBY","gender":"male","homeworld":"http://swapi.co/api/planets/1/","films":["http://swapi.co/api/films/5/","http://swapi.co/api/

Of course, to do anything with this data, we need to get it into Python. The `requests` library is an extremely useful library for issuing HTTP requests within Python (which includes opening and reading URLs).

In [2]:
import requests
resp = requests.get("http://swapi.co/api/people/?search=skywalker")
resp

<Response [200]>

The `requests` library recognizes the data type as JSON and converts it into a Python dict for us. We can access this dict using the `.json` method of the resulting `Response` object.

In [3]:
resp.json()

{'count': 3,
 'next': None,
 'previous': None,
 'results': [{'birth_year': '19BBY',
   'created': '2014-12-09T13:50:51.644000Z',
   'edited': '2014-12-20T21:17:56.891000Z',
   'eye_color': 'blue',
   'films': ['http://swapi.co/api/films/2/',
    'http://swapi.co/api/films/6/',
    'http://swapi.co/api/films/3/',
    'http://swapi.co/api/films/1/',
    'http://swapi.co/api/films/7/'],
   'gender': 'male',
   'hair_color': 'blond',
   'height': '172',
   'homeworld': 'http://swapi.co/api/planets/1/',
   'mass': '77',
   'name': 'Luke Skywalker',
   'skin_color': 'fair',
   'species': ['http://swapi.co/api/species/1/'],
   'starships': ['http://swapi.co/api/starships/12/',
    'http://swapi.co/api/starships/22/'],
   'url': 'http://swapi.co/api/people/1/',
   'vehicles': ['http://swapi.co/api/vehicles/14/',
    'http://swapi.co/api/vehicles/30/']},
  {'birth_year': '41.9BBY',
   'created': '2014-12-10T16:20:44.310000Z',
   'edited': '2014-12-20T21:17:50.327000Z',
   'eye_color': 'blue',
 

In [4]:
characters = resp.json()["results"]
characters

[{'birth_year': '19BBY',
  'created': '2014-12-09T13:50:51.644000Z',
  'edited': '2014-12-20T21:17:56.891000Z',
  'eye_color': 'blue',
  'films': ['http://swapi.co/api/films/2/',
   'http://swapi.co/api/films/6/',
   'http://swapi.co/api/films/3/',
   'http://swapi.co/api/films/1/',
   'http://swapi.co/api/films/7/'],
  'gender': 'male',
  'hair_color': 'blond',
  'height': '172',
  'homeworld': 'http://swapi.co/api/planets/1/',
  'mass': '77',
  'name': 'Luke Skywalker',
  'skin_color': 'fair',
  'species': ['http://swapi.co/api/species/1/'],
  'starships': ['http://swapi.co/api/starships/12/',
   'http://swapi.co/api/starships/22/'],
  'url': 'http://swapi.co/api/people/1/',
  'vehicles': ['http://swapi.co/api/vehicles/14/',
   'http://swapi.co/api/vehicles/30/']},
 {'birth_year': '41.9BBY',
  'created': '2014-12-10T16:20:44.310000Z',
  'edited': '2014-12-20T21:17:50.327000Z',
  'eye_color': 'blue',
  'films': ['http://swapi.co/api/films/5/',
   'http://swapi.co/api/films/4/',
   'ht

Notice that to get information about each character's homeworld, species, starships, etc., we have to make further API calls. Go through the data, make the API calls, and replace each URLs with the retrieved JSON object to obtain a JSON object with nested repeated fields.

Each IP address is limited to 10,000 API calls per day, and we are all on the same IP address, so please be extra careful! You could ruin things for everyone if you are not.

In [5]:
for character in characters:
    for field in ["films", "species", "starships", "vehicles"]:
        elements = []
        for url in character[field]:
            elements.append(requests.get(url).json())
        character[field] = elements
        
characters

[{'birth_year': '19BBY',
  'created': '2014-12-09T13:50:51.644000Z',
  'edited': '2014-12-20T21:17:56.891000Z',
  'eye_color': 'blue',
  'films': [{'characters': ['http://swapi.co/api/people/1/',
     'http://swapi.co/api/people/2/',
     'http://swapi.co/api/people/3/',
     'http://swapi.co/api/people/4/',
     'http://swapi.co/api/people/5/',
     'http://swapi.co/api/people/10/',
     'http://swapi.co/api/people/13/',
     'http://swapi.co/api/people/14/',
     'http://swapi.co/api/people/18/',
     'http://swapi.co/api/people/20/',
     'http://swapi.co/api/people/21/',
     'http://swapi.co/api/people/22/',
     'http://swapi.co/api/people/23/',
     'http://swapi.co/api/people/24/',
     'http://swapi.co/api/people/25/',
     'http://swapi.co/api/people/26/'],
    'created': '2014-12-12T11:26:24.656000Z',
    'director': 'Irvin Kershner',
    'edited': '2017-04-19T10:57:29.544256Z',
    'episode_id': 5,
    'opening_crawl': 'It is a dark time for the\r\nRebellion. Although the D

## Flattening the Data

Let's take a look at what happens if we try to convert the current JSON directly into a DataFrame.

In [6]:
import pandas as pd
pd.DataFrame(characters)

Unnamed: 0,birth_year,created,edited,eye_color,films,gender,hair_color,height,homeworld,mass,name,skin_color,species,starships,url,vehicles
0,19BBY,2014-12-09T13:50:51.644000Z,2014-12-20T21:17:56.891000Z,blue,"[{'edited': '2017-04-19T10:57:29.544256Z', 'pl...",male,blond,172,http://swapi.co/api/planets/1/,77,Luke Skywalker,fair,"[{'classification': 'mammal', 'edited': '2015-...","[{'max_atmosphering_speed': '1050', 'crew': '1...",http://swapi.co/api/people/1/,"[{'max_atmosphering_speed': '650', 'crew': '2'..."
1,41.9BBY,2014-12-10T16:20:44.310000Z,2014-12-20T21:17:50.327000Z,blue,"[{'edited': '2015-04-11T09:45:01.623982Z', 'pl...",male,blond,188,http://swapi.co/api/planets/1/,84,Anakin Skywalker,fair,"[{'classification': 'mammal', 'edited': '2015-...","[{'max_atmosphering_speed': '1050', 'crew': '6...",http://swapi.co/api/people/11/,"[{'max_atmosphering_speed': '350', 'crew': '1'..."
2,72BBY,2014-12-19T17:57:41.191000Z,2014-12-20T21:17:50.401000Z,brown,"[{'edited': '2015-04-11T09:45:01.623982Z', 'pl...",female,black,163,http://swapi.co/api/planets/1/,unknown,Shmi Skywalker,fair,"[{'classification': 'mammal', 'edited': '2015-...",[],http://swapi.co/api/people/43/,[]


Notice that repeated fields like species, starships, etc. are stored as JSON strings. The data inside these fields is not very accessible at the moment.

Suppose we want data on all of the starships piloted by a Skywalker. We have to flatten the data at the starship level. There's no easy way to do this. One way to do this is to simply iterate over the JSON object and collect the fields that you need.

In [17]:
starships = {
    "pilot": [], # which Skywalker piloted this ship
    "name": [],
    "model": [],
    "passengers": [],
    "length": [],
    "hyperdrive_rating": [],
    "max_atmosphering_speed": [],
    "cargo_capacity": []
}

for character in characters:
    for starship in character["starships"]:
        starships["pilot"].append(character["name"])
        for field in ["name", "model", "passengers", "length", 
                      "hyperdrive_rating", "max_atmosphering_speed", "cargo_capacity"]:
            starships[field].append(starship[field])

starships = pd.DataFrame(starships)
starships

Unnamed: 0,cargo_capacity,hyperdrive_rating,length,max_atmosphering_speed,model,name,passengers,pilot
0,110,1.0,12.5,1050,T-65 X-wing,X-wing,0,Luke Skywalker
1,80000,1.0,20.0,850,Lambda-class T-4a shuttle,Imperial shuttle,20,Luke Skywalker
2,50000000,1.5,1088.0,1050,Providence-class carrier/destroyer,Trade Federation cruiser,48247,Anakin Skywalker
3,60,1.0,5.47,1500,Eta-2 Actis-class light interceptor,Jedi Interceptor,0,Anakin Skywalker
4,65,1.0,11.0,1100,N-1 starfighter,Naboo fighter,0,Anakin Skywalker


Another way is to make several calls to `.apply(pd.Series)` (to expand lists into columns) and `.stack()` (to get one row per starship) to get a DataFrame that you can join back to the original data frame.

In [22]:
characters_df = pd.DataFrame(characters)
starship_df = characters_df["starships"].apply(pd.Series).stack().apply(pd.Series)

In [24]:
starship_df.reset_index().merge(characters_df, 
                                left_on="level_0", 
                                right_index=True)

Unnamed: 0,level_0,level_1,MGLT,cargo_capacity,consumables,cost_in_credits,created_x,crew,edited_x,films_x,...,hair_color,height,homeworld,mass,name_y,skin_color,species,starships,url_y,vehicles
0,0,0,100,110,1 week,149999,2014-12-12T11:19:05.340000Z,1,2014-12-22T17:35:44.491233Z,"[http://swapi.co/api/films/2/, http://swapi.co...",...,blond,172,http://swapi.co/api/planets/1/,77,Luke Skywalker,fair,"[{'classification': 'mammal', 'edited': '2015-...","[{'max_atmosphering_speed': '1050', 'crew': '1...",http://swapi.co/api/people/1/,"[{'max_atmosphering_speed': '650', 'crew': '2'..."
1,0,1,50,80000,2 months,240000,2014-12-15T13:04:47.235000Z,6,2014-12-22T17:35:44.795405Z,"[http://swapi.co/api/films/2/, http://swapi.co...",...,blond,172,http://swapi.co/api/planets/1/,77,Luke Skywalker,fair,"[{'classification': 'mammal', 'edited': '2015-...","[{'max_atmosphering_speed': '1050', 'crew': '1...",http://swapi.co/api/people/1/,"[{'max_atmosphering_speed': '650', 'crew': '2'..."
2,1,0,unknown,50000000,4 years,125000000,2014-12-20T19:40:21.902000Z,600,2014-12-22T17:35:45.195165Z,[http://swapi.co/api/films/6/],...,blond,188,http://swapi.co/api/planets/1/,84,Anakin Skywalker,fair,"[{'classification': 'mammal', 'edited': '2015-...","[{'max_atmosphering_speed': '1050', 'crew': '6...",http://swapi.co/api/people/11/,"[{'max_atmosphering_speed': '350', 'crew': '1'..."
3,1,1,unknown,60,2 days,320000,2014-12-20T19:56:57.468000Z,1,2014-12-22T17:35:45.272349Z,[http://swapi.co/api/films/6/],...,blond,188,http://swapi.co/api/planets/1/,84,Anakin Skywalker,fair,"[{'classification': 'mammal', 'edited': '2015-...","[{'max_atmosphering_speed': '1050', 'crew': '6...",http://swapi.co/api/people/11/,"[{'max_atmosphering_speed': '350', 'crew': '1'..."
4,1,2,unknown,65,7 days,200000,2014-12-19T17:39:17.582000Z,1,2014-12-22T17:35:45.079452Z,"[http://swapi.co/api/films/5/, http://swapi.co...",...,blond,188,http://swapi.co/api/planets/1/,84,Anakin Skywalker,fair,"[{'classification': 'mammal', 'edited': '2015-...","[{'max_atmosphering_speed': '1050', 'crew': '6...",http://swapi.co/api/people/11/,"[{'max_atmosphering_speed': '350', 'crew': '1'..."
