#  APIs

---

<a id="learning-objectives"></a>
## Learning Objectives
*After completing this notebook, you will be able to:*

- Use APIs to get data from the web

# <font color='blue'> APIs 

Let's start by importing the **requests** library, which we'll be using to make API requests

In [1]:
import requests

Let's make a request to the astronauts API and view the resulting JSON. The first thing we do is make a GET request. This is really simple!

In [2]:
astro_request = requests.get('http://api.open-notify.org/astros.json')

In [3]:
astro_request

<Response [200]>

An alternative, neater way to make the request would be to define the URL as a variable instead of pasting it straight into `.get()`, like this:

In [4]:
astro_url = 'http://api.open-notify.org/astros.json'
astro_request = requests.get(astro_url)

The thing we get back from a GET request is a `request` object.

In [5]:
type(astro_request)

requests.models.Response

This is an object that has a few different bits of information bundled up inside it, all of which have been sent back to us by the servers at `open-notify.org`, including...

The status code, which tells us whether the request was successful or not. A status code of `200` means the request was a success, whereas a status code of `400` means there was an error. You might remember seeing `404: error` messages in your browser when you try to load a webpage that doesn't exist- that's also an example of a status code! 

We can check the status code like this:

In [6]:
astro_request.status_code

200

We can also access the JSON that's returned by the API; this is also bundled up inside our `request` object.

In [7]:
astro_request.json()

{'number': 10,
 'people': [{'name': 'Oleg Artemyev', 'craft': 'ISS'},
  {'name': 'Denis Matveev', 'craft': 'ISS'},
  {'name': 'Sergey Korsakov', 'craft': 'ISS'},
  {'name': 'Kjell Lindgren', 'craft': 'ISS'},
  {'name': 'Bob Hines', 'craft': 'ISS'},
  {'name': 'Samantha Cristoforetti', 'craft': 'ISS'},
  {'name': 'Jessica Watkins', 'craft': 'ISS'},
  {'name': 'Cai Xuzhe', 'craft': 'Tiangong'},
  {'name': 'Chen Dong', 'craft': 'Tiangong'},
  {'name': 'Liu Yang', 'craft': 'Tiangong'}],
 'message': 'success'}

Let's create a variable that contains the JSON only.

In [8]:
astro_json = astro_request.json()
astro_json

{'number': 10,
 'people': [{'name': 'Oleg Artemyev', 'craft': 'ISS'},
  {'name': 'Denis Matveev', 'craft': 'ISS'},
  {'name': 'Sergey Korsakov', 'craft': 'ISS'},
  {'name': 'Kjell Lindgren', 'craft': 'ISS'},
  {'name': 'Bob Hines', 'craft': 'ISS'},
  {'name': 'Samantha Cristoforetti', 'craft': 'ISS'},
  {'name': 'Jessica Watkins', 'craft': 'ISS'},
  {'name': 'Cai Xuzhe', 'craft': 'Tiangong'},
  {'name': 'Chen Dong', 'craft': 'Tiangong'},
  {'name': 'Liu Yang', 'craft': 'Tiangong'}],
 'message': 'success'}

Let's check it's type- it's a dictionary!

In [9]:
type(astro_json)

dict

Now we can use our dictionary and list-indexing skills to access information inside the JSON.

In [10]:
astro_json['people']

[{'name': 'Oleg Artemyev', 'craft': 'ISS'},
 {'name': 'Denis Matveev', 'craft': 'ISS'},
 {'name': 'Sergey Korsakov', 'craft': 'ISS'},
 {'name': 'Kjell Lindgren', 'craft': 'ISS'},
 {'name': 'Bob Hines', 'craft': 'ISS'},
 {'name': 'Samantha Cristoforetti', 'craft': 'ISS'},
 {'name': 'Jessica Watkins', 'craft': 'ISS'},
 {'name': 'Cai Xuzhe', 'craft': 'Tiangong'},
 {'name': 'Chen Dong', 'craft': 'Tiangong'},
 {'name': 'Liu Yang', 'craft': 'Tiangong'}]

In [11]:
astro_json['people'][0]

{'name': 'Oleg Artemyev', 'craft': 'ISS'}

In [12]:
astro_json['people'][0]['name']

'Oleg Artemyev'

## <font color='red'> Exercise 1: Dad jokes

We will play around with a new API: one that returns dad jokes.
    
https://icanhazdadjoke.com/api
    
Use the documentation to find the url that returns *a random dad joke*. Fill in the gap in the code below to make a GET request to the correct url.

In [20]:
dad_joke_url = 'https://icanhazdadjoke.com/'
dad_joke_request = requests.get(
    url=dad_joke_url,
    headers={'Accept': 'application/json'} # specify we want JSON back
)
dad_joke_request

<Response [200]>

Now, check the status code of the request:

In [21]:
dad_joke_request.status_code

200

Next, create a variable that contains the JSON returned by the API

In [22]:
dad_joke_json = dad_joke_request.json()
dad_joke_json

{'id': '3oG6UvX82g',
 'joke': "Which is the fastest growing city in the world? Dublin'",
 'status': 200}

Inspect the dictionary and extract the joke itself as a string

In [23]:
dad_joke = dad_joke_json["joke"]
print(dad_joke)

Which is the fastest growing city in the world? Dublin'


### <font color='green'>  Stretch

Now write a `for` loop to call the API 10 times and store the 10 jokes in a list.

_You may optionally decide to create a function to do the fetching for you_

In [84]:
jokes = []
    
for i in range(10):
    
    dad_joke_request = requests.get(
    url=dad_joke_url,
    headers={'Accept': 'application/json'})
    
    if dad_joke_request.status_code == 200:
        dad_joke_json = dad_joke_request.json()
        jokes.append(dad_joke_json['joke'])

for i in jokes:
    print(i)
    

What did the big flower say to the littler flower? Hi, bud!
A butcher accidentally backed into his meat grinder and got a little behind in his work that day.
How many bones are in the human hand? A handful of them.
What do you call your friend who stands in a hole? Phil.
What do you call a careful wolf? Aware wolf.
How do hens stay fit? They always egg-cercise!
Why did the teddy bear say “no” to dessert? Because she was stuffed.
How did Darth Vader know what Luke was getting for Christmas? He felt his presents.
Did you know that ghosts call their true love their ghoul-friend?
I heard there was a new store called Moderation. They have everything there


Use your Python skills to find the longest joke in your list (in terms of number of characters)

In [89]:
# joke_length = 0

# for joke in jokes:
#     joke_length = len(joke)
    
#     if joke_length > len(joke):
#         longest_joke = joke
        
# print(longest_joke)  

A butcher accidentally backed into his meat grinder and got a little behind in his work that day.


In [86]:
joke_length = 0

for joke in jokes:
    if len(joke) > joke_length:
        joke_length = len(joke)
        longest_joke = joke 
        
print(longest_joke)

A butcher accidentally backed into his meat grinder and got a little behind in his work that day.


In [87]:
def get_max_string(lst):
    return max(lst, key=len)

get_max_string(jokes)


'A butcher accidentally backed into his meat grinder and got a little behind in his work that day.'

In [106]:
longest_joke = max(jokes, key = len)
longest_joke

'A butcher accidentally backed into his meat grinder and got a little behind in his work that day.'

In [101]:
# lambda arguments: expression

sorted(jokes, key=lambda x: len(x), reverse=True)

['A butcher accidentally backed into his meat grinder and got a little behind in his work that day.',
 'How did Darth Vader know what Luke was getting for Christmas? He felt his presents.',
 'I heard there was a new store called Moderation. They have everything there',
 'Why did the teddy bear say “no” to dessert? Because she was stuffed.',
 'Did you know that ghosts call their true love their ghoul-friend?',
 'What did the big flower say to the littler flower? Hi, bud!',
 'How many bones are in the human hand? A handful of them.',
 'What do you call your friend who stands in a hole? Phil.',
 'How do hens stay fit? They always egg-cercise!',
 'What do you call a careful wolf? Aware wolf.']

---

# <font color='blue'>  Getting data from APIs

Let's look at another API that actually returns some data, in this case about the Star Wars universe.

We will use the Star Wars API at https://swapi.dev.
    
Their "root" API returns all the possible API endpoints and their urls, let's start there.

In [93]:
star_wars_root = "https://swapi.dev/api/"

star_wars_result = requests.get(star_wars_root)

star_wars_result.raise_for_status() # throws an error message if we don't get a 200 code

Let's extract the JSON and see what possible endpoints there are

In [94]:
star_wars_result.json()

{'people': 'https://swapi.dev/api/people/',
 'planets': 'https://swapi.dev/api/planets/',
 'films': 'https://swapi.dev/api/films/',
 'species': 'https://swapi.dev/api/species/',
 'vehicles': 'https://swapi.dev/api/vehicles/',
 'starships': 'https://swapi.dev/api/starships/'}

Let's look at the one for vehicles

In [95]:
vehicles_json = requests.get("http://swapi.dev/api/vehicles").json()
vehicles_json

{'count': 39,
 'next': 'https://swapi.dev/api/vehicles/?page=2',
 'previous': None,
 'results': [{'name': 'Sand Crawler',
   'model': 'Digger Crawler',
   'manufacturer': 'Corellia Mining Corporation',
   'cost_in_credits': '150000',
   'length': '36.8 ',
   'max_atmosphering_speed': '30',
   'crew': '46',
   'passengers': '30',
   'cargo_capacity': '50000',
   'consumables': '2 months',
   'vehicle_class': 'wheeled',
   'pilots': [],
   'films': ['https://swapi.dev/api/films/1/',
    'https://swapi.dev/api/films/5/'],
   'created': '2014-12-10T15:36:25.724000Z',
   'edited': '2014-12-20T21:30:21.661000Z',
   'url': 'https://swapi.dev/api/vehicles/4/'},
  {'name': 'T-16 skyhopper',
   'model': 'T-16 skyhopper',
   'manufacturer': 'Incom Corporation',
   'cost_in_credits': '14500',
   'length': '10.4 ',
   'max_atmosphering_speed': '1200',
   'crew': '1',
   'passengers': '1',
   'cargo_capacity': '50',
   'consumables': '0',
   'vehicle_class': 'repulsorcraft',
   'pilots': [],
   'fil

In [96]:
vehicles_json.keys()

dict_keys(['count', 'next', 'previous', 'results'])

Looks like the actual data is stored in a key called `results`

In [97]:
vehicles_json["results"]

[{'name': 'Sand Crawler',
  'model': 'Digger Crawler',
  'manufacturer': 'Corellia Mining Corporation',
  'cost_in_credits': '150000',
  'length': '36.8 ',
  'max_atmosphering_speed': '30',
  'crew': '46',
  'passengers': '30',
  'cargo_capacity': '50000',
  'consumables': '2 months',
  'vehicle_class': 'wheeled',
  'pilots': [],
  'films': ['https://swapi.dev/api/films/1/',
   'https://swapi.dev/api/films/5/'],
  'created': '2014-12-10T15:36:25.724000Z',
  'edited': '2014-12-20T21:30:21.661000Z',
  'url': 'https://swapi.dev/api/vehicles/4/'},
 {'name': 'T-16 skyhopper',
  'model': 'T-16 skyhopper',
  'manufacturer': 'Incom Corporation',
  'cost_in_credits': '14500',
  'length': '10.4 ',
  'max_atmosphering_speed': '1200',
  'crew': '1',
  'passengers': '1',
  'cargo_capacity': '50',
  'consumables': '0',
  'vehicle_class': 'repulsorcraft',
  'pilots': [],
  'films': ['https://swapi.dev/api/films/1/'],
  'created': '2014-12-10T16:01:52.434000Z',
  'edited': '2014-12-20T21:30:21.665000Z

Which is a Python list

In [98]:
type(vehicles_json["results"])

list

Working with JSON and dictionaries is great because it's very standardised, but it's not a very pretty data format to work with. 

Ideally we want a way of working with this data in Python using `pandas`.

Let's take a look at how easy it is to convert JSON/dictionaries into a ```DataFrame```.

In [99]:
import pandas as pd

vehicles = pd.DataFrame(vehicles_json["results"])
vehicles.head()

Unnamed: 0,name,model,manufacturer,cost_in_credits,length,max_atmosphering_speed,crew,passengers,cargo_capacity,consumables,vehicle_class,pilots,films,created,edited,url
0,Sand Crawler,Digger Crawler,Corellia Mining Corporation,150000,36.8,30,46,30,50000,2 months,wheeled,[],"[https://swapi.dev/api/films/1/, https://swapi...",2014-12-10T15:36:25.724000Z,2014-12-20T21:30:21.661000Z,https://swapi.dev/api/vehicles/4/
1,T-16 skyhopper,T-16 skyhopper,Incom Corporation,14500,10.4,1200,1,1,50,0,repulsorcraft,[],[https://swapi.dev/api/films/1/],2014-12-10T16:01:52.434000Z,2014-12-20T21:30:21.665000Z,https://swapi.dev/api/vehicles/6/
2,X-34 landspeeder,X-34 landspeeder,SoroSuub Corporation,10550,3.4,250,1,1,5,unknown,repulsorcraft,[],[https://swapi.dev/api/films/1/],2014-12-10T16:13:52.586000Z,2014-12-20T21:30:21.668000Z,https://swapi.dev/api/vehicles/7/
3,TIE/LN starfighter,Twin Ion Engine/Ln Starfighter,Sienar Fleet Systems,unknown,6.4,1200,1,0,65,2 days,starfighter,[],"[https://swapi.dev/api/films/1/, https://swapi...",2014-12-10T16:33:52.860000Z,2014-12-20T21:30:21.670000Z,https://swapi.dev/api/vehicles/8/
4,Snowspeeder,t-47 airspeeder,Incom corporation,unknown,4.5,650,2,0,10,none,airspeeder,"[https://swapi.dev/api/people/1/, https://swap...",[https://swapi.dev/api/films/2/],2014-12-15T12:22:12Z,2014-12-20T21:30:21.672000Z,https://swapi.dev/api/vehicles/14/


Because `vehicles_json["results"]` is a list of dictionaries (that correspond to individual entries in a table) we can use the list to initialise a DataFrame, and the right column names will be picked up. We can then analyse this data like any other!

Just beware that the only problem is the data will be all strings by default!

# <font color='blue'> API parameters
    
To allow more targeted data access, most APIs have additional *parameters* you can use to filter the results accordingly.
    
According to the API documentation _"All resources support a search parameter that filters the set of resources returned."_ and the [vehicles API documentation](https://swapi.dev/documentation#vehicles) tells us we can search on the `name` field (i.e. search for a vehicle by name).
    
In most cases it is as simple as appending the parameters to the end of the url:

In [107]:
specific_vehicle_url = "https://swapi.dev/api/vehicles?search=Sand"
requests.get(specific_vehicle_url).json()["results"]

[{'name': 'Sand Crawler',
  'model': 'Digger Crawler',
  'manufacturer': 'Corellia Mining Corporation',
  'cost_in_credits': '150000',
  'length': '36.8 ',
  'max_atmosphering_speed': '30',
  'crew': '46',
  'passengers': '30',
  'cargo_capacity': '50000',
  'consumables': '2 months',
  'vehicle_class': 'wheeled',
  'pilots': [],
  'films': ['https://swapi.dev/api/films/1/',
   'https://swapi.dev/api/films/5/'],
  'created': '2014-12-10T15:36:25.724000Z',
  'edited': '2014-12-20T21:30:21.661000Z',
  'url': 'https://swapi.dev/api/vehicles/4/'}]

The `?` separates the url from the parameters, which are specified in a `key=value` format (separated by an `&` if we use multiple parameters)

## <font color='red'> Exercise 2: Star Wars

Now let's use our knowledge of APIs to explore some data about the different species within the Star Wars universe.

Here is the specific documentation: [Species API](https://swapi.dev/documentation#species)
    
First identify the correct url for the `species` API and return some results:

In [109]:
species_url ='https://swapi.dev/api/species/'
species_result = requests.get(species_url)
species_result.raise_for_status()

Create a variable that contains the JSON only

In [110]:
species_json = species_result.json()
species_json

{'count': 37,
 'next': 'https://swapi.dev/api/species/?page=2',
 'previous': None,
 'results': [{'name': 'Human',
   'classification': 'mammal',
   'designation': 'sentient',
   'average_height': '180',
   'skin_colors': 'caucasian, black, asian, hispanic',
   'hair_colors': 'blonde, brown, black, red',
   'eye_colors': 'brown, blue, green, hazel, grey, amber',
   'average_lifespan': '120',
   'homeworld': 'https://swapi.dev/api/planets/9/',
   'language': 'Galactic Basic',
   'people': ['https://swapi.dev/api/people/66/',
    'https://swapi.dev/api/people/67/',
    'https://swapi.dev/api/people/68/',
    'https://swapi.dev/api/people/74/'],
   'films': ['https://swapi.dev/api/films/1/',
    'https://swapi.dev/api/films/2/',
    'https://swapi.dev/api/films/3/',
    'https://swapi.dev/api/films/4/',
    'https://swapi.dev/api/films/5/',
    'https://swapi.dev/api/films/6/'],
   'created': '2014-12-10T13:52:11.567000Z',
   'edited': '2014-12-20T21:36:42.136000Z',
   'url': 'https://swap

_Optional: put the results in a pandas DataFrame to see it as a table of data_

In [118]:
species = pd.DataFrame(species_json["results"])
species.head()

Unnamed: 0,name,classification,designation,average_height,skin_colors,hair_colors,eye_colors,average_lifespan,homeworld,language,people,films,created,edited,url
0,Human,mammal,sentient,180.0,"caucasian, black, asian, hispanic","blonde, brown, black, red","brown, blue, green, hazel, grey, amber",120,https://swapi.dev/api/planets/9/,Galactic Basic,"[https://swapi.dev/api/people/66/, https://swa...","[https://swapi.dev/api/films/1/, https://swapi...",2014-12-10T13:52:11.567000Z,2014-12-20T21:36:42.136000Z,https://swapi.dev/api/species/1/
1,Droid,artificial,sentient,,,,,indefinite,,,"[https://swapi.dev/api/people/2/, https://swap...","[https://swapi.dev/api/films/1/, https://swapi...",2014-12-10T15:16:16.259000Z,2014-12-20T21:36:42.139000Z,https://swapi.dev/api/species/2/
2,Wookie,mammal,sentient,210.0,gray,"black, brown","blue, green, yellow, brown, golden, red",400,https://swapi.dev/api/planets/14/,Shyriiwook,"[https://swapi.dev/api/people/13/, https://swa...","[https://swapi.dev/api/films/1/, https://swapi...",2014-12-10T16:44:31.486000Z,2014-12-20T21:36:42.142000Z,https://swapi.dev/api/species/3/
3,Rodian,sentient,reptilian,170.0,"green, blue",,black,unknown,https://swapi.dev/api/planets/23/,Galatic Basic,[https://swapi.dev/api/people/15/],[https://swapi.dev/api/films/1/],2014-12-10T17:05:26.471000Z,2014-12-20T21:36:42.144000Z,https://swapi.dev/api/species/4/
4,Hutt,gastropod,sentient,300.0,"green, brown, tan",,"yellow, red",1000,https://swapi.dev/api/planets/24/,Huttese,[https://swapi.dev/api/people/16/],"[https://swapi.dev/api/films/1/, https://swapi...",2014-12-10T17:12:50.410000Z,2014-12-20T21:36:42.146000Z,https://swapi.dev/api/species/5/


Use your knowledge of the returned JSON data, and Python dictionaries, to find out how many results you got. How many were you expecting based on the `count` returned in the API?

In [116]:

print(species.shape)

print(species.count())

(10, 15)
name                10
classification      10
designation         10
average_height      10
skin_colors         10
hair_colors         10
eye_colors          10
average_lifespan    10
homeworld            9
language            10
people              10
films               10
created             10
edited              10
url                 10
dtype: int64


Looks like the results are **paginated** meaning we can only get 10 results at a time.

What is the url (including the parameter) to get the next page of results?

In [119]:
species_json['next']

'https://swapi.dev/api/species/?page=2'

Let's collect all the species data now, by combining the paginated data.

Write some code (or even a `for` loop if know how!) to get subsequent pages of results.

You will need to:

- change the relevant parameter each time to get the next page of results
- for each page, store the results in a list
- keep adding each page's worth of results to a "master" list of results (hint: you can use the `extend` method on a list)

Verify that you have the correct number of records in your list.

In [121]:
species_all_data = []

page_1 = requests.get("http://swapi.dev/api/species?page=1")
page_1_data = page_1.json()["results"]
# print("page_1_data", page_1_data)
species_all_data.extend(page_1_data)

page_2 = requests.get("http://swapi.dev/api/species?page=2")
page_2_data = page_2.json()["results"]
species_all_data.extend(page_2_data)

page_3 = requests.get("http://swapi.dev/api/species?page=3")
page_3_data = page_3.json()["results"]
species_all_data.extend(page_3_data)

page_4 = requests.get("http://swapi.dev/api/species?page=4")
page_4_data = page_4.json()["results"]
species_all_data.extend(page_4_data)

print(len(species_all_data))

37


In [133]:
species_all_data = []

for i in range(1,5):
    species_url = f"http://swapi.dev/api/species?page={i}"
    
    species_result =  requests.get(species_url)
    species_page_data = species_result.json()['results']
    
    species_all_data.extend(species_page_data)

print(len(species_all_data))
    
    

37


## <font color='blue'> Authentication</font>
    
In this exercise we will play around with a new API, published by Transport for London.
    
Unline the previous examples, this one requires authentication.

Go to https://api-portal.tfl.gov.uk/signup and register your details, if you haven't already.

Once your email has been verified, sign in and go to https://api-portal.tfl.gov.uk/product#product=2357355709892 to get a free API key that allows 500 requests per minute.

Once you've successfully requested it, your API key can be retrieved from your profile: https://api-portal.tfl.gov.uk/profile. You'll need to click on "show" to actually show the key on the page.

Now you've obtained your key, using it in this particular instance is as easy as appending a new parameter:

In [None]:
# for example, let's look at live disruption information
url = "https://api.tfl.gov.uk/Line/Mode/tube/Disruption"

# remember to keep this private!
API_KEY = ""

# append the API key
tfl = requests.get(f"{url}?app_key={API_KEY}")

print(tfl.status_code)
tfl.json()

## <font color='red'> Exercise 3: Transport for London</font>

Now it's time to answer some research questions using the API.

### 1. Are there currently any lift disruptions? If so, how many?

First, read the documentation to find out the url to get data on lift disruptions.

Using the url and its parameter(s) find out how many accidents happened in the years 2019, 2020, and 2021.

Remember to include your API key!


### 2. How many bike points are there around Hyde Park?

Find the API endpoint that lets you search for bike points by location, then use the string "Hyde Park" to answer the question.

Remember to include your API key!

### 3. Find out how many stations there are along the Victoria line.

First, identify the correct endpoint on the [Line page](https://api-portal.tfl.gov.uk/api-details#api=Line)

Remember to include your API key!