# Using APIs - (Application Programing Interfaces)

In this badge you will see a number of available online data sources and try to extract some meaningful information from them.

1. We will talk about the steps to using the API:

- create a request
- perfom/make the request ('make an API call')
- filter and understand the data
- do something with the data

Then we will look at examples of a few APIs that might be already familiar to you. For some of them you will need to get yoru own `API key or token` and for some you will not need one or will be given one.

2. We will see how to simplify the data you are given - Bank of Scotland Cash Machine (ATM) API

3. We will see how to make yoru own specific requests - OMDB Movie knowledge database

# 1. Abotu APIs

## Creating the Request for data

For example if you are using a weather api, and care about weather for Edinburgh next week, you will likely do something like this (depending on the design of the api):

`www.weatherapi.com/city=Edinburgh&days=7` or 

`www.weatherapi.com/city/Edinburgh/days/7` or

`www.weatherapi.com/Edinburgh/7` or

`www.weatherapi.com/Edinburgh/week`

The format will depend on the design of the API which you will always be able to find in the documentation on the website of the API provider.

**Why API, not just a big download of a file?**

The idea of API is that the data cannot just be downloded as a file. It is often either too large (eg. financial history of all companies in UK , via Company House) or to constantly changing (eg. most recent tweets via twitter API). That's why we can request a specific information, or even specific detail level.

In general you should try to ask only for the information you care about. This way you will have less processing to do and also your calls will be faster (you'll wait for a shorter time).

- `www.weatherapi.com?city=Edinburgh` - to get daily weather for Edinburgh
- `www.weatherapi.com?city=Edinburgh&level=detailed&units=kelvin` - same but hourly data and in Kelvin

This way **outsource as much work as you can** to the data provider, so that you do not have to download, analyse or transform data you revieve.

Each API will have a different format of creating a request

## Making the request / fetching the response

In general you will make a call to the web address, that you prepared above (which clarifiues what is it that you want to ask). The **response** to your call/request will be 
data. This data can come in a number of formats, comma separated, JSON, Text or even binary.


#### Common Response Formats

**Comma separated:**

It's a bit like simplified excel spreadsheet. Usially the first line is the name of columns, and each line afterwards in one data point. Commas separate the columns, but there also exist formats where it is a Tab character, or semicolon which  separates data.

`
day,temperature,humidity
monday,21,72
tuesday,23,68
wednesday,24,66
`

**JSON:**

JSON (JavaScript Object Notation) has NOTHING to do with JavaSctipt language (name was a marketting ploy to make it more popular). But for you it will be very familiar - JSON is basically **Exactly identical to Python Dictionaries and lists** 

`
[
{'day':'monday','temperature':21,'humidity':72},
{'day':'tuesday','temperature':23,'humidity':68},
{'day':'wednesday','temperature':24,'humidity':66},
`

# 2. Cleaning up the data

Often the api response will contain extra information that you do not need. It will be your task to identify the information you need and discarding everything else.

Typical data that comes with the request, would be:

- version of the API
- time request was made
- details of the request (repeating again what you asked for)

We will have a look at the example of Bank of Scotland ATMS to illustrate this:

### Bank of Scotland Cash Machine (ATM) API

https://api.bankofscotland.co.uk/open-banking/v2.2/atms

This API is not limited and has no key. Yay! You can use it as much as you'd like,

Below is an example response of this API

When you try to read it, ask how would you print the TownName of the first ATM?

In [None]:
# reading hint: when you click on a bracket, it becomes green and its closing-opening counterpart also
# becomes green

example_api_response = {'meta': {'LastUpdated': '2020-11-20T11:59:10.095Z',
                                  'TotalResults': 422,
                                  'Agreement': 'Use of the APIs and any related data ...',
                                  'TermsOfUse': 'https://www.openbanking.org.uk/terms'},
                        'data': [{'Brand': 
                                     [ { 'BrandName': 'Bank Of Scotland',
                                         'ATM': 
                                              [{'Identification': 'BFF7BC11',
                                              'SupportedLanguages': ['eng', 'spa', 'ger', 'fre'],
                                              'ATMServices': ['PINUnblock',
                                               'Balance',
                                               'BillPayments',
                                               'CashWithdrawal',
                                               'FastCash',
                                               'MobilePhoneTopUp',
                                               'PINChange',
                                               'MiniStatement'],
                                              'Accessibility': ['WheelchairAccess'],
                                              'SupportedCurrencies': ['GBP'],
                                              'MinimumPossibleAmount': '10',
                                              'Branch': {'Identification': '80453100'},
                                              'Location': {'LocationCategory': ['BranchExternal'],
                                               'Site': {'Identification': '80453100'},
                                               'PostalAddress': {'AddressLine': ['136 BUCHANAN STREET; BALFRON'],
                                                'BuildingNumber': 'BOS BRANCH',
                                                'StreetName': '136 BUCHANAN STREET',
                                                'TownName': 'GLASGOW',
                                                'CountrySubDivision': ['GLASGOW'],
                                                'Country': 'GB',
                                                'PostCode': 'G63 0TG',
                                                'GeoLocation': {'GeographicCoordinates': {
                                                    'Latitude': '56.071629',
                                                  'Longitude': '-4.336911'}
                                                               }
                                                                }
                                                          }
                                                 },
                                                {},
                                                {}
                                              ]
                                       }
                                     ]
                                 }
                                ]
                       }

When you try to read it, ask how would you print the TownName of the first ATM?

it would look like this:

```
print(example_api_response["data"][0]["Brand"][0]["ATM"][0]['Location']['PostalAddress']['TownName'])
```

That's quite a handful, but do not worry - we will simplify it a bit. Basically you could say that the data that we care abotu is just the list of ATMs. So we could pull it out into a variable and just reference that variable, like this:

```
all_atms = example_api_response["data"][0]["Brand"][0]["ATM"]
print(all_atms[0]['Location']['PostalAddress']['TownName'])

```

In [None]:
# print TownName of first atm
print(example_api_response["data"][0]["Brand"][0]["ATM"][0]['Location']['PostalAddress']['TownName'])

# or a bit moore DRY, doing the initial cleaning up of the data first. 
# This will be even more dry when we do this many times in a row

all_atms = example_api_response["data"][0]["Brand"][0]["ATM"]
print(all_atms[0]['Location']['PostalAddress']['TownName'])

### Requesting the actual live data:

In [None]:
# this function is already written for you. 
# It will make an API request: get_bankofscotland_api_result()
# and then simplify it to just keep the relevant data: get_list_of_atms()

import requests
import json 
import pprint as pp

def get_bankofscotland_api_result():
        response = requests.request("GET","https://api.bankofscotland.co.uk/open-banking/v2.2/atms")
        return response.json()
    
def get_list_of_atms():
    complete_response_from_api = get_bankofscotland_api_result()
    list_of_atms = complete_response_from_api["data"][0]["Brand"][0]["ATM"]
    return list_of_atms

In [None]:
# first the unprocessed raw data:
unprocessed_api_result = get_bankofscotland_api_result()
pp.pprint(unprocessed_api_result)

In [None]:
# or a cleaner version: function that just gives us what we need: a list of atms
all_atms = get_list_of_atms()
pp.pprint(all_atms)

In [None]:
# when testing APIs, we usually ask quite generic questions.
assert  type( all_atms) is list # check if result is a list
assert  len( all_atms) > 0 # check if there are any items
assert 'GBP' in all_atms[0]['SupportedCurrencies'] # check if first ATM has english as a language
print("tests passed")

### See it in action:

In [None]:
# this function you will need to finish

def get_number_of_atms_in_this_town(all_atms, town):
    return -1 # TODO, here write your answer, it will return a number

demo_data = [{'Identification': 'BFF7BC11', 
              'Location': { 'PostalAddress': {'StreetName': 'THE CROSS', 'TownName': 'PAISLEY'}}},
             {'Identification': 'BFF7BC12', 
              'Location': { 'PostalAddress': {'StreetName': 'TOWN SQUARE', 'TownName': 'FALKIRK'}}}]

assert get_number_of_atms_in_this_town(demo_data, 'FALKIRK') == 1
assert get_number_of_atms_in_this_town(demo_data, 'GLASGOW') == 0
print("tests passed")

<details><summary style='color:blue'>CLICK HERE TO SEE THE HINT. BUT REALLY TRY TO DO IT YOURSELF FIRST!</summary>
    
    ### BEGIN SOLUTION
    return len([atm 
                for atm in all_atms 
                if atm['Location']['PostalAddress']['TownName'] == town])
    ### END SOLUTION
</details>

In [None]:
# this function you will need to finish

def number_of_atms_as_sentence(count_of_atms, town):
     return f"TODO: here write your answer"

assert number_of_atms_as_sentence(20, 'Glasgow') == "There are 20 ATMs of Bank Of Scotland in Glasgow"
assert number_of_atms_as_sentence(20, 'GLASGOW') == "There are 20 ATMs of Bank Of Scotland in Glasgow"
assert number_of_atms_as_sentence(2, 'Falkirk') == "There are 2 ATMs of Bank Of Scotland in Falkirk"
print("tests passed")

<details><summary style='color:blue'>CLICK HERE TO SEE THE HINT. BUT REALLY TRY TO DO IT YOURSELF FIRST!</summary>
    
    ### BEGIN SOLUTION
     return f"There are {count_of_atms} ATMs of Bank Of Scotland in {town.title()}"
    ### END SOLUTION
</details>

In [None]:
# this function is already written for you, but will only work once you complete above functions

def answer_demo_question_1():
    a_town = 'Glasgow'
    all_atms = get_list_of_atms() 
    number_of_atms_in_town = get_number_of_atms_in_this_town( all_atms, a_town)
    sentence_with_findings =  number_of_atms_as_sentence(number_of_atms_in_town, a_town) 
    return sentence_with_findings
    # if you were creating a graph, this is where you would create and show it
    
# no need to test top-most function, because you tested all of it's components. Just call it for now 
print( answer_demo_question_1()) 

# 3. Making your own specific requests:

The real power of APIs is visible when datasets are too large, or sensitive, to gavce them all on your own machine. That's where you will make your own requests, a bit like usign quesry languages like SQL.

In SQL you will say something like 

`SELECT * FROM Movies WHERE Title LIKE '%pokemon%' AND Year = 1999 AND Type = 'Movie';`

but in an api we will clarify this in the api call's url:

`http://www.omdbapi.com/?s=pokemon&y=1999&type=movie&apikey={api_key}`

In [None]:
import requests
import math
import pprint as pp

In [None]:
# Here's a 'quick and dirty' way to do it. It works, but just below you will see th eproper way.

url = "http://www.omdbapi.com/?s=pokemon&y=1999&type=movie&apikey=dd14dc5f"
pp.pprint(requests.request("GET", url).json())

### Let's do it properly: function that takes variables:

Some parts of the request will always be the same (as in: they are not 'variable/changable'). Everything else should come into the function as a variable:

In [None]:
def get_basic_list_of_movies_for_title(movie_title, year_from = ""): 
    '''get basic info of movies. Optionally also specify year'''
    
    api_key = "dd14dc5f" 
    query = movie_title
    result_type="movie"
    year = year_from
    page = 1
    
    url = f"http://www.omdbapi.com/?s={query}&type={result_type}&y={year}&page={page}&apikey={api_key}"
    result_dict = requests.request("GET", url).json()
    cleaned_up_result = result_dict['Search'] 
    return cleaned_up_result

In [None]:
pokemon_movies = get_basic_list_of_movies_for_title("Pokemon")
print('number of results: ', len(pokemon_movies))
pp.pprint(pokemon_movies)

### But... Why are there just 10 results? and where are all the details? 

Notice that this information is not very detailed! For each request you only get first 10 results. And even then you only get year, title and ID! There is not much we can do with that, can we?

We will solve it in two steps:

1. To get more items, we will use paging.

Like in a book, something can be on page 1, and then page 2, 3 and so on. you can ask the API for the 'Next PAge', or even "Page 15".

2. To get more details, we will make a further API call.

We will explore another API call (different URL on the same API) which can give us more detaile, but only for 1 movie id. And where are we going to get a movie id from? Oh, luckilly, the first 'simple' API call gave us Title, Yerar and ID of each movie. So we will 'reuse' the response of the first API to get more details from the second one.

### Paging

We will modify the function to be able to request more than 10 movies. What we will do, is to say: 

- request 10 movies, then increase the page 
- request the next 10 movies, and so on...

Important note: 

PLEASE DO NOT REQUEST MORE THAN 10 PAGES - if we keep attaching the API a lot of time, it will end up thinking that it is under a cyber attack and will block our school IP address, which will make it impossible to use it for a while. So please be considerable and only make API requests you need. In general, during a lab, you should be fine by making 500ish requests.

In [None]:
def get_basic_list_of_paged_movies_for_title(movie_title, year_from = "", number_of_pages = 1): 
    '''get basic info of movies. Optionally also specify year'''
    
    api_key = "dd14dc5f" 
    query = movie_title
    result_type="movie"
    year = year_from
    
    movies_basic_info = []
    for page in range(1, number_of_pages + 1): 
        # range is non-inclusive, so to have a range 1,2,3 use range(1,4)
        
        url = f"http://www.omdbapi.com/?s={query}&page={page}&apikey={api_key}&type={result_type}&y={year}"
        result_dict = requests.request("GET", url).json()
        # add new results to the final list
        movies_basic_info.extend( result_dict['Search'] ) 
        
        # just in case if there are NO MORE PAGES TO SEE 
        #(eg. there are 33 movies, and you just requested 4th page, so you have them all)
        totalResults = int(result_dict['totalResults']) 
        number_of_pages = math.ceil(totalResults / 10) # 10 movies per page. ceil means round UP
        if  number_of_pages <= page:
            break # if reached last page, stop looping

    return movies_basic_info

In [None]:
pokemon_movies = get_basic_list_of_paged_movies_for_title("Pokemon", number_of_pages = 3)
print('number of results: ', len(pokemon_movies))
pp.pprint(pokemon_movies)

In [None]:
# but notice that in the year 1999 there were only 14 pokemon movies (only??? WOW!)
pokemon_movies = get_basic_list_of_paged_movies_for_title("Pokemon", year_from='1999', number_of_pages = 3)
print('number of results: ', len(pokemon_movies))
pp.pprint(pokemon_movies)

### And Finally: Where are all the details?

Once we know movie ID, we can use it to make an API call to a DIFFERENT ENDPOINT of the same API

**ENDPOINT** is one type of question we can ask the API. You can think of it as a cable socket on your laptop: you might have a USB plug, monitor plug, charger plug, earphones plug. Each of them is used to communicate different information in a different format. Each API endpoint will allow you to ask a slightly differernt question, if a slightly different format, and might give you the answer in a slightly different way.

### Search API endpoint:

We already know this **Search** endpoint that enables us to get very 'basic' info of a number of movies:

`http://www.omdbapi.com/?s=`

which you would use like this:

`http://www.omdbapi.com/?s=pokemon&y=1999&type=movie&apikey=dd14dc5f`

### Details API endpoint:

`http://www.omdbapi.com/?i=`

which you would use like this:

`http://www.omdbapi.com/?i=tt5884052&apikey=dd14dc5f`


In [None]:
# again, quick and dirty first:
movie_id = "tt5884052"
url = f"http://www.omdbapi.com/?i={movie_id}&apikey=dd14dc5f"
pp.pprint(requests.request("GET", url).json())

In [None]:
# but if id is incorrect:
# again, quick and dirty first:
movie_id = "banana"
url = f"http://www.omdbapi.com/?i={movie_id}&apikey=dd14dc5f"
pp.pprint(requests.request("GET", url).json())

In [None]:
# Now let's go for the proper usage with a function:

def get_details_of_movie_with_id(movie_id): 
    api_key = "dd14dc5f" 
    url = f"http://www.omdbapi.com/?i={movie_id}&apikey={api_key}"
    result_dict = requests.request("GET", url).json()
    
    return result_dict if result_dict['Response'] == "True" else None 
    # above is a ternary (simplified if) value_if_true if condition else value_if_false
    # if movie does not exist, return None

In [None]:
detective_pikatchu = get_details_of_movie_with_id("tt5884052")
pp.pprint(detective_pikatchu)

In [None]:
#also try these movie ids:
"tt0086190"
"tt0451279"

# Bringing it all together:

Below code will:

- ask for all movie ids for a particular title (it will ask for 20 movies)
- then in a loop, it will use those movie ids to ask for the details of each movie
- then it will sort movies by their length in minutes, and show the titles of 5 longest ones

In [None]:
# notice I added prints so that you see what is happening 'inside' the function when it runs

def movie_length_as_number(movie):
    # turn '104 min' into 104
    return int(movie['Runtime'].rstrip(" min"))

def titles_of_5_longest_movies_with_title(title):
    simple_movie_infos = get_basic_list_of_paged_movies_for_title(title, number_of_pages = 2)
    print(f"found {len(simple_movie_infos)} movies")
    movie_details = []
    for one_info in simple_movie_infos:
        print(f"ask for details of {one_info['Title']}")
        movie_details.append( get_details_of_movie_with_id(one_info['imdbID']))
    movie_details.sort(key=movie_length_as_number, reverse=True)
    return [
        f"{movie['Title']} is {movie['Runtime']} long" 
        for movie in movie_details[0:5]
           ]


In [None]:
longest = titles_of_5_longest_movies_with_title("star wars")

In [None]:
pp.pprint(longest)

# Now it's your turn: Here are some other APIs

Try to get some data from these APIs (you might need to get the API key to access some of them):

Remember the steps:

- create a url with which you can ask a question
- write a function which will fill this url with your variables
- call that function and do something with the data

Here are two APIs to start with:

## API: weatherapi.com

Below is an example of taking an hourly weather for next 3 days and 

In [None]:
import requests
import pprint as pp

def get_weather_for_city(city):
    api_key = "to get an API key sign up to a free acount at https://www.weatherapi.com/"

    api_url_current_weather = f"http://api.weatherapi.com/v1/forecast.json?key={api_key}&q={city}&days=3"
    
    response = requests.request("GET", api_url_current_weather)
    return response.json()

In [None]:
forecast_edinburgh = get_weather_for_city("edinburgh")

In [None]:
def condition_to_emoji(condition_text):
    weathers = {
        'Patchy rain possible':'🌧',
        'Cloudy':'☁️',
        'Overcast':'🌦',
        'Partly cloudy':'🌥',
        'Mist':'💨',
        'Sunny':'☀️',
        'Clear': '🌤',
        'Light drizzle':'🌧',
        'Light rain shower':'🌧'
        
    }
    return weathers.get(condition_text, '❓')

def forecast_to_emoji(full_forecast):
    all_hours_as_emoji = [
     condition_to_emoji(one_hour['condition']['text'])
     for one_day in full_forecast['forecast']['forecastday']
     for one_hour in one_day['hour']
    ]
    # above is an example of using double-deep list comprehension
    return "".join(all_hours_as_emoji) 
    # "".join(some_list)  <--- this will join a list into a long string

print(forecast_to_emoji(forecast_edinburgh))

### Your turn:

Formulate a question, and then answer it with data:

for example:
- Which of these three cities will be warmer tomorrow? Edinburgh, Glasgow, Inverness
- On average which of the next 3 days will be least likely to rain in Edinburgh?
- Any other question that comes to yoru mind...

In [None]:
# here you can write your code

## API: Airplane departures - aviationstack.com

You will need to create an account on their website. Free account only allows you to make 100 requests! That is not very many! (They unhelpfully do not specify if it is 100 per hour, day or month :( )



In [None]:
def planes_departures(city):
    api_key = "get a key at aviationstack.com" 
    url = f"http://api.aviationstack.com/v1/flights?access_key={api_key}&dep_icao={city}"
    response = requests.request("GET", url)
    return response.json()

In [None]:
# edinburgh = EGPH
# glasgow = EGPK
edinburgh_flights = planes_departures("EGPH")
# since your number of requests is limited, ask once, and then use the data many times, 
# in other cells

In [None]:
from datetime import datetime

def flight_as_sentence(flight):
    dep_loc = flight['departure']['airport']
    arr_loc = flight['arrival']['airport']
    dep_time = datetime.fromisoformat(flight['departure']['scheduled'])
    arr_time = datetime.fromisoformat(flight['arrival']['scheduled'])
    
    duration_minutes = (arr_time - dep_time).total_seconds() // (60)
    kg_of_co2_per_minute_whole_plane = 1.5
    average_passengers = 150
    kg_of_co2_per_minute_per_passenger = kg_of_co2_per_minute_whole_plane/ average_passengers
    kg_of_co2 = duration_minutes*kg_of_co2_per_minute_per_passenger
    
    kg_of_co2_in_a_burger = 4.5
    burgers_quivalent = round(kg_of_co2 / kg_of_co2_in_a_burger, 2)
    
    return f"flight {dep_loc} to {arr_loc} uses {kg_of_co2} kg CO2, same as {burgers_quivalent} burgers 🍔"

def departure_board(all_flights):
    return [
        flight_as_sentence(flight)
        for flight in all_flights['data']
    ]

In [None]:
departure_board(edinburgh_flights)

### Your turn:

Formulate a question, and then answer it with data:

for example:
- Which of these three cities will be warmer tomorrow? Edinburgh, Glasgow, Inverness
- On average which of the next 3 days will be least likely to rain in Edinburgh?
- Any other question that comes to yoru mind...

In [None]:
# here you can write your code