# APIS

## Internet communication

## URLs

**U**niform **R**esource **L**ocator

Contains the information about a resource we (the CLIENT) are requesting from a SERVER

http://www.google.com/search?q=puppies

http://127.0.0.1:306/invocations

- Protocol: http
- Top Level Domain: com
- Domain: google
- Subdomain: www
- IP: 127.0.0.1
- Port: 306
- Route/Folder/Path: search/invocations
- Query Parameters: q=puppies

## HTTP

**H**yper **T**ext **T**transfer **P**rotocol (**S**ecure)   

HTTP(S) is a protocol that provides a structure for request between a client and a server.
For example, the web browser of a user (the client) uses HTTP to request information from a server that hoist a website

### Requests

**Requests** are the questions we (clients) ask of a server to get some information (the **response**).        
Types of request (verbs):
 * GET: read info from a resource and do not change it in any way. This is the standard request that in most sites gets the HTML+CSS of the page as a response.
 * POST: send data that creates/updates a resource, or triggers some process.
 * PUT
 * DELETE
 * PATCH
 * ...

### Response
The response is usually dependent on the functionality you are looking for:
 * a JSON  
 * an image
 * a video
 * an HTML
 * ...

A response has a strucutre:
 * HEADER section: has the metadata of the communication with the server
 * BODY: has the actual information

A really important part of the HEADER is the **status code**, which is a number that the server reports to let you know how the processing of your request went:

- **2xx** successful: request received, understood and processed successfully
- **3xx** redirecting: more actions are needed to complete the request
- **4xx** client error: you (the client) made some kind of mistake
- **5xx** server error: me (the server) cannot complete your request due to some issue on my side

Complete list:    
https://en.wikipedia.org/wiki/List_of_HTTP_status_codes    


In [None]:
# requests library handles the HTTP methods and requests for Python
import requests

TFL = requests.get('https://api.tfl.gov.uk/AirQuality')
print("TFL:", TFL.status_code)

NBA = requests.get("https://api.sportsdata.io/api/nba/fantasy/json/CurrentSeason")
print("NBA:", NBA.status_code)

#rotten_tomato = requests.get("http://api.rottentomatoes.com/api/public/v1.0/lists/movies/box_office.json")
HTTPSTAT = requests.get('http://httpstat.us/500')
print("Server Error:", HTTPSTAT.status_code)

TFL: 200
NBA: 401
Server Error: 500


## API

**A**pplication **P**rogramming **I**nterface
* It's a "contract" between software products: your component gets to communicate with mine in this way. I can change the way I implement that request, but the interface stays the same.

## Let's go

### Requests en Python

In [None]:
TFL = requests.get('https://api.tfl.gov.uk/AirQuality')

In [None]:
TFL

<Response [200]>

In [None]:
TFL.headers

{'Date': 'Sat, 25 Nov 2023 11:37:53 GMT', 'Content-Type': 'application/json; charset=utf-8', 'Content-Length': '874', 'Connection': 'keep-alive', 'Cache-Control': 'public, must-revalidate, max-age=1800, s-maxage=3600', 'Via': '1.1 varnish', 'Content-Encoding': 'gzip', 'Accept-Ranges': 'bytes', 'Age': '42', 'Access-Control-Allow-Headers': 'Content-Type', 'Access-Control-Allow-Methods': 'GET,POST,PUT,DELETE,OPTIONS', 'Access-Control-Allow-Origin': '*', 'Api-Entity-Payload': 'AirQuality', 'X-Backend': 'api', 'X-Cache': 'HIT', 'X-Cache-Hits': '1', 'X-Cacheable': 'Yes. Cacheable', 'X-Frame-Options': 'deny', 'X-Proxy-Connection': 'unset', 'X-TTL': '3600.000', 'X-TTL-RULE': '0', 'X-Varnish': '2029655540 2029646208', 'X-AspNet-Version': '4.0.30319', 'X-Operation': 'AirQuality_Get', 'X-API': 'AirQuality', 'CF-Cache-Status': 'DYNAMIC', 'Strict-Transport-Security': 'max-age=31536000', 'Set-Cookie': '_cfuvid=fE8U.8E3V8NvWuSryVADZNSmVQHztF.AVPN7qVWAfAY-1700912273960-0-604800000; path=/; domain=.tfl

In [None]:
TFL.content

b'{"$id":"1","$type":"Tfl.Api.Presentation.Entities.LondonAirForecast, Tfl.Api.Presentation.Entities","updatePeriod":"hourly","updateFrequency":"1","forecastURL":"http://londonair.org.uk/forecast","disclaimerText":"This forecast is intended to provide information on expected pollution levels in areas of significant public exposure. It may not apply in very specific locations close to unusually strong or short-lived local sources of pollution.","currentForecast":[{"$id":"2","$type":"Tfl.Api.Presentation.Entities.CurrentForecast, Tfl.Api.Presentation.Entities","forecastType":"Current","forecastID":"43008","forecastBand":"Low","forecastSummary":"Low air pollution forecast valid from Saturday 25 November to end of Saturday 25 November GMT","nO2Band":"Low","o3Band":"Low","pM10Band":"Low","pM25Band":"Low","sO2Band":"Low","forecastText":"A cold day on Saturday, mostly clear and sunny. &lt;br/&gt;&lt;br/&gt;A high altitude Arctic air feed via the UK is forecast. This is expected to be relative

### JSON response

In [None]:
TFL.json()

{'$id': '1',
 '$type': 'Tfl.Api.Presentation.Entities.LondonAirForecast, Tfl.Api.Presentation.Entities',
 'updatePeriod': 'hourly',
 'updateFrequency': '1',
 'forecastURL': 'http://londonair.org.uk/forecast',
 'disclaimerText': 'This forecast is intended to provide information on expected pollution levels in areas of significant public exposure. It may not apply in very specific locations close to unusually strong or short-lived local sources of pollution.',
 'currentForecast': [{'$id': '2',
   '$type': 'Tfl.Api.Presentation.Entities.CurrentForecast, Tfl.Api.Presentation.Entities',
   'forecastType': 'Current',
   'forecastID': '43008',
   'forecastBand': 'Low',
   'forecastSummary': 'Low air pollution forecast valid from Saturday 25 November to end of Saturday 25 November GMT',
   'nO2Band': 'Low',
   'o3Band': 'Low',
   'pM10Band': 'Low',
   'pM25Band': 'Low',
   'sO2Band': 'Low',
   'forecastText': 'A cold day on Saturday, mostly clear and sunny. &lt;br/&gt;&lt;br/&gt;A high altitud

### Managing responses in pandas

In [None]:
import pandas as pd

In [None]:
weather_data = pd.DataFrame.from_dict(TFL.json())
weather_data.head()

Unnamed: 0,$id,$type,updatePeriod,updateFrequency,forecastURL,disclaimerText,currentForecast
0,1,Tfl.Api.Presentation.Entities.LondonAirForecas...,hourly,1,http://londonair.org.uk/forecast,This forecast is intended to provide informati...,"{'$id': '2', '$type': 'Tfl.Api.Presentation.En..."
1,1,Tfl.Api.Presentation.Entities.LondonAirForecas...,hourly,1,http://londonair.org.uk/forecast,This forecast is intended to provide informati...,"{'$id': '3', '$type': 'Tfl.Api.Presentation.En..."


In [None]:
#Not ideal, part of the request is still in json. This is a nested json...
weather_data['currentForecast'][1]

{'$id': '3',
 '$type': 'Tfl.Api.Presentation.Entities.CurrentForecast, Tfl.Api.Presentation.Entities',
 'forecastType': 'Future',
 'forecastID': '43009',
 'forecastBand': 'Low',
 'forecastSummary': 'Low air pollution forecast valid from Sunday 26 November to end of Sunday 26 November GMT',
 'nO2Band': 'Low',
 'o3Band': 'Low',
 'pM10Band': 'Low',
 'pM25Band': 'Low',
 'sO2Band': 'Low',
 'forecastText': 'Cloudy and cold on Sunday with some rain possible later in the night. &lt;br/&gt;&lt;br/&gt;A continuing Arctic air feed is expected, travelling from a lower altitude via the UK. Locally the breeze is forecast to be light and southerly or south-westerly. Air arriving over London and the southeast is not expected to contain any significant pollution, in addition the breeze should ensure some dispersion of local emissions.&lt;br/&gt;&lt;br/&gt;Air pollution is expected to remain &#39;Low&#39; throughout the forecast period for the following pollutants:&lt;br/&gt;&lt;br/&gt;Nitrogen Dioxide&

In [None]:
# There is a function in pandas to un-nest jsons, but it makes some assumptions and sometimes we have to unpack hierarchical structures ourselves
# beware this usually involves a lot of for loops and apply functions
pd.json_normalize(weather_data['currentForecast'])

Unnamed: 0,$id,$type,forecastType,forecastID,forecastBand,forecastSummary,nO2Band,o3Band,pM10Band,pM25Band,sO2Band,forecastText
0,2,"Tfl.Api.Presentation.Entities.CurrentForecast,...",Current,42993,Low,Low air pollution forecast valid from Friday 2...,Low,Low,Low,Low,Low,Friday will be mainly dry with periods of sunn...
1,3,"Tfl.Api.Presentation.Entities.CurrentForecast,...",Future,43008,Low,Low air pollution forecast valid from Saturday...,Low,Low,Low,Low,Low,"A cold day on Saturday, mostly clear and sunny..."


### Parameters

In [None]:
r = requests.get('https://v2.jokeapi.dev/joke/programming')
r.json()


{'error': False,
 'category': 'Programming',
 'type': 'single',
 'flags': {'nsfw': False,
  'religious': False,
  'political': False,
  'racist': False,
  'sexist': False,
  'explicit': False},
 'id': 38,
 'safe': True,
 'lang': 'en'}

Sometimes we want to pass parameters to the endpoint, just like we pass arguments to functions in python  
We pass parameters via the url as `?param1=value1&param2=value2...` at the end of the url

In [None]:
r = requests.get('https://v2.jokeapi.dev/joke/programming?contains=python&amount=3')
r.json()

{'error': False,
 'amount': 3,
 'jokes': [{'category': 'Programming',
   'type': 'twopart',
   'setup': 'Why did the Python programmer not respond to the foreign mails he got?',
   'delivery': 'Because his interpreter was busy collecting garbage.',
   'flags': {'nsfw': False,
    'religious': False,
    'political': False,
    'racist': False,
    'sexist': False,
    'explicit': False},
   'id': 15,
   'safe': True,
   'lang': 'en'},
  {'category': 'Programming',
   'type': 'twopart',
   'setup': 'Why did the Python data scientist get arrested at customs?',
   'delivery': 'She was caught trying to import pandas!',
   'flags': {'nsfw': False,
    'religious': False,
    'political': False,
    'racist': False,
    'sexist': False,
    'explicit': False},
   'id': 234,
   'safe': True,
   'lang': 'en'},
  {'category': 'Programming',
   'type': 'twopart',
   'setup': 'why do python programmers wear glasses?',
   'delivery': "Because they can't C.",
   'flags': {'nsfw': False,
    'religi

`requests` allows us to pass parameters as a dictionary, which is somewhat "cleaner"

In [None]:
params_dict = {"contains":"python","amount":"3"}

In [None]:
r = requests.get('https://v2.jokeapi.dev/joke/programming',params=params_dict)
r.json()

{'error': False,
 'amount': 3,
 'jokes': [{'category': 'Programming',
   'type': 'twopart',
   'setup': 'Why did the Python programmer not respond to the foreign mails he got?',
   'delivery': 'Because his interpreter was busy collecting garbage.',
   'flags': {'nsfw': False,
    'religious': False,
    'political': False,
    'racist': False,
    'sexist': False,
    'explicit': False},
   'id': 15,
   'safe': True,
   'lang': 'en'},
  {'category': 'Programming',
   'type': 'twopart',
   'setup': 'Why did the Python data scientist get arrested at customs?',
   'delivery': 'She was caught trying to import pandas!',
   'flags': {'nsfw': False,
    'religious': False,
    'political': False,
    'racist': False,
    'sexist': False,
    'explicit': False},
   'id': 234,
   'safe': True,
   'lang': 'en'},
  {'category': 'Programming',
   'type': 'twopart',
   'setup': 'why do python programmers wear glasses?',
   'delivery': "Because they can't C.",
   'flags': {'nsfw': False,
    'religi

### Headers


HTTP headers are an important part of the request, they contain the metadata associated to the request, they carry information for

* The format and contents of the body
* Authorization credentials
* Caching and cookies
* Client information

In [None]:
url = "https://www.farfetch.com/pt/shopping/men/prada/items.aspx"
r = requests.get(url)
r.status_code

KeyboardInterrupt: ignored

In [None]:
headers= {'Accept-Encoding':'gzip, deflate',
          'Accept-Language':'en-US,en;q=0.9',
          'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.88 Safari/537.36'}

In [None]:
r = requests.get(url, headers = headers)
r.status_code

### Secrets and authentications

Secrets are often stored in local files or secrets managers. This way you never run the risk of uploading passwords, tokens or keys, while also not having to type them every time.

In [None]:
key_file = open("key_slack.txt")
for line in key_file:
    key = line

In [None]:
#never do this
key

Authentication credentials go on headers. Different APIS have different authentication mechanisms. Check the documentation.

In [None]:
r= requests.get("https://slack.com/api/conversations.list/",
              headers={'content-type':'application/json',
                       'Authorization': f'Bearer {key}'})

In [None]:
r.json()

### POST

POST requests are similar to GET requests but they allow the resource to actually *do* something, not just retrieve information

In [None]:
import requests

In [None]:
url = 'https://hooks.slack.com/services/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'

In [None]:
r= requests.post(url,
                 headers={'content-type':'application/json',
                 'Authorization': key},
                 json={'text':'Just taking a sneak peak 👀'})

In [None]:
r.content

b'ok'

## Further materials

- [Public APIS](https://github.com/public-apis/public-apis)
- [RapidAPI](https://rapidapi.com/category/Sports)