# HTTP Request

This notebook presents method for fetching resource over Internet using PSL and Third Party Packages.

In [1]:
import json
import hashlib
import base64

## Trial Resource

We will use the [Brussels Air Quality API][1] `echo` service for test purpose, bellow the URI of the resource:

[1]: https://airqualitydata.environnement.brussels/docs/

In [2]:
uri = 'https://airqualitydata.environnement.brussels/echo?headers&request'

This link returns JSON file on client requests.

## PSL `urllib`

Fetching resources over internet is built-in Python [2.7][1] and [3.x][2], everything lies in the `urllib` package. But there had a big refactoring between those versions. Mainly, *namespaces*, *objects* and *functionnalities* are different.

We present here [`urllib`][2] running with Python 3.5, importing the `request` namespace is enough to make a [MCVE][3]:

[1]: https://docs.python.org/2/library/urllib.html
[2]: https://docs.python.org/3/library/urllib.html
[3]: https://stackoverflow.com/help/mcve

In [17]:
from urllib import request, parse

### Make a simple `GET` Request

Create a request object, passing the URL (by default we use the `GET` method)

In [4]:
req = request.Request(url=uri)

We can add headers to our request, if needed:

In [5]:
req.add_header('User-Agent', 'awesome fetcher')

### Fecth the Answer

We use [`urlopen`][1] method to actually fetch the resource. This method also works with context manager:

[1]: https://docs.python.org/3/library/urllib.request.html#urllib.request.urlopen

In [6]:
with request.urlopen(req) as ans:
    raw = ans.read()

All information exchanged over internet are `bytes` not `str`, so we have the need to decode using the proper charset (by default we will assume `utf-8`):

In [7]:
raw

b'{"arguments":{"headers":"","request":""},"headers":{"accept-encoding":"identity","connection":"close","host":"airqualitydata.environnement.brussels","user-agent":"awesome fetcher","x-forwarded-for":"10.0.1.4","x-real-ip":"10.0.1.4"},"message":"Echo"}\n'

In [8]:
type(raw)

bytes

### Process the Answer

#### Status Code

When request has been processed, we can check out the status of the answer:

In [9]:
ans.status

200

If we get a `2xx` code (see [list][1]), the request succeed and there might have a resource to parse.

#### Headers 
We can check answer headers to learn more about the resource we just got:

[1]: https://en.wikipedia.org/wiki/List_of_HTTP_status_codes

In [10]:
ans.headers

<http.client.HTTPMessage at 0x7f6428035358>

Answer Headers are collection of key/value pairs, therefore we can natively create a dictionary from the object:

In [11]:
headers = dict(ans.headers)
headers

{'Access-Control-Allow-Headers': 'Content-Type',
 'Access-Control-Allow-Methods': 'GET, POST',
 'Access-Control-Allow-Origin': '*',
 'Cache-Control': 'no-cache',
 'Connection': 'close',
 'Content-Length': '251',
 'Content-MD5': 'Aws7Ueijc0X115cltPdhhQ==',
 'Content-SHA1': 'tJEMnD5Whq+GSWkJxpPWTnyoWgw=',
 'Content-SHA256': 'cHgoo1blMFJokRcAswGF+sn+BipBVpv2vZ7Fy1pelmQ=',
 'Content-Type': 'application/json',
 'Content-UUID4': '04249e0ed23a4752ae845490e268918c',
 'Date': 'Sun, 04 Feb 2018 18:24:30 GMT',
 'ETag': '"030b3b51e8a37345f5d79725b4f76185"',
 'Expires': 'Sun, 04 Feb 2018 18:32:27 GMT',
 'Last-Modified': 'Sun, 04 Feb 2018 18:32:27 GMT',
 'Server': 'nginx/1.8.1'}

#### Check Sums

We can check out if digests match (eg. `SHA-256` in `Base64` format):

In [12]:
sha256 = base64.b64encode(hashlib.sha256(raw).digest()).decode()
sha256

'cHgoo1blMFJokRcAswGF+sn+BipBVpv2vZ7Fy1pelmQ='

Digests match, resource is unaltered:

In [13]:
sha256 == headers['Content-SHA256']

True

#### Parse Data 

We have fetched a JSON file, so we can parse the content:

 - First we decode `bytes` into `str`;
 - Second we interpret the `str` using JSON parser, we get a Python dict holding the resource.

In [14]:
data = json.loads(raw.decode())
data

{'arguments': {'headers': '', 'request': ''},
 'headers': {'accept-encoding': 'identity',
  'connection': 'close',
  'host': 'airqualitydata.environnement.brussels',
  'user-agent': 'awesome fetcher',
  'x-forwarded-for': '10.0.1.4',
  'x-real-ip': '10.0.1.4'},
 'message': 'Echo'}

In summary, we have:

 1. Created and configured a Request object; 
 - Fetched the corresponding resource over the Internet;
 - Checked out the resource integrity;
 - Decoded the received bytes and parsed them into the correct format.

### POST Request

In [15]:
struct = {'password': '***', 'username': 'jlandercy'}

In [19]:
params = parse.urlencode(struct)
params

'password=%2A%2A%2A&username=jlandercy'

In [21]:
bdata = json.dumps(struct).encode()
bdata

b'{"password": "***", "username": "jlandercy"}'

In [32]:
req = request.Request(uri, bdata)
req.add_header('Content-Type', 'application/json')
req.add_header('Content-Length', len(bdata))
req.get_method()

'POST'

In [33]:
with request.urlopen(req) as rep:
    data = json.loads((rep.read().decode()))
data

{'arguments': {'headers': '', 'request': ''},
 'headers': {'accept-encoding': 'identity',
  'connection': 'close',
  'content-length': '44',
  'content-type': 'application/json',
  'host': 'airqualitydata.environnement.brussels',
  'user-agent': 'Python-urllib/3.5',
  'x-forwarded-for': '10.0.1.4',
  'x-real-ip': '10.0.1.4'},
 'message': 'Echo',
 'request': {'password': '***', 'username': 'jlandercy'}}

## Thrid Party Request

http://docs.python-requests.org/en/master/