# HTTP requests
In this tutorial ti is covered how to make requests via HTTP protocol. 
For more informations about related stuff see:
* <a href="https://en.wikipedia.org/wiki/Hypertext_Transfer_Protocol">Hypertext Transfer Protocol (HTTP)</a>
* <a href="https://en.wikipedia.org/wiki/JSON">JavaScript Object Notation</a>
* <a href="https://en.wikipedia.org/wiki/HTML">HyperText Markup Language (HTML)</a>

Keep in mind, that in this tutorial we work only with static content. How to obtain web dynamic content is not covered in this tutorial. If you want to deal with dynamic content, study <a href="http://selenium-python.readthedocs.io/">Selenium Python Bindings</a>.

## Get HTML page content
In this section are examples how to get HTTP response with two different libraries:
* <a href="https://docs.python.org/3.4/library/urllib.html?highlight=urllib">urllib</a> (standard library in Python 3)
* <a href="http://docs.python-requests.org/en/master/">Requests</a> (instalable through pip)

In this tutorial is mainly used the Requests library, as a prefered option.

### Urlib2 library
Example how to get static content of web page with Urlib2 follows:

In [2]:
from urllib.request import urlopen

r = urlopen('http://www.python.org/')
data = r.read()

print("Status code:", r.getcode())



Status code: 200


The variable `data` contains returned HTML code (full page) as string. You can process it, save it, or do anything else you need.

### Requests
Example how to get static content of web page with Requests follows.

In [5]:
import requests

r = requests.get("http://www.python.org/")
data = r.text

print("Status code:", r.status_code)

#print(data)

Status code: 200


## Get JSON data from an API
This task is demonstrated on Open Notify - an open source project that provides a simple programming interface for some of NASA’s awesome data.

The examples bellow cover how to obtain current possition of ISS. With Requests library it is possible to get the JSON from the API in the same way as HTML data.

In [4]:
import requests

r = requests.get("http://api.open-notify.org/iss-now.json")
obj = r.json()

print(obj)
#print(obj["timestamp"])

{'timestamp': 1521447874, 'iss_position': {'latitude': '-43.3744', 'longitude': '32.0371'}, 'message': 'success'}
1521447874


The Requests function `json()` convert the json response to Python dictionary. In next code block is demonstrated how to get data from obtained response.

## Persistent session with Requests
Session with Requests are handy for cases where you need to use same cookies (session cookies for example) or authentication for multiple requests.

In [4]:
s = requests.Session()
print("No cookies on start: ")
print(dict(s.cookies))
r = s.get('http://google.cz/')
print("\nA cookie from google: ")
print(dict(s.cookies))
r = s.get('http://google.cz/?q=cat')
print("\nThe cookie is perstent:")
print(dict(s.cookies))

No cookies on start: 
{}

A cookie from google: 
{'NID': '116=dIAN97p8g8djMXP_ncUDZcPoSVQXk-d6O7yH-d6gf8hQ0sEXqEAmGL71-tVVVr81tf4ITj3gzmcZgx21i99wjGzZg_Ndh3cyo9YGtTYXdTp5Z1vE-9VzxTeekGhGo6YV', '1P_JAR': '2017-11-06-10'}

The cookie is perstent:
{'NID': '116=dIAN97p8g8djMXP_ncUDZcPoSVQXk-d6O7yH-d6gf8hQ0sEXqEAmGL71-tVVVr81tf4ITj3gzmcZgx21i99wjGzZg_Ndh3cyo9YGtTYXdTp5Z1vE-9VzxTeekGhGo6YV', '1P_JAR': '2017-11-06-10'}


Compare the output of the code above, with the example bellow.

In [5]:
r = requests.get('http://google.cz/')
print("\nA cookie from google: ")
print(dict(r.cookies))
r = requests.get('http://google.cz/?q=cat')
print("\nDifferent cookie:")
print(dict(r.cookies))


A cookie from google: 
{'NID': '116=AL_qYgkRmxS0BRyEI64-gmCjAiVB-IHjQuWYET8nQ9bLdC8An8yYFG3hyyEMPhetp-dXB_ocrQOb9coQqc08p3SZjVw6AYIKPdbyj-1UNTzaGDzoc9aVP_KYpAd1L2kW', '1P_JAR': '2017-11-06-10'}

Different cookie:
{'NID': '116=dIgFVSuz1iLr2Yr0HzQvOE7zoY2OCfow-KTJ2IStINmLj2tTSScw2Z5KZEzGYUO5rXAIu1_kTBrbjMmudn9zULJ__WH80OYurwP-1SQf8xnBtZDxC7hCS8dl52SvPe8n', '1P_JAR': '2017-11-06-10'}


## Custom headers
Headers of the response are easy to check, example follows.

In [6]:
r = requests.get("http://www.python.org/")
print(r.headers)

{'Server': 'nginx', 'Content-Type': 'text/html; charset=utf-8', 'X-Frame-Options': 'SAMEORIGIN', 'x-xss-protection': '1; mode=block', 'X-Clacks-Overhead': 'GNU Terry Pratchett', 'Via': '1.1 varnish, 1.1 varnish', 'Content-Length': '48872', 'Accept-Ranges': 'bytes', 'Date': 'Mon, 19 Mar 2018 08:39:49 GMT', 'Age': '1507', 'Connection': 'keep-alive', 'X-Served-By': 'cache-iad2135-IAD, cache-ams4151-AMS', 'X-Cache': 'HIT, HIT', 'X-Cache-Hits': '3, 4', 'X-Timer': 'S1521448790.736767,VS0,VE0', 'Vary': 'Cookie', 'Strict-Transport-Security': 'max-age=63072000; includeSubDomains'}


The request headers can be modified in simple way as follows.

In [7]:
headers = {
    "Accept": "text/plain",
}

r = requests.get("http://www.python.org/", headers=headers)
print(r.status_code)

200


More information about HTTP headers can be found at <a href="https://en.wikipedia.org/wiki/List_of_HTTP_header_fields">List of HTTP header fields wikipedia page</a>.