In this notebook, we illustrate how to communicate with a HTTP server using built-in Python modules (`socket`) only.

HTTP is a network protocol which works on top of TCP/IP. This means that we can communicate with webservers using standard network sockets.

In Python, this functionality is provided in the `socket` module.

In [5]:
import socket

Next, we define the host and port we want to connect to. Typically, HTTP servers run on TCP port 80, and HTTPS servers on TCP port 443. Note that your browser also provides support for other port numbers, by adding them to the domain name as follows, e.g. `http://example.org:8080`.

In [6]:
HOST = 'example.org'
PORT = 80

Luckily, our operating system (and Python) are smart enough to handle the DNS resolving for us, so `example.org` will be matched with an IP address behind the scenes.

We can then create our socket and send some data to it. Recall that we need:
- A first line containing a request method, URL, and HTTP version
- A list of request headers. Note that `Host` is mandatory
- A blank line
- An optional message body (not included here)
- Each line is separated by a carriage return and line feed character, `\r\n` in Python strings

In [7]:
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as sock:
    sock.connect((HOST, PORT))
    sock.sendall(b'GET / HTTP/1.1\r\n' +
                 b'Host: example.org\r\n' +
                 b'User-Agent: Python 3\r\n' +
                 b'\r\n')
    data = sock.recv(1024 * 10)

The data we get back comes in as bytes (even though HTTP itself is a textual protocol, `socket` communicates using raw bytes as some other network protocols are binary. As such, we decode it (assuming a UTF-8 encoding), and show it.

Taks a look at the output below. Can you recognize all the components of the HTTP reply message?

In [9]:
print(data.decode('utf-8'))

HTTP/1.1 200 OK
Age: 269188
Cache-Control: max-age=604800
Content-Type: text/html; charset=UTF-8
Date: Fri, 31 Jul 2020 11:06:59 GMT
Etag: "3147526947+ident"
Expires: Fri, 07 Aug 2020 11:06:59 GMT
Last-Modified: Thu, 17 Oct 2019 07:18:26 GMT
Server: ECS (dcb/7EEC)
Vary: Accept-Encoding
X-Cache: HIT
Content-Length: 1256

<!doctype html>
<html>
<head>
    <title>Example Domain</title>

    <meta charset="utf-8" />
    <meta http-equiv="Content-type" content="text/html; charset=utf-8" />
    <meta name="viewport" content="width=device-width, initial-scale=1" />
    <style type="text/css">
    body {
        background-color: #f0f0f2;
        margin: 0;
        padding: 0;
        font-family: -apple-system, system-ui, BlinkMacSystemFont, "Segoe UI", "Open Sans", "Helvetica Neue", Helvetica, Arial, sans-serif;
        
    }
    div {
        width: 600px;
        margin: 5em auto;
        padding: 2em;
        background-color: #fdfdff;
        border-radius: 0.5em;
        b