# Denison CS-181/DA-210 Homework

---

## Raw HTTP Homework

In [19]:
import os
import sys
import json

def add_modules():
    """
    Starting at the current directory and proceeding up the file system
    tree, search for a directory named `modules`.  If found, and if not
    already there, add to the Python module search path.
    
    Params: None
    
    Return: None
    """
    directory = "."
    levels = 0
    while not os.path.isdir(os.path.join(directory, "modules")) and \
          levels < 5:
        directory = os.path.join(directory, "..")
        levels += 1
    module_path = os.path.abspath(os.path.join(directory, "modules"))
    if os.path.isdir(module_path):
        if not module_path in sys.path:
            sys.path.append(module_path)

add_modules()
import util
import mysocket as sock

### Socket Programming Requests

The first set of exercises are about *composing and making requests*.

**Q1** Suppose we wish to retrieve (GET) a file via HTTP (so port 80) from `datasystems.denison.edu`.  The resource path of the file is `/tabular/namesbyyear.csv`.  We wish to use version 1.1 of HTTP and to request that the connection be closed after a single request/reply exchange.  We will need a header line to satisfy the HTTP 1.1 requirement of a valid `Host` header.  Write a sequence of code to compose a valid HTTP request as a Python string, and assign the result to `message`.  This is entirely string manipulation in Python.

In [20]:
message = "GET /tabular/namesbyyear.csv HTTP/1.1\r\nHost: datasystems.denison.edu\r\nConnection: close\r\n\r\n"
print(message)
print("--------------------")

GET /tabular/namesbyyear.csv HTTP/1.1
Host: datasystems.denison.edu
Connection: close


--------------------


In [21]:
assert type(message) == str
assert message[:3] == "GET"
assert message[4:4+len("/tabular/namesbyyear.csv")] == "/tabular/namesbyyear.csv"
assert "Host: datasystems.denison.edu" in message
assert message.count('\r\n') == 4
assert message[-4:] == '\r\n\r\n'

**Q2** Write a sequence of code to establish a connection to the host `datasystems.denison.edu` at port 80, to send the string `message` from the previous problem to the host, receive the reply from the host until the server closes the connection, assigning the reply to `reply`, and close the connection.  Note: if the request is not completely correct, a network connection can wait forever for a reply that will never come.  So if you have difficulty here, double check your answer to the previous problem.

In [22]:
connection = sock.makeConnection("datasystems.denison.edu", 80)
sock.sendString(connection, message)
reply = sock.receiveTillClose(connection)
print(reply)

HTTP/1.1 200 OK
Date: Thu, 22 Apr 2021 14:28:25 GMT
Server: Apache/2.4.6 (CentOS)
Last-Modified: Mon, 21 Dec 2020 11:47:51 GMT
ETag: "58-5b6f8071c67c0"
Accept-Ranges: bytes
Content-Length: 88
Connection: close
Content-Type: text/csv

,2014,2015,2016,2017,2018
Female,Emma,Emma,Emma,Emma,Emma
Male,Noah,Noah,Noah,Liam,Liam



In [23]:
assert type(reply) == str
assert "200 OK" in reply
assert "text/csv" in reply
assert reply.endswith("Noah,Liam,Liam\n")

**Q3** Suppose we want to generalize the scenario from the first exercise, where the two things that can change are the *host location* and the *resource path*.  For example, we might want to change the host to `httpbin.org` and the resource path to `/`, or many other combinations.  Write a function

    buildRequest(location, resource)
    
that constructs and returns a Python string containing a valid HTTP GET request that incorporates the parameters `location` and `resource` into the request at the appropriate places, and includes the appropriate header lines (for the required `Host` and to request the server close the connection after the exchange).

In [24]:
def buildRequest(location, resource):
    '''
    This function builds the request string with
    a location and the resource it needs to get.
    
    Parameters: location: the host of the request
                resource: what the user is requesting
    
    Return: the request message string.
    '''
    return "GET {} HTTP/1.1\r\nHost: {}\r\nConnection: close\r\n\r\n".format(resource, location)
print(buildRequest("datasystems.denison.edu", "/data/ind0.json"))
print("---------------------")

GET /data/ind0.json HTTP/1.1
Host: datasystems.denison.edu
Connection: close


---------------------


In [25]:
r1 = buildRequest("datasystems.denison.edu", "/data/ind0.json")
assert r1[:3] == "GET"
assert r1[4:4+len("/data/ind0.json")] == "/data/ind0.json"
assert "Host: datasystems.denison.edu" in r1
assert r1.count('\r\n') == 4
assert r1[-4:] == '\r\n\r\n'
r2 = buildRequest("httpbin.org", "/get")
assert r2[:3] == "GET"
assert r2[4:4+len("/get")] == "/get"
assert "Host: httpbin.org" in r2
assert r2.count('\r\n') == 4
assert r2[-4:] == '\r\n\r\n'

**Q4** Write a function

    makeRequest(location, resource)

that first constructs a valid HTTP GET request for `resource` at host `location`, as a Python string (using your function from the previous question), and then performs the  request-reply steps of making the connection, sending the string request, receiving a reply until the connection closes, and finally closing the client side of the connection.  The function should return the reply.

In [26]:
def makeRequest(location, resource):
    '''
    This function makes a request by establishing a
    connection and sending a request message with
    a location and resource through it. Finally it
    will return the reply of the request.
    
    Parameters: location: the host of the request
                resource: what the user is requesting
                
    Return: reply: the reply of the message request
    '''
    message = buildRequest(location, resource)
    connection = sock.makeConnection(location, 80)
    sock.sendString(connection, message)
    reply = sock.receiveTillClose(connection)
    connection.close()
    return reply

print(makeRequest("datasystems.denison.edu", "/basic.html"))

HTTP/1.1 200 OK
Date: Thu, 22 Apr 2021 14:28:28 GMT
Server: Apache/2.4.6 (CentOS)
Accept-Ranges: bytes
Content-Length: 496
Connection: close
Content-Type: text/html; charset=UTF-8

<!DOCTYPE html>
<html lang="en">
  <head>
    <title>Data Systems Basic HTML Page</title>
  </head>
  <body>
    <h1>First Level Heading</h1>

    <p>Paragraph defined in <b>body</b>.

    <h2>Second Level Heading</h2>

    <a href="http://docs.python.org">Link</a> to Python documentation.
    </p>

    <ul>
      <li>Item 1
      <ol>
        <li>Item 1 nested</li>
        <li>Item 2 nested</li>
      </ol>
      </li>
      <li>Item 2</li>
      <li>Item 3</li>
    </ul>
  </body>
</html>



In [27]:
resp1 = makeRequest("datasystems.denison.edu", "/basic.html")
#print(resp1)
assert "200 OK" in resp1
assert "text/html" in resp1
assert resp1.endswith("</html>\n")

resp2 = makeRequest("datasystems.denison.edu", "/data/ind0.json")
#print(resp2)
assert "200 OK" in resp2
assert "application/json" in resp2
assert resp2.endswith("19485.4}}}")

resp3 = makeRequest("httpbin.org", "/get")
#print(resp3)
assert "200 OK" in resp3
assert "application/json" in resp3
assert resp3.endswith(""""url": "http://httpbin.org/get"\n}\n""")

## Programming Response Replies

The next set of exercises are about parsing through the reply resulting from a request.  If we consider an HTTP reply, we can partition it into a status line, the set of headers, and the body.  The exercises ask for functions that, given a reply, and parse the reply and return each of these pieces.

**Q5:** Write a function

    parseStatus(reply)

that finds and returns a Python string consisting of only the status line of a reply.  The returned value should include the line-terminating `"\r\n"`.

In [None]:
# YOUR CODE HERE
raise NotImplementedError()
reply = makeRequest("datasystems.denison.edu", "/basic.html")
print(repr(parseStatus(reply)))
reply = makeRequest("datasystems.denison.edu", "/foobar.txt")
print(repr(parseStatus(reply)))

In [None]:
r1 = makeRequest("datasystems.denison.edu", "/basic.html")
s1 = parseStatus(r1)
assert s1 == "HTTP/1.1 200 OK\r\n"

r2 = makeRequest("datasystems.denison.edu", "/foobar.txt")
s2 = parseStatus(r2)
assert s2 == "HTTP/1.1 404 Not Found\r\n"

**Q6:** Write a function

    parseHeaders(reply)

that finds and returns a single Python string that starts with the first header in the reply and continues up through the last header in the reply, including the line-terminating `"\r\n"`, but *not* the empty line separating the headers from the body.

In [None]:
# YOUR CODE HERE
raise NotImplementedError()
reply = makeRequest("datasystems.denison.edu", "/basic.html")
print(repr(parseHeaders(reply)))
reply = makeRequest("datasystems.denison.edu", "/foobar.txt")
print(repr(parseHeaders(reply)))

In [None]:
r1 = makeRequest("datasystems.denison.edu", "/basic.html")
h1 = parseHeaders(r1)
assert "Server: Apache" in h1
assert "Connection: close\r\n" in h1
assert "Content-Type: text/html" in h1
r2 = makeRequest("datasystems.denison.edu", "/foobar.txt")
h2 = parseHeaders(r2)
assert "Server: Apache" in h2
assert "Connection: close\r\n" in h2
assert "Content-Type: text/html" in h2

**Q7:** Write a function

    parseBody(reply)

that finds and returns a single Python string that starts with the beginning of the body (i.e. after the empty line of the reply) and continues to the end of the reply.

In [None]:
# YOUR CODE HERE
raise NotImplementedError()
reply = makeRequest("datasystems.denison.edu", "/basic.html")
print(parseBody(reply))
reply = makeRequest("datasystems.denison.edu", "/foobar.txt")
print(parseBody(reply))

In [None]:
r1 = makeRequest("datasystems.denison.edu", "/basic.html")
b1 = parseBody(r1)
r2 = makeRequest("datasystems.denison.edu", "/foobar.txt")
b2 = parseBody(r2)
assert b1.startswith("<!DOCTYPE html>")
assert b1.endswith("</html>\n")
assert b2.startswith("<!DOCTYPE HTML")
assert b2.endswith("</body></html>\n")