# CMPE 148 ‚Äî Lecture 03 In-Class Exercise
# The Application Layer: HTTP, DNS, Sockets & More

**Course:** CMPE-148-01 Computer Networks I ‚Äî Spring 2026  
**Instructor:** Andrew Bond  
**Reference:** Kurose & Ross, *Computer Networking*, 8th Edition ‚Äî Chapter 2

---

### Overview (~45 minutes)

| Part | Topic | Time |
|------|-------|------|
| 1 | HTTP Requests & Responses | ~12 min |
| 2 | DNS Resolution | ~10 min |
| 3 | Socket Programming (UDP & TCP) | ~15 min |
| 4 | Web Caching Calculations | ~8 min |

**Instructions:** Work through each section in order. Run every code cell, answer the inline questions in the provided markdown cells, and complete the `# TODO` exercises. Raise your hand if you get stuck!

---
## Part 1: HTTP Requests & Responses (~12 min)

Recall from lecture:
- HTTP is a **client/server** application-layer protocol running on top of **TCP (port 80/443)**
- HTTP is **stateless** ‚Äî the server maintains no information about past requests
- An HTTP message has a **request line** (or status line), **header lines**, and an optional **body**

### 1.1 ‚Äî Making a raw HTTP GET request

On the slides we saw how you could use `telnet` to manually send an HTTP request. Here we'll do the same thing in Python using the `requests` library ‚Äî then peek at the raw data.

In [2]:
import requests

# Send a GET request (just like your browser does)
response = requests.get("http://gaia.cs.umass.edu/kurose_ross/interactive/index.php")

# --- Status line ---
print("=== STATUS ===")
print(f"Status Code : {response.status_code}")
print(f"Reason      : {response.reason}")
print(f"HTTP Version: {response.raw.version}" if hasattr(response.raw, 'version') else "")
print()

# --- Response headers ---
print("=== RESPONSE HEADERS ===")
for key, value in response.headers.items():
    print(f"  {key}: {value}")
print()

# --- First 500 chars of the body ---
print("=== BODY (first 500 chars) ===")
print(response.text[:500])

=== STATUS ===
Status Code : 200
Reason      : OK
HTTP Version: 11

=== RESPONSE HEADERS ===
  Date: Wed, 11 Feb 2026 05:43:10 GMT
  Server: Apache/2.4.62 (AlmaLinux) OpenSSL/3.5.1 mod_fcgid/2.3.9 mod_perl/2.0.12 Perl/v5.32.1
  X-Powered-By: PHP/8.0.30
  Set-Cookie: DevMode=0
  Keep-Alive: timeout=5, max=100
  Connection: Keep-Alive
  Transfer-Encoding: chunked
  Content-Type: text/html; charset=UTF-8

=== BODY (first 500 chars) ===

<!DOCTYPE HTML>
<html>

  

  <head>
    <title>Interactive Problems, Computer Networking: A Top Down Approach</title>
    <meta charset="utf-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <link href="https://stackpath.bootstrapcdn.com/bootswatch/4.5.0/lux/bootstrap.min.css" rel="stylesheet" type="text/css"/>
    <link href="custom.css" rel="stylesheet" type="text/css"/>
  </head>

  <body>
    <!-- Required scripts for bootstrap to function -->
    <script t


### ‚úèÔ∏è Question 1.1

Look at the output above and answer:

1. What **status code** did the server return? What does it mean?
2. What **Content-Type** header did the server send back?
3. Is there a **Connection** header? What does its value tell you about persistent vs. non-persistent HTTP?

> *Double-click this cell to type your answers below:*
>
> 1. It returned 200, which means OK.
> 2. The content type header returned was text/html; charset=UTF-8.
> 3. The connection header is Keep-Alive, which allows for a persistent HTTP connection to make many requests.

### 1.2 ‚Äî Exploring HTTP status codes

From the slides: `200 OK`, `301 Moved Permanently`, `400 Bad Request`, `404 Not Found`, `505 HTTP Version Not Supported`. Let's trigger a few of these.

In [5]:
# Let's request several URLs and observe the status codes
test_urls = [
    "http://httpbin.org/status/200",      # Should return 200
    "http://httpbin.org/status/301",      # Should return 301
    "http://httpbin.org/status/404",      # Should return 404
    "http://httpbin.org/status/500",      # Should return 500
]

for url in test_urls:
    resp = requests.get(url, allow_redirects=False, timeout=5)
    print(f"{resp.status_code} {resp.reason:25s} ‚Üê {url}")

200 OK                        ‚Üê http://httpbin.org/status/200
301 MOVED PERMANENTLY         ‚Üê http://httpbin.org/status/301
404 NOT FOUND                 ‚Üê http://httpbin.org/status/404
500 INTERNAL SERVER ERROR     ‚Üê http://httpbin.org/status/500


### 1.3 ‚Äî Request headers & the Conditional GET

Recall the **Conditional GET**: the client sends `If-Modified-Since` so the server can respond with `304 Not Modified` (saving bandwidth). Let's see this in action.

In [7]:
# Step 1: Make a normal GET and note the Last-Modified header
# resp1 = requests.get("http://httpbin.org/cache/300") # Does not work the way it should
resp1 = requests.get("http://vta.org") # Using VTA homepage as an example that likely has caching headers
print("First request:")
print(f"  Status: {resp1.status_code}")
last_modified = resp1.headers.get("Last-Modified", "(not present)")
etag = resp1.headers.get("ETag", "(not present)")
print(f"  Last-Modified: {last_modified}")
print(f"  ETag: {etag}")
print(f"  Body length: {len(resp1.content)} bytes")
print()

# Step 2: Make a conditional GET using If-None-Match (ETag-based)
if etag != "(not present)":
    resp2 = requests.get("http://vta.org", headers={"If-None-Match": etag})
    print("Conditional GET (If-None-Match):")
    print(f"  Status: {resp2.status_code} {resp2.reason}")
    print(f"  Body length: {len(resp2.content)} bytes")
    print()
    print("‚Üí Notice: 304 means the server sent NO body ‚Äî bandwidth saved!")
else:
    print("Server did not provide an ETag. Try the If-Modified-Since approach instead.")

# Step 3: Make a conditional GET using If-Modified-Since (Last-Modified-based)
if last_modified != "(not present)":
    resp3 = requests.get("http://vta.org", headers={"If-Modified-Since": last_modified})
    print("Conditional GET (If-Modified-Since):")
    print(f"  Status: {resp3.status_code} {resp3.reason}")
    print(f"  Body length: {len(resp3.content)} bytes")
    print()
    print("‚Üí Notice: 304 means the server sent NO body ‚Äî bandwidth saved!")
else:
    print("Server did not provide a Last-Modified header. Try the If-None-Match approach instead.")

First request:
  Status: 200
  Last-Modified: Wed, 11 Feb 2026 05:41:29 GMT
  ETag: "1770788489-gzip"
  Body length: 133806 bytes

Conditional GET (If-None-Match):
  Status: 304 Not Modified
  Body length: 0 bytes

‚Üí Notice: 304 means the server sent NO body ‚Äî bandwidth saved!
Conditional GET (If-Modified-Since):
  Status: 304 Not Modified
  Body length: 0 bytes

‚Üí Notice: 304 means the server sent NO body ‚Äî bandwidth saved!


### ‚úèÔ∏è Question 1.2

1. In the Conditional GET above, how many bytes were in the body of the `304` response vs. the original `200` response?
2. Why is this useful for **web caches (proxy servers)**?

> 1. In the original 304 response there were 0 bytes while the original 200 response there were 133806 bytes.
> 2. This is useful for web caches as it does not need to be downloaded again - it is a waste of bandwidth if downloaded again. Note that I had to use another website as it did not work properly.

### 1.4 ‚Äî HTTP Methods: GET vs POST

From the slides: GET sends data in the URL (after `?`), while POST sends data in the request **body**.

In [12]:
# GET with query parameters (data visible in URL)
resp_get = requests.get("http://httpbin.org/get",
                        params={"course": "CMPE148", "semester": "SP26"})
print("=== GET ===")
print(f"URL sent: {resp_get.url}")
print(f"Server saw args: {resp_get.json()['args']}")
print()

# POST with form data (data in body, not URL)
resp_post = requests.post("http://httpbin.org/post",
                          data={"course": "CMPE148", "semester": "SP26"})
print("=== POST ===")
print(f"URL sent: {resp_post.url}")
print(f"Server saw form data: {resp_post.json()['form']}")
print(f"Server saw args: {resp_post.json()['args']}")

=== GET ===
URL sent: http://httpbin.org/get?course=CMPE148&semester=SP26
Server saw args: {'course': 'CMPE148', 'semester': 'SP26'}

=== POST ===
URL sent: http://httpbin.org/post
Server saw form data: {'course': 'CMPE148', 'semester': 'SP26'}
Server saw args: {}


### ‚úèÔ∏è Question 1.3

Compare the GET and POST outputs:
1. Where does the data appear in the GET request? Where in the POST request?
2. Which method would you use for submitting a login form with a password? Why?

> 1. The data in the GET request is in the arguments. In the POST it comes from the data.
> 2. For submitting a login form with the password, I would use GET since you have to type it in manually, while if I used POST it would be to save the password or remain signed in.

---
## Part 2: DNS Resolution (~10 min)

From lecture, DNS is a **distributed, hierarchical database** that maps hostnames to IP addresses. The hierarchy goes: **Root ‚Üí TLD ‚Üí Authoritative** servers. Your local DNS server acts as a proxy and caches results.

In [13]:
import socket

# Basic DNS lookup ‚Äî hostname to IP address (A record)
hostnames = [
    "gaia.cs.umass.edu",
    "www.google.com",
    "www.sjsu.edu",
    "www.netflix.com",
    "dns.google",
]

print(f"{'Hostname':<30s} {'IP Address(es)'}")
print("=" * 60)
for host in hostnames:
    try:
        results = socket.getaddrinfo(host, None, socket.AF_INET)
        ips = sorted(set(r[4][0] for r in results))
        print(f"{host:<30s} {', '.join(ips)}")
    except socket.gaierror as e:
        print(f"{host:<30s} FAILED: {e}")

Hostname                       IP Address(es)
gaia.cs.umass.edu              128.119.245.12
www.google.com                 142.250.189.196
www.sjsu.edu                   130.65.218.11
www.netflix.com                207.45.72.1, 207.45.73.1
dns.google                     8.8.4.4, 8.8.8.8


### ‚úèÔ∏è Question 2.1

1. `www.google.com` and `www.netflix.com` may return **multiple IP addresses**. Why would a single hostname map to multiple IPs? (Hint: think about the DNS service called *load distribution* from the slides.)
2. What type of DNS record maps a hostname to an IP address?

> 1. A single hostname would probably map to multiple IPs to help distribute the load and to also provide a fallback should any server go down.
> 2. CNAME is the type of DNS record mapping a hostname to IP address.

### 2.2 ‚Äî Reverse DNS and more record types

In [None]:
# Reverse DNS: IP ‚Üí hostname
test_ips = ["8.8.8.8", "1.1.1.1", "142.250.80.4"]

print("=== Reverse DNS (PTR records) ===")
for ip in test_ips:
    try:
        hostname, _, _ = socket.gethostbyaddr(ip)
        print(f"  {ip:>20s}  ‚Üí  {hostname}")
    except socket.herror:
        print(f"  {ip:>20s}  ‚Üí  (no reverse DNS)")

print()

# MX record lookup using subprocess (dig command)
import subprocess

print("=== MX Records for sjsu.edu ===")
# Using nslookup since dig is not available on Windows
result = subprocess.run(["nslookup", "-type=MX", "sjsu.edu"], capture_output=True, text=True, timeout=10)
if result.stdout.strip():
    for line in result.stdout.strip().split("\n"):
        print(f"  {line}")
else:
    print("  (dig not available or no results ‚Äî see alternative below)")
    # Fallback
    print("  Try: !dig +short MX sjsu.edu")

=== Reverse DNS (PTR records) ===
               8.8.8.8  ‚Üí  dns.google
               1.1.1.1  ‚Üí  one.one.one.one
          142.250.80.4  ‚Üí  lga34s33-in-f4.1e100.net

=== MX Records for sjsu.edu ===
  Server:  UnKnown
  Address:  192.168.86.1
  
  sjsu.edu	MX preference = 10, mail exchanger = aspmx.l.google.com
  sjsu.edu	MX preference = 30, mail exchanger = aspmx4.googlemail.com
  sjsu.edu	MX preference = 20, mail exchanger = alt1.aspmx.l.google.com
  sjsu.edu	MX preference = 30, mail exchanger = aspmx3.googlemail.com
  sjsu.edu	MX preference = 20, mail exchanger = alt2.aspmx.l.google.com
  sjsu.edu	MX preference = 30, mail exchanger = aspmx2.googlemail.com
  sjsu.edu	MX preference = 30, mail exchanger = aspmx5.googlemail.com


### 2.3 ‚Äî Measuring DNS resolution time

In [24]:
import time

domains = ["www.sjsu.edu", "www.amazon.com", "www.wikipedia.org",
           "obscure-test-domain-12345.com", "www.sjsu.edu"]  # note: sjsu repeated

print(f"{'Domain':<35s} {'Time (ms)':>10s}  {'Result'}")
print("=" * 75)

for domain in domains:
    start = time.perf_counter()
    try:
        ip = socket.gethostbyname(domain)
        elapsed = (time.perf_counter() - start) * 1000
        print(f"{domain:<35s} {elapsed:>8.2f}ms  {ip}")
    except socket.gaierror:
        elapsed = (time.perf_counter() - start) * 1000
        print(f"{domain:<35s} {elapsed:>8.2f}ms  FAILED")

Domain                               Time (ms)  Result
www.sjsu.edu                            0.23ms  130.65.218.11
www.amazon.com                          0.14ms  18.173.122.19
www.wikipedia.org                       0.14ms  198.35.26.96
obscure-test-domain-12345.com           0.10ms  FAILED
www.sjsu.edu                            0.14ms  130.65.218.11


### ‚úèÔ∏è Question 2.2

1. `www.sjsu.edu` appears twice in the list. Was the second lookup faster? Why or why not? (Hint: DNS **caching** from the slides)
2. The slides describe **iterated** vs. **recursive** DNS queries. In an iterated query, who does most of the work ‚Äî the local DNS server or the root/TLD servers?

> 1. The second lookup is faster since the files were already cached and it had alreadt returned a 304.
> 2. The local DNS server does more work in the iterated query as it navigates the entire query.

---
## Part 3: Socket Programming (~15 min)

From the slides, a **socket** is the "door" between the application process and the transport layer. We'll build both **UDP** and **TCP** client/server pairs ‚Äî the same uppercase-conversion example from lecture.

### 3.1 ‚Äî UDP Client & Server

Recall:
- UDP has **no connection setup** (no handshaking)
- The sender **explicitly attaches** the destination IP and port to each datagram
- Data may be **lost or arrive out of order**

In [13]:
import socket
import threading

SERVER_PORT_UDP = 12000

def udp_server():
    """Simple UDP server that converts messages to uppercase."""
    server_socket = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
    server_socket.bind(('', SERVER_PORT_UDP))
    server_socket.settimeout(5)  # timeout so thread doesn't hang forever
    print("[UDP Server] Ready to receive on port", SERVER_PORT_UDP)

    messages_handled = 0
    while messages_handled < 3:  # handle 3 messages then stop
        try:
            message, client_address = server_socket.recvfrom(2048)
            decoded = message.decode()
            print(f"[UDP Server] Received from {client_address}: '{decoded}'")

            modified = decoded.upper()
            server_socket.sendto(modified.encode(), client_address)
            print(f"[UDP Server] Sent back: '{modified}'")
            messages_handled += 1
        except socket.timeout:
            break

    server_socket.close()
    print("[UDP Server] Shut down.")

def udp_client(message):
    """Simple UDP client that sends a message and receives the uppercase reply."""
    client_socket = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
    client_socket.settimeout(3)

    # Note: we must attach the server address to every datagram!
    client_socket.sendto(message.encode(), ('127.0.0.1', SERVER_PORT_UDP))
    print(f"[UDP Client] Sent: '{message}'")

    modified_message, server_address = client_socket.recvfrom(2048)
    print(f"[UDP Client] Received: '{modified_message.decode()}' from {server_address}")

    client_socket.close()

# --- Run the demo ---
import time

# Start server in a background thread
server_thread = threading.Thread(target=udp_server, daemon=True)
server_thread.start()
time.sleep(0.3)  # give server time to bind

# Send 3 messages
for msg in ["hello from cmpe 148", "udp has no handshake", "packets might be lost"]:
    udp_client(msg)
    time.sleep(0.1)

server_thread.join(timeout=6)
print("\n‚úÖ UDP demo complete!")

[UDP Server] Ready to receive on port 12000
[UDP Client] Sent: 'hello from cmpe 148'[UDP Server] Received from ('127.0.0.1', 51221): 'hello from cmpe 148'
[UDP Server] Sent back: 'HELLO FROM CMPE 148'

[UDP Client] Received: 'HELLO FROM CMPE 148' from ('127.0.0.1', 12000)
[UDP Client] Sent: 'udp has no handshake'
[UDP Server] Received from ('127.0.0.1', 51222): 'udp has no handshake'
[UDP Server] Sent back: 'UDP HAS NO HANDSHAKE'
[UDP Client] Received: 'UDP HAS NO HANDSHAKE' from ('127.0.0.1', 12000)
[UDP Client] Sent: 'packets might be lost'
[UDP Server] Received from ('127.0.0.1', 51223): 'packets might be lost'
[UDP Server] Sent back: 'PACKETS MIGHT BE LOST'
[UDP Client] Received: 'PACKETS MIGHT BE LOST' from ('127.0.0.1', 12000)
[UDP Server] Shut down.

‚úÖ UDP demo complete!


### ‚úèÔ∏è Question 3.1

Look at the UDP code above:
1. In `sendto()`, the client must specify `('127.0.0.1', 12000)` every time. Why isn't this needed in TCP? (Hint: think about connection setup.)
2. What socket type constant is used for UDP? What about TCP?

> 1. TCP already assumes it is already connected and it knows where the context is being sent to. The protocol is connection oriented.
> 2. UDP uses the SOCK_DGRAM constant while TCP uses SOCK_STREAM.

### 3.2 ‚Äî TCP Client & Server

Recall:
- TCP requires a **connection setup** (3-way handshake) before data exchange
- TCP provides **reliable, in-order** byte-stream transfer
- The server uses `accept()` to create a **new socket** for each client

In [28]:
import socket
import threading
import time

SERVER_PORT_TCP = 12001

def tcp_server():
    """Simple TCP server that converts messages to uppercase."""
    server_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    server_socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
    server_socket.bind(('', SERVER_PORT_TCP))
    server_socket.listen(1)
    server_socket.settimeout(5)
    print(f"[TCP Server] Listening on port {SERVER_PORT_TCP}")

    try:
        # accept() creates a NEW socket for this specific client
        connection_socket, addr = server_socket.accept()
        print(f"[TCP Server] Connection from {addr}")

        # Handle multiple messages on the SAME connection (persistent!)
        connection_socket.settimeout(3)
        while True:
            try:
                sentence = connection_socket.recv(1024).decode()
                if not sentence:
                    break
                print(f"[TCP Server] Received: '{sentence}'")
                capitalized = sentence.upper()
                connection_socket.send(capitalized.encode())
                print(f"[TCP Server] Sent back: '{capitalized}'")
            except socket.timeout:
                break

        connection_socket.close()
        print("[TCP Server] Client connection closed.")
    except socket.timeout:
        print("[TCP Server] No client connected (timeout).")

    server_socket.close()
    print("[TCP Server] Shut down.")

def tcp_client(messages):
    """TCP client that sends multiple messages over one connection."""
    client_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)

    # Connection setup (triggers TCP 3-way handshake)
    client_socket.connect(('127.0.0.1', SERVER_PORT_TCP))
    print(f"[TCP Client] Connected to server")

    for msg in messages:
        # No need to specify destination ‚Äî connection already established!
        client_socket.send(msg.encode())
        print(f"[TCP Client] Sent: '{msg}'")

        modified = client_socket.recv(1024).decode()
        print(f"[TCP Client] Received: '{modified}'")
        time.sleep(0.1)

    client_socket.close()
    print("[TCP Client] Connection closed.")

# --- Run the demo ---
server_thread = threading.Thread(target=tcp_server, daemon=True)
server_thread.start()
time.sleep(0.3)

# Send 3 messages over a SINGLE TCP connection
tcp_client(["hello from cmpe 148", "tcp is reliable", "connection oriented"])

server_thread.join(timeout=6)
print("\n‚úÖ TCP demo complete!")

[TCP Server] Listening on port 12001
[TCP Client] Connected to server
[TCP Client] Sent: 'hello from cmpe 148'
[TCP Server] Connection from ('127.0.0.1', 58489)
[TCP Server] Received: 'hello from cmpe 148'
[TCP Server] Sent back: 'HELLO FROM CMPE 148'
[TCP Client] Received: 'HELLO FROM CMPE 148'
[TCP Client] Sent: 'tcp is reliable'
[TCP Server] Received: 'tcp is reliable'
[TCP Server] Sent back: 'TCP IS RELIABLE'
[TCP Client] Received: 'TCP IS RELIABLE'
[TCP Client] Sent: 'connection oriented'
[TCP Server] Received: 'connection oriented'
[TCP Client] Received: 'CONNECTION ORIENTED'
[TCP Server] Sent back: 'CONNECTION ORIENTED'
[TCP Client] Connection closed.
[TCP Server] Client connection closed.
[TCP Server] Shut down.

‚úÖ TCP demo complete!


### ‚úèÔ∏è Question 3.2

Compare the UDP and TCP code:

1. The TCP client sends 3 messages over **one** connection. How many TCP connections would non-persistent HTTP use for 3 objects? How about persistent HTTP?
2. The TCP server calls `accept()` which returns a *new* socket (`connection_socket`). Why does the server need a separate socket per client?
3. Why does TCP use `send()`/`recv()` while UDP uses `sendto()`/`recvfrom()`?

> 1. 3 objects require 3 connections in non-persistent HTTP while persistent HTTP uses just one TCP connection.
> 2. The server needs a separate socket per client because it needs to manage connection streams to allow for persistency and allow for multitasking.
> 3. TCP already knows where to send and recieve the messages to and from in setup, while UDP you need to specify where to get or receive messages (the port and IP address).

### 3.3 ‚Äî TODO: Modify the server

**Your turn!** Modify the TCP server function below so that instead of converting to uppercase, it **reverses** the string (e.g., `"hello"` ‚Üí `"olleh"`).

Then run the cell to test it.

In [29]:
import socket, threading, time

def tcp_reverse_server():
    """TODO: TCP server that REVERSES the client's message."""
    server_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    server_socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
    server_socket.bind(('', 12002))
    server_socket.listen(1)
    server_socket.settimeout(5)

    try:
        conn, addr = server_socket.accept()
        conn.settimeout(3)
        while True:
            try:
                data = conn.recv(1024).decode()
                if not data:
                    break
                # ============================================
                # TODO: Replace the line below so that
                # 'reply' is the REVERSE of 'data'
                # Hint: data[::-1]
                # ============================================
                reply = data[::-1]  # <-- FIX THIS LINE
                # ============================================
                conn.send(reply.encode())
            except socket.timeout:
                break
        conn.close()
    except socket.timeout:
        pass
    server_socket.close()

# --- Test your server ---
server_thread = threading.Thread(target=tcp_reverse_server, daemon=True)
server_thread.start()
time.sleep(0.3)

client_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
client_socket.connect(('127.0.0.1', 12002))

test_messages = ["cmpe148", "networking", "sockets"]
all_passed = True
for msg in test_messages:
    client_socket.send(msg.encode())
    reply = client_socket.recv(1024).decode()
    expected = msg[::-1]
    status = "‚úÖ" if reply == expected else "‚ùå"
    if reply != expected:
        all_passed = False
    print(f"  {status}  Sent: '{msg}' ‚Üí Got: '{reply}' (expected: '{expected}')")
    time.sleep(0.1)

client_socket.close()
server_thread.join(timeout=6)

if all_passed:
    print("\nüéâ All tests passed!")
else:
    print("\n‚ö†Ô∏è  Some tests failed ‚Äî check your TODO above.")

  ‚úÖ  Sent: 'cmpe148' ‚Üí Got: '841epmc' (expected: '841epmc')
  ‚úÖ  Sent: 'networking' ‚Üí Got: 'gnikrowten' (expected: 'gnikrowten')
  ‚úÖ  Sent: 'sockets' ‚Üí Got: 'stekcos' (expected: 'stekcos')

üéâ All tests passed!


---
## Part 4: Web Caching Calculations (~8 min)

From the slides, we analyzed an institutional network scenario:
- **Access link rate:** 1.54 Mbps
- **RTT** to origin server: 2 sec
- **Object size:** 100 Kbits
- **Average request rate:** 15 requests/sec
- **Average data rate:** 15 √ó 100Kbits = 1.50 Mbps

Let's compute the key metrics programmatically.

In [30]:
# === Scenario parameters ===
access_link_rate = 1.54   # Mbps
internet_rtt = 2.0        # seconds (RTT to origin server)
object_size = 100         # Kbits per object
request_rate = 15         # requests per second
lan_rate = 1000           # Mbps (1 Gbps LAN)

avg_data_rate = request_rate * object_size / 1000  # Mbps

print("=" * 55)
print("SCENARIO 1: No cache, original 1.54 Mbps access link")
print("=" * 55)

utilization_access = avg_data_rate / access_link_rate
utilization_lan = avg_data_rate / lan_rate

print(f"Average data rate to browsers: {avg_data_rate:.2f} Mbps")
print(f"LAN utilization:               {utilization_lan:.4f} ({utilization_lan*100:.2f}%)")
print(f"Access link utilization:        {utilization_access:.4f} ({utilization_access*100:.1f}%)")
print(f"\n‚ö†Ô∏è  At {utilization_access*100:.0f}% utilization, queueing delays ‚Üí MINUTES!")
print(f"End-to-end delay ‚âà {internet_rtt} sec + MINUTES + ~0 sec (LAN) = very high!")

SCENARIO 1: No cache, original 1.54 Mbps access link
Average data rate to browsers: 1.50 Mbps
LAN utilization:               0.0015 (0.15%)
Access link utilization:        0.9740 (97.4%)

‚ö†Ô∏è  At 97% utilization, queueing delays ‚Üí MINUTES!
End-to-end delay ‚âà 2.0 sec + MINUTES + ~0 sec (LAN) = very high!


In [31]:
print("=" * 55)
print("SCENARIO 2: Upgrade access link to 154 Mbps")
print("=" * 55)

fast_link_rate = 154  # Mbps
utilization_fast = avg_data_rate / fast_link_rate

print(f"Access link utilization:  {utilization_fast:.4f} ({utilization_fast*100:.2f}%)")
print(f"End-to-end delay ‚âà {internet_rtt} sec + ~msecs + ~0 = ~{internet_rtt} sec")
print(f"\nüí∞ But this is EXPENSIVE!")

SCENARIO 2: Upgrade access link to 154 Mbps
Access link utilization:  0.0097 (0.97%)
End-to-end delay ‚âà 2.0 sec + ~msecs + ~0 = ~2.0 sec

üí∞ But this is EXPENSIVE!


In [44]:
print("=" * 55)
print("SCENARIO 3: Install a web cache (hit rate = 0.4)")
print("=" * 55)

cache_hit_rate = 0.9

# Only (1 - hit_rate) fraction of requests go to the origin server
data_rate_to_origin = (1 - cache_hit_rate) * avg_data_rate
utilization_with_cache = data_rate_to_origin / access_link_rate

# Average delay = weighted sum
delay_origin = internet_rtt + 0.01  # ~2.01 sec (RTT + small access delay)
delay_cache = 0.001                 # ~1 ms (local cache hit)

avg_delay = (1 - cache_hit_rate) * delay_origin + cache_hit_rate * delay_cache

print(f"Cache hit rate:               {cache_hit_rate:.0%}")
print(f"Data rate over access link:   {data_rate_to_origin:.2f} Mbps")
print(f"Access link utilization:      {utilization_with_cache:.2f} ({utilization_with_cache*100:.0f}%)")
print(f"\nAverage end-to-end delay:")
print(f"  = {1-cache_hit_rate:.1f} √ó {delay_origin:.2f}s + {cache_hit_rate:.1f} √ó {delay_cache*1000:.0f}ms")
print(f"  = {avg_delay:.3f} sec  ‚âà {avg_delay:.1f} sec")
print(f"\n‚úÖ Faster than the 154 Mbps upgrade AND cheaper!")

SCENARIO 3: Install a web cache (hit rate = 0.4)
Cache hit rate:               90%
Data rate over access link:   0.15 Mbps
Access link utilization:      0.10 (10%)

Average end-to-end delay:
  = 0.1 √ó 2.01s + 0.9 √ó 1ms
  = 0.202 sec  ‚âà 0.2 sec

‚úÖ Faster than the 154 Mbps upgrade AND cheaper!


### ‚úèÔ∏è Question 4.1 ‚Äî TODO: What if the cache hit rate improves?

Complete the code below to calculate the access link utilization and average delay for **cache hit rates from 0.0 to 0.9** (in steps of 0.1).

In [50]:
import math

print(f"{'Hit Rate':>10s}  {'Access Util':>12s}  {'Avg Delay (s)':>14s}")
print("=" * 42)

for hit_rate in [i / 10 for i in range(10)]:
    # ============================================
    # TODO: Calculate the following two values
    # based on the formulas from Scenario 3 above.
    #
    # access_util = ???
    # avg_delay   = ???
    # ============================================
    access_util = (1 - hit_rate) # <-- FIX THIS
    avg_delay   = (1 - hit_rate) * 2.0 + hit_rate * 0.001  # <-- FIX THIS
    # ============================================

    print(f"{hit_rate:>10.0%}  {access_util:>11.2%}  {avg_delay:>14.3f}")

  Hit Rate   Access Util   Avg Delay (s)
        0%      100.00%           2.000
       10%       90.00%           1.800
       20%       80.00%           1.600
       30%       70.00%           1.400
       40%       60.00%           1.200
       50%       50.00%           1.000
       60%       40.00%           0.801
       70%       30.00%           0.601
       80%       20.00%           0.401
       90%       10.00%           0.201


### ‚úèÔ∏è Question 4.2

Based on your table:
1. At what hit rate does the access link utilization drop below 50%?
2. At hit rate 0.0 (no cache), is the average delay practical? Why or why not?
3. Why is installing a cache considered cheaper than upgrading the access link, even though both improve performance?

> 1. At 50% the access link utilization drops roughly below 50%.
> 2. The average delay is not really practical since it needs to first load the resources.
> 3. Installing a cache is considered cheaper than upgrading the access link because less data needs to be transferred from the host to the client.

---
## Part 5 (Bonus): Non-Persistent HTTP ‚Äî RTT Calculation

From the slides: **Non-persistent HTTP response time = 2√óRTT + file transmission time** per object.

If a web page has a base HTML file plus 10 referenced images, and non-persistent HTTP is used (one object per TCP connection):

In [35]:
# Parameters
rtt = 0.050               # 50 ms RTT
file_size_html = 10_000   # 10 KB base HTML
file_size_img = 50_000    # 50 KB per image
num_images = 10
bandwidth = 10_000_000    # 10 Mbps link (bits/sec)

def non_persistent_time(rtt, file_size_bytes, bandwidth_bps):
    """Time to fetch ONE object with non-persistent HTTP."""
    transmission_time = (file_size_bytes * 8) / bandwidth_bps
    return 2 * rtt + transmission_time

# Time for base HTML
t_html = non_persistent_time(rtt, file_size_html, bandwidth)

# Time for each image (sequentially ‚Äî worst case)
t_per_image = non_persistent_time(rtt, file_size_img, bandwidth)
t_all_images_sequential = num_images * t_per_image

total_sequential = t_html + t_all_images_sequential

print("=== Non-Persistent HTTP (sequential) ===")
print(f"Time for base HTML:  {t_html*1000:.1f} ms  (2√óRTT + {file_size_html*8/bandwidth*1000:.1f}ms tx)")
print(f"Time per image:      {t_per_image*1000:.1f} ms  (2√óRTT + {file_size_img*8/bandwidth*1000:.1f}ms tx)")
print(f"Total (1 HTML + {num_images} images): {total_sequential*1000:.0f} ms = {total_sequential:.3f} sec")
print()

# With persistent HTTP: 1 RTT for connection + 1 RTT per object + tx times
t_persistent = rtt + (rtt + file_size_html * 8 / bandwidth) + \
               num_images * (rtt + file_size_img * 8 / bandwidth)

print("=== Persistent HTTP ===")
print(f"Total: {t_persistent*1000:.0f} ms = {t_persistent:.3f} sec")
print(f"\nSpeedup: {total_sequential/t_persistent:.1f}√ó faster with persistent HTTP!")

=== Non-Persistent HTTP (sequential) ===
Time for base HTML:  108.0 ms  (2√óRTT + 8.0ms tx)
Time per image:      140.0 ms  (2√óRTT + 40.0ms tx)
Total (1 HTML + 10 images): 1508 ms = 1.508 sec

=== Persistent HTTP ===
Total: 1008 ms = 1.008 sec

Speedup: 1.5√ó faster with persistent HTTP!


---
## üèÅ Wrap-Up

**Key takeaways from this exercise:**

1. **HTTP** is a request/response protocol on top of TCP. Headers, status codes, and methods (GET/POST) are all human-readable ASCII.
2. **Conditional GETs** and **web caches** save bandwidth and reduce latency.
3. **DNS** is a distributed, hierarchical system. Caching at the local DNS server dramatically speeds up repeated lookups.
4. **UDP sockets** require destination addresses on every send; **TCP sockets** establish a connection first, then send/receive without specifying addresses.
5. **Persistent HTTP** saves significant time over non-persistent by reusing TCP connections.

---

**Submission:** Download this notebook (`File ‚Üí Download .ipynb`) and submit it to Canvas with your answers filled in and all code cells executed.