# HTTP Caching Server with ETag and Last-Modified

## Aim
To implement an HTTP server in Python that demonstrates caching using ETag and Last-Modified headers, and serves web pages efficiently using conditional requests.

## Description
This server uses Python's `http.server` and `socketserver` modules to serve a single `index.html` file. It implements HTTP caching by:

- Generating an ETag (MD5 hash of file content) for each version of the file.
- Setting the Last-Modified header based on the file’s modification time.
- Checking the client’s request headers `If-None-Match` (ETag) and `If-Modified-Since` to validate cached copies.

If the client already has the latest version of the file, the server responds with `304 Not Modified`, avoiding retransmission. Otherwise, it sends a `200 OK` response with the file contents, ETag, and Last-Modified headers.

## Key Learning Outcomes
- Understand how HTTP caching reduces network bandwidth usage.
- See server-driven caching using both strong (ETag) and weak (Last-Modified) validators.
- Gain experience with real-world HTTP header manipulation in server-side programming.

In [None]:
import http.server
import socketserver
import os
import hashlib
from email.utils import formatdate, parsedate_to_datetime

PORT = 8081
FILENAME = "index.html"

class CachingHandler(http.server.SimpleHTTPRequestHandler):
    def do_GET(self):
        if self.path in ["/", f"/{FILENAME}"]:
            filename = FILENAME
        elif self.path == "/favicon.ico":
            self.send_response(204)
            self.end_headers()
            return
        else:
            print(self.path)
            self.send_error(404, "File Not Found")
            return

        with open(filename, "rb") as f:
            content = f.read()

        etag = hashlib.md5(content).hexdigest()
        last_mod_time = int(os.path.getmtime(filename))
        last_modify = formatdate(last_mod_time, usegmt=True)

        if_none_match = self.headers.get("If-None-Match")
        if_modified = self.headers.get("If-Modified-Since")

        etag_match = False
        if if_none_match is not None:
            if_no_match = if_none_match.strip('"')
            etag_match = (if_no_match == etag)

        last_match = False
        if if_modified is not None:
            try:
                ims_time = int(parsedate_to_datetime(if_modified).timestamp())
                last_match = (ims_time >= last_mod_time)
            except Exception as e:
                last_match = False

        if etag_match or last_match:
            self.send_response(304)
            self.end_headers()
            return

        self.send_response(200)
        self.send_header("Content-Type", "text/html")
        self.send_header("ETag", etag)
        self.send_header("Last-Modified", last_modify)
        self.end_headers()
        self.wfile.write(content)

with socketserver.TCPServer(("", PORT), CachingHandler) as httpd:
    print(f"Serving on port {PORT}...")
    httpd.serve_forever()

Serving on port 8081...


127.0.0.1 - - [12/Sep/2025 14:42:37] "GET / HTTP/1.1" 304 -
127.0.0.1 - - [12/Sep/2025 14:42:55] "GET / HTTP/1.1" 200 -
127.0.0.1 - - [12/Sep/2025 14:42:55] "GET / HTTP/1.1" 200 -
127.0.0.1 - - [12/Sep/2025 14:43:03] "GET / HTTP/1.1" 304 -
127.0.0.1 - - [12/Sep/2025 14:43:03] "GET / HTTP/1.1" 304 -


### How the code works, step by step:

1. **Imports**: Loads modules for HTTP serving, file handling, hashing, and date formatting.
2. **Configuration**: Sets the port and file to serve.
3. **Handler Class**: Inherits from `SimpleHTTPRequestHandler` and overrides `do_GET` to add caching logic.
4. **Path Check**: Only serves `index.html`; other paths return 404.
5. **File Read**: Reads the file content in binary mode.
6. **ETag Generation**: Computes an MD5 hash of the file for the ETag header.
7. **Last-Modified**: Gets the file's last modification time and formats it for HTTP.
8. **Client Cache Check**: Reads `If-None-Match` and `If-Modified-Since` headers from the request.
9. **Cache Validation**: If the ETag matches or the file hasn't changed since the client's copy, responds with 304 (no content sent).
10. **Send File**: If the cache is invalid or missing, sends the file with 200 OK and the appropriate headers.
11. **Server Start**: Binds the server to the port and starts serving requests.