Touch point: documentcloud/documents/views.py retrieve handler and
documentcloud/documents/models/document.py:115-117.
Document.updated_at = AutoLastModifiedField is the right freshness signal — confirmed not bumped on reads (no hit_count increment in the view). Two changes:
- Set
Last-Modified: <document.updated_at> instead of letting Django default to response time. DRF's Last-Modified mixin or a manual header set in the retrieve method both work.
- Compare incoming
If-Modified-Since to document.updated_at and return 304 Not Modified (no body) when nothing has changed.
Multiplied across daily scraper crawls of static archive content this is a large bandwidth win — and pairs naturally with stale-while-revalidate.
Touch point:
documentcloud/documents/views.pyretrieve handler anddocumentcloud/documents/models/document.py:115-117.Document.updated_at = AutoLastModifiedFieldis the right freshness signal — confirmed not bumped on reads (nohit_countincrement in the view). Two changes:Last-Modified: <document.updated_at>instead of letting Django default to response time. DRF'sLast-Modifiedmixin or a manual header set in the retrieve method both work.If-Modified-Sincetodocument.updated_atand return304 Not Modified(no body) when nothing has changed.Multiplied across daily scraper crawls of static archive content this is a large bandwidth win — and pairs naturally with
stale-while-revalidate.