Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sensible cache-control headers for static assets, including those served by plugins #1645

Open
curiousleo opened this issue Feb 28, 2022 · 4 comments
Labels
enhancement maybe-not-1.0 Things that might not need to block 1.0 after all performance
Milestone

Comments

@curiousleo
Copy link

curiousleo commented Feb 28, 2022

What I'm seeing

With default_cache_ttl = 86400, I see the following:

A table view returns Cache-control: max-age=86400:

Screenshot_20220228_190000

A static asset returns no Cache-control header:

Screenshot_20220228_185933

What I expected to see

I expected the static asset to return a Cache-control header indicating that this response can be cached.

Why this matters

I'm productionising a Datasette deployment right now and was looking into putting it behind a Varnish instance. I was surprised to see requests for static assets being served from Datasette rather than Varnish, this is what led me to look more closely at the response headers.

While Datasette serves those static assets pretty quickly, I don't see why Datasette should serve them. By their nature, static assets like images and JS files are very cacheable, so it should be easy to serve them from a cache like Varnish.

(Note that Varnish can easily be configured to override this header, enabling caching for static assets. But it would be better if this override was not necessary.)

Discussion

It seems clear to me that serving static assets without a Cache-control header is not ideal.

I see two options here:

A. Static assets use the same logic as table / SQL views to set the Cache-control header based on default_cache_ttl.
B. An additional setting for static assets is introduced (default_static_cache_ttl, say).

@simonw
Copy link
Owner

simonw commented Mar 5, 2022

I agree: this is bad.

Ideally, content served from /static/ would apply best practices for static content serving - which to my mind means the following:

  • Where possible, serve with a far-future cache expiry header and use an asset URL that changes when the file itself changes
  • For assets without that, support conditional GET to avoid transferring the whole asset if it hasn't changed
  • Some kind of sensible mechanism for setting cache TTLs on assets that don't have a unique-file-per-version - in particular assets that might be served from plugins.

Datasette half-implemented the first of these: if you view source on https://latest.datasette.io/ you'll see it links to /-/static/app.css?cead5a - which in the template looks like this:

<link rel="stylesheet" href="{{ urls.static('app.css') }}?{{ app_css_hash }}">

I had forgotten I had implemented this! Here is how it is calculated:

datasette/datasette/app.py

Lines 510 to 516 in 458f03a

def app_css_hash(self):
if not hasattr(self, "_app_css_hash"):
with open(os.path.join(str(app_root), "datasette/static/app.css")) as fp:
self._app_css_hash = hashlib.sha1(fp.read().encode("utf8")).hexdigest()[
:6
]
return self._app_css_hash

So app.css right now could be safely served with a far-future cache header... only it isn't:

~ % curl -i 'https://latest.datasette.io/-/static/app.css?cead5a' 
HTTP/2 200 
content-type: text/css
x-databases: _memory, _internal, fixtures, extra_database
x-cloud-trace-context: 9ddc825620eb53d30fc127d1c750f342
date: Sat, 05 Mar 2022 01:01:53 GMT
server: Google Frontend
content-length: 16178

The larger question though is what to do about other assets. I'm particularly interested in plugin assets, since visualization plugins like datasette-vega and datasette-cluster-map ship with large amounts of JavaScript and I'd really like that to be sensibly cached by default.

@simonw
Copy link
Owner

simonw commented Mar 5, 2022

The existing app_css_hash already isn't good enough, because I built that before table.js existed, and that file should obviously be smartly cached too.

@simonw
Copy link
Owner

simonw commented Mar 5, 2022

It sounds like you can workaround this with Varnish configuration for the moment, but I'm going to bump this up the list of things to fix - it's particularly relevant now as I'd like to get a solution in place before Datasette 1.0, since it's likely to be beneficial to plugins and hence should be part of the stable, documented plugin interface.

@simonw simonw added this to the Datasette 1.0 milestone Mar 5, 2022
@simonw simonw changed the title Cache-control header missing for static assets Sensible cache-control headers for static assets, including those served by plugins Mar 5, 2022
@simonw
Copy link
Owner

simonw commented Mar 8, 2022

Hah, found a TODO about this:

datasette/datasette/app.py

Lines 997 to 999 in c579115

add_route(IndexView.as_view(self), r"/(?P<as_format>(\.jsono?)?$)")
# TODO: /favicon.ico and /-/static/ deserve far-future cache expires
add_route(favicon, "/favicon.ico")

@simonw simonw added the maybe-not-1.0 Things that might not need to block 1.0 after all label Dec 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement maybe-not-1.0 Things that might not need to block 1.0 after all performance
Projects
None yet
Development

No branches or pull requests

2 participants