If Tornado (3.1) is not running in debug mode, when a static file is accessed for the first time the included StaticFileHandler will generate an MD5 hash for it and store that hash in perpetuity. The handler will then read the content of the file and serve it up with the corresponding version and ETag. The next time the same client requests the file we get a 304 response and the client uses its own copy from its cache.
Now let's modify the file and change something interesting in the static file. If the original client now requests the file again, we still get the cached copy from the local browser since the ETag is linked to the cached hash in Tornado.
Consider now what happens if another distinct client that has not accessed the file before tries to fetch the same static file. Tornado will happily serve up the file, but it reads it off the disk again, and that client gets the new version of the file, but since Tornado already has a cached hash for that file it serves the new file with the old ETag. So now we have two different versions of the same file with the same ETag being shown on two different clients.
Cache the file along with the hash and always serve it consistently -- memory issues.
Update the hash when the file is read by the second client -- inconsistent state, the original client shouldn't get a different response on the off chance that another client without a locally cached copy has fetched since the original client did.
Change the ETag / revision hashing to avoid assigning the same hash two different versions of the file -- this would oblige some kind of stat call to see if the file has changed and would always serve the most recent version -- it would also need to check the hash is up to date when using static_url calls.
I notice that disabling ETags with a custom static file handler already handles if modified since cleanly.
This is a good point. For completeness, there's a fourth option: document this behavior and recommend deployment strategies that do not involve changing files out from under a live process (which can cause problems for templates or python modules, not just static files). However, since this is an easy mistake to make and it has ramifications for external caches, so it's probably worth statting the file before returning a cached etag.
There is a related issue with the version tag included by static_url: if a static_url is generated by one process but the resulting file is served by a second process with a different version of the data, we'll serve the wrong data but still mark it as cacheable forever. The static_url version tag should be checked against the expected version and if we discover that we're being asked for an inconsistent version we should give it a very short expiration.
Looks like it already fetches the modified date of the file every time. It shouldn't be too difficult to add the modified time to the cache entry so that we ignore old entries...