New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
specifications for caching #68
Comments
The optional "modificationDate" field of the info request puts a time stamp on the entire dataset. For data where updates are rare (all data in "final archive" mode), this is probably sufficient. Autoplot has support for caching based on this, which is disabled by default, but will soon be enabled. The code also has a small kludge, where it will allow data to exist after the modificationDate, but this data will not be cached. This allows for more recent data to be treated as "active archive" mode. This feature will probably be revisited, and possibly removed. |
HTTP has several headers which will be useful. Bob describes no-cache: "The no-cache directive could be used by a server to tell clients to not even bother caching a response associated with a dataset. This could be useful for frequently changing datasets, for example, real time datasets." This could be used to support the "active archive" portion of the data. For example, I collect temperature measurements once per minute and make them available on the server. The data I collected yesterday will never change and could be cached on the client side. However, requests for today's data will result in a different dataset growing every minute, and this should not be cached. |
If-Modified-Since allows the client to send the time stamp for the interval, and the server can send back a 304 response, meaning use what you have, no newer data exists. Server responses can also have lastModified and the cache should keep track of this. The tricky part of this which I'm still thinking about is how things should be timetagged. I have on my server daily files, with each with a time stamp from the end of the day. When I send three consecutive files, the lastModified field should be the latest time stamp. When the client gets this three-day chunk of data, it breaks up the response into its granules, which also happen to be daily. Note the first day has the wrong time stamp, but it should be newer than what it would have received if just that day were requested. I can imagine this causing problems, for example where the first day is updated with a file that is stamped just one day later. |
Autoplot's next devel release will support the "If-Modified-Since" mode for caching. |
@jbfaden will create a wiki page with best practices for caching |
@jbfaden Please add notes to this page: https://github.com/hapi-server/data-specification/wiki/implementation-notes |
We've agreed there should be a guidance document describing how servers and clients can implement caching.
Caching is an important feature to support interactive access to the data. Since data must be downloaded at full resolution, both the server and client may find that are passing the same data repeatedly in the same session. Autoplot has experimental caching which limits downloads of daily intervals to once per hour, but it would be nice if data were loaded once and client requests would serve only to check if new versions are available.
The text was updated successfully, but these errors were encountered: