Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

specifications for caching #68

Closed
jbfaden opened this issue Mar 29, 2018 · 6 comments
Closed

specifications for caching #68

jbfaden opened this issue Mar 29, 2018 · 6 comments

Comments

@jbfaden
Copy link
Contributor

jbfaden commented Mar 29, 2018

We've agreed there should be a guidance document describing how servers and clients can implement caching.

Caching is an important feature to support interactive access to the data. Since data must be downloaded at full resolution, both the server and client may find that are passing the same data repeatedly in the same session. Autoplot has experimental caching which limits downloads of daily intervals to once per hour, but it would be nice if data were loaded once and client requests would serve only to check if new versions are available.

@jbfaden
Copy link
Contributor Author

jbfaden commented Mar 29, 2018

The optional "modificationDate" field of the info request puts a time stamp on the entire dataset. For data where updates are rare (all data in "final archive" mode), this is probably sufficient. Autoplot has support for caching based on this, which is disabled by default, but will soon be enabled. The code also has a small kludge, where it will allow data to exist after the modificationDate, but this data will not be cached. This allows for more recent data to be treated as "active archive" mode. This feature will probably be revisited, and possibly removed.

@jbfaden
Copy link
Contributor Author

jbfaden commented Mar 29, 2018

HTTP has several headers which will be useful. Bob describes no-cache: "The no-cache directive could be used by a server to tell clients to not even bother caching a response associated with a dataset. This could be useful for frequently changing datasets, for example, real time datasets." This could be used to support the "active archive" portion of the data. For example, I collect temperature measurements once per minute and make them available on the server. The data I collected yesterday will never change and could be cached on the client side. However, requests for today's data will result in a different dataset growing every minute, and this should not be cached.

@jbfaden
Copy link
Contributor Author

jbfaden commented Mar 29, 2018

If-Modified-Since allows the client to send the time stamp for the interval, and the server can send back a 304 response, meaning use what you have, no newer data exists. Server responses can also have lastModified and the cache should keep track of this.

The tricky part of this which I'm still thinking about is how things should be timetagged. I have on my server daily files, with each with a time stamp from the end of the day. When I send three consecutive files, the lastModified field should be the latest time stamp. When the client gets this three-day chunk of data, it breaks up the response into its granules, which also happen to be daily. Note the first day has the wrong time stamp, but it should be newer than what it would have received if just that day were requested. I can imagine this causing problems, for example where the first day is updated with a file that is stamped just one day later.

@jbfaden
Copy link
Contributor Author

jbfaden commented Mar 30, 2018

Autoplot's next devel release will support the "If-Modified-Since" mode for caching.

@jvandegriff
Copy link
Collaborator

@jbfaden will create a wiki page with best practices for caching

@rweigel
Copy link
Contributor

rweigel commented Mar 4, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants