Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add way to have utils.data functions w/ cache have their cache expire #1162

Open
eteq opened this issue Jun 6, 2013 · 4 comments
Open

Add way to have utils.data functions w/ cache have their cache expire #1162

eteq opened this issue Jun 6, 2013 · 4 comments

Comments

@eteq
Copy link
Member

eteq commented Jun 6, 2013

Right now, the various functions in utils.data that have a cache option hold onto their cached copy of a download until the file is deleted. There are some use cases that require the cache expire after a certain time, and get re-downloaded (e.g. #1145).

So these functions (which all depend on download_file I believe), should allow cache to give a specific timeframe after which the file should be re-downloaded. This is pretty straightforward, but there is a question how best to define the expiration time. My instinct is to use stdlib datetime.datetime and datetime.timedelta objects, but an alternative might be astropy.time.Time and astropy.time.DeltaTime.

@astrofrog
Copy link
Member

👍 to this idea!

Using astropy.time.Time to define the expiration time on the tables used for Time... very meta! Hopefully you don't need leap seconds to define the expiration time? ;)

@mhvk
Copy link
Contributor

mhvk commented Jun 6, 2013

For the IERS tables considered for Time, one will only know after the file is downloaded up to when the file is good. Also, @taldcroft has suggested that refreshing only happen if the contents do not cover the date for which information is needed; if so, this may be less essential. On the other hand, for these files at least, they just get longer, so once a copy is there, (the equivalent of) wget -c will do the trick and will carry rather little overhead.

@hamogu
Copy link
Member

hamogu commented Jul 13, 2013

It might be worth noting that the Sunpy project has a GSoC student who implements a database for organize store and cache datafiles that are downloaded from archives.
http://www.sunpy.org/2013/06/19/google-summer-of-code-the-database-class/
So far, I am not quite sure if and how much that overlaps with this, but we might be able to just use that class here as well. It might be worthwhile to check which features are implemented by the end of the summer.

@derdon
Copy link
Contributor

derdon commented Mar 15, 2014

Hi, this student is me :) The docs for the database can be found at http://sunpy.readthedocs.org/en/stable/guide/acquiring_data/database.html and perhaps the most interesting section is Caching. It's also easily possible to implement and use a custom cache type. This is not documented yet, though. Here you can find a unit test for a custom cache to see an example of how to do it: https://github.com/sunpy/sunpy/blob/master/sunpy/database/tests/test_caching.py#L15

You can also find me on #astropy or #sunpy if you want to discuss this idea further outside this thread.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants