-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cached datasets #257
Comments
Ah, right. Probably not intended and should be fixed as soon as the time will allow. Could you consider making a PR? |
Unfortunately, I have no practice with PRs but I 'll see if I could do something. |
I can see the inconvenience but I think it's debatable whether this is unintended behaviour or not. The point of caching is to make the least amount of requests to Eurostat servers and writing a fix that would constantly compare the cached file with the unfiltered remote file would create unnecessary web traffic between end-users and Eurostat. Caching can be easily disabled, although it is currently enabled by default. Maybe this is more of an issue related to documentation? Would adding some explicit messages when downloading and caching data make users more aware of this limitation? |
Just to clarify, my point was that I would expect the second query in my example to return an empty table and/or send the query to Eurostat. Basically, the cached table after the first query is only a small part of the dataset and obviously it could not be used for broader queries. |
Thank you for clarifying. The reason (whether it be good or not, you decide) why it works like that is that the query parameters are passed onto the request made to the Eurostat database. For some query parameters no filtering is done locally, whereas in some cases there is some at least some processing done locally (if not filtering). An example of the latter is handling Eurostat date strings and turning them to date objects. Yes, we could be possible to add some additional local checks before printing the output, to see whether the geo column has the desired areas or if the time frame is as desired; if not, then print a message to the user or attempt to refresh the cached dataset. Or maybe the query could be saved with the cached dataset and only use the cached data if the queries are identical. |
As referenced in issue #258 it might make more sense to cache datasets that were downloaded without filtering than caching filtered datasets. Then, if the complete dataset was cached locally, it could also be filtered locally, solving both issues at a single stroke. |
Closed with the CRAN release of package version 4.0.0 |
By default
eurosatat
caches datasets when it is run for the first time during the session, but it does not check if the cached table contains all the data needed to proceed the consecutive requests to the same table in Eurostat. I'm not sure if this is the intended behaviour. Please see the following example:The text was updated successfully, but these errors were encountered: