Skip to content
This repository has been archived by the owner on Aug 2, 2022. It is now read-only.

Caching #100

Closed
perrygeo opened this issue Dec 23, 2015 · 5 comments
Closed

Caching #100

perrygeo opened this issue Dec 23, 2015 · 5 comments
Milestone

Comments

@perrygeo
Copy link
Contributor

We might want to add optional caching via https://pypi.python.org/pypi/requests-cache

This would allow clients to store raw responses locally and reduce the burden on the API, particularly when developing and testing where the requests are likely to be repetitive.

@sgillies
Copy link
Contributor

Down at the end of the requests-cache README, it says

requests-cache ignores all cache headers, it just caches the data for the time you specify.

If you need library which knows how to use HTTP headers and status codes, take a look at httpcache and CacheControl.

The Mapbox APIs do have cache control headers we can use, so let's do check out CacheControl: http://cachecontrol.readthedocs.org/en/latest/.

@perrygeo
Copy link
Contributor Author

👍 requests-cache is more helpful for web scraping and end-user caching, not exactly a true web client caching mechanism like CacheControl.

@perrygeo
Copy link
Contributor Author

By default, CacheControl caches to an in-memory dictionary which doesn't persist when the process stops. This would only help for long-running clients, not so much for the mapbox cli.

There is an option to use file caching which uses pickling to serialize requests. This requires an extra level of configuration (user may want to specify the cache file location) but might be useful.

Also, I've had problems install from pypi: psf/cachecontrol#108 resolved

In general, it seems like there are a lot of unmerged PRs and open bugs so I'm not sure how actively it is maintained.

@perrygeo
Copy link
Contributor Author

Initial report: adding a simple wrapper using the dictionary store seems to work well. This passes both the unit tests and the doctests

         session = requests.Session()
+        session = CacheControl(session)

However, both the unit tests and doctests fail when you add FileCache. It creates the cache, writes to it but fails when cachecontrol tries to clean up the file (gives us OSError: [Errno 2] No such file or directory: '/tmp/.mapbox-sdk-py-cache/...). This only affects datasets and uploads.

By adding forever=True to the cachecontrol constructor, we can persist the data on disk after the process is complete. This seems to mitigate the OSError because it's no longer trying to delete the cached data after the session ends. But I'm not sure if forever caching stuff on disk is really desirable for this SDK?

@sgillies
Copy link
Contributor

sgillies commented Jan 5, 2016

I'm going to tackle this issue tonight. Basically:

  • Refactor, producing a new base service class with optional HTTP caching.
  • SDK usage becomes, for example, service = Directions(access_token=access_token, cache=cachecontrol.cache.DictCache()) for non-persistent caching and service = Directions(access_token=access_token, cache=cachecontrol.cache.FileCache()) for persistent caching.

sgillies pushed a commit that referenced this issue Jan 5, 2016
The base service class can use an HTTP cache.

Closes #100.
@sgillies sgillies added this to the 0.7 milestone Jan 5, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants