Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conditional requests? #112

Open
lolmaus opened this issue May 14, 2013 · 13 comments
Open

Conditional requests? #112

lolmaus opened this issue May 14, 2013 · 13 comments

Comments

@lolmaus
Copy link

lolmaus commented May 14, 2013

Hi!

I'm using github_api to work with Gists. I'm building a tiny public online service, and the limitation of 5000 GitHub API requests per hour might one day become not enough (e. g. when someone shares a link on a very popular web portal).

GitHub suggests overcoming the limitation by caching requested files serverside and making conditional requests: GitHub will only parse a request if the requested Gist has been modified since you requested it for the previous time.

The question is: is it possible to do conditional requests with github_api?

If yes, please consider this ticket to be a request for documentation. If not, a feature request.

I'd also greatly appreciate links to example implementations, etc.

Thank you! ^_^

@piotrmurach
Copy link
Owner

Thanks for using the library! The simple answer is no, at the moment the gem does not provide conditional requests. This is a feature that I would like to add, do you have time to look into this?

@lolmaus
Copy link
Author

lolmaus commented May 14, 2013

I'm too novice for this yet. :(

But i'm willing to learn. So if you're ready to chat with me to explain the architecture of your gem and answer lots of newbie questions, i'll be happy to participate!

@lolmaus
Copy link
Author

lolmaus commented May 14, 2013

Hey Peter, i've looked into github_api source and i've got a couple of questions to discuss with you.

  1. A Conditional Request is based on either of two values — a date and an ETag (that looks similar to an md5 hash). GitHub API reports those values in the header of its HTTP response. I have inspected an object returned by github_api and found neither of the two. Does this mean that parsing the HTTP headers is yet to be implemented?
  2. Conditional Requests only make sense if you cache Gists and other stuff serverside. Do objects returned by github_api support serialization/deserialization or something?

@lolmaus
Copy link
Author

lolmaus commented May 15, 2013

Okay, i spent some time fiddling with github_api source and i found out that the object returned by it does contain etag and last-modified, yay! They are accessible like this:

@github.gists.get(params[:gist_id]).env[:response_headers][:etag]
@github.gists.get(params[:gist_id]).env[:response_headers][:"last-modified"]

@lolmaus
Copy link
Author

lolmaus commented May 15, 2013

Okay, all we need to make a conditional request is to merge {"If-None-Match" => '0bf5df817e504a1310b0bdd36b03f8cf'} from an argument of github_api's .gists.get() into the Faraday's headers hash.

I tried to follow the code from github_api's .gists.get() to Faraday's headers, but i got totally confiused with all those arguments, options, params, paginations and other stuff. Geez, this stuff is complicated!

Hey Peter, could you please help me with that?

The question of effective saving/loading of objects returned by github_api's .gists.get() is also actual.

@piotrmurach
Copy link
Owner

@lolmaus Thanks for spending time looking into this. The conditional requests as you rightly pointed out need to work hand in hand with response caching. Currently the gem does not support caching of any sort. I feel that users of the library should be able to add this functionality as optional extra and easily configure it. Once caching has been setup the conditional requests would be enabled by default.
I will be working on this library in couple of days as there are other things that need adding/changing. I will definitely try to implement/work on the caching layer, however this may well not be a trivial task.

@jimjh
Copy link

jimjh commented Jul 8, 2013

@peter-murach Has there been progress on this issue? I am happy to help.

From my perspective, the application should be able to enable caching at two places:

  1. Inside the client. The gem could allow an additional option (config.cache = true) to turn on automatic request caching. If enabled, the gem will store the ETag given in the response and attach it in future requests. This 'memoization' is tricky - would a map of SHA256(faraday_args) -> response.etag be sufficient?
  2. Inside the application. Multi-process applications (say, production apps) may decide to persist the ETag outside of the gem. In this case, the gem should allow an optional argument (etag: '644b5b0155e6404a9cc4bd9d8b1ae730') and insert an additional If-None-Match header if provided.

Alternatively, we could just use faraday-cache, which appears to be used in octokit (see pull request).

@jimjh
Copy link

jimjh commented Jul 8, 2013

I tried adding faraday-http-cache as middleware using config.stack, but no luck so far.

@piotrmurach
Copy link
Owner

@jimjh quite slow to be frank, it would be awesome if you have time to help me out and do some prototyping. I find it much easier to discuss actual code implementation.

  1. Not sure about the option. If we go down the root of the catching middleware being enabled/disabled inside the gem itself then probably this solution would be good. However, allowing to specify externally cache middleware and the storage type that the user may want to use would clash with the option. For instance, if I want to store my responses as flat files then it would be good to provide for that flexibility. Then we could assume if someone configured caching layer it means they want to cache. When it comes to ETags retrieval, i think we should make sure that we hash the get request including the query params. What do you think?
  2. Based on the get request hash the conditional request would get fired with If-None-Match header for revalidation. If we get 304 then we serve the cached copy otherwise we update etag and content and serve that. Is that what you had in mind?

I think initially it would be cool to just flesh out a simple working prototype.

@jimjh
Copy link

jimjh commented Jul 9, 2013

I managed to get caching to work with the faraday-http-cache gem. Please refer to this gist.

  • The cache key is generated from the HTTP method, the url, and the headers.
  • The cache first checks the TTL of an entry. If the entry is fresh, it's used.
  • Otherwise, the entry is validated by including the etag and timestamp in the request.

Since it uses ActiveSupport::Cache, it supports external caches such as FileStore and MemCacheStore. If you are happy with this solution, all we need to do is update the documentation.

Note that in the case of a 304, the http cache middleware will simply return the cached status i.e. the rest of the Faraday stack won't know that it is a 304. Not sure if this will break anything.

@piotrmurach
Copy link
Owner

@jimjh This is really good! I like the way we can have external dependency that allows for easy store configuration. I think the ActiveSupport::Cache is good but not good enough. I'd suggest to see if this could work with moneta that has a huge range of storage options more suited for the json&media type of storage. I think this should not be a problem to integrate (see moneta). Any thoughts?

Currently the 304 I think would raise faraday error but we could add specific 304 error just for this case that people could catch. Eventually, add logic that checks if http cache middleware is present and attempt to retrieve response content.

@jimjh
Copy link

jimjh commented Jul 12, 2013

@peter-murach I am not too familiar with moneta, but it certainly looks powerful. Since MonetaStore implements the ActiveSupport::Cache::Store interface, I managed to use it as a drop-in replacement for the memory store. Check out this new gist.

I guess if there is demand in the future we could somehow indicate that it was a cached response (such as raising a faraday error like you mentioned) For now, the application could forcefully clear the cache store (with @store.clear) if it is really concerned.

@maikokuppe
Copy link

maikokuppe commented Nov 14, 2016

Is this currently possible?

  • Send a request to github specifying the 'If-Modified-Since' header manually
  • Receive a 304 status in a response object
  • Handle this situation with manually persisted data from the past, i.e. doing the 'caching' manually

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants