If a RetryableAPIError exception is raised, we only repeat the request MAX_RETRIES number of times before raising an APIError. This guards against infinite loops, while still allowing most 403 errors to be worked around. As I explained in the commit message for 6cae2e3, this logic is still pretty vague because GitHub hasn't documented their rate limiting policy yet.
My testing strongly suggests that when GitHub returns status code 403 the request can be retried. This may be how they implement rate limiting. So, if we get a 403 we simply repeat the request. We don't wait between requests because there is not yet any evidence that it would benefit us. Hopefully, once the rate limiting is documented, we can revisit this issue. We also retry on Net::HTTPBadResponse exceptions. These are typically raised when something between the client and the server clobbers the response, so repeating the request is the most sensible approach. We don't limit the number of retries which means this code could end up looping forever. I'm loath to specify some arbitrary limit, however, without documentation on what to expect. For example, in the case of 403 errors, my testing reveals that sometimes we succeed after retrying twice, and other times it may take nearly ten retries.
We want the token and login to be sent for all authenticated queries. They were being sent for POST requests, but, seemingly, not for GETs, causing methods relying on the latter to fail. HTTParty's `default_params` method causes parameters so set to be sent on every request. We specify `login` and `token` as default parameters if the request is authenticated.
GitHub currently returns 500 errors as HTML. When we encountered this, the error message referred to the content type rather than the status code. Now we check the status code first, so errors are more informative. Signed-off-by: Felipe Coury <email@example.com>
The API should respond with data in the same format as we requested. If the Content-Type disagrees with what we expected, we raise an exception. This is currently broken for raw Git data as the API call returns the wrong content type. Reported as develop/develop.github.com#13
Repository.find_all accepted an array of 'words', which it concatenated with '+'. It now also accepts a single space-separated String, or any combination of the two. This method is still buggy, however, in that the query is not URI escaped; it's simply interpolated as-is into the URI. Peculiarly, the API doesn't accept URI-escaped space-separated queries, e.g. 'ruby%20spec'. This is non-standard enough to put off escaping until I know exactly what the API expects.
* The `tree/show/:user/:repo/:tree_sha` call returns an array of objects in the given tree. This is exposed by the FileObject class. The name is convoluted because Object is a reserved word. 'Tree' is awkward because although the argument describes a tree, the returned objects aren't trees... We should probably call this Tree, and have it return an array of FileObject objects... * The `blob/show/:user/:repo/:tree_sha/:path` call is supported by Blob.find(user, repo, sha, path). * The `blob/show/:user/:repo/:sha` call returns raw Git data irrespective of caller's format preference. To handle this a get_raw method has been defined which simply requests a given path and returns the raw body without attempting to coerce it into a data structure. The right way to handle this is to format based on the Content-Type header in the response, but that is always set to text/html, so is useless. Blob.find(user, repo, sha) returns said raw data. * The .find method is now intelligent about arrays. If yaml[key] is an array, each element is assumed to be a hash constituting a new object. I still don't claim to understand all the magic of this module, so this enchantment may very well be unnecessary, but it enabled some hairy code to be factored out, so it stays for now. * The .find_all method now accepts a block, to which it passes the data it intends to construct a new object with. This is to allow callers to massage an arbitrary data structure into a simple hash. This is used for Tag.find because GitHub returns a single hash of tags, rather than one hash per tag. (Yes, I realise that this is a ridiculously long commit, and, yes, I have heard of cherry-pick...)
GitHub API errors aren't reported consistently yet. Sometimes they return reams of HTML and a non-200 status code, other times they return a 200 status code and an error message. We now handle both of these cases by raising an APIError with an appropriate message. This will need to be updated as the API standardises. When it does, we want to interpret 404s, for example, as the object not being found, and thus provide an informative error message.
There's now a Tag class so you can ask for the tags of user's repo with Tag.find(user, repo). Plus, all repository objects respond to .tags, which returns an array of Tag objects. This isn't the pretties code but it works, and that's enough for now.
If GitHub.com can't handle the API request, either due to user error or server error, it returns an utterly useless chunk of HTML. This confuses the library, so we now raise an APIError if the status code is anything other than 200. Better error handling will have to wait until the API supports it.
Hacked Repository.find so if called with a single argument, assumes it to be a user name, and returns an array of Repository objects corresponding to the user's repositories. I don't completely understand this library's architecture, so it's likely my implementation is Wrong. ;-) Works for me, though.