Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pagination merge issue #136

Merged
merged 2 commits into from Nov 27, 2020
Merged

Pagination merge issue #136

merged 2 commits into from Nov 27, 2020

Conversation

rundel
Copy link
Contributor

@rundel rundel commented Oct 29, 2020

This is related to #135, I was also seeing strange behavior with the search API and I think I understand the cause now.

Most of the GitHub api endpoints either return a single result as a single json object (e.g. GET /user) or multiple results as a json array (e.g. GET /orgs/:org/members). However, there are a few endpoints where the endpoint returns a single json object that then contains the paginated array (e.g. GET /repos/:owner/:repo/actions/runs and GET /search/repositories) - the problem arises for these latter cases.

This pull request has a proposed fix to deal with the merging of these types separately so that we get the "right" behavior for all three cases. The bug is a bit insidious because any results past the first page are concatenated to the result object, but because the attributes are overwritten the new values don't have a names attribute and the first results are present with the expected structure. See the example below:

z = gh::gh(endpoint = "GET /search/repositories", q = "org:r-lib", .limit = 200)

str(z, max.level=1, give.attr = FALSE)
## List of 6
##  $ total_count       : int 115
##  $ incomplete_results: logi FALSE
##  $ items             :List of 100
##  $ NA                : int 115
##  $ NA                : logi FALSE
##  $ NA                :List of 15

Additionally, I think some rethinking of the pagination may be necessary. For example length(res) < .limit in

gh/R/gh.R

Line 177 in b524a44

while (!is.null(.limit) && length(res) < .limit && gh_has_next(res)) {
does not work correctly for the named multiresult endpoints but is also doing the wrong thing for the single result endpoints (although this doesn't matter since these never have a next link). The name of the results array also changes between endpoints so it is difficult to workout what the current # of results is.

Another related problem is using a small .limit with one of the single result endpoints, e.g.

gh::gh(endpoint = "GET /user", .limit = 1)
Error in attributes(res) <- res_attr : 
  'names' attribute [39] must be the same length as the vector [1]

as this currently ends up incorrectly subsetting the results object values instead of the "results".

R/gh.R Show resolved Hide resolved
Copy link
Collaborator

@gaborcsardi gaborcsardi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

R/gh.R Show resolved Hide resolved
R/gh.R Show resolved Hide resolved
@gaborcsardi
Copy link
Collaborator

gaborcsardi commented Nov 27, 2020

Thanks! This sort of heuristic is somewhat risky, because we don't know what it might break in the other parts of the API. OTOH it seems beneficial, so I'll merge it and we'll if it breaks anything.

@gaborcsardi gaborcsardi merged commit a8af0be into r-lib:master Nov 27, 2020
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants