Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PaginatedList.totalCount KeyError: 'page' with large results #1006

Open
Twitch opened this issue Jan 3, 2019 · 6 comments
Open

PaginatedList.totalCount KeyError: 'page' with large results #1006

Twitch opened this issue Jan 3, 2019 · 6 comments
Labels

Comments

@Twitch
Copy link

Twitch commented Jan 3, 2019

When using get_repo().totalCount on a small result set (only my own repositories) totalCount returns the repository count correctly. When querying my enterprise Github for a complete list of repositories, the following error occurs:

Traceback (most recent call last):
  File "./test-repo-list.py", line 9, in <module>
    print("Found %s repositories." % full_repo_list.totalCount)
  File "/usr/local/lib/python3.7/dist-packages/github/PaginatedList.py", line 175, in totalCount
    self.__totalCount = int(parse_qs(lastUrl)['page'][0])
KeyError: 'page'

I am using Python 3.7.1 and the pip-installed version 1.43.4 of PyGithub with a Github Enterprise version 2.14.7.

This is the code I have used to test this:

#!/usr/bin/env python3.7
from github import Github
import os

api_token = os.getenv('GITHUB_TOKEN', None)
g = Github(base_url="https://mygithubhost.lads", login_or_token=api_token)

full_repo_list = g.get_repos()
print("Found %s repositories." % full_repo_list.totalCount)

I did notice, after some skimming of PaginatedList.py, that the Link headers differ between the two tests. When collecting just my own repositories, I see the "last" link returned in the link header, but when querying for all repositories I only see "next" and "first" links returned.

g.get_user().get_repos():

'link':'<https://mygithubhost.lads/api/v3/user/repos?per_page=1&page=2>; rel="next", <https://mygithubhost.lads/api/v3/user/repos?per_page=1& 
page=10>; rel="last"'`

g.get_repos():

'link':'<https://mygithubhost.lads/api/v3/repositories?per_page=1&since=112>; rel="next", <https://mygithubhost.lads/api/v3/repositories{?since}>; rel="first"'

I don't know what's causing this difference in behavior, though.

If I don't try to get totalCount from the object, it works as expected, but totalCount specifically seems to fail.

@Hanaasagi
Copy link
Contributor

Hanaasagi commented Jan 14, 2019

Maybe, From the list-all-public-repositories API doc.

Note: Pagination is powered exclusively by the since parameter. Use the Link header to get the URL for the next page of repositories.

It not support per_page parameter, so the pagination strategy is not working. But I don't know why there is no last` link returned in the link header.

@sfdye sfdye added the bug label Jan 18, 2019
@stale
Copy link

stale bot commented Sep 13, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Sep 13, 2019
@stale stale bot closed this as completed Sep 20, 2019
@stoiev
Copy link

stoiev commented Dec 19, 2019

This issue could be related to the way that PaginatedList.py parses links urls from headers.

parse_qs is executed using a full url. It fails parsing first parameter (rigth after ?)

>>> from urllib.parse import parse_qs
>>> parse_qs("https://github.com/Cobliteam/?page=11")
{'https://github.com/Cobliteam/?page': ['11']}
>>> parse_qs("https://github.com/Cobliteam/?per_page=3&page=11")
{'https://github.com/Cobliteam/?per_page': ['3'], 'page': ['11']}

Could this bug be reopened? Or should I open a new issue?

@SlapDrone
Copy link

Hey, just got stung by this. Anyone since managed to find a solution?

@EnricoMi
Copy link
Collaborator

Never seen this, reopening this issue for tracking.

@EnricoMi EnricoMi reopened this Sep 19, 2023
@sag1sh-rezilion
Copy link

sag1sh-rezilion commented Oct 4, 2023

I've had a few tackles with this issue, it reappears more when using the lazy=True when creating the repo object (as an example - r = g.get_repo('docker/docker', lazy=True); c = r.get_contributors(anon='true'); c.totalCount

An easy fix could be in the function totalCount in the line self.__totalCount = int(parse_qs(lastUrl)["page"][0])
instead there should be -
urlQuery = urlparse(lastUrl).query
self.__totalCount = int(parse_qs(urlQuery)["page"][0])

@stale stale bot removed the stale label Oct 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

7 participants