Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there a modern equivalent to PyPI's XML-RPC .top_packages() ? #1930

Closed
cclauss opened this issue Apr 19, 2017 · 2 comments
Closed

Is there a modern equivalent to PyPI's XML-RPC .top_packages() ? #1930

cclauss opened this issue Apr 19, 2017 · 2 comments

Comments

@cclauss
Copy link

cclauss commented Apr 19, 2017

Is there a more modern, supported way of determining the relative popularity of all PyPI packages?

The legacy PyPI XML RPC API's top_packages() call returns a sorted list of packages ranked by number of downloads. My sense is that the number of downloads does not include pip installs. (Is that correct?)

The simple API returns 106,086 while https://pypi.org says 106,147 Projects and top_packages() returns only 96,104.

import bs4
import requests
import xmlrpclib

soup = bs4.BeautifulSoup(requests.get('https://pypi.org/simple/').text, 'lxml')
links = soup.find_all('a')
print(len(links))  # 106,086
print(len(set(links)))  # 106,086

client = xmlrpclib.ServerProxy('https://pypi.org/pypi')
print(len(client.top_packages()))  # 96,104
@dstufft
Copy link
Member

dstufft commented Apr 19, 2017

The best method is to query BigQuery directly: https://langui.sh/2016/12/09/data-driven-decisions/

downloads hasn't been reliabily updating for some time :/

@cclauss
Copy link
Author

cclauss commented Apr 19, 2017

Perfect!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants