Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
Can register packages that match system packages #585
tl;dr: Allowing people to register 'json' (or any standard lib module) package on pypi provides no advantage, and should be prevented
During a python event in London, we noticed that pypi allows registering packages with the same name as standard library packages (for example 'sys'!)
This isn't a massive security issue per-se, as pypi package contents are untrusted anyway, but it feels like something that's trivial to prevent, and stop any bad actors from exploiting this going forward.
One hypothetical attack might go something like:
More complex case:
There is also the possibility that people have written automatic requirements.txt creators that scrape imports to work out dependencies. In this situation, imports of built-in packages will end-up in requirements files too.
The above may seem fairly unlikely, but when I noticed this issue, (I emailed the security contacts on the pypi site same-day but got no reply) I proactively registered many standard library package names*, using a dummy payload that does nothing more than raise an exception informing the user not to install from pypi.
This meant that I could use the pypi download logs to measure downloads:
The following table shows the total number of downloads of my most popular system packages during December 2016 (where the installer was pip):
For reference, the query that generated this is:
I'm sure there are many automated build scripts trying to install these packages, but the data indictates that pip is being used to install these packages from many different subnets, countries, versions of python and package installers.
As the owner of these packages, I don't mind them being taken off me, or access to them disabled as part of any fix.
(*) I thought a lot about if this was the right thing to do, but decided on this approach based on several factors:
I'm curious what other package managers do as saying "install X from Y" requires that you fully trust both X and Y. The common names are probably good to squat, but it's a pick-two-of-three: be open for people to register, safe, and cheap (labor-wise).
Thank you for the detailed report! Could you please start a discussion on distutils-sig? Most of the packaging experts follow that list so you will more likely get a feedback there. We can still use this issue to discuss or review implementation details if an agreement is reached on the mailing list.
Related: [Taking over 17000 hosts by] Typosquatting programming language package managers. There the researcher instrumented the package's
Crawling the 404 logs of PyPI for multiple failed install requests across multiple users/IPs to the same package and blacklisting/squatting them would be a good proactive step. Basically: crowd-source the names to protect.
For reference, there is at least one other report in this repository about a PyPI package with a stdlib name ("logging") being available and actually causing issues. It doesn't seem to have malicious intent and I can't get it to break pip in the way the report describes, but it does still exist.