-
Notifications
You must be signed in to change notification settings - Fork 963
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Detect malicious packages, for later removal #5117
Comments
There are almost 200K projects on PyPI. We don't have the ability to manually audit each one. How do you propose this should be done? |
Exactly! -- And probably 99.9% useless, outdated, fake, deprecated (at best), or possibly containing malware, at worst!
:) We are programmers so I'm sure we can figure that out! How about about searching for packages that:
That's just a start... and would probably remove a siht load of crud. |
Another related issue, is that there seem to be some kind of cyber squatting for package names going on there as well. Packages with little or meaningless content but occupies useful names. How do you plan to deal with that? |
Thanks for filing this issue, @E3V3A! Per discussion today, we'll be addressing this problem during upcoming work on automated detection of malicious uploads. In this issue we'll be nailing down our criteria for "how do we determine what is a bad package?" and plans for removing those packages. (Note that we're distinguishing between a malicious upload and spam, and between malware and typosquatting, and that there are other issues -- like #194, #4319 and #4004 -- that concentrate on filtering re: packages that have noncompliant metadata or no recent releases.) |
Per a discussion with @ewdurbin last week: The work we'll do on automated detection of malicious uploads will first concentrate on finding malicious packages, and building the tools around that. Only after that will we be able to provide automated tools to help PyPI admins remove them. |
From #7061:
|
I'm very interested in this effort and would like to help. With the fact that there are so many packages here are a few suggestions that I have:
|
Hello friends! I will be working on the backend implementation of the system for adding malware checks. You can track the progress of this work by checking out the malware-detection label. |
Hey everyone. |
Yes, absolutely! I'm actually giving a charla about this system at PyCon, but for interested non-Spanish speakers, I can give the English version during the sprints. Also, I'd really love to get feedback on this contribution documentation, and this sounds like a great way to do that. |
@xmunoz Are there any slides of that charla? |
For the first question, I'll follow up over email :) The second question could potentially be answered by @ewdurbin. |
The malware-detection branch has been merged onto master with PR #7377 |
I need to develop a tool that detects malicious repositories. can you @xmunoz help me with it? |
Looking at the simple package index, there are a number of highly questionable packages (at least so by their names.)
Packages without proper names, authors or descriptions should probably be removed. If not for bloat reasons, but for security concerns.
Stuff like this:
The text was updated successfully, but these errors were encountered: