Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
verify packages before running them non-interactively #91
Please add some cryptographic checking on the validity of updated packages.
Right now it's pretty trivial to attack anyone using Package Control: just DNS spoof sublime.wbond.net at some coffee shop where a sublime developer goes, and serve up "updates" to all possible packages that contain malware. Sublime will then instantly load whatever arbitrary python code the attacker want to send, and execute it locally, as soon as our unfortunate developer opens their laptop. Good bye credit cards, et cetera.
Both the update listing and the packages themselves should be signed in some way. Since most packages are on GitHub, a regular old HTTPS certificate check everywhere, while not ideal, would be a good start. The curl and wget downloaders will sometimes verify a cert, but that depends on local configuration; the urllib one won't. But none of that matters if the URLs are HTTP, so those should be disallowed during an automatic update.
Until they are, please disable auto_upgrade by default so that upgrading packages is an explicit decision made when the developer at least feels that their network connection is unlikely to be pwned.
Thanks for raising this issue!
You are definitely correct in saying that it is time for me to set up the package list to be served over HTTPS. Does python not ship with a CA cert bundle that I can reasonably verify the SSL certs via urllib/urllib2?
In terms of generating hashes for the packages, I'm not sure that will be possible without completely gutting the whole infrastructure that package control is built upon. Unless I am missing something, GitHub and BitBucket to not provide the hashes for the zip downloads. The only way I could provide hashes for verification is if I have my channel server download every package from GitHub and BitBucket and generate the hashes.
In terms of signing the channel file - this would be trivial, except that the Linux version of Sublime Text does not ship with the ssl module, leaving almost 20% of users out in the dark. I could attempt to fork off CLI openssl process to do some public key crypto instead, but there is still the very real possibility of bricking Package Control for almost 20k users.
I'd prefer to try and roll out the HTTPS default channel URL with the CA cert for my SSL cert, assuming that Python doesn't necessarily have access to CA certs. I could then provide a URL in the channel that would have the CA certs for BitBucket and GitHub so that downloads from those sites could be verified. This would allow me to continue running package control off of the current infrastructure without having to find different hosting to handle the load of independently downloading and generating hashes.
I understand your desire to turn off automatic updating right away, but I'm gonna try to swiftly get the SSL stuff setup instead. Once I turn off automatic updating, there will be not automatic way to turn it back on, which will in essence leave 100k developers with a package manager that supposedly automatically updates packages, but does not.
On a related note, this does not in any way prevent malicious code from getting onto developers computers. The package infrastructure does not involve a formal review process, so it would be possible for any developer who has a package on GitHub or BitBucket to ship a benign package one day, and the next ship python code to wipe someone's hard drive. I don't think there is going to be any reasonable way to prevent this possibility, other than telling users this is possible. They can then take the risks into consideration and determine if they think they trust the developer who created the package.
If you, or anyone else has more thoughts, or see issues with my current plan of action, please do let me know! I'd like to get this resolved very soon.
No, sadly. I'm ashamed to say that neither does my own HTTP client. You can see some of the discussion of the complexity and unpleasantness of that issue here: http://twistedmatrix.com/trac/ticket/4023#comment:7 and here http://twistedmatrix.com/trac/ticket/4888 (I can't find the analogous discussion for Python itself, but it's similar). This is one reason that I said the packages themselves should be verified somehow: participating correctly in the whole certificate-authority cartel circus is a challenge. If you ship your own CA certs list, you have to keep it up to date somehow: recent scandals in the world of exploiting the authorities themselves have shown that this is important. If you use the one from your platform you have to write a ton of platform-specific code to translate from "native" SSL to OpenSSL; and in some cases, the platform itself doesn't necessarily provide CA certs (ca-certificates isn't always installed on Debian and Ubuntu systems) and you might need to talk to a browser.
That said, Mercurial has to work around this issue as well; they do ship their own bundle, but they explain various useful techniques for deferring to the platform in certain cases (which is the more secure thing to do) here: http://mercurial.selenic.com/wiki/CACertificates
Of course not; nothing can ever prevent that completely. The issue, however, is one of trust, and more broadly, of cost. I'm reasonably comfortable trusting you, trusting the developers of the plugins I've selected, and trusting GitHub, BitBucket, or the developers' hosting providers. That is pretty much the cost of doing business in Internet-land these days; there are always a couple of intermediary parties. That's always been the implicit expectation of saying "Install this software".
However, I'm not OK with trusting you, everyone on your network, everyone on my network, and everyone in between. That's potentially millions of people, many of whom are known to be malicious. The cost of mounting an attack by forging one plain DNS request over coffee-shop wifi is about $2 for an individual with the requisite skill; the cost of forcibly breaking in to your server, or to GitHub, is presumably several orders of magnitude higher than that, assuming a modicum of security awareness on your part.
Here is the solution I am working on right now (I'm about 70% complete):
I saw you were involved in the following stack overflow question a while ago with an answer. Do you see any issue with the technique presented at http://stackoverflow.com/questions/1087227/validate-ssl-certificates-with-python#answer-3551700? Basically just providing a specific cert for the SSL connection?
Thanks for your time!
You need to retrieve the certificate authority certificates, not the host certificates, in order to properly do verification. If your sublime.wbond.net certificate contains a self-signature, then that will work, but otherwise you will need to include the cert of your upstream CA. Most certificates are not self-signed (if I recall properly, it's kinda tricky to work in both the self-signed and CA-signed version so that clients can see both, and CAs already fairly complex instructions on how to interact with OpenSSL will not tell you how).
You can, however, ship just your CA's cert, and only ship a new one if you change CAs - that way you won't even need to do the update. This is actually more secure than the default, so it would be fine :).
You really should not be obtaining and caching CA certificates. First of all, there's nowhere to retrieve them from, because, a server presenting its signed certificate may omit the authority root. Second, even if you could retrieve it, it wouldn't be allowed, because the reason that root may be absent is that (c.f. RFC 2246 section 7.4.2) "the remote end must already possess it in order to validate it in any case". CA certificates are required to be a-priori knowledge in the TLS protocol.
That technique will work fine, but again, it's not providing a specific cert, it's providing a specific authority. The 'certfile' argument there is the private client certificate to use, not the server's expected certificate.
Perhaps the easiest way to do what you want to do would be to manually curate a CA certificates list on wbond.net, exporting your trust root from a browser of your choice or from a recent ca-certificates package, ship a single CA cert (the one you're currently using) with Package Control, and then update the local CA archive by downloading it from a secure URL at wbond.net verified against that existing CA?
Thanks again for all of your help.
I was hoping there would be a way to compare the certificate directly without shipping the CA certs, but from what you've said it doesn't sound like that will work.
I did play around with openssl s_client some and found a way to extract the CA certs that will be necessary. I'll ship the current CA certs for the 5 different secure domain names currently referenced. I'll also add a mechanism to fetch new ones from sublime.wbond.net if necessary. This way I won't have to ship a 0.5MB ca-bundle.crt with Package Control.
I'll close this off once I have the new version of Package Control out.
Sort of, however Package Control never actually tries to contact github.com. It uses either api.github.com or nodeload.github.com or raw.github.com.
What URL are are providing for the Formal SQL package? Most likely you included the .git suffix and Package Control is expected a packages.json file at that location since it doesn't match the pattern for GitHub URLs.