Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

On securing Composer updates with The Update Framework (TUF) #9740

Open
TravisCarden opened this issue Mar 1, 2021 · 15 comments
Open

On securing Composer updates with The Update Framework (TUF) #9740

TravisCarden opened this issue Mar 1, 2021 · 15 comments
Labels

Comments

@TravisCarden
Copy link
Contributor

TravisCarden commented Mar 1, 2021

Several open source PHP projects including Drupal, Joomla, and Typo3 are working on a Composer-based implementation of The Update Framework (TUF) for securing software update systems (php-tuf). TUF provides protection against various attacks including those involving compromised repositories or signing keys. Before we undertake to implement it, we want to make sure we don't duplicate effort and that whatever we end up doing has the potential to benefit the greatest number of Composer users. Is this something that interests the maintainers here? Would you like to collaborate or provide guidance? What are your thoughts?

@trishankatdatadog
Copy link

Just so maintainers have a better idea, I think the projects mentioned above are trying to achieve something like PEP 458 level of security for at least some PHP package repositories (such as for Drupal).

@joshuagl
Copy link

joshuagl commented Mar 4, 2021

Because PEP 458 is long, I wanted to try and provide an elevator pitch.

PEP 458 is a design for repository signing with TUF which is transparent to both developers (uploading packages) and users (downloading packages). Repository signing ensures that package contents have not been modified (tampered with) from the time the package is added to the repository to when the client downloads and installs it.

One of the nice things about the PEP 458 design is that it can be extended (at a later time) to add optional developer signing of packages as described in PEP 480 (note we intend to update this PEP Real Soon Now™️). Developer signing gives the additional security benefits of ensuring the package contents are not modified from the time they are signed by the developer (before upload to the repository), and further provides origin authentication – that the key associated with a user was used to sign the package.

@trishankatdatadog
Copy link

Thanks for the summary, Joshua!

Repository signing ensures that package contents have not been modified (tampered with) from the time the package is added to the repository to when the client downloads and installs it.

IIUC, with PEP 458, you don't get that guarantee exactly: you get protection from MitM attacks between the repository and end-users (e.g., from malicious mirrors that are basically MitM). With PEP 480, you get the protection you mentioned above.

@stof
Copy link
Contributor

stof commented Mar 4, 2021

Given that the main composer repository (packagist.org) does not store the code but only the metadata, would TUF work there ?
Developers are not uploading packages to packagist.org. And users are not downloading packages from packagist.org but only metadata. Packages are then download from GitHub or Gitlab (or whatever else the package is hosted).

Disclaimer: I know nothing about TUF other than what is written in this issue.

@Seldaek
Copy link
Member

Seldaek commented Mar 9, 2021

Is there interest? Generally speaking yes. However there are definitely challenges here. The main one as @stof mentioned is that the way we do things on Packagist.org people do not create package artifacts, and do not upload them. They just create a tag and push that to the git repo and that's the end of it.

Packagist.org

This tag-to-package process provides simplicity, and makes sure that what you see in the git repo is what you get installed locally, which I think is a great feature actually when comparing to what I see in the npm world where the package files I have locally are sometimes entirely different from the source repo, making it harder to contribute, patch or review code.

We tried verifying zip files in the past, and have been kinda stuck without solution for ages (see #2540). To sum things up, there are a few options:

  • We get guarantees from GitHub that zip files for a tag, once created, will not ever change. Then we could download that and sign it and we'd just have to store the signatures on our end. That would be the ideal solution.
  • We do what SHA1 sum of package dists should be more reliable than a SHA1 on the zip result #2540 suggests and find a way to strip down a zip archive in a way that it can be signed/verified even tho it may have minor metadata changes in the zip headers/TOC. This is probably more realistic as we would not need GH collaboration, but it also has a CPU impact on every client which may or may not be bearable.
  • We do download zip files from GH, sign that and store them, but then we have to start serving zip content as well as metadata which would mean our bandwidth costs would rise quite substantially.

I overall feel like automatic signing does not bring a ton of value as we do have tight control over our mirrors (as long as you don't add a custom third party one) and GH can also probably be considered a trusted party because otherwise we're all doomed anyway. So the first question is if we can't do the above cleanly, is it worth doing the automatic repo signing?

The next level (PEP 480) bringing in maintainer signatures would be much more valuable so it does make the implementation challenges more worth it too.

Other repositories

Now the thing is, for other Composer repos like Drupal's which are AFAIK serving package data already, the value of having support for this in Composer is definitely higher as you can implement it on the repo end without the same challenges that we do have with Packagist.org.

Thus I'd say maybe we can start working on that first (Composer support for TUF), and then hopefully we can bolt this on Packagist.org at some point as well one way or another.

@phenaproxima
Copy link
Contributor

phenaproxima commented Apr 5, 2021

I've been working alongside @TravisCarden (and others) on php-tuf/composer-integration, which is a plugin whose goal is to act as a bridge between the php-tuf/php-tuf library, and Composer 2. I have it working under controlled conditions, as a proof of concept -- it uses TUF to transparently download and validate Composer package metadata and actual packages, regardless of where they are physically hosted.

The biggest problem it faces at the moment is that it's overriding (by some rather contorted magic) the HttpDownloader class from Composer, which allows PHP-TUF to download and validate things from the Internet before they are handed off to the rest of Composer. I'd love to get reviews on php-tuf/composer-integration#6 (sorry for the lack of a description in the PR), which implements this approach. Specifically, I'd love to get opinions about:

  1. Is there a better way to achieve the goal of routing as many HTTP operations as possible through PHP-TUF (before the results of those operations are handed to Composer)? Can it be done without having to take over the HttpDownloader?
  2. Are there any relatively lightweight changes that Composer might be able/willing to make which would reduce the plugin's fragility? Personally, I think it's somewhat brittle right now because it's overriding code which is clearly not designed to be overridden.

Looking forward to hearing peoples' thoughts!

@Seldaek
Copy link
Member

Seldaek commented Apr 6, 2021

Looking at what you do in that PR ( I haven't reviewed the rest of the code), it seems the main touch points are:

  • ArrayLoader (your PackageLoader) where you override to setTransportOptions on the package. I wonder why you had to do that if you already have a custom TufValidatedComposerRepository, why not setTransportOptions in there? Overriding/extending configurePackageTransportOptions would probably be a better way than changing the loader entirely, but that would need to be made protected.
  • Seeing that TUF depends on guzzle makes it quite a pain to integrate with, as you can't simply do downloading with Composer's HttpDownloader, benefiting from our proxy config and others.. Requiring psr/http-client-implementation or php-http/async-client-implementation would probably make this easier, as we could implement these interfaces in Composer, or in Composer 2.2 perhaps switch to symfony/http-client but I am not sure if this is possible without causing too much breakage internally now that people started depending on our HttpDownloader.
  • It seems that the main reason you have to override the entire downloader is that php-tuf expects to do the downloading itself. Composer already has a PreFileDownload event which lets you override the URL (if type==package, context data will be the package object being downloaded), so you could probably change the download URL if needed at that point, and then in PostFileDownloadEvent you could perform the TUF checks, if php-tuf allows you to check a file after it got downloaded some other way. This would be a way less intrusive integration IMO. I'm not sure if I missed something that would prevent you from doing it that way..
  • Regarding overriding the ComposerRepository, I wonder if we could perhaps add a way to store/retrieve the repository options. The repository is available via the package object so you could make sure the tuf options are read if present, and applied to the package transport options, either at pre-operations-exec or later in a new hook we could add. The issue though with this is that the plugin needs to be installed for it to work, but that's already the case with your override technique. So either it needs to be installed globally, or it will only start doing TUF checks after the plugin is installed, which is not great from a security standpoint.

I think getting rid of the HttpDownloader override would be good because it makes the plugin much more viable for now.. The question is really if we plan on merging this in Composer in the medium term (due to PHP version requirements, we definitely can't do this before Composer 2.2 which will bump the php req) then it would be good to avoid causing migration pains for people that already started using the plugin. The repository override seems to be done fairly transparently for users and should hopefully not have adverse effects when combined with other plugins, so that seems much safer to keep in for now.

If I missed anything else let me know.

@phenaproxima
Copy link
Contributor

phenaproxima commented Apr 6, 2021

Thank you for reviewing, @Seldaek!! All that feedback is extremely helpful and I hugely appreciate it. I also agree with basically all of it. Some responses to your points:

ArrayLoader (your PackageLoader) where you override to setTransportOptions on the package. I wonder why you had to do that if you already have a custom TufValidatedComposerRepository, why not setTransportOptions in there? Overriding/extending configurePackageTransportOptions would probably be a better way than changing the loader entirely, but that would need to be made protected.

I think the reason I used a separate package loader is because ComposerRepository seems to have several methods which load packages at various times and in various ways, and all of those ultimately route their code path through the package loader. Therefore, overriding the package loader seemed like the "right" place to ensure that the transport options would always be set correctly. If I could just override configurePackageTransportOptions() instead, that would definitely be preferable. I'll file a separate PR to make that method protected.

Seeing that TUF depends on guzzle makes it quite a pain to integrate with, as you can't simply do downloading with Composer's HttpDownloader, benefiting from our proxy config and others.. Requiring psr/http-client-implementation or php-http/async-client-implementation would probably make this easier, as we could implement these interfaces in Composer, or in Composer 2.2 perhaps switch to symfony/http-client but I am not sure if this is possible without causing too much breakage internally now that people started depending on our HttpDownloader.

TUF's dependency on Guzzle is incidental. We only used it so that we could provide a basic tool for developers to easily fetch and validate TUF-protected things from the Internet. We could absolutely use a different package for that, or make Guzzle one of several supported options.

Far more important to TUF's public API is the use of PSR-7's StreamInterface, and the use of promises, to manage code flow and data supply to the code that's consuming TUF. So as long as we could keep those aspects in place, there's no major reason why we need to use Guzzle specifically. (That said, guzzlehttp/psr7 and guzzlehttp/promises seem to be excellent implementations of PSR-7 and promises, respectively, so we'd need to either keep those dependencies or otherwise depend on something comparable.)

It seems that the main reason you have to override the entire downloader is that php-tuf expects to do the downloading itself. Composer already has a PreFileDownload event which lets you override the URL (if type==package, context data will be the package object being downloaded), so you could probably change the download URL if needed at that point, and then in PostFileDownloadEvent you could perform the TUF checks, if php-tuf allows you to check a file after it got downloaded some other way. This would be a way less intrusive integration IMO. I'm not sure if I missed something that would prevent you from doing it that way..

I 100% agree that it would be preferable to take advantage of PostFileDownloadEvent to do the TUF checks. However, that wasn't possible because ComposerRepository doesn't dispatch PostFileDownloadEvent after downloading packages.json or any other metadata files. If it did, then I definitely would have preferred to hook into that. Would you accept a PR to make ComposerRepository dispatch PostFileDownloadEvent?

It seems that the main reason you have to override the entire downloader is that php-tuf expects to do the downloading itself. Composer already has a PreFileDownload event which lets you override the URL (if type==package, context data will be the package object being downloaded), so you could probably change the download URL if needed at that point, and then in PostFileDownloadEvent you could perform the TUF checks, if php-tuf allows you to check a file after it got downloaded some other way. This would be a way less intrusive integration IMO. I'm not sure if I missed something that would prevent you from doing it that way..

Agreed! I think the goal is to keep TUF as simple as possible to consuming code, but I will discuss this with my team (mainly @tedbow) about adding a method to TUF that will verify data without needing to also download it. Such a change could dramatically simplify the plugin, which I'm obviously in favor of!

Regarding overriding the ComposerRepository, I wonder if we could perhaps add a way to store/retrieve the repository options. The repository is available via the package object so you could make sure the tuf options are read if present, and applied to the package transport options, either at pre-operations-exec or later in a new hook we could add. The issue though with this is that the plugin needs to be installed for it to work, but that's already the case with your override technique. So either it needs to be installed globally, or it will only start doing TUF checks after the plugin is installed, which is not great from a security standpoint.

I think this definitely sounds promising. Can you elaborate on what such a mechanism (storing/retrieving the repo options) might look like?

@Seldaek
Copy link
Member

Seldaek commented Apr 8, 2021

If it did, then I definitely would have preferred to hook into that. Would you accept a PR to make ComposerRepository dispatch PostFileDownloadEvent

Yeah the main reason it's not there is nobody needed it, so it is only dispatched for package downloads at the moment. We'd need to define what type and context data get provided but sure feel free to open a PR.

Can you elaborate on what such a mechanism (storing/retrieving the repo options) might look like?

I mean loadRootServerFile could read something like an extra key for the repo's packages.json and we'd add a getExtra() on ComposerRepository to read it. That extra data would contain your tuf key with whatever is needed in there.

@mpdude
Copy link
Contributor

mpdude commented May 28, 2021

At the risk of asking a stupid question, would TUF ensure integrity/accuracy of metadata downloaded from Packagist (list of available packages, their source/dist URLs and checksums) only? Or would TUF also handle the actual library/dependency file download?

@effulgentsia
Copy link
Contributor

It would handle the tarball (dist) download as well. In theory, the TUF spec could also be used for validating each file in a git checkout (source), but we're not currently planning on implementing that within https://github.com/php-tuf/composer-integration, because that would add a lot of complexity, and our use cases don't yet require it.

@mpdude
Copy link
Contributor

mpdude commented May 31, 2021

What brought me here was the question from #2540 whether the work you started here also involved making checksums for dist downloads available in Composer in general.

Please correct me if wrong, but I think the answer is "no": Your work was done in the context of a (Composer) repository for Drupal, which hosts the dist images alongside the Composer package metadata. You were thus able to use the Python-based tooling for creating the TUF metadata, which includes file sizes and checksums for the target files. Since the actual dist download is also handled by the TUF implementation, it will look at the TUF metadata again to protect and validate the download. In other words, for your use case, it is not relevant to have file size and checksum information in the Composer metadata (neither on the repository side nor in Composer as the update client) since that information is carried in TUF-specific extra files already.

@phenaproxima: I only skimmed the plugin code... Did I get it right that you put a repo URL for the TUF repo into the .lock file, so that upon composer install you can go back to the repo, fetch the TUF metadata and use that to safely perform the download? In my mental model (which might be wrong), composer install currently works without having to contact the repo again. Might be a purely conceptual difference, though – repo with Composer metadata vs. server with dist files for download.

More generally, is this separation of the update and install phases in Composer a problem for TUF? Would it be "safe" from the framework point of view to put a download URL, file size and checksums into the .lock file and perform the download any time later only with that information? Or is there any reason for fetching/validating TUF metadata and retrieving the target (dist) file in close succession?

Depending on your answer, Composer as a whole might benefit from some of the protections TUF provides (endless data and slow rate attacks, signature verification) when we put that into Composer's download code directly.

@markrandall
Copy link

markrandall commented Oct 26, 2021

I was wondering if we could receive an update on the progress of this? Supply side attacks are increasing every day across all manner of platforms (NPM itself just announced a critical issue effecting millions of downloads each week GHSA-pjwm-rvh2-c87w).

If this is stuck, what would it need to move it forwards? Is this something that could be achieved via funding or are the challenges still technical roadblocks?

@mpdude
Copy link
Contributor

mpdude commented Oct 26, 2021

In the special case of the NPM package 👆🏻 mentioned above, seemingly the package maintainer's NPM account was breached so malicious package versions could be published under their name. (My limited understanding of NPM is that when publishing, you upload dist-tarballs to the registry; so, no need to push malicious code to a public (GitHub/git/...) repository first.)

The "small" implementation version of TUF that goes without package signing at the developer/publisher's end probably could not have prevented an issue like this.

As I get it, only end-to-end package signing/validation with a key secure enough so it cannot easily be stolen plus reasonable protection of the place where you publish valid signature keys (so nobody can put their signature key there) might help mitigate such threats.

@TravisCarden
Copy link
Contributor Author

@markrandall you can check the status at https://github.com/php-tuf/composer-integration where active development is going on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

9 participants