Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automating signatures: A good idea? #221

Closed
mpdude opened this issue Dec 28, 2019 · 14 comments
Closed

Automating signatures: A good idea? #221

mpdude opened this issue Dec 28, 2019 · 14 comments

Comments

@mpdude
Copy link

mpdude commented Dec 28, 2019

I am looking into using GitHub Actions to automatically build and sign a .phar version of a tool I am providing.

Do you think that automated signing defeats the purpose of signatures in the first place? (Why) should signing be a manual process?

Are signatures helpful to prevent forgery on/inside the GitHub platform, or rather outside of it?

What is the advantage of having signatures over downloading from HTTPS GitHub URLs?

@theseer
Copy link
Member

theseer commented Dec 28, 2019

Those are very good questions and I'm not yet fully sure as to what the perfect answer to all of them would be.

But I'm more than willing to share my (our?) current state of mind. This may turn out a bit longer of an "answer" ;) If you only care about the exec summary, skip on to the tl;dr section.

Automation

Firstly, automating all the things is generally good. An automated process - in an ideal world at least ;) - means the problem it solves has been understood and the knowledge on how to solve it has been written down in a self-running fashion. The process at hand can be verified and last but not least, automation reduces the likelihood of human error.

I'll not go into all the absurdness of current CI systems, github actions and travis included, with all it's pointless redundancies, half baked solutions and just workarounds on top of workaround for limitations since cloud power is free - as this is off topic for your questions...

Automated Signing

Let me start of with some generics: I guess, we can agree that generally speaking, there is no such thing as 100% security. To get as close to that as possible never the less, things should become as hard for an attacker as possible to succeed with whatever they aim for. To put that into perspective: If your car gets stolen because you had your keys in your pocket at an unmonitored coat rack and left it, you might have a hard time asking your insurance to cover for it. The fact modern cars and keys work via remote and the car even signals when it unlocks doesn't actually help the security at all and one might argue why we even lock cars...

But let's get back to "our" software: While I really like using github and enjoy the ease of integration of all the things these days, it's a single point of failure and from a security perspective a very dangerous development. If your github account get's owned, you're doomed.

Using github actions to sign releases

So this is finally going to attempt to answer your question. It's actually more than one answer, because the answer depends highly on what you expect from a signature and why you'd use it or create one in the first place.

Let's be bold, and see if we can get away with the following statement: Using github actions to automagically sign a release is close to pointless at it doesn't add anything of value.

Why would that be? As explained above, having github as the central place for everything, an attacker merely needs to take over any account with commit permissions to inject malicious code and - depending on how the automation is setup - it will make it into an release. Given the release has been signed, every system will see a valid signature and install it.

But than, why do you sign a commit? If an attacker could take over the account, a new key for the commit could be easily registered and everything appears fine. It's everything but obvious unless one already expected manipulations.

So, if you want to have a signature on a release phar for the sole purpose of having a signature, that's obviously pointless. It just burns CPU cycles.

But: What does a signature on the phar tell us? A phar file is a what for other environments and languages would qualify as a compiled binary. It's an arbitrary piece of binary data attached to something that somebody somehow generated and called a release. While the git tag used for the release is potentially signed, the release itself on github is not. Nor is the source of the phar file in any way verifiable.

To me, that is a very big problem. I want to know that the binary I am about to run is in the shape and form the releasing party intended it to be. Without a signature provided by the project and a means to verify that very signature, this is impossible. I would have to build my own phar and hope it's generated identically.

I'm not sure whether you noticed that I changed the focus here: The fact that I can verify the phar to be in said shape does not say anything about it being trustworthy to execute. It merely allows me to be sure it's being considered "official" and un-tampered with.

Signatures merely ensure authenticity

Any reasonable (to my understanding at least) cryptographic security concept relies on private and public keys, allowing the trusting party to decide whether or not to trust a public key as authentic (e.g. belonging to the person or organization claiming ownership) and whether or not to trust signatures made with it for a given context.

Except for the aforementioned required manual and initial steps, the rest of that process should be automated.

One can use signatures to additionally enforce a process: For some of our clients (that is, my company thePHP.cc's clients) for instance we have their ops team co-sign the RPMs they explicitly approve for production. So on top of having the Centos team sign the packages, they test-install them on their environment and run all kinds of (integration) tests. If they all pass, the packages get promoted and thus signed. To install a package on production, the OPS team's signature is required.

Does that mean you can put any actual "trust" into a signature? No, of course not. But I can potentially put trust in a process that lead to the signature being made and then have (legal) consequences attached to that.

TL;DR:

I strongly believe that having all things automated is a good thing and that should technically include having signatures being generated - particularly for files that get arbitrary attached to a release on github.

Using github actions to do so seems logical but comes with the danger of having a single point of failure and attack. This can, to some extend at least, be mitigated by a good process and clever isolation of permissions so that for instance no single individual (read: hacked account) can do everything, if that level of security (paranoia?) is required.

Github HTTPS URL

Using HTTPS to download a file is transport layer security. That's a prerequisite to even consider any type of other security. It is fundamental to ensure that a) your connection is to the system you intended and b) no forgery happens during transit. Without transport security you have no means of knowing where, if or how a corrupted transfer occurred.

If the file was corrupted before downloading, the transport security still holds. That's where signatures come into play and where an independent means to transport key information is vital. Hence, the keyserver infrastructure of openGPG.

Risk of Forgery

If Github is to be hacked or an account being stolen, things get messy. Nothing much signatures would do here if they get generated automagically. To avoid this type of attack, the generation process would require human interaction - for instance to provide a cryptographicly secure way to "unlock" the private key.

Not sure that's feasible for most projects or if it's gets less likely to be hacked/a problem if someone without deep understanding and experience of Linux administration were to be tasked to setup their own systems.

For phar.io, we're paranoid enough to have the private key for our signatures NOT on a system we do not physically own. That means to actually do a release with a signed phar, we have to run that build step on our own (physical) hardware. Given that we have our systems encrypted, I at least pretend that to be good enough for the worst case scenario that one of our laptops would get stolen.

As stated before, I doubt that to be the best process possible.

If you have to mass build packages, for instance as a linux distribution, that certainly does not work one bit. But then: All their signatures technically say is that the package at hand has been build on that mashine and has not been modified since. The key used for signing is NOT used for anything else so could also easily be replaced if need be.

In that sense, keys are like passwords: Do not reuse them, ever!

@theseer
Copy link
Member

theseer commented Dec 28, 2019

I hope that somehow answered your questions? ;-)

@mpdude
Copy link
Author

mpdude commented Dec 28, 2019

Awesome, thank you for the comprehensive reply!

I guess that first needs to sink in and I’ll have to think it through. Will get back to it later :-).

@rask
Copy link

rask commented Dec 30, 2019

How about double-signing, sign automatically via automated actions, then manually to enforce? This way users get a "default" automated signing ("yes this was built on GitHub systems" compared to "not sure where this thing was built in, maybe some North Korean basement?"), with reinforced signing ("yes it was Jane Doe with auth foobarbaz that signed this build"). Or would this be useless?

@theseer
Copy link
Member

theseer commented Dec 30, 2019

Nice idea, but... ;-)

You may or may not have realized that in your example, the potentially added security merely lies in the very process that lead to the signatures and what level of trust you put into it. The proposed second signature is not any more secure than the first.

So, it all boils down to answering the following question: What do we really ensure when requiring and checking a signature? To me, the most important aspect is a chain of trust. Like, to stick with your example, being relatively certain that the build was run on system that I can put trust into, based on source code that can be assumed to be untampered with given the signed commit it was checked out under. But that's me..

All in all, that's a lot of assumptions already and it's probably a miracle things aren't horribly broken already ;)

As I explained above with the ops team (co-)signing packages, most people do not necessarily care about the signature per se but only the fact it is there (and valid) as a means to ensure a defined process has been followed.

Now, what would adding a second signature do?

First, we might run into an intermediate state: The build artifact being signed by the automated build process but not yet by the release manager.

That leads to two questions, with, funny enough, the same answer: Do I (have to?) wait for the second signature or not and does the second signature raise the level of trust into the artifact? The answer for both is: It depends on the processes, not the signature itself.

If I only install things that have a "human made" signature attached - aka the second -, the automated signature is useless for me. If I'm the human to make the second signature on the other hand, I probably wouldn't consider signing anything that doesn't pass the validation of first signature. But we're already talking process again.

TL;DR:

You're trying to solve a trust issue you put into automagically generated signatures by (re-)involving humans to make that a manual step (even though the act of actually signing might of course be automated).

Whether or not that is useful or useless depends on your process and your trust into that process. And maybe, against what types of attack you want to ward yourself.

Let me try a very high level summary: I'm not convinced having a second signature will add enough to be worth the effort - at least in the given scenario and as a generic "that's the way to do it" kind of thing.

@rask
Copy link

rask commented Dec 30, 2019

Thanks for the in-depth response!

My thinking was that maybe some people are OK with the automated signing step (as in "this is secure enough for my usecase/for now, I just need to know that this binary was built in some service I like to use"), while others want to only use binaries signed by people who they trust (as in "I know Jane Doe personally (or three of my colleagues know her), and I trust she signed this personally and my company has strict standards when it comes to using third party software").

The automated signing is a bit of a mystery: would it help at all as a "baseline" or would it create more headaches in the long run if something goes wrong.

@theseer
Copy link
Member

theseer commented Dec 31, 2019

The only point of any signature is trust in the process that lead to the signature being made. That may or may not include knowing who actually signed and whether or not to place any trust into that person or organization.

That's pretty much identical to "real life": If you sign a contract and for it to be considered legally binding (read: "valid"), the general assumption is that you knew what you did when signing, you did so in good faith, you are actually you and nobody forced you to sign in the first place. The signature itself cannot tell any of that. You have to look into the process that lead to the resulting signature.

So even if you - for whatever valid, real or imagined reason - put more trust into a signature made by a human, that signature is useless if the person signs everything without even looking. Like having a famous person give an "autograph" under a contract. Still technically a valid signature, but ....

I did understand you point, don't get me wrong ;)
And if you, as a project maintainer or organization, want to provide both types of signatures, that's perfectly fine. I'm just not convinced the amount of work (read: the additional steps of process) involved justify any meaningful increase of trust.

Let's see if it gets more clear if we layout a simple process for this:

  1. Release tag has been made (signed via gpg, pushed via git)
  2. Github action builds a phar
    1. Automagically creates a signature
    2. Adds phar and signature to the release
  3. A human release manager downloads the phar
  4. The human release manager downloads the accompanying signature
  5. The human release manager verifies the signature and phar match
  6. The human release manager creates the second signature for it
  7. The human release manager adds the second signature to the release

Sounds good?
No, sounds horrid.

It may look like we have a human person that we trust sign for the release. That's what you asked for. But what do you base your trust on?

The fact it's a human signing? You know that human? You know and trust the human? Why? And who said the fact the key of the person you know has been used means that the person you know actually created the signature?

Based on the process described above, you have literally nothing to base any additional trust upon the second signature.

It get's even worse:

First, chances are the steps 2-6 are run on the release managers machine. That's not a problem per-se but why would you trust that the release manager did steps 2-6 and not just 2,5 and 6? Can you verify their work in any way?

And even if we'd blindly trust the RM did indeed follow steps 2-6, what does the signature actually tell us? Not what we hoped for but only that an RM did verify that the automated signature was valid. That's nice to know probably, but something I could easily verify myself, given the first signature is also public.
So if the process were to be like layout above, the RM's work didn't actually make anything better. Hence, me concluding the additional work doesn't provide enough of a benefit.

Again, process remains key. If there's a way to verify the process was followed as specified, having a(n additional) signature as a token of confirmation makes a lot of sense. But unless you have a means of looking at the process that lead to the signature, it's blindly trusting.

Blind trust can be okay though. I for instance do trust packages signed by my Linux distribution vendor of choice. Whether or not that trust is misplaced, I obviously cannot really say but even for me paranoia only goes so far ;)

But there are projects that push for "reproducible builds": They checkout the code, aim for an identical build system and check if they can reproduce an identical build. If they co-sign a release, that's again adding value. Same as in science: If you have somebody run the same experiment(s) and confirm your results, suddenly things are considered a lot more believable.

@theseer
Copy link
Member

theseer commented Dec 31, 2019

It may seem like with my arguing I'm almost advocating against using signatures.
I'm far from doing that.

All I'm trying to say is that signatures are build artifacts. A build artifact that can be automatically be checked to ensure a process has been followed and that support building a chain of trust: Given a build was run based on a signed commit on a trusted system, the resulting binary can be considered trustworthy. To provide a means of verification the build system signs the release.

With this, the signature on the binary implies two things:

  • The binary was created on this system (assuming, the private key for signing wasn't leaked..)
  • The result was produced using the source code as per commit/tag used, ideally with also a signature to maintain the chain

This does by NO MEANS yet say the binary will work, contains no malicious code or is in no otherwise garbled. If those are requirements, additional processes and potentially additional signatures are required.

@rask
Copy link

rask commented Jan 2, 2020

Are you familiar with crev (https://github.com/crev-dev/crev/)? It attempts to automatize a little of this "can I trust this piece of code" problem. Not a silver bullet but a nice idea at least I think.

@mpdude
Copy link
Author

mpdude commented Jan 2, 2020

After a few days I've come to realize that the entire topic is way much bigger than I expected.

It's nice that phive is pushing towards GPG-signatures. But it would be way more interesting to see how Composer addresses this for all kinds of vendors, not only .phar files. (Spoiler: No real idea.)

The list of security-relevant aspects that need to be considered is daunting.

My initial remark regarding HTTPS vs. signatures had the background that I was used to GPG signatures for Linux packages that you might have pulled from somewhere on the Internet or found on a CD inside your favorite magazine. In that case, if the "transport layer" is not trustworthy, signatures can ensure integrity of the package you're trying to install. But even in this case, some kinds of "replay" attacks are possible if you can be made believe that an older version of a package would be the latest one.

But lets assume for now that TLS and its trust mechanisms are sound and work, and users acutally get the version of the .phar file hosted on the GitHub release page, the one that they asked for.

The security of my GitHub account then plays a crucial role.

When the signature is generated automatically during the build, the attacker could change the code and would get a new, signed release for free. It would also be easy to modify the workflow definitions to extract the signing key, even if stored as a "secret" input to the GitHub action. The code change could be rolled back, and unless I closely monitor all action runs that ever happen, I need not even notice. With the signing key, also more targeted attacks would be possible.

When the GPG key fingerprint is documented in the README file, and there is no second, 100% independent place where it can be published and/or cross-checked, malicious changes might also go unnoticed for a while. That is especially true when working with a repo that you don't commit into too often and/or there are many collaborators so that you'd pull in innocuous-looking changes without further checking.

So – as long as the GitHub account is reasonably safe and only trusted people can publish the .phar as part of the release, the signature does not add any value. If the GitHub account is compromised, an auto-generated signature does not help either.

The only kind of signatures that seem to add value to me are those generated on the developer's machine, maybe even with a hardware-based key. For integrity, it might be nice to have the .phar built automatically, but the signature be based on a local build to cross-check consistent results. But that's an entirely different story, and reproducible .phar builds currently seem to be impossible, at least with humbug/box.

Even then, I'd have to figure out how such a process could support multiple people maintaining (and releasing for) a shared repo. Maybe we need different signing subkeys for an offline GPG master key.

For now, I'll probably go without signatures as we do not have a process in place that would allow those local signatures to be created reliably and close in time to every release :-(.

@theseer
Copy link
Member

theseer commented Jan 2, 2020

Thanks for taking the time to come back with your results :)

I hope my answers from earlier didn't scare you away from signatures or turned you into a paranoid with a lack of sleep ;).

Just to be sure, let me repeat some key statements from earlier: I strongly believe release binaries should be signed. Actually, even regardless of the actual process that may lead to a signature being there. Signatures are a vital and core ingredient. They're far from being all that needs to be done, of course.

What will differ depending on the process and thus per project, is how much trust can be placed into the signature being there and what might be implied by having a signature. (I'll not repeat myself with all the details here ;-) )

Again, there is no 100% security. But not having signatures on releases because they do not protect against or help in case an account was taken over or other security threats, is a broken logic: It's like saying I'm not going to have health insurance because I'm vaccinated. I'd rather have both. So we have to anticipate the problems that we didn't solve/protect against yet and consider implications and solutions for these.

I totally agree that there are a lot of potential security issues - not only when releasing software or dealing with open source in general and on github in particular. Some of them are more likely than others. And some projects may be more exposed (read: considered an interesting target) than others.

All we, as developers and project maintainers, can do is make the live of an attacker harder. In my opinion, signatures on releases are a good and vital part of that. Of course, it's only a subset of things that need to be done.

Where do we go from here now?

@theseer
Copy link
Member

theseer commented Jan 2, 2020

Since you mentioned composer and security, let me add a rant ;)

In my opinion, composer currently is broken on various levels when it comes to security. That starts with the recommended way of installing it - using curl with direct piping into php, using an arbitrary and imho useless internal check -, continues with how the metadata is not verified in any way (at least last I checked) and ends with not having any cryptographic checks on the code it installs.

Imho, composer should require gpg signed release tags and enforce they are made with a key registered with packagist. This should happen on packagist when "importing" a new release and should also be verified on the client. Git provides all the required info to do that, unless you of course download aribtrary tar balls that just happen to have been created by github...

They should also sign their metadata and verify those signatures.
As things are, I guess we can consider ourselfs lucky nothing happend so far and we at least do have transport layer security (aka https).

@theseer
Copy link
Member

theseer commented Jan 2, 2020

Regarding reproducible phar builds: My Autoload Builder can create phars as well and is used for instance by PHPUnit. It should create reproducible builds. If not, somebody should open a ticket ;)

@theseer
Copy link
Member

theseer commented Apr 10, 2020

I guess there's nothing to be done here so I'll close this ticket.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants