Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is signing worth it? #203

Closed
Alexendoo opened this issue Jul 17, 2019 · 24 comments
Closed

Is signing worth it? #203

Alexendoo opened this issue Jul 17, 2019 · 24 comments

Comments

@Alexendoo
Copy link
Contributor

I hope not to come off as overly critical, but this was something on my mind after tinkering about with crev locally. I think it would be valuable to reconsider the fact that the design uses signing, my reasons being:

From the security viewpoint, I can't personally see many scenarios the signing protects against. In practice it does not protect against GitHub going rogue (I think this is perfectly acceptable), as IDs to be trusted are fetched from a GitHub repository and then likely copy-pasted from the output into trust. That or they're copy-pasted from instructions hosted on GitHub.

A rogue/compromised user is caught out by testing crates directly from the package manager rather than the source. A rogue package manager is caught out by including a digest in the review.

Now for the main reason I opened this issue, I believe it would open up some benefits to adoption if there were no signing. Code review is a hard ask, and any barrier makes the ask harder still.

There would be no more need for key management. You don't have to worry about transferring it between devices in order to be able to review. You don't have to worry about losing it. You don't have to manage yet another password.

It would allow a simpler format and CLI. There would be no need for IDs, or to think of it another way the repo would be the ID. Trusting other users means just pointing at their repository. The file layout could be simplified to e.g. crev-proofs/reviews/*, the current format gives off an air of being opaque to human readers which may be off-putting.

It would be simpler to produce tools/integrations. Good integrations will be a massive benefit to the ecosystem: be it in the editor, GitHub bots to show if something lacks reviews on PRs, web UIs etc. Both the simpler format and lower complexity will increase the ease of creating any such tools.

It is of course a trade-off either way, the signing does provide trust on first use protection, my question would be is it worth the hit to usability?

@dpc
Copy link
Collaborator

dpc commented Jul 17, 2019

From the security viewpoint, I can't personally see many scenarios the signing protects against.

Wat? :D

Not trying to mean. I just don't get it. It seems to me you started with some very confused assumptions.

In practice it does not protect against GitHub going rogue

Crev IDs are generated locally, and everything that is being stored in proof repositories is signed and verified.

A rogue/compromised user is caught out by testing crates directly from the package manager rather than the source.

cargo-crev will make you review the code exactly like it is in the local copy provided by the package manager (cargo), and does generated a digest of package content.

As per the simplification ... For most serious users trusting/depending on github or crates.io or any third-party in general is a big no-no. Everything in crev is therefore cryptographically signed and trust-less. I don't really find it a big problem, and purposedly the project is split into multiple libraries so it's easy to build own tools that use them to do the heavy lifting like verifying signatures and so on.

@Alexendoo
Copy link
Contributor Author

Alexendoo commented Jul 17, 2019

Crev IDs are generated locally, and everything that is being stored in proof repositories is signed and verified.

Sure, but that's not the problematic part, it's other IDs where the trust of GitHub comes in, you fetch from a repository and trust the ID in it. Reviews from that user (or transitive users) are now verified against whatever ID you got from GitHub, it's not likely that people will verify out of band.

cargo-crev will make you review the code exactly like it is in the local copy provided by the package manager (cargo), and does generated a digest of package content.

Yeah, my point there was that signing is unrelated to that scenario as it's already covered.

Unless you either do not import any other reviewers, or verify IDs out of band. Crev currently has you trusting GitHub

@dpc
Copy link
Collaborator

dpc commented Jul 17, 2019

Ah. I see.

It doesn't have to be github. Git repos can be self-hosted. Commits can be signed in it etc. And ID can be verified, though it's true it won't be most of the time, but for some use cases they will be. Jjust because something isn't theoretically perfect doesn't mean it's bad.

And even architecturally, git repos are only a transport mechanism. The url of the repo can change (eg. people moving their repo gitlab/self-hosting). In the future I could imagine people fetching proofs from distributed sources like DHTs of some kind (IPFS, Dat) and so on.

@Alexendoo
Copy link
Contributor Author

In the self hosted case it does absolutely remove the need to trust GitHub (or some other third party hosted git repository), but that also means the signature isn't protecting against anything anymore.

Commit signing is the same sort of deal, if by using it we remove the trust in the repo host then crev's signatures are made redundant. Same again with IPFS/Dat.

While I do believe it would be beneficial to remove the crev signatures that doesn't mean I think the authenticated use case should be ignored. I just believe it's better handled by the transport, e.g. in the case of git you could require all of a peers commits to be signed

@Alexendoo
Copy link
Contributor Author

On implementation since I forgot to reply to that point:

For wide adoption if something can be simplified enough to be easily re-implemented it's going to be easier to receive. libgit2 for example is decidedly complicated but even still many languages have re-implementations that are popular. JavaScript has several. They avoid the need to manage system dependencies.

crev-lib could be mostly fine for Rust users, but if the intention is to bring crev to other languages there will have to be re-implementations.

Another simplification example:

Without implementing signing in crev itself the file format no longer needs to be append only. To distrust a user or delete a review the entry/file can just be deleted. Individual files could be parsed with a regular yaml parser. There would no longer need to be a workaround for merge conflicts. e.g. picture something like

.
├── reviews
│   └── cargo
│       └── crates.io
│           ├── h2-0.1.25.yaml
│           ├── lazy_static-1.3.0.yaml
│           └── num-0.2.0.yaml
└── trust.yaml

@dpc
Copy link
Collaborator

dpc commented Jul 19, 2019

Sure.

But it's a trade-off on which I remain unconvinced. Signing might have a tiny, tactical problems, but is significantly increasing security and robustness and generality.

With signing you can eg. download bunch of proofs using bittorrent and import them and so on. You can import proofs from other people into your own repositories. You can store review proofs into source code of the reviewed project itself. Plenty of possibilities and flexibility to evolve. Tying data representation, security and transport layer together is just meh in my book.

And in practice I can't see how signing removal would benefit alternative implementations much. Signing is like 1% of complexity here. The signature is some standard ed25519 stuff, headers and such can be handled with a simple regular expression.

It's going to be hard enough to get this idea to catch on and become practical anyway. Git got several implementations only years (decade?) after its conception, when it became super-popular.

I am still hoping that crev could bootstrap itself by being used by security conscious organizations, that will appreciate cryptographic signing. The first reactions from groups like rust-bitcoin was: "when PGP signings". It is currently beyond my hope to get average programmer to care about source code attacks that crev is trying to address. Noone is going to reimplement crev from scratch, because the problem is apparent for years now and people just don't give a damn. And if someone will, the additional 1% of work is not going to make or break it.

@Alexendoo
Copy link
Contributor Author

I guess that's main the difference in viewpoint, I don't it as significantly increasing security. Unless I'm missing something the idea of the built in signatures is for it to be secure over a potentially malicious transport, but it relies on the very same transport being secure in the first place?

@dpc
Copy link
Collaborator

dpc commented Jul 19, 2019

but it relies on the very same transport being secure in the first place?

It doesn't. How you obtain identity is not necessarily from the git repo hosting. And it's quite different scope of an attack to just insert a file/commit somewhere and to try totally impersonate somebodies identity.

@kornelski
Copy link
Member

Trust is ultimately about trusting the person doing review, not any key. All the technicalities are about establishing the connection between the person and the review.

And Crev ID IMHO is terrible at establishing this connection. I don't know people by their Crev IDs. Nobody's going to do PGP-like signing parties for their Crev IDs, so they aren't much better than self-signed certificates. In practice, we know each other by GitHub accounts, and despite that being owned by a centralized service and for-profit corporation, that's a stronger identity than Crev IDs.

So the scenarios of "We don't need to trust GitHub, because we have Crev IDs" has it entirely backwards. It's actually "I trust these Crev IDs only because they're in GitHub repos of users I trust".

@kornelski
Copy link
Member

I thought that the big loss of dropping identities would be loss of aggregation (it's quicker to fetch one repo that aggregates reviews from 100 users, than to fetch 100 repos).

However, I've realized that the download-everything-upfront model is going to be a pain to scale anyway. If crates.io grows to the size of npm, and every crate has a review, that will be gigabytes of data to fetch. Offloading bandwidth to some p2p protocol may make it cheaper to host, but won't help users download less data.

But there could be another way. There could be an "index server" that could be asked "who has reviews for crates x,y,z?" and it would reply with a list of URLs to fetch. This way the amount of data fetched would be more proportional to number of crates actually used, rather than all crates in existence.

In this model the server doesn't have to be particularly trusted, because it's only used for optimization and discovery. The verification happens outside of it. If you're worried that the server would hide reviews from you, you can still fetch (some) repos directly, or ask a few different servers (they'd be cheap to ask).

@kornelski
Copy link
Member

So overall, I think dumbing down identity to URL ownership is going to be good enough.

If I trust a user I know by their GitHub account, I trust what I fetch from https://github.com/user/repo.

If I trust Example, Inc. and know they control example.com, I may trust https://git.example.com/repo.git.

Even if they used Crev IDs, it wouldn't benefit me, because I'd have to go to one of these sites to find what is their Crev ID. So just trusting the URLs only removes a step in the chain.

The missing component here is ability to change the URL. This could be accomplished by putting the new URL somewhere in the old repo, so that the Crev client can find it and act accordingly.

@dpc
Copy link
Collaborator

dpc commented Jul 21, 2019

It's actually "I trust these Crev IDs only because they're in GitHub repos of users I trust".

I totally agree.

Making the UI to put more emphasis on the URLs than CrevIDs is an option, and I think it's the right way to go. But I don't see any value in dropping the actual signing. The user might not have to worry about signing and cryptographic identities at all, and all the integrity checks etc are going to happen under the hood.

Even if they used Crev IDs, it wouldn't benefit me, because I'd have to go to one of these sites to find what is their Crev ID. So just trusting the URLs only removes a step in the chain.

"Open source developers - individual not knowing each other" is not the only world crev will be used. I work at BitGo and because of nature of cryptocurrency industry, we can't just trust stuff because it was at a given url. We don't trust PGP signature just because it says it comes from John Smith and so on.

Thanks to cryptographic identity it's possible to not trust the proof store, transport layer, and possibly verify the ID on twitter, some personal/offiicial website and with all other available proofs.

So I don't want to strip the system from important security properties, for very dubious benefits.

I think such niceness and life-convenience things belong into UI, and not core data layer. If the user does cargo crev trust https://github.com/dpc/cargo-proofs, it all can "just work" and not bother the user with CrevIds unless it really has to (some mismatch/ambiguity detected). Most users might not ever need to deal with cryptographic IDs, even if they are there under the hood.

There could be an "index server" that could be asked "who has reviews for crates x,y,z?"

Exactly. I don't know how the scaling would work, but that's a great problem to have in the first place. By having cryptographic identity and signing in place anything can be potentially used without loosing security and robustness.

@Alexendoo
Copy link
Contributor Author

The lots of URLs problem sounds very much similar to what the folks over in Go have been tackling recently with https://proxy.golang.org/

@dpc
Copy link
Collaborator

dpc commented Jul 21, 2019

A while ago, when I just started toying with crev, I originally though that people will just submit proofs as PRs to projects they are reviewing. Turned out not to be very practical. But in the future, there might be a tool or something for project maintainers to periodically fetch all the reviews for their projects and just check them in under ./.crev/..., so they come included with the crate. Obviously this can't be the only way to distribute proofs and discover reviewers, but might be a good supplement.

@kornelski
Copy link
Member

kornelski commented Jul 21, 2019

So I don't want to strip the system from important security properties, for very dubious benefits.

Could signing be made de-facto optional? So if you wanted to use it for real, you could, but users who just want to rubber-stamp a few crates they use, wouldn't have to feel the friction.

To me key management is a real problem, and it adds friction. The problems I see are:

  1. You can't start using cargo-crev without setting up a key. It's an extra step, and the security model is an extra thing user needs to learn.

  2. I don't know what would happen if I tried to use cargo-crev from multiple machines, with the same repo, but without bothering to transfer the key (I don't even know where it's stored). This uncertainty has stopped me from adding reviews.

  3. Adding a review requires a passphrase, separate from SSH key passphrase.

  4. I don't know how key rotation looks like, what are the consequences of losing my key, etc. (and that's not something I want to worry about).

Basically, key management is a real cost. It's intimidating, it's hard to do right, it's annoying.

So for solutions:

  1. Could you create a key automatically, without asking?

  2. & 4. I don't know what crev currently does, but if reviews repo could just accumulate IDs, to the point I could use a new ID for every new review, and they'd all be trusted and considered mine, because they're in my repo, that'd be great.

  3. If the default passphrase for the auto-generated key was empty, user wouldn't need to bother with this. My threat model is that if something can read files from ~/ or launch crev on my machine on my behalf, I'm already screwed and a passphrase won't save me. On macOS however, you could put the key in the Keychain, which gives it a bit of extra protection.

@dpc
Copy link
Collaborator

dpc commented Jul 21, 2019

You can't start using cargo-crev without setting up a key. It's an extra step, and the security model is an extra thing user needs to learn.

This should not be the case anymore. You can actually call a lot of commands without generating the id.

I don't know what would happen if I tried to use cargo-crev from multiple machines, with the same repo, but without bothering to transfer the key

It would work, and your repo would just store two identities.

don't even know where it's stored).

You don't have to. You can just: cargo crev export id + cargo crev import id

Adding a review requires a passphrase, separate from SSH key passphrase.

I don't recommend it, but you can export -x CREV_PASSPHRASE="mysupersecretpassphrase if it really bothers you. :)

I don't know how key rotation looks like

There's currently no key rotation, expiration. While I like cryptography, I really, really dislike how PGP works, and all these key-managment hoops there serve no real purpose, and just confuse people.

If you ever decide to nuke your own compromised ID, you are supposed to create a self-distrustrust proof (negative trust for your own self). The implementation of this is probably missing/not working, but in concept - this is a way to tell everyone, that you don't trust your own self-identity, so other shouldn't anymore either. No need for revocation certificates and other crap.

Could you create a key automatically, without asking?

Sure. Key is easy. But what about the repo url? The biggest reason the explicit ID creation is there, is that the user has to provide us with the url of the identity.

I don't know what crev currently does, but if reviews repo could just accumulate IDs, to the point I could use a new ID for every new review, and they'd all be trusted and considered mine, because they're in my repo, that'd be great.

I don't understand this. Repo definitely can store proofs from multiple identities. That's how I envision organizations will use it: all engineers will just push/PR the same repo.

But I don't understand how would multiple IDs help with anything. You can already generate new ID before every review, and set the same url in there, and it will "just work". Except there's no implicit trust between them.

Basically, key management is a real cost. It's intimidating, it's hard to do right, it's annoying.

It might be a presupposition based on how terribly unusable GPG is? In crev the key management is really basic. No subkeys, no expiration, no weird settings and so on. If it wasn't for the URL and maybe backing up, we could generate the key on the fly.

My threat model is that if something can read files from ~/ or launch crev on my machine on my behalf,

It is actually much easier for an attacker to snatch one file that is stored in a predefined place, than it is to lunch a full-blown keylogging/whatever attack to steal your passphrase. So I would not discount completely the additional security of even a lame passphrase. Sure - if you've system is compromised, the attacker can win, but a lot of attacks like this are not so sophisticated.

And the defaults matter, so while I am happy to enable escape hatches, I would like the user to have to set up the passphrase by default.

@dpc
Copy link
Collaborator

dpc commented Jul 21, 2019

On macOS however, you could put the key in the Keychain, which gives it a bit of extra protection.

Oh, and I would be happy to implement features like this. :)

@cmcaine
Copy link

cmcaine commented Jul 21, 2019

I don't understand this. Repo definitely can store proofs from multiple identities. That's how I envision organizations will use it: all engineers will just push/PR the same repo.

I think @kornelski wants to just tell crev to trust repo url. That way they don't need to care if the keys are the same or not.

If that is a common pattern, then there's little point in the keys because the URLs are the real roots of trust.

Also, if I had to enter a potentially long passphrase every time I wanted to write a review, that would be annoying. I guess I would just alias crev to crev -p mypassphrase or something.

@dpc
Copy link
Collaborator

dpc commented Jul 21, 2019

@cmcaine

I think @kornelski wants to just tell crev to trust repo url. That way they don't need to care if the keys are the same or not.

If so, I'm kind of fine with that. It would be like TOFU, which is reasonable for normal users. And there are still checks that crev can perform to make this more solid.

I could have cargo crev trust url <url> and cargo crev trust id <id> with cargo crev trust <smth> maybe doing auto-detection.

If that is a common pattern, then there's little point in the keys because the URLs are the real roots of trust.

That's not true. As long as it performs TOFU without automatic-updates etc. there's still plenty of difference, and a lot of checks that could be performed: eg. do we already have this ID + url in other known proofs? Do they match? Etc.

crev -p mypassphrase

Passing secrets like that is not a good practice. It leaves them recorded in your .bash_history, and makes them visible in ps -All and other places. As I said CREV_PASSPHRASE is implemented already for this purpose.

Eventually support of all sorts of keyrings could make this inconvenience a non-issue.

@Alexendoo
Copy link
Contributor Author

Alexendoo commented Jul 21, 2019

Note that if something has access to ~ the attack doesn't require any sophisticated keylogger. You append a small function to .bashrc (or whatever profile) that fakes its own password prompt using read. Same deal as the common sudo one

@dpc
Copy link
Collaborator

dpc commented Jul 21, 2019

You append a small function to .bashrc

Sure. But the victim has to actually enter a passphrase, and secrets will need to be transmitted somewhere and everything has to go fine. Which is actually a lot more difficult, since such attack are often lunched widely, and not hand-tailored for each victim. A wide-net-phishing attack like this (as opposed to spear-type targeting an individual) can at very least by much more quickly noticed, helping people invalidate suspected IDs etc.

@cmcaine
Copy link

cmcaine commented Jul 21, 2019 via email

@kornelski
Copy link
Member

kornelski commented Jul 21, 2019

You can actually call a lot of commands without generating the id.

By start using, I've meant publishing a review (we need more of these, so that should be minimum friction). I don't think you can publish one without being introduced to crev's key management.

Another way to look at the situation:

There's some baseline level of (in)security: developers have SSH keys for their repos, crates-io token for publishing, and source sitting on their machine (that a malware running on their machine could modify directly, instead of hacking anything else). Some protect these better than others, but these are the major weak links.

What I think you're concerned about is elevating security above the baseline. If the developer gets hacked, and someone injects malware or publishes rogue package under their name, you can still catch it. In this case Crev has to have stronger security than the baseline.

What I'm concerned about is more trivial: I want to check that nobody intentionally publishes outright malware that would attack my machine and my projects. In this model baseline security is enough, because I'm only using Crev as a human-powered antivirus.

@dpc
Copy link
Collaborator

dpc commented Jul 22, 2019

@cmcaine

The semantics are not TOFU if the number of keys in a repo changes

I'm would not support that. Not only it seems like a lot of additional complexity, but also 99% of users would use only one key, and change it only in case it gettubg compromised. In such case manual intervention of other users might be neccessary/good idea anyway.

My goal and agreement is only, that for a lot of users bothering with CrevIDs is unnecessary and they will not double check anything anyway. So in the typical case TOFU and using repo url as identitier is good enough.

However for users that do care it must be possible to throurouglhy and carefully use and confirm all the underlying cryptographic identifiers.

I sketched a design for something a bit like crev last year or the year (...)

Crev pretty much works like this, except it does not upload the encrypted key anywhere. It just asks user to back it up somewhere and provides export id and import id. I did consider checking it in the public repository, but as you mention - this requires good passphrase, educating users and so on.

@kornelski

Another way to look at the situation:

I'm looking at crev as potentially superseding baseline level of package managers, because frankly it's insufficient for mission critical software.

I'm going to slowly disengage from the discussion, because I don't have much more time to devote to it.

To sum up my position:

  • I agree that simplifying the UI so that urls can be used to accomplish things without paying attention to CrevIDs is desirable, as long as users that do care, can still do stuff their thin-foil-hat-way.
  • I am not going to get rid of signing and cryptography. I don't see the reason for that, especially after a couple of tweaks here and there to make live easier. I know for the fact that if reviews are not signed, crev is going to be insufficient for any serious users eg. working on software in adversarial applications like Bitcoin.
  • Key management in crev is absolutely the simplest it can be.
  • Entering passphrase can be easily removed if the user so desires through CREV_PASSPHRASE. I'm even surprised this would be considered to be so annoying. After spending > 10 minutes reviewing code, this 5 seconds or so doesn't really seem that bad to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants