Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

contrib: Improve verify-commits.py to work with maintainers leaving #27058

Merged
merged 5 commits into from Feb 27, 2023

Conversation

achow101
Copy link
Member

@achow101 achow101 commented Feb 7, 2023

Currently the verify-commits.py script does not work well with maintainers giving up their commit access. If a key is removed from trusted-keys, any commits it signed previously will fail to verify, however keys cannot be kept in the list as it would allow that person to continue to push new commits. Furthermore, the trusted-keys used depends on the working tree which verify-commits.py itself may be modifying. When the script is run, the trusted-keys may be the one that is intended to be used, but the script may change the tree to a different commit with a different trusted-keys and use that instead!

To resolve these issues, I've updated verify-commits.py to load the trusted-keys file and check the keys itself rather than delegating that to gpg.sh (which previously read in trusted-keys). This avoids the issue with the tree changing.

I've also updated the script so that it stops modifying the tree. It would do this for the clean merge check where it would checkout each individual commit and attempt to reapply the merges, and then checking out the commit given as a cli arg. git merge-tree lets us do basically that but without modifying the tree. It will give us the object id for the resulting tree which we can compare against the object id of the tree in the merge commit in question. This also appears to be quite a bit faster.

Lastly I've removed all of the exception commits in allow-revsig-commits, allow-incorrect-sha512-commits, and allow-unclean-merge-commits since all of these predate the commits in trusted-git-root and trusted-sha512-root. I've also updated the script to skip verification of commits that predate trusted-git-root, and skip sha512 verification for those that predate trusted-sha512-root.

@DrahtBot
Copy link
Contributor

DrahtBot commented Feb 7, 2023

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Reviews

See the guideline for information on the review process.

Type Reviewers
ACK Sjors
Concept ACK glozow

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

No conflicts as of last run.

D1DBF2C4B96F2DEBF4C16654410108112E7EA81F
152812300785C96444D3334D17565732E08E5E41
6B002C6EA3F91B1B0DF0C9BC8F617F1200A6D25C
key,since,until
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if it makes sense to document the last merge commit here.

Keeping the key will also mean that the script will break again once the key naturally expires or is revoked. So I guess you'd also have to document a faketime where the script still passes? Moreover, the key might be purged from keyservers if it is expired/revoked, so you may also have to document the full key here. Not sure if we want this.

It seems easier to just bump the trusted root, like it was done before, see d4b3dc5

The trusted root hash commits to all previous commits, so there shouldn't be any downside, compared to listing a subset of commits prior to the trusted root hash.

Of course anyone is free to archive previous keys as long as they want and re-run gpg on previous commits, with or without faketime, as often as they want. But doing that once as part of the review process of bumping the trusted root should be enough. I don't see a need to maintain every key forever.

Overall I think, the script should ideally be easy to understand and use, so that many users can read and use it. It is already hard enough to properly run the script (just one example: #25197 (comment) ; happy to provide more examples), so adding even more complexity and failure modes will mean that likely no one is running it in practice, let alone understand it and be able to act on errors if there are any.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me, the question was how comfortable are we with a really recent trusted root? Suppose an active maintainer needs to be removed on very short notice, e.g. keys and account is compromised. We could drop the key and update the trusted root to the current HEAD, but are we okay with doing that? With this change, we could just set the until commit rather than removing the key.

My thinking was that the trusted root shouldn't be updated ever, and really we should make it possible to move it even earlier so that older commits signed by previous maintainers can still be verified. But maybe that's not how people think of verify-commits.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But maybe that's not how people think of verify-commits.

Yeah opinions may differ on this. Though, I think there is no easy way out of manually reviewing (and manually taking over! [1]) changes to this script and the data it uses. So optimizing for an easy and repeatable review process should be priority, to allow as many people to repeat it as possible. Otherwise, if the review process is different for each occasion, it is impossible to document and hard to follow. I do think it probably makes sense to allow for one-off rare exceptions (unclean merge, unsigned merge, wrong sha512, ...), which can be used during the review process of bumping the trusted root, but I think will complicate stuff if they are needed to be maintained for the whole history at the HEAD commit forever. (Note that anyone can check out an earlier version to get them for a previous commit, even if they are removed from HEAD)

We could drop the key and update the trusted root to the current HEAD, but are we okay with doing that?

I'd say yes. There is no way out of doing a change on short notice. And then, whether reviewers review the until commit, or the trusted root, which will be equal to the until commit, shouldn't make a difference?

[1] Blindly checking out the script on the fetched/downloaded/merged git commit is next to useless, unless you are the CI doing a smoke test.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've dropped this commit for now.

@Sjors
Copy link
Member

Sjors commented Feb 11, 2023

I like the CSV approach in c090a7f better. See inline discussion and here. I don't like having to update the root trusted commit every time the maintainer list changes.

Imo we should only bump it if a former maintainer revokes their key or it expires, plus maybe every couple of years to preempt such issues. As @MarcoFalke suggested people can do this with git checkout "$(cat contrib/verify-commits/trusted-git-root)~1" (and override verification script with the new version). But I'd rather avoid that friction.

One heuristic for when it's ok to change the root trusted commit could be the branch-off commit for the oldest release branch that we still support backports to.

@MarcoFalke wrote:

What would the alternative be? Listing hundreds or thousands of "revsig" commits in a file, to ensure it is impossible to review manually, only with special git commands, potentially making it trivial to sneak in malicious commits that are not actually revsig commits? And then, as you say, bump the root anyway later.

This script doesn't protect against maintainer collusion. But doesn't the CSV remove the need for revsig commits?

Tested with contrib/verify-commits/verify-commits.py HEAD~4. It indeed seems a bit faster. Still have to test some other things.

@maflcko
Copy link
Member

maflcko commented Feb 11, 2023

I don't like having to update the root trusted commit every time the maintainer list changes.

You don't have to. It is only one possible trusted git root. Everyone using this script will have to pick their root themselves in their own copy of this script anyway. They are free to not touch their script after the root is bumped and continue using the previous script+data; they are free to pick the latest data from the master branch, once and if they agree with it; they are also free to pick any other trusted git root according to their own needs (this is what I do).

In fact, everyone using the same trusted root (or encouraging people to do so via a workflow) makes it trivial for a single merge commit to short-circuit out all history (by treating all of history as one pull request), without the script even noticing (unless you are using a different root).

@achow101
Copy link
Member Author

But doesn't the CSV remove the need for revsig commits?

It could. The CSV could have a field that indicates the key is revoked, so we could then allow revsigs only for that key. Similar with expired keys.

@kristapsk
Copy link
Contributor

Maybe it would make sense to use OpenTimestamps to verify timestamps (dates) of merge commits?

@Sjors
Copy link
Member

Sjors commented Feb 12, 2023

@kristapsk maybe not for this PR, however, that might be a good way to make it safe to use a revocation date (we can't trust the date in a git commit). The rule would then be that a timestamp must exist with a median time past before that date.

… itself

Instead of having gpg.sh check against the trusted keys for a valid
signature, do it inside of verify-commits itself.

This also allows us to use the same trusted-keys throughout the
verify-commits.py check rather than it possibly being modified during
the clean merge check.
These commits predate the current trusted root.
@Sjors
Copy link
Member

Sjors commented Feb 21, 2023

For testing purposes, I tried rebasing the PR on master, removing @MarcoFalke's key without updating the trusted root and instead adding revsigs:

git log --format="%H %GK" --merges $(cat contrib/verify-commits/trusted-git-root)..master | grep -E "CE2B75697E69A548" | cut -c -40

It seems to ignore them though, because contrib/verify-commits/verify-commits.py HEAD~4 fails: No parent of 75f0e0b607cd7ff7afd56853eb34a2b285b22ad2 was signed with a trusted key!. I checked that the first parent it lists is in allow-revsig-commits.

Other stuff does seem to work (per bb86887), although I didn't test very thoroughly: if I mess with a random commit in history, it will fail complain about that commits that weren't signed. Even if I whitelist myself, the sha256 tree will complain, as it should.

We should merge this before #27135 for easier verification of the past ~year.

@Sjors
Copy link
Member

Sjors commented Feb 21, 2023

Ah wait, I'm misunderstanding what allow-revsig-commits does. It merely allows a signature fron a revoked key. Not a key that we removed from the trusted list, but a revoked PGP key (someone verifying signatures may or may not have an up to date view of revoked keys).

So the only way to verify earlier history is to check out the trusted root commit, copy the most recent verify-commits.py and run it again. That way it will use the trusted keys at the time.

I suggest we merge this and then later rethink how we want to handle changing maintainers.

tACK bb86887

subprocess.call([GIT, 'checkout', '--force', '--quiet', parents[0]])
subprocess.call([GIT, 'merge', '--no-ff', '--quiet', '--no-gpg-sign', parents[1]], stdout=subprocess.DEVNULL)
recreated_tree = subprocess.check_output([GIT, 'show', '--format=format:%T', 'HEAD']).decode('utf8').splitlines()[0]
recreated_tree = subprocess.check_output([GIT, "merge-tree", parents[0], parents[1]]).decode('utf8').splitlines()[0]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does anyone know which merge strategy this uses? I couldn't find anything at https://git-scm.com/docs/git-merge-tree

See also:

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the source, I believe it uses ort.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the minimum git version will be 2.38? git/git@1f0c3a2

Might be good to test on commit 0cfbb17

Copy link
Member

@Sjors Sjors Feb 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems so. If I remove the Homebrew version of Git (2.39.2) and fall back to Apple's default 2.37.1 the verification script fails (on any commit)

 % contrib/verify-commits/verify-commits.py 0cfbb17
Using verify-commits data from /Users/sjors/dev/bitcoin/contrib/verify-commits
usage: git merge-tree <base-tree> <branch1> <branch2>
Traceback (most recent call last):
  File "contrib/verify-commits/verify-commits.py", line 191, in <module>
    main()
  File "contrib/verify-commits/verify-commits.py", line 181, in main
    recreated_tree = subprocess.check_output([GIT, "merge-tree", parents[0], parents[1]]).decode('utf8').splitlines()[0]
  File "/Users/sjors/.pyenv/versions/3.7.16/lib/python3.7/subprocess.py", line 411, in check_output
    **kwargs).stdout
  File "/Users/sjors/.pyenv/versions/3.7.16/lib/python3.7/subprocess.py", line 512, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['git', 'merge-tree', '100949af0e2551f22c02a73355f2c64710b68ef1', 'a60d9eb9e6b6a272a3fca8981d89a55955dced55']' returned non-zero exit status 129.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is reasonable to require that. I'm sure the people who would run this script can figure that out. I'll add a note to the docs.

@Sjors
Copy link
Member

Sjors commented Feb 22, 2023

ACK 14fac80

@hebasto
Copy link
Member

hebasto commented Feb 23, 2023

This part in the PR description

I've changed trusted-keys from just a list of key ids to a csv file with each row being the key id, and the earliest and latest commits the key is allowed to have signed.

looks outdated.

@achow101
Copy link
Member Author

looks outdated.

Updated the description

@petertodd
Copy link
Contributor

@Sjors

The rule would then be that a [OpenTimestamps] timestamp must exist with a median time past before that date.

The median time past is not the correct way to interpret a Bitcoin block time for the purpose of timestamping. The problem is median time past is in the past; a timestamp proof is a stronger statement if it's backdated, not weaker.

Instead, the OpenTimestamps client rounds off timestamps to the nearest day in the local timezone to give users the right impression about the accuracy of OTS proofs, without getting into the UI complexity of having timestamps with dates apparently in the future. For an automated tool, adding a day to the block time and interpreting the timestamp as proof that some data existed prior to that point in time could be a reasonable approach.

@Sjors
Copy link
Member

Sjors commented Feb 23, 2023

@petertodd in practice things should be fine if there's at least a few days between when a maintainer last merges something and the date we put in a CSV to track when they are no longer authorised. Maybe a bit more if there's a congestion delay in when the timestamp gets included. In the case of Wladimir there was a full year between his last merge and revoking his key.

@petertodd
Copy link
Contributor

@petertodd in practice things should be fine if there's at least a few days between when a maintainer last merges something and the date we put in a CSV to track when they are no longer authorised. Maybe a bit more if there's a congestion delay in when the timestamp gets included. In the case of Wladimir there was a full year between his last merge and revoking his key.

My comment is about what standard an automated tool should apply. Obviously, if some humans are looking at it and manually updating a CSV, there's a lot of flexibility and people can use their judgement.

Copy link
Member

@glozow glozow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK 14fac80
Makes sense to me to use the same trusted-keys throughout, remove commit exceptions that are never used, use merge-tree instead of checkout/merge, and skip if commit is older than trusted root.

@glozow glozow merged commit 873dcc1 into bitcoin:master Feb 27, 2023
@maflcko
Copy link
Member

maflcko commented Feb 27, 2023

This will break the script on all operating systems except the rolling ones like rawhide and tumbleweed?

See also:

subprocess.CalledProcessError: Command '['git', 'merge-tree', 'be2e748f378fc9ed40593a723dd18f2528705956', '14fac808bd6c12bce121011bbf50501960c7326f']' returned non-zero exit status 129.

https://cirrus-ci.com/task/4961884550463488?logs=lint#L242

@glozow
Copy link
Member

glozow commented Feb 27, 2023

iiuc reverting 5497c14 would make it go away though keeping merge-tree would be nice.
Is bumping the lint container to something that gets 2.38 an option for CI?

@maflcko
Copy link
Member

maflcko commented Feb 27, 2023

Yeah, should be easy to temporarily bump CI to ubuntu:devel. Though, I wonder what users are supposed to do?

@glozow
Copy link
Member

glozow commented Feb 27, 2023

Would it be unreasonable for users to add a ppa and install that way? Also temporary?

sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Feb 27, 2023
@Sjors
Copy link
Member

Sjors commented Mar 1, 2023

I think it's fine to have a temporary hoop here, as long as it's documented.

@maflcko
Copy link
Member

maflcko commented Mar 1, 2023

Ok, fair enough. Also, I guess no one is forced to upgrade their script. Anyone is free to use the previous version of the python script as long as they want.

@bitcoin bitcoin locked and limited conversation to collaborators Feb 29, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants