contrib: Improve verify-commits.py to work with maintainers leaving #27058

achow101 · 2023-02-07T19:17:51Z

Currently the verify-commits.py script does not work well with maintainers giving up their commit access. If a key is removed from trusted-keys, any commits it signed previously will fail to verify, however keys cannot be kept in the list as it would allow that person to continue to push new commits. Furthermore, the trusted-keys used depends on the working tree which verify-commits.py itself may be modifying. When the script is run, the trusted-keys may be the one that is intended to be used, but the script may change the tree to a different commit with a different trusted-keys and use that instead!

To resolve these issues, I've updated verify-commits.py to load the trusted-keys file and check the keys itself rather than delegating that to gpg.sh (which previously read in trusted-keys). This avoids the issue with the tree changing.

I've also updated the script so that it stops modifying the tree. It would do this for the clean merge check where it would checkout each individual commit and attempt to reapply the merges, and then checking out the commit given as a cli arg. git merge-tree lets us do basically that but without modifying the tree. It will give us the object id for the resulting tree which we can compare against the object id of the tree in the merge commit in question. This also appears to be quite a bit faster.

Lastly I've removed all of the exception commits in allow-revsig-commits, allow-incorrect-sha512-commits, and allow-unclean-merge-commits since all of these predate the commits in trusted-git-root and trusted-sha512-root. I've also updated the script to skip verification of commits that predate trusted-git-root, and skip sha512 verification for those that predate trusted-sha512-root.

DrahtBot · 2023-02-07T19:17:53Z

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Reviews

See the guideline for information on the review process.

Type	Reviewers
ACK	Sjors
Concept ACK	glozow

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

No conflicts as of last run.

contrib/verify-commits/verify-commits.py

maflcko · 2023-02-08T09:54:15Z

contrib/verify-commits/trusted-keys

-D1DBF2C4B96F2DEBF4C16654410108112E7EA81F
-152812300785C96444D3334D17565732E08E5E41
-6B002C6EA3F91B1B0DF0C9BC8F617F1200A6D25C
+key,since,until


Not sure if it makes sense to document the last merge commit here.

Keeping the key will also mean that the script will break again once the key naturally expires or is revoked. So I guess you'd also have to document a faketime where the script still passes? Moreover, the key might be purged from keyservers if it is expired/revoked, so you may also have to document the full key here. Not sure if we want this.

It seems easier to just bump the trusted root, like it was done before, see d4b3dc5

The trusted root hash commits to all previous commits, so there shouldn't be any downside, compared to listing a subset of commits prior to the trusted root hash.

Of course anyone is free to archive previous keys as long as they want and re-run gpg on previous commits, with or without faketime, as often as they want. But doing that once as part of the review process of bumping the trusted root should be enough. I don't see a need to maintain every key forever.

Overall I think, the script should ideally be easy to understand and use, so that many users can read and use it. It is already hard enough to properly run the script (just one example: #25197 (comment) ; happy to provide more examples), so adding even more complexity and failure modes will mean that likely no one is running it in practice, let alone understand it and be able to act on errors if there are any.

For me, the question was how comfortable are we with a really recent trusted root? Suppose an active maintainer needs to be removed on very short notice, e.g. keys and account is compromised. We could drop the key and update the trusted root to the current HEAD, but are we okay with doing that? With this change, we could just set the until commit rather than removing the key.

My thinking was that the trusted root shouldn't be updated ever, and really we should make it possible to move it even earlier so that older commits signed by previous maintainers can still be verified. But maybe that's not how people think of verify-commits.

But maybe that's not how people think of verify-commits.

Yeah opinions may differ on this. Though, I think there is no easy way out of manually reviewing (and manually taking over! [1]) changes to this script and the data it uses. So optimizing for an easy and repeatable review process should be priority, to allow as many people to repeat it as possible. Otherwise, if the review process is different for each occasion, it is impossible to document and hard to follow. I do think it probably makes sense to allow for one-off rare exceptions (unclean merge, unsigned merge, wrong sha512, ...), which can be used during the review process of bumping the trusted root, but I think will complicate stuff if they are needed to be maintained for the whole history at the HEAD commit forever. (Note that anyone can check out an earlier version to get them for a previous commit, even if they are removed from HEAD)

We could drop the key and update the trusted root to the current HEAD, but are we okay with doing that?

I'd say yes. There is no way out of doing a change on short notice. And then, whether reviewers review the until commit, or the trusted root, which will be equal to the until commit, shouldn't make a difference?

[1] Blindly checking out the script on the fetched/downloaded/merged git commit is next to useless, unless you are the CI doing a smoke test.

I've dropped this commit for now.

Sjors · 2023-02-11T13:31:47Z

I like the CSV approach in c090a7f better. See inline discussion and here. I don't like having to update the root trusted commit every time the maintainer list changes.

Imo we should only bump it if a former maintainer revokes their key or it expires, plus maybe every couple of years to preempt such issues. As @MarcoFalke suggested people can do this with git checkout "$(cat contrib/verify-commits/trusted-git-root)~1" (and override verification script with the new version). But I'd rather avoid that friction.

One heuristic for when it's ok to change the root trusted commit could be the branch-off commit for the oldest release branch that we still support backports to.

@MarcoFalke wrote:

What would the alternative be? Listing hundreds or thousands of "revsig" commits in a file, to ensure it is impossible to review manually, only with special git commands, potentially making it trivial to sneak in malicious commits that are not actually revsig commits? And then, as you say, bump the root anyway later.

This script doesn't protect against maintainer collusion. But doesn't the CSV remove the need for revsig commits?

Tested with contrib/verify-commits/verify-commits.py HEAD~4. It indeed seems a bit faster. Still have to test some other things.

maflcko · 2023-02-11T13:45:58Z

I don't like having to update the root trusted commit every time the maintainer list changes.

You don't have to. It is only one possible trusted git root. Everyone using this script will have to pick their root themselves in their own copy of this script anyway. They are free to not touch their script after the root is bumped and continue using the previous script+data; they are free to pick the latest data from the master branch, once and if they agree with it; they are also free to pick any other trusted git root according to their own needs (this is what I do).

In fact, everyone using the same trusted root (or encouraging people to do so via a workflow) makes it trivial for a single merge commit to short-circuit out all history (by treating all of history as one pull request), without the script even noticing (unless you are using a different root).

achow101 · 2023-02-11T16:32:34Z

But doesn't the CSV remove the need for revsig commits?

It could. The CSV could have a field that indicates the key is revoked, so we could then allow revsigs only for that key. Similar with expired keys.

kristapsk · 2023-02-11T16:52:09Z

Maybe it would make sense to use OpenTimestamps to verify timestamps (dates) of merge commits?

Sjors · 2023-02-12T14:39:30Z

@kristapsk maybe not for this PR, however, that might be a good way to make it safe to use a revocation date (we can't trust the date in a git commit). The rule would then be that a timestamp must exist with a median time past before that date.

… itself Instead of having gpg.sh check against the trusted keys for a valid signature, do it inside of verify-commits itself. This also allows us to use the same trusted-keys throughout the verify-commits.py check rather than it possibly being modified during the clean merge check.

These commits predate the current trusted root.

Sjors · 2023-02-21T14:45:07Z

For testing purposes, I tried rebasing the PR on master, removing @MarcoFalke's key without updating the trusted root and instead adding revsigs:

git log --format="%H %GK" --merges $(cat contrib/verify-commits/trusted-git-root)..master | grep -E "CE2B75697E69A548" | cut -c -40

It seems to ignore them though, because contrib/verify-commits/verify-commits.py HEAD~4 fails: No parent of 75f0e0b607cd7ff7afd56853eb34a2b285b22ad2 was signed with a trusted key!. I checked that the first parent it lists is in allow-revsig-commits.

Other stuff does seem to work (per bb86887), although I didn't test very thoroughly: if I mess with a random commit in history, it will fail complain about that commits that weren't signed. Even if I whitelist myself, the sha256 tree will complain, as it should.

We should merge this before #27135 for easier verification of the past ~year.

Sjors · 2023-02-21T15:05:50Z

Ah wait, I'm misunderstanding what allow-revsig-commits does. It merely allows a signature fron a revoked key. Not a key that we removed from the trusted list, but a revoked PGP key (someone verifying signatures may or may not have an up to date view of revoked keys).

So the only way to verify earlier history is to check out the trusted root commit, copy the most recent verify-commits.py and run it again. That way it will use the trusted keys at the time.

I suggest we merge this and then later rethink how we want to handle changing maintainers.

tACK bb86887

maflcko · 2023-02-21T15:23:50Z

contrib/verify-commits/verify-commits.py

-            subprocess.call([GIT, 'checkout', '--force', '--quiet', parents[0]])
-            subprocess.call([GIT, 'merge', '--no-ff', '--quiet', '--no-gpg-sign', parents[1]], stdout=subprocess.DEVNULL)
-            recreated_tree = subprocess.check_output([GIT, 'show', '--format=format:%T', 'HEAD']).decode('utf8').splitlines()[0]
+            recreated_tree = subprocess.check_output([GIT, "merge-tree", parents[0], parents[1]]).decode('utf8').splitlines()[0]


Does anyone know which merge strategy this uses? I couldn't find anything at https://git-scm.com/docs/git-merge-tree

See also:

https://git-scm.com/docs/git-merge#Documentation/git-merge.txt--sltstrategygt

commit 291e363

Looking at the source, I believe it uses ort.

So the minimum git version will be 2.38? git/git@1f0c3a2

Might be good to test on commit 0cfbb17

It seems so. If I remove the Homebrew version of Git (2.39.2) and fall back to Apple's default 2.37.1 the verification script fails (on any commit)

% contrib/verify-commits/verify-commits.py 0cfbb17 Using verify-commits data from /Users/sjors/dev/bitcoin/contrib/verify-commits usage: git merge-tree <base-tree> <branch1> <branch2> Traceback (most recent call last): File "contrib/verify-commits/verify-commits.py", line 191, in <module> main() File "contrib/verify-commits/verify-commits.py", line 181, in main recreated_tree = subprocess.check_output([GIT, "merge-tree", parents[0], parents[1]]).decode('utf8').splitlines()[0] File "/Users/sjors/.pyenv/versions/3.7.16/lib/python3.7/subprocess.py", line 411, in check_output **kwargs).stdout File "/Users/sjors/.pyenv/versions/3.7.16/lib/python3.7/subprocess.py", line 512, in run output=stdout, stderr=stderr) subprocess.CalledProcessError: Command '['git', 'merge-tree', '100949af0e2551f22c02a73355f2c64710b68ef1', 'a60d9eb9e6b6a272a3fca8981d89a55955dced55']' returned non-zero exit status 129.

I think it is reasonable to require that. I'm sure the people who would run this script can figure that out. I'll add a note to the docs.

Sjors · 2023-02-22T11:18:08Z

ACK 14fac80

hebasto · 2023-02-23T16:22:09Z

This part in the PR description

I've changed trusted-keys from just a list of key ids to a csv file with each row being the key id, and the earliest and latest commits the key is allowed to have signed.

looks outdated.

achow101 · 2023-02-23T16:46:23Z

looks outdated.

Updated the description

petertodd · 2023-02-23T17:01:49Z

@Sjors

The rule would then be that a [OpenTimestamps] timestamp must exist with a median time past before that date.

The median time past is not the correct way to interpret a Bitcoin block time for the purpose of timestamping. The problem is median time past is in the past; a timestamp proof is a stronger statement if it's backdated, not weaker.

Instead, the OpenTimestamps client rounds off timestamps to the nearest day in the local timezone to give users the right impression about the accuracy of OTS proofs, without getting into the UI complexity of having timestamps with dates apparently in the future. For an automated tool, adding a day to the block time and interpreting the timestamp as proof that some data existed prior to that point in time could be a reasonable approach.

Sjors · 2023-02-23T18:28:54Z

@petertodd in practice things should be fine if there's at least a few days between when a maintainer last merges something and the date we put in a CSV to track when they are no longer authorised. Maybe a bit more if there's a congestion delay in when the timestamp gets included. In the case of Wladimir there was a full year between his last merge and revoking his key.

petertodd · 2023-02-23T19:07:44Z

@petertodd in practice things should be fine if there's at least a few days between when a maintainer last merges something and the date we put in a CSV to track when they are no longer authorised. Maybe a bit more if there's a congestion delay in when the timestamp gets included. In the case of Wladimir there was a full year between his last merge and revoking his key.

My comment is about what standard an automated tool should apply. Obviously, if some humans are looking at it and manually updating a CSV, there's a lot of flexibility and people can use their judgement.

glozow

Concept ACK 14fac80
Makes sense to me to use the same trusted-keys throughout, remove commit exceptions that are never used, use merge-tree instead of checkout/merge, and skip if commit is older than trusted root.

maflcko · 2023-02-27T13:37:53Z

This will break the script on all operating systems except the rolling ones like rawhide and tumbleweed?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

contrib: Improve verify-commits.py to work with maintainers leaving #27058

contrib: Improve verify-commits.py to work with maintainers leaving #27058

achow101 commented Feb 7, 2023 •

edited

DrahtBot commented Feb 7, 2023 •

edited

maflcko Feb 8, 2023

achow101 Feb 8, 2023

maflcko Feb 8, 2023

achow101 Feb 10, 2023

Sjors commented Feb 11, 2023

maflcko commented Feb 11, 2023 •

edited

achow101 commented Feb 11, 2023

kristapsk commented Feb 11, 2023

Sjors commented Feb 12, 2023 •

edited

Sjors commented Feb 21, 2023 •

edited

Sjors commented Feb 21, 2023 •

edited

maflcko Feb 21, 2023

achow101 Feb 21, 2023

maflcko Feb 21, 2023

Sjors Feb 21, 2023 •

edited

achow101 Feb 21, 2023

Sjors commented Feb 22, 2023

hebasto commented Feb 23, 2023

achow101 commented Feb 23, 2023

petertodd commented Feb 23, 2023

Sjors commented Feb 23, 2023

petertodd commented Feb 23, 2023

glozow left a comment

maflcko commented Feb 27, 2023

glozow commented Feb 27, 2023

maflcko commented Feb 27, 2023 •

edited

glozow commented Feb 27, 2023

Sjors commented Mar 1, 2023

maflcko commented Mar 1, 2023

contrib: Improve verify-commits.py to work with maintainers leaving #27058

contrib: Improve verify-commits.py to work with maintainers leaving #27058

Conversation

achow101 commented Feb 7, 2023 • edited

DrahtBot commented Feb 7, 2023 • edited

Reviews

Conflicts

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Sjors commented Feb 11, 2023

maflcko commented Feb 11, 2023 • edited

achow101 commented Feb 11, 2023

kristapsk commented Feb 11, 2023

Sjors commented Feb 12, 2023 • edited

Sjors commented Feb 21, 2023 • edited

Sjors commented Feb 21, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Sjors Feb 21, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Sjors commented Feb 22, 2023

hebasto commented Feb 23, 2023

achow101 commented Feb 23, 2023

petertodd commented Feb 23, 2023

Sjors commented Feb 23, 2023

petertodd commented Feb 23, 2023

glozow left a comment

Choose a reason for hiding this comment

maflcko commented Feb 27, 2023

glozow commented Feb 27, 2023

maflcko commented Feb 27, 2023 • edited

glozow commented Feb 27, 2023

Sjors commented Mar 1, 2023

maflcko commented Mar 1, 2023

achow101 commented Feb 7, 2023 •

edited

DrahtBot commented Feb 7, 2023 •

edited

maflcko commented Feb 11, 2023 •

edited

Sjors commented Feb 12, 2023 •

edited

Sjors commented Feb 21, 2023 •

edited

Sjors commented Feb 21, 2023 •

edited

Sjors Feb 21, 2023 •

edited

maflcko commented Feb 27, 2023 •

edited