New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
contrib: Improve verify-commits.py to work with maintainers leaving #27058
Conversation
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update. ConflictsNo conflicts as of last run. |
61f2100
to
15d9fc0
Compare
contrib/verify-commits/trusted-keys
Outdated
D1DBF2C4B96F2DEBF4C16654410108112E7EA81F | ||
152812300785C96444D3334D17565732E08E5E41 | ||
6B002C6EA3F91B1B0DF0C9BC8F617F1200A6D25C | ||
key,since,until |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if it makes sense to document the last merge commit here.
Keeping the key will also mean that the script will break again once the key naturally expires or is revoked. So I guess you'd also have to document a faketime where the script still passes? Moreover, the key might be purged from keyservers if it is expired/revoked, so you may also have to document the full key here. Not sure if we want this.
It seems easier to just bump the trusted root, like it was done before, see d4b3dc5
The trusted root hash commits to all previous commits, so there shouldn't be any downside, compared to listing a subset of commits prior to the trusted root hash.
Of course anyone is free to archive previous keys as long as they want and re-run gpg on previous commits, with or without faketime, as often as they want. But doing that once as part of the review process of bumping the trusted root should be enough. I don't see a need to maintain every key forever.
Overall I think, the script should ideally be easy to understand and use, so that many users can read and use it. It is already hard enough to properly run the script (just one example: #25197 (comment) ; happy to provide more examples), so adding even more complexity and failure modes will mean that likely no one is running it in practice, let alone understand it and be able to act on errors if there are any.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For me, the question was how comfortable are we with a really recent trusted root? Suppose an active maintainer needs to be removed on very short notice, e.g. keys and account is compromised. We could drop the key and update the trusted root to the current HEAD
, but are we okay with doing that? With this change, we could just set the until
commit rather than removing the key.
My thinking was that the trusted root shouldn't be updated ever, and really we should make it possible to move it even earlier so that older commits signed by previous maintainers can still be verified. But maybe that's not how people think of verify-commits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But maybe that's not how people think of verify-commits.
Yeah opinions may differ on this. Though, I think there is no easy way out of manually reviewing (and manually taking over! [1]) changes to this script and the data it uses. So optimizing for an easy and repeatable review process should be priority, to allow as many people to repeat it as possible. Otherwise, if the review process is different for each occasion, it is impossible to document and hard to follow. I do think it probably makes sense to allow for one-off rare exceptions (unclean merge, unsigned merge, wrong sha512, ...), which can be used during the review process of bumping the trusted root, but I think will complicate stuff if they are needed to be maintained for the whole history at the HEAD commit forever. (Note that anyone can check out an earlier version to get them for a previous commit, even if they are removed from HEAD)
We could drop the key and update the trusted root to the current HEAD, but are we okay with doing that?
I'd say yes. There is no way out of doing a change on short notice. And then, whether reviewers review the until
commit, or the trusted root, which will be equal to the until
commit, shouldn't make a difference?
[1] Blindly checking out the script on the fetched/downloaded/merged git commit is next to useless, unless you are the CI doing a smoke test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've dropped this commit for now.
15d9fc0
to
16d8070
Compare
I like the CSV approach in c090a7f better. See inline discussion and here. I don't like having to update the root trusted commit every time the maintainer list changes. Imo we should only bump it if a former maintainer revokes their key or it expires, plus maybe every couple of years to preempt such issues. As @MarcoFalke suggested people can do this with One heuristic for when it's ok to change the root trusted commit could be the branch-off commit for the oldest release branch that we still support backports to.
This script doesn't protect against maintainer collusion. But doesn't the CSV remove the need for revsig commits? Tested with |
You don't have to. It is only one possible trusted git root. Everyone using this script will have to pick their root themselves in their own copy of this script anyway. They are free to not touch their script after the root is bumped and continue using the previous script+data; they are free to pick the latest data from the In fact, everyone using the same trusted root (or encouraging people to do so via a workflow) makes it trivial for a single merge commit to short-circuit out all history (by treating all of history as one pull request), without the script even noticing (unless you are using a different root). |
It could. The CSV could have a field that indicates the key is revoked, so we could then allow revsigs only for that key. Similar with expired keys. |
Maybe it would make sense to use OpenTimestamps to verify timestamps (dates) of merge commits? |
@kristapsk maybe not for this PR, however, that might be a good way to make it safe to use a revocation date (we can't trust the date in a git commit). The rule would then be that a timestamp must exist with a median time past before that date. |
… itself Instead of having gpg.sh check against the trusted keys for a valid signature, do it inside of verify-commits itself. This also allows us to use the same trusted-keys throughout the verify-commits.py check rather than it possibly being modified during the clean merge check.
These commits predate the current trusted root.
16d8070
to
bb86887
Compare
For testing purposes, I tried rebasing the PR on master, removing @MarcoFalke's key without updating the trusted root and instead adding revsigs:
It seems to ignore them though, because Other stuff does seem to work (per bb86887), although I didn't test very thoroughly: if I mess with a random commit in history, it will fail complain about that commits that weren't signed. Even if I whitelist myself, the sha256 tree will complain, as it should. We should merge this before #27135 for easier verification of the past ~year. |
Ah wait, I'm misunderstanding what So the only way to verify earlier history is to check out the trusted root commit, copy the most recent I suggest we merge this and then later rethink how we want to handle changing maintainers. tACK bb86887 |
subprocess.call([GIT, 'checkout', '--force', '--quiet', parents[0]]) | ||
subprocess.call([GIT, 'merge', '--no-ff', '--quiet', '--no-gpg-sign', parents[1]], stdout=subprocess.DEVNULL) | ||
recreated_tree = subprocess.check_output([GIT, 'show', '--format=format:%T', 'HEAD']).decode('utf8').splitlines()[0] | ||
recreated_tree = subprocess.check_output([GIT, "merge-tree", parents[0], parents[1]]).decode('utf8').splitlines()[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does anyone know which merge strategy this uses? I couldn't find anything at https://git-scm.com/docs/git-merge-tree
See also:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the source, I believe it uses ort
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the minimum git version will be 2.38? git/git@1f0c3a2
Might be good to test on commit 0cfbb17
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems so. If I remove the Homebrew version of Git (2.39.2) and fall back to Apple's default 2.37.1 the verification script fails (on any commit)
% contrib/verify-commits/verify-commits.py 0cfbb17
Using verify-commits data from /Users/sjors/dev/bitcoin/contrib/verify-commits
usage: git merge-tree <base-tree> <branch1> <branch2>
Traceback (most recent call last):
File "contrib/verify-commits/verify-commits.py", line 191, in <module>
main()
File "contrib/verify-commits/verify-commits.py", line 181, in main
recreated_tree = subprocess.check_output([GIT, "merge-tree", parents[0], parents[1]]).decode('utf8').splitlines()[0]
File "/Users/sjors/.pyenv/versions/3.7.16/lib/python3.7/subprocess.py", line 411, in check_output
**kwargs).stdout
File "/Users/sjors/.pyenv/versions/3.7.16/lib/python3.7/subprocess.py", line 512, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['git', 'merge-tree', '100949af0e2551f22c02a73355f2c64710b68ef1', 'a60d9eb9e6b6a272a3fca8981d89a55955dced55']' returned non-zero exit status 129.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is reasonable to require that. I'm sure the people who would run this script can figure that out. I'll add a note to the docs.
ACK 14fac80 |
This part in the PR description
looks outdated. |
Updated the description |
The median time past is not the correct way to interpret a Bitcoin block time for the purpose of timestamping. The problem is median time past is in the past; a timestamp proof is a stronger statement if it's backdated, not weaker. Instead, the OpenTimestamps client rounds off timestamps to the nearest day in the local timezone to give users the right impression about the accuracy of OTS proofs, without getting into the UI complexity of having timestamps with dates apparently in the future. For an automated tool, adding a day to the block time and interpreting the timestamp as proof that some data existed prior to that point in time could be a reasonable approach. |
@petertodd in practice things should be fine if there's at least a few days between when a maintainer last merges something and the date we put in a CSV to track when they are no longer authorised. Maybe a bit more if there's a congestion delay in when the timestamp gets included. In the case of Wladimir there was a full year between his last merge and revoking his key. |
My comment is about what standard an automated tool should apply. Obviously, if some humans are looking at it and manually updating a CSV, there's a lot of flexibility and people can use their judgement. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK 14fac80
Makes sense to me to use the same trusted-keys throughout, remove commit exceptions that are never used, use merge-tree instead of checkout/merge, and skip if commit is older than trusted root.
This will break the script on all operating systems except the rolling ones like rawhide and tumbleweed? See also: subprocess.CalledProcessError: Command '['git', 'merge-tree', 'be2e748f378fc9ed40593a723dd18f2528705956', '14fac808bd6c12bce121011bbf50501960c7326f']' returned non-zero exit status 129. |
iiuc reverting 5497c14 would make it go away though keeping |
Yeah, should be easy to temporarily bump CI to |
Would it be unreasonable for users to add a ppa and install that way? Also temporary? |
I think it's fine to have a temporary hoop here, as long as it's documented. |
Ok, fair enough. Also, I guess no one is forced to upgrade their script. Anyone is free to use the previous version of the python script as long as they want. |
Currently the
verify-commits.py
script does not work well with maintainers giving up their commit access. If a key is removed fromtrusted-keys
, any commits it signed previously will fail to verify, however keys cannot be kept in the list as it would allow that person to continue to push new commits. Furthermore, thetrusted-keys
used depends on the working tree whichverify-commits.py
itself may be modifying. When the script is run, thetrusted-keys
may be the one that is intended to be used, but the script may change the tree to a different commit with a differenttrusted-keys
and use that instead!To resolve these issues, I've updated
verify-commits.py
to load thetrusted-keys
file and check the keys itself rather than delegating that togpg.sh
(which previously read intrusted-keys
). This avoids the issue with the tree changing.I've also updated the script so that it stops modifying the tree. It would do this for the clean merge check where it would checkout each individual commit and attempt to reapply the merges, and then checking out the commit given as a cli arg.
git merge-tree
lets us do basically that but without modifying the tree. It will give us the object id for the resulting tree which we can compare against the object id of the tree in the merge commit in question. This also appears to be quite a bit faster.Lastly I've removed all of the exception commits in
allow-revsig-commits
,allow-incorrect-sha512-commits
, andallow-unclean-merge-commits
since all of these predate the commits intrusted-git-root
andtrusted-sha512-root
. I've also updated the script to skip verification of commits that predatetrusted-git-root
, and skip sha512 verification for those that predatetrusted-sha512-root
.