Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[git-webkit] Add pre-push hook to prevent publication of security sensative commits #11043

Conversation

JonWBedard
Copy link
Member

@JonWBedard JonWBedard commented Mar 4, 2023

604395a

[git-webkit] Add pre-push hook to prevent publication of security sensative commits
https://bugs.webkit.org/show_bug.cgi?id=253354
rdar://106216593

Reviewed by Elliott Williams.

Write a pre-push hook to block or prompt the user in 3 situations to prevent the
inadvertent publication of security sensative commits:
- Class 1: A commit exists on a remote more secure than the one a contributor is pushing to
- Class 2: A commit is a cherry-pick of a commit from a more secure remote
- Class 3: The commit references a security bug the target remote is public

The goal of this hook is to prevent class 1 and 2 without relying on code in the checkout, while
class 3 relies on webkitbugspy to determine if a linked issue is redacted.

* Tools/Scripts/hooks/pre-push: Added.
* Tools/Scripts/libraries/webkitscmpy/setup.py: Bump version.
* Tools/Scripts/libraries/webkitscmpy/webkitscmpy/__init__.py: Ditto.
* Tools/Scripts/libraries/webkitscmpy/webkitscmpy/mocks/local/git.py:
(Git): Add `git config --get-regexp`
* Tools/Scripts/libraries/webkitscmpy/webkitscmpy/program/clean.py:
(Clean.cleanup): Forward verbosity into `git push`.
(Clean.main): Ditto.
(DeletePRBranches.main): Ditto.
* Tools/Scripts/libraries/webkitscmpy/webkitscmpy/program/land.py:
(Land.main): Forward verbosity into `git push`.
* Tools/Scripts/libraries/webkitscmpy/webkitscmpy/program/publish.py:
(Publish.main): Change operating mode of our pre-push hook to allow class-1 security
publication with a prompt.
* Tools/Scripts/libraries/webkitscmpy/webkitscmpy/program/pull_request.py:
(PullRequest.create_pull_request): Forward verbosity into `git push`.
* Tools/Scripts/libraries/webkitscmpy/webkitscmpy/program/setup.py:
(Setup._security_levels): Provide a security level for source and fork remotes based on
the order of our source remotes.
(Setup.git): Pass arguments to template for our pre-push hook.
* Tools/Scripts/libraries/webkitscmpy/webkitscmpy/test/git_unittest.py:
* Tools/Scripts/libraries/webkitscmpy/webkitscmpy/test/setup_unittest.py:

Canonical link: https://commits.webkit.org/261526@main

3321a75

Misc iOS, tvOS & watchOS macOS Linux Windows
❌ πŸ§ͺ style   πŸ›  ios   πŸ›  mac βœ… πŸ›  wpe   πŸ›  wincairo
βœ… πŸ§ͺ bindings βœ… πŸ›  ios-sim   πŸ›  mac-AS-debug   πŸ§ͺ wpe-wk2
βœ… πŸ§ͺ webkitperl   πŸ§ͺ ios-wk2   πŸ§ͺ api-mac βœ… πŸ›  gtk
βœ… πŸ§ͺ webkitpy   πŸ§ͺ api-ios   πŸ§ͺ mac-wk1   πŸ§ͺ gtk-wk2
  πŸ›  tv   πŸ§ͺ mac-wk2   πŸ§ͺ api-gtk
  πŸ›  tv-sim   πŸ§ͺ mac-AS-debug-wk2
  πŸ§ͺ services   πŸ›  watch   πŸ§ͺ mac-wk2-stress
  πŸ›  watch-sim
βœ… πŸ›  πŸ§ͺ unsafe-merge

@JonWBedard JonWBedard self-assigned this Mar 4, 2023
@JonWBedard JonWBedard added Tools / Tests Tools in the Tools directory, build issues, test infrastructure, and bugs in test cases skip-ews Applied to prevent a change from being run on EWS and removed skip-ews Applied to prevent a change from being run on EWS labels Mar 4, 2023
@JonWBedard JonWBedard force-pushed the eng/git-webkit-Add-pre-push-hook-to-prevent-publication-of-security-sensative-commits branch from 5dfede3 to 86d00a9 Compare March 4, 2023 00:52
@JonWBedard JonWBedard added the skip-ews Applied to prevent a change from being run on EWS label Mar 4, 2023
Copy link
Contributor

@ddkilzer ddkilzer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had questions/comments while reviewing the code, but I don't feel qualified to approve or deny this (as I haven't reviewed related code in the past), so none of my comments should block landing the patch.

In general, I was looking for:

  • Ways to improve readability of the code.
  • Places where additional sanity checks may be required in the code.
  • Places where mistakes could be made or could occur, such as the order of remotes defining security levels, or copying and pasting the same text repeatedly to approve an action.

Also, I may not understand the threat model here, so (for example) the order of remotes matching the security level may be okay. (I was worried about a situation where there was a misconfigured setup, or a setup that had been locally edited for some reason without enforcingβ€”or knowledge ofβ€”these rules.)

SCRIPTS = os.path.dirname(os.path.dirname(LOCATION))
sys.path.append(SCRIPTS)

REMOTE_RE = re.compile(r'(?P<protcol>[^@:]+)(@|://)(?P<host>[^:]+)(/|:)(?P<path>\S+)(\.git)?/?')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use urllib.parse or similar to parse repository URLs instead of a regex? This regex is awfully complex.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't want to rely on anything outside of the Python standard library here, and urllib.parse doesn't handle something like git@github.com:WebKit/WebKit.git or ssh://github.com:WebKit/WebKit.git, so while we could rely on both urllib.parse and a regex for git@ and ssh://, but the reduced regex which can handle git@ and ssh:// can nearly handle urllib.parse's case too

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you could do something like:

url = 'git@github.com:WebKit/WebKit.git'
scheme = urllib.parse.urlsplit(url).scheme
if not scheme:
    url = 'ssh://' + url.replace(':', '/', 1)
if scheme == 'ssh':
    url = url.replace(':', '/', 1)
result = urllib.parse.urlsplit(url)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could do something like this (in a function, presumably), I think the question is: is this any less confusing than a regex?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I think this is easier to follow, and much easier to reason about the correctness of. Of course, it could be made easier to follow with comments, too. But I'm biased because I wrote it :)

Up to you to decide which is better for maintaining.

Tools/Scripts/hooks/pre-push Outdated Show resolved Hide resolved
Tools/Scripts/hooks/pre-push Outdated Show resolved Hide resolved
Comment on lines +73 to +93
if not REPOSITORY:
print_error('Checkout too out-of-date to check bug status of novel commits')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the if not REPOSITORY: block need a sys.exit(1) call?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, if we invoked sys.exit here, we would prevent users from pushing anywhere if their checkout was too old. Given that your checkout being too old only breaks 1 of our 3 checks (and the other two checks are the ones most relevant to an old checkout anyways), we want to keep this hook useful even in old checkouts

Tools/Scripts/hooks/pre-push Outdated Show resolved Hide resolved
Tools/Scripts/hooks/pre-push Outdated Show resolved Hide resolved
@@ -37,6 +38,8 @@ class Setup(Command):
name = 'setup'
help = 'Configure local settings for the current repository'

REMOTE_RE = re.compile(r'(?P<protcol>[^@:]+)(@|://)(?P<host>[^:]+)(/|:)(?P<path>\S+)(\.git)?/?')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As in a previous comment, it would be nice if we could use urllib.parse or similar to parse git repo URLs rather than relying on a complex regex.

result['fork'] = level
else:
result['{}-fork'.format(key.split('.')[-1])] = level
level += 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the security level really just based on the order in which the remotes are listed in a .git/config file, or is there some magic that happens with git config --get-regexp webkitscmpy.remotes to ensure the correct order each time it is run?

Maybe I'm not understanding the threat model, though.

Copy link
Member Author

@JonWBedard JonWBedard Mar 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These aren't listed in .git/config, they come from metadata/git_config_extension. It is the order in this file (which is checked in) that matters. The nice thing about this approach is that it can allow other workflows to define additional remotes in .git/config and still have this logic work, but we can discuss alternatives here (like perhaps defining security levels in metadata/git_config_extension) if this implicit order concept isn't making sense to folks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know there is precedent for this pattern in the code, but I am scared at how magical the ordering is. It seems like any accidental reordering could lead to a leak of security content.

Would you mind putting a comment in metadata/git_config_extension explaining the significance of the ordering?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@emw-apple and I discussed this offline, instead of relying on implicit ordering here, we're going to do #11251, which makes the security level an explicit part of the configuration. Once that PR lands, I'll adopt it here.

Comment on lines 228 to 231
with open(os.path.join(self.path, '.git', 'config'), 'a') as f:
f.write('[webkitscmpy "remotes"]\n')
f.write(' origin = git@example.org:project/project\n')
f.write(' security = git@example.org:project/project-security\n')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per my previous comment, what happens if you reverse origin and security? Will security be at a lower security level than origin?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it would. These are defined in metadata/git_config_extension, the unit tests using a definition in .git/config because our mocking isn't smart enough to follow this the way real git is:

[include]
	path = ../metadata/git_config_extension

Tools/Scripts/hooks/pre-push Outdated Show resolved Hide resolved
@JonWBedard JonWBedard force-pushed the eng/git-webkit-Add-pre-push-hook-to-prevent-publication-of-security-sensative-commits branch from 86d00a9 to 2e1fb34 Compare March 6, 2023 19:14
Copy link
Contributor

@emw-apple emw-apple left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still working my way through this, but leaving some comments to start with.

Tools/Scripts/hooks/pre-push Show resolved Hide resolved
Tools/Scripts/hooks/pre-push Show resolved Hide resolved
Tools/Scripts/hooks/pre-push Show resolved Hide resolved
@JonWBedard JonWBedard removed the skip-ews Applied to prevent a change from being run on EWS label Mar 7, 2023
Tools/Scripts/hooks/pre-push Show resolved Hide resolved
print('We have detected you are about to publish security content to a public remote')
print('Unless you are engaged in merge-back or fixing a security bug which has not shipped, you are likely making a mistake')
print('')
sys.stdout.write('Type "MAKE PUBLIC" to continue with publication: ')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the thing GitHub does for risky actions, where you have to type the type out the branch name you've selected, to mentally confirm that you're operating on the right branch. Typing "Make safari-XXXX.X-branch public" or similar seems like it would work well here.

SCRIPTS = os.path.dirname(os.path.dirname(LOCATION))
sys.path.append(SCRIPTS)

REMOTE_RE = re.compile(r'(?P<protcol>[^@:]+)(@|://)(?P<host>[^:]+)(/|:)(?P<path>\S+)(\.git)?/?')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you could do something like:

url = 'git@github.com:WebKit/WebKit.git'
scheme = urllib.parse.urlsplit(url).scheme
if not scheme:
    url = 'ssh://' + url.replace(':', '/', 1)
if scheme == 'ssh':
    url = url.replace(':', '/', 1)
result = urllib.parse.urlsplit(url)

@JonWBedard JonWBedard force-pushed the eng/git-webkit-Add-pre-push-hook-to-prevent-publication-of-security-sensative-commits branch from 2e1fb34 to 424bb7b Compare March 7, 2023 21:22
@JonWBedard JonWBedard force-pushed the eng/git-webkit-Add-pre-push-hook-to-prevent-publication-of-security-sensative-commits branch from 424bb7b to 62c30ea Compare March 8, 2023 00:11
@JonWBedard JonWBedard added the skip-ews Applied to prevent a change from being run on EWS label Mar 8, 2023
result['fork'] = level
else:
result['{}-fork'.format(key.split('.')[-1])] = level
level += 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know there is precedent for this pattern in the code, but I am scared at how magical the ordering is. It seems like any accidental reordering could lead to a leak of security content.

Would you mind putting a comment in metadata/git_config_extension explaining the significance of the ordering?

Comment on lines 219 to 236
if match:
result['{}:{}'.format(match.group('host'), match.group('path'))] = level
if username:
team, repository = match.group('path').split('/', 1)
if team == repository:
forked_name = repository
elif '-' in repository:
forked_name = repository
else:
forked_name = '{}-{}'.format(repository, team)
result['{}:{}/{}'.format(match.group('host'), username, forked_name)] = level

result[str(key.split('.')[-1])] = level
if username:
if key.split('.')[-1] == 'origin':
result['fork'] = level
else:
result['{}-fork'.format(key.split('.')[-1])] = level
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, this is assumes the naming convention of forks that git-webkit sets up. And if a user has renamed their fork in GitHub or changed their remote name locally, it'll be caught by pre-push as having "no" security level.

Not leaking security content when someone renames their fork is obviously good, but I don't like putting more naming requirements on things which belong an engineer (namely, their repositories and local configuration). For example, I've mentioned to you before that I call my forks emw, because emw-apple is superfluous. It seems like I wouldn't be able to push security commits to a similarly-named private remote.

If we changed this, what would you think about:

  • using the GitHub API to check whether a repo is private or not
  • determining security level by identifying a known branch or tag on the remote

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@emw-apple discussed this offline, and checking using GitHub's API is a good idea, but also an expensive thing to do every push. git-webkit setup owning the API calls, though, makes a lot of sense. So that's what the most recent iteration of this PR does. That lets us rip out all logic that treats different remote names differently, everything is based on remote URL.

@JonWBedard JonWBedard removed the skip-ews Applied to prevent a change from being run on EWS label Mar 8, 2023
@JonWBedard JonWBedard force-pushed the eng/git-webkit-Add-pre-push-hook-to-prevent-publication-of-security-sensative-commits branch from 62c30ea to e2536e3 Compare March 8, 2023 22:15
Copy link
Contributor

@emw-apple emw-apple left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r+ with the pending adoption of #11251

Comment on lines 274 to 282
# Any future commits (being ancesstors of this one) will nessesarily be on the same remote this commit is
if remotes_with[commit]:
break
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix typo, plus, isn't this the inverse of the above statement?

Suggested change
# Any future commits (being ancesstors of this one) will nessesarily be on the same remote this commit is
if remotes_with[commit]:
break
else:
# Any future commits (being ancestors of this one) will nessesarily be on the same remote this commit is
break

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not quite, I made this change, then realized it was a performance regression. We want to early exit the loop the moment we encounter a commit which has remotes, even if the commit in question isn't in the dictionary yet.

)
print('')
print('We have detected you are about to publish security content to a public remote')
print('Unless you are engaged in merge-back or fixing a security bug which has not shipped, you are likely making a mistake')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re "which has not shipped": How do contributors know this? If there's a silent "in Safari" in this sentence, then Apple engineers can determine whether a security bug has escaped, but I'm not aware of how to make that call more generally.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the short answer here is that a contributor which does not understand how various WebKit ports ship code will be unable to answer this question. I think that's fine, such a contributor would then ask for help (which they would need to post a security PR anyways, since the repository where one does that is not publicly accessible).

The longer answer is that if you are fixing a security bug that has just been introduced in the last week or two, it's a good bet it hasn't shipped yet. Past that, answering the question "has this bug shipped" is a hard question to answer that requires coordination between various WebKit ports, and is not likely to be answerable with public information. While I have Safari in mind when writing this comment, making the statement general was deliberate, it applies to other shipping ports as well.

branches = sorted(list(set([branch_for(commit) for commit in prompt_for])))
expected_response = 'publicize {}'.format(
branches[0] if len(branches) == 1
else '{} and {}'.format(', '.join(branches[:-1]), branches[-1]),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: oxford comma

Suggested change
else '{} and {}'.format(', '.join(branches[:-1]), branches[-1]),
else '{}, and {}'.format(', '.join(branches[:-1]), branches[-1]),

return 1

if prompt_for:
branches = sorted(list(set([branch_for(commit) for commit in prompt_for])))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These look like unnecessary intermediate lists:

Suggested change
branches = sorted(list(set([branch_for(commit) for commit in prompt_for])))
branches = sorted({branch_for(commit) for commit in prompt_for]})

Comment on lines 336 to 342
sys.stdout.write('Type "{}" to continue with publication: '.format(expected_response))
sys.stdout.flush()
response = TTY.readline().rstrip()
if response != expected_response:
print_error('User failed to explicitly approve potentially dangerous push')
return 1
print("User has explicitly approved potentially dangerous push, continuing...")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few small ideas here:

  • Can we write to TTY, in case stdout is redirected somewhere weird?
  • To help users detect typos, we could confirm when a response is recognized but does not match.
Suggested change
sys.stdout.write('Type "{}" to continue with publication: '.format(expected_response))
sys.stdout.flush()
response = TTY.readline().rstrip()
if response != expected_response:
print_error('User failed to explicitly approve potentially dangerous push')
return 1
print("User has explicitly approved potentially dangerous push, continuing...")
TTY.write('Type "{}" to continue with publication: '.format(expected_response))
TTY.flush()
response = TTY.readline().rstrip()
if response != expected_response:
if response:
print("Response does not match prompt")
print_error('User failed to explicitly approve potentially dangerous push')
return 1
print("User has explicitly approved potentially dangerous push, continuing...")

SCRIPTS = os.path.dirname(os.path.dirname(LOCATION))
sys.path.append(SCRIPTS)

REMOTE_RE = re.compile(r'(?P<protcol>[^@:]+)(@|://)(?P<host>[^:]+)(/|:)(?P<path>\S+)(\.git)?/?')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I think this is easier to follow, and much easier to reason about the correctness of. Of course, it could be made easier to follow with comments, too. But I'm biased because I wrote it :)

Up to you to decide which is better for maintaining.

@JonWBedard JonWBedard force-pushed the eng/git-webkit-Add-pre-push-hook-to-prevent-publication-of-security-sensative-commits branch from e2536e3 to 3321a75 Compare March 10, 2023 21:04
@JonWBedard JonWBedard added the unsafe-merge-queue Applied to send a pull request to merge-queue, but skip building and testing label Mar 10, 2023
…sative commits

https://bugs.webkit.org/show_bug.cgi?id=253354
rdar://106216593

Reviewed by Elliott Williams.

Write a pre-push hook to block or prompt the user in 3 situations to prevent the
inadvertent publication of security sensative commits:
- Class 1: A commit exists on a remote more secure than the one a contributor is pushing to
- Class 2: A commit is a cherry-pick of a commit from a more secure remote
- Class 3: The commit references a security bug the target remote is public

The goal of this hook is to prevent class 1 and 2 without relying on code in the checkout, while
class 3 relies on webkitbugspy to determine if a linked issue is redacted.

* Tools/Scripts/hooks/pre-push: Added.
* Tools/Scripts/libraries/webkitscmpy/setup.py: Bump version.
* Tools/Scripts/libraries/webkitscmpy/webkitscmpy/__init__.py: Ditto.
* Tools/Scripts/libraries/webkitscmpy/webkitscmpy/mocks/local/git.py:
(Git): Add `git config --get-regexp`
* Tools/Scripts/libraries/webkitscmpy/webkitscmpy/program/clean.py:
(Clean.cleanup): Forward verbosity into `git push`.
(Clean.main): Ditto.
(DeletePRBranches.main): Ditto.
* Tools/Scripts/libraries/webkitscmpy/webkitscmpy/program/land.py:
(Land.main): Forward verbosity into `git push`.
* Tools/Scripts/libraries/webkitscmpy/webkitscmpy/program/publish.py:
(Publish.main): Change operating mode of our pre-push hook to allow class-1 security
publication with a prompt.
* Tools/Scripts/libraries/webkitscmpy/webkitscmpy/program/pull_request.py:
(PullRequest.create_pull_request): Forward verbosity into `git push`.
* Tools/Scripts/libraries/webkitscmpy/webkitscmpy/program/setup.py:
(Setup._security_levels): Provide a security level for source and fork remotes based on
the order of our source remotes.
(Setup.git): Pass arguments to template for our pre-push hook.
* Tools/Scripts/libraries/webkitscmpy/webkitscmpy/test/git_unittest.py:
* Tools/Scripts/libraries/webkitscmpy/webkitscmpy/test/setup_unittest.py:

Canonical link: https://commits.webkit.org/261526@main
@webkit-commit-queue webkit-commit-queue force-pushed the eng/git-webkit-Add-pre-push-hook-to-prevent-publication-of-security-sensative-commits branch from 3321a75 to 604395a Compare March 10, 2023 21:18
@webkit-commit-queue
Copy link
Collaborator

Committed 261526@main (604395a): https://commits.webkit.org/261526@main

Reviewed commits have been landed. Closing PR #11043 and removing active labels.

@webkit-commit-queue webkit-commit-queue merged commit 604395a into WebKit:main Mar 10, 2023
@webkit-commit-queue webkit-commit-queue removed the unsafe-merge-queue Applied to send a pull request to merge-queue, but skip building and testing label Mar 10, 2023
@JonWBedard JonWBedard deleted the eng/git-webkit-Add-pre-push-hook-to-prevent-publication-of-security-sensative-commits branch March 14, 2023 14:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Tools / Tests Tools in the Tools directory, build issues, test infrastructure, and bugs in test cases
Projects
None yet
5 participants