Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor git ssh url parsing #4142

Merged
merged 4 commits into from
Aug 22, 2023
Merged

Conversation

jedevc
Copy link
Member

@jedevc jedevc commented Aug 13, 2023

This is an alternative to the solution proposed in #4069, see #4069 (comment) (cc @aaronlehmann)

Wherever possible, we should parse SSH urls using url.Parse. See the updated tests for the exact changes in behavior:

  • Don't strip the username during parsing (I'm not 100% sure why we did this in the first place - maybe I'm missing something obvious?)
  • Normalize SCP-style git urls (username@host:org/repo) into ssh:// URLs.

@jedevc jedevc marked this pull request as ready for review August 13, 2023 15:18
@jedevc jedevc force-pushed the refactor-git-ssh-url-parsing branch from e4df209 to 6c78917 Compare August 14, 2023 13:58
util/gitutil/git_protocol.go Outdated Show resolved Hide resolved
util/gitutil/git_protocol.go Outdated Show resolved Hide resolved
util/gitutil/git_protocol.go Outdated Show resolved Hide resolved
@jedevc jedevc force-pushed the refactor-git-ssh-url-parsing branch 2 times, most recently from 00a04cb to 1eec8a9 Compare August 14, 2023 15:58
@alexcb
Copy link
Collaborator

alexcb commented Aug 14, 2023

Don't strip the username during parsing (I'm not 100% sure why we did this in the first place - maybe I'm missing something obvious?)

This was done in order to construct a consistent ID (that's passed to NewState(...) that didn't depend on the protocol or username, e.g.

"https://github.com/moby/buildkit", "git@github.com/moby/buildkit", and "https://username:token@github.com/moby/buildkit" all refer to the same repo, and in order to re-use the cache they should all have the same ID.

This reason was ever-so-briefly described in the old code's comment:

			// keep remote consistent with http(s) version
			remote = parts[0] + "/" + parts[1]

There might additionally be a security concern now that remote contains a username/password (or in the case of github, they actually allow a token to be passed as the username) -- credentials shouldn't be stored in the cache ID.

@jedevc
Copy link
Member Author

jedevc commented Aug 15, 2023

This was done in order to construct a consistent ID (that's passed to NewState(...) that didn't depend on the protocol or username

There might additionally be a security concern now that remote contains a username/password (or in the case of github, they actually allow a token to be passed as the username) -- credentials shouldn't be stored in the cache ID.

This should be ok I think - the cache key for git sources is computed from the git sha (see #2091)

func (gs *gitSourceHandler) CacheKey(ctx context.Context, g session.Group, index int) (string, string, solver.CacheOpts, bool, error) {

The identifier itself isn't used to compute the cache, so it should never end up in the cache id, it's only ever recorded in the op. Even then, we wouldn't be able to execute two different urls concurrently, since the vertex digests used for this are a checksum of the entire source definition (which will include the full URL).

I think we can keep the consistent ID, but I'm fairly sure that it won't actually change the caching/concurrent execution behavior.

@jedevc jedevc requested a review from crazy-max August 16, 2023 09:24
@jedevc jedevc force-pushed the refactor-git-ssh-url-parsing branch from 84b510f to 38b8565 Compare August 16, 2023 09:45
@jedevc jedevc force-pushed the refactor-git-ssh-url-parsing branch 2 times, most recently from 8b56a4b to a8d926a Compare August 22, 2023 10:36
jedevc and others added 4 commits August 22, 2023 11:36
Signed-off-by: Justin Chadwell <me@jedevc.com>
Signed-off-by: Justin Chadwell <me@jedevc.com>
This should also resolve the ports parsing issue.

Co-authored-by: Aaron Lehmann <alehmann@netflix.com>
Signed-off-by: Justin Chadwell <me@jedevc.com>
Signed-off-by: Justin Chadwell <me@jedevc.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants