Skip to content

git url: support query form + support specifying checksum in query #5974

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

AkihiroSuda
Copy link
Member

@AkihiroSuda AkihiroSuda commented May 15, 2025

Split from:


e.g.,

https://github.com/moby/example.git?ref=v1.0.0&commit=deadbeef&subdir=/subdir

# alias of ref=refs/tags/v1.0.0
https://github.com/moby/example.git?tag=v1.0.0

# alias of ref=refs/heads/v1.0
https://github.com/moby/example.git?branch=v1.0

Fix #4905, but the syntax differs from the original proposal.

The document will be added to
https://github.com/docker/docs/blob/main/content/manuals/build/concepts/context.md#url-fragments

@tonistiigi
Copy link
Member

@AkihiroSuda Needs rebase after #5975

@AkihiroSuda

This comment was marked as resolved.

@AkihiroSuda AkihiroSuda marked this pull request as ready for review May 20, 2025 10:19
Copy link
Member

@tonistiigi tonistiigi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing that is weird atm is the URL parsing in LLB (client lib and identifier) vs Dockerfile. The query form should really be only for Dockerfile/dockerui. In LLB the parameters are sent with attribute values instead. For historical reasons, the current LLB is kind of a mix where #ref is still part of the URL from initial implementation, but for example protocol of the URL is always git:// and the real protocol is sent as an attribute.

Currently, it looks like in LLB client we parse a full URL (with querystring etc.) but then we drop everything but Host/Path and send the original URL as attr (that I think will container querystring). Also #ref is not even part of the URL the user passes in the client function, but is glued to the identifier internally.

Then on LLB receiver side, this now handles querystring in this PR in the URL in NewGitIdentifier (although it looks that it would require client side to error for the querystring to be passed at all there), but actually the checksum gets set later from pb.AttrGitChecksum.

In order to fix this, I think we should separate these concepts. Assuming that we do need some level of parser in the LLB side, I think having two separate packages might be too much so maybe just two parser functions where the Dockerfile one would parse the querystring and other Dockerfile specific things and then internally call the more basic parser. Looking at the current return type, there is already some weird things there, eg. IndistinguishableFromLocal seems to be only used in Dockerfile, and UnencryptedTCP doesn't seem to be used anywhere at all.

I'm also ok with changing the signature for llb.Git, eg. adding GitSubdir() (maybe even GitRef()) so it is the same as other git properties instead of requiring microformat and URL parsing as input that gets converted anyway. On the identifier parser side, we do need to keep backwards compatibility though (but could deprecate the old behavior).

repo.Subdir = u.Fragment.Subdir
if u.Opts != nil {
repo.Ref = u.Opts.Ref
repo.Checksum = u.Opts.Checksum
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the place where it is getting checksum from querystring in LLB while it really comes from pb.AttrGitChecksum

@tonistiigi
Copy link
Member

@AkihiroSuda Any update?

@AkihiroSuda
Copy link
Member Author

Rebased, but refactoring the parser may take some time. Unlikely in this week.

@AkihiroSuda
Copy link
Member Author

Split the commit git url: rename GitURLFragment to GitURLOpts to #6012 for ease of merging.

Is the commit git url: support query form safe to cherry-pick too, ahead of the refactoring of the parser?

@AkihiroSuda AkihiroSuda changed the title git url: support query form git url: support query form + support specifying checksum in query Jun 5, 2025
Fix issue 4905, but the syntax differs from the original proposal.

The document will be added to
https://github.com/docker/docs/blob/main/content/manuals/build/concepts/context.md#url-fragments

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
Prior to this commit, `ADD https://github.com/moby/example.git?invalid=42`
was fetching a single HTML file without causing an error.

Now it returns an error `unexpected query "invalid"`.

Hint: use `git show --ignore-all-space` to review this commit.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
@AkihiroSuda AkihiroSuda modified the milestones: v0.23.0, v0.24.0 Jun 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Proposal: csv syntax for git repos
2 participants