Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/go: avoid work to compute unused fields in 'go list' #29666

Open
hyangah opened this Issue Jan 10, 2019 · 6 comments

Comments

Projects
None yet
4 participants
@hyangah
Copy link
Contributor

hyangah commented Jan 10, 2019

We want to get the list of available versions of a module
and the best way is to use go list -m --versions <module>.

Under the hood, the command retrieves the tag list (which we expected)
and also fetches some versions (which is surprising).

Our guess is that the go list does so in order to find information
like the latest version's commit time, etc. For our use case, we don't need
that information and the extra fetch slows down the go list especially
when the repository is large.

Can we have a flag to skip the fetch?

An alternative mentioned by @bcmills is to make go list smarter so it
performs steps necessary to fill in the requested information (e.g. fields
mentioned in -f, or with an additional option to specify fields to prune if
-json is requested)

$ GO111MODULE=on tipgo list -x -m -f '{{.Versions}}' --versions rsc.io/quote

mkdir -p $GOPATH/pkg/mod/cache/vcs # git2 https://github.com/rsc/quote
# lock $GOPATH/pkg/mod/cache/vcs/$HASH.lockmkdir -p $GOPATH/pkg/mod/cache/vcs/$HASH # git2 https://github.com/rsc/quote
cd $GOPATH/pkg/mod/cache/vcs/$HASH; git init --bare
0.006s # cd $GOPATH/pkg/mod/cache/vcs/$HASH; git init --bare
cd $GOPATH/pkg/mod/cache/vcs/$HASH; git remote add origin https://github.com/rsc/quote 
0.004s # cd $GOPATH/pkg/mod/cache/vcs/$HASH; git remote add origin https://github.com/rsc/quote
cd $GOPATH/pkg/mod/cache/vcs/$HASH; git ls-remote -q origin
0.298s # cd $GOPATH/pkg/mod/cache/vcs/$HASH; git ls-remote -q origin
cd $GOPATH/pkg/mod/cache/vcs/$HASH; git cat-file --batch
0.004s # cd $GOPATH/pkg/mod/cache/vcs/$HASH; git cat-file --batch

cd $GOPATH/pkg/mod/cache/vcs/$HASH; git fetch -f origin refs/tags/v2.0.0:refs/tags/v2.0.0 refs/tags/v2.0.1:refs/tags/v2.0.1 refs/tags/v3.0.0:refs/tags/v3.0.0
0.365s # cd $GOPATH/pkg/mod/cache/vcs/$HASH; git fetch -f origin refs/tags/v2.0.0:refs/tags/v2.0.0 refs/tags/v2.0.1:refs/tags/v2.0.1 refs/tags/v3.0.0:refs/tags/v3.0.0

cd $GOPATH/pkg/mod/cache/vcs/$HASH; git cat-file --batch
0.004s # cd $GOPATH/pkg/mod/cache/vcs/$HASH; git cat-file --batch
go: finding rsc.io/quote v1.5.2
cd $GOPATH/pkg/mod/cache/vcs/$HASH; git tag -l
0.003s # cd $GOPATH/pkg/mod/cache/vcs/$HASH; git tag -l
cd $GOPATH/pkg/mod/cache/vcs/$HASH; git -c log.showsignature=false log -n1 '--format=format:%H %ct %D' refs/tags/v1.5.2
0.004s # cd $GOPATH/pkg/mod/cache/vcs/$HASH; git -c log.showsignature=false log -n1 '--format=format:%H %ct %D' refs/tags/v1.5.2

[v1.0.0 v1.1.0 v1.2.0 v1.2.1 v1.3.0 v1.4.0 v1.5.0 v1.5.1 v1.5.2 v1.5.3-pre1]

@bcmills @heschik @katiehockman @jayconrod

@heschik

This comment has been minimized.

Copy link
Contributor

heschik commented Jan 10, 2019

From what I can piece together, Hana's example above is particularly poor, but probably unfixable. It's looking for a v1 module, but it needs to download every non-v1 version too to make sure that they all have a go.mod. (If one didn't, then it'd be +incompatible and a valid answer for the query.)

Previously we had been looking at the way it behaves when fetching rsc.io/quote/v2, or a v1 module from a repo that doesn't have v2 yet. That can probably be improved.

@bcmills

This comment has been minimized.

Copy link
Member

bcmills commented Jan 11, 2019

I would rather do the field-based pruning than add a separate flag. Field-based pruning gives us the opportunity to eliminate a lot of different kinds of redundant work, whereas a “skip timestamp fetching” flag would have a very narrow benefit.

@bcmills

This comment has been minimized.

Copy link
Member

bcmills commented Jan 11, 2019

One nice option for the JSON field-pruning might be to allow the -json flag to accept a comma-separated list of fields with an explicit = connector:

go list -m -json=Path,Versions,Replace.Path --versions rsc.io/quote
@bcmills

This comment has been minimized.

Copy link
Member

bcmills commented Jan 11, 2019

It's looking for a v1 module, but it needs to download every non-v1 version too to make sure that they all have a go.mod.

It occurs to me that this could interact with caching in an interesting way: if you're only looking for new versions and you know when you last checked the repo, then it's probably safe to assume that any tag that was +incompatible still is, and any tag that was a module still is. So perhaps some sort of flag to restrict the result to “tags added since timestamp T” would avoid the need for that fetch too.

@jayconrod

This comment has been minimized.

Copy link
Contributor

jayconrod commented Jan 11, 2019

I really like the idea of -json accepting a list of fields that are actually needed. I think we should do that on the regular go list (not -m) side as well.

@bcmills bcmills changed the title cmd/go: allow 'go list -m --versions' to skip vcs-fetching cmd/go: avoid work to compute unused fields in 'go list' Jan 17, 2019

@bcmills

This comment has been minimized.

Copy link
Member

bcmills commented Jan 17, 2019

Retitled to address the general problem: we won't be adding a specific flag to avoid VCS fetches, but we certainly can optimize the work that we do based on the requested output.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.