Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/go: build with vendor depends on modcache for checksums #46400

Closed
seankhliao opened this issue May 26, 2021 · 18 comments
Closed

cmd/go: build with vendor depends on modcache for checksums #46400

seankhliao opened this issue May 26, 2021 · 18 comments
Assignees
Labels
FixPending Issues that have a fix which has not yet been reviewed or submitted. GoCommand cmd/go modules NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@seankhliao
Copy link
Member

seankhliao commented May 26, 2021

What version of Go are you using (go version)?

$ go version
go version devel go1.17-e4615ad Wed May 26 13:25:43 2021 +0000 linux/amd64

Does this issue reproduce with the latest release?

yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/arccy/.cache/go-build"
GOENV="/home/arccy/.config/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/tmp/gomodcache.f4Wp"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/arccy/.data/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/home/arccy/sdk/gotip"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/home/arccy/sdk/gotip/pkg/tool/linux_amd64"
GOVCS=""
GOVERSION="devel go1.17-e4615ad Wed May 26 13:25:43 2021 +0000"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/home/arccy/tmp/trivy-mod-parse/go.mod"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build3032020717=/tmp/go-build -gno-record-gcc-switches"

What did you do?

build a module with a vendor dir.
Whether or not the module sums are embedded depends on the module cache

$ export GOMODCACHE=$(mktemp -d -t gomodcache.XXXX)

$ go build -mod=vendor
$ go version -m trivy-mod-parse
trivy-mod-parse: devel go1.17-e4615ad Wed May 26 13:25:43 2021 +0000
	path	github.com/ebati/trivy-mod-parse
	mod	github.com/ebati/trivy-mod-parse	(devel)	
	dep	github.com/davecgh/go-spew	v1.1.1	
	dep	github.com/go-sql-driver/mysql	v0.0.0-00010101000000-000000000000
	=>	github.com/go-sql-driver/mysql	v1.5.0	

$ go mod tidy
go: downloading github.com/davecgh/go-spew v1.1.1
go: downloading github.com/go-sql-driver/mysql v1.5.0

$ go build -mod=vendor
$ go version -m trivy-mod-parse
trivy-mod-parse: devel go1.17-e4615ad Wed May 26 13:25:43 2021 +0000
	path	github.com/ebati/trivy-mod-parse
	mod	github.com/ebati/trivy-mod-parse	(devel)	
	dep	github.com/davecgh/go-spew	v1.1.1	h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c=
	dep	github.com/go-sql-driver/mysql	v0.0.0-00010101000000-000000000000
	=>	github.com/go-sql-driver/mysql	v1.5.0	h1:ozyZYNQW3x3HtqT1jira07DN2PArx2v7/mN66gGcHOs=

What did you expect to see?

build output not affected by module cache contents

What did you see instead?

checksums only embedded when module cache is populated


cc @bcmills @jayconrod @matloob

@jerbob92
Copy link

This is quite the issue for us, since there is also no way to skip adding the module info to the binary.

@bcmills
Copy link
Contributor

bcmills commented Oct 26, 2022

If your go.sum file is tidy, go build -mod=vendor should be using the checksums from there.
If it isn't tidy (missing entries), perhaps we should emit a clear error message and fail the build with -mod=vendor?

(FWIW, I suspect that the issue in the original post is due to a bad interaction with replace.)

@caarlos0
Copy link
Contributor

clarifying: the issue (or one of the issues) is that you can go mod vendor, edit the dependencies (or manually put any new ones) in ./vendor, and then go build, and go version will still show the the dependencies as "legit" (without the manual edits)

@jerbob92
Copy link

jerbob92 commented Oct 26, 2022

@bcmills the file is tidy, the only difference is having the dependencies in your ~/go folder (which will be filled when you run go mod tidy) and your vendor folder or only in your vendor folder. Here is the go version -m output on a build when having the dependencies in ~/go:

entc: go1.18
	path	command-line-arguments
	dep	ariga.io/atlas	v0.8.0	h1:iSNmWLCoI1i4Zu9Gqnu8iWgO4ZLNcR42/d6r78DHMQI=
	dep	entgo.io/contrib	v0.3.3-0.20220909110541-462c4e569873	h1:VqkwKyPnxLSoYJdjT6QWzoh4Be2QOB2JF2rUJus4EpY=
	dep	entgo.io/ent	v0.11.4-0.20221001062602-1029a2d3ba2a	h1:T28WZZUdeJb7DQVQNnZkr3pasIdDrDVC41eQIV0hvTU=
	dep	github.com/99designs/gqlgen	v0.17.20	h1:O7WzccIhKB1dm+7g6dhQcULINftfiLSBg2l/mwbpJMw=
	dep	github.com/agext/levenshtein	v1.2.1	h1:QmvMAjj2aEICytGiWzmxoE0x2KZvE0fvmqMOfy2tjT8=
	dep	github.com/agnivade/levenshtein	v1.1.1	h1:QY8M92nrzkmr798gCo3kmMyqXFzdQVpxLlGPRBij0P8=
	dep	github.com/apparentlymart/go-textseg/v13	v13.0.0	h1:Y+KvPE1NYz0xl601PVImeQfFyEy6iT90AvPUL1NNfNw=
	dep	github.com/fsnotify/fsnotify	v1.5.4	h1:jRbGcIw6P2Meqdwuo0H1p6JVLbL5DHKAKlYndzMwVZI=
	dep	github.com/gin-contrib/sse	v0.1.0	h1:Y/yl/+YNO8GZSjAhjMsSuLt29uWRFHdHYUb5lYOV9qE=
	dep	github.com/gin-gonic/gin	v1.8.1	h1:4+fr/el88TOO3ewCmQr8cx/CtZ/umlIRIs5M4NTNjf8=
	dep	github.com/go-logr/logr	v1.2.3	h1:2DntVwHkVopvECVRSlL5PSo9eG+cAkDCuckLubN+rq0=
	dep	github.com/go-logr/stdr	v1.2.2	h1:hSWxHoqTgW2S2qGc0LTAI563KZ5YKYRhT3MFKZMbjag=

... etc ...

And here the go version -m output on a build after having removed ~/go:

entc: go1.18
	path	command-line-arguments
	dep	ariga.io/atlas	v0.8.0	
	dep	entgo.io/contrib	v0.3.3-0.20220909110541-462c4e569873	
	dep	entgo.io/ent	v0.11.4-0.20221001062602-1029a2d3ba2a	
	dep	github.com/99designs/gqlgen	v0.17.20	
	dep	github.com/agext/levenshtein	v1.2.1	
	dep	github.com/agnivade/levenshtein	v1.1.1	
	dep	github.com/apparentlymart/go-textseg/v13	v13.0.0	
	dep	github.com/fsnotify/fsnotify	v1.5.4	
	dep	github.com/gin-contrib/sse	v0.1.0	
	dep	github.com/gin-gonic/gin	v1.8.1	
	dep	github.com/go-logr/logr	v1.2.3	
	dep	github.com/go-logr/stdr	v1.2.2	

... etc ...

Conclusion: when you don't have the dependencies in ~/go, it will not insert the sums.

@jerbob92
Copy link

I did some more research into this, the build info is built by this function: https://github.com/golang/go/blob/master/src/cmd/go/internal/load/pkg.go#L2269

It uses modfetch.Sum to get the Sum of a module, but modfetch.Sum only returns a Sum if it's available in the cache, it also says so in the comments:

// Sum returns the checksum for the downloaded copy of the given module,
// if present in the download cache.

ziphash, err := CachePath(mod, "ziphash")

I have added some debugging and the returned ziphash path is something like this:
/home/jerbob92/go/pkg/mod/cache/download/gopkg.in/yaml.v3/@v/v3.0.1.ziphash

It then tries to open that file and doesn't exist ofcourse to then lockedfile.Read results in an error:

lockedfile error for gopkg.in/yaml.v3@v3.0.1: open /home/jerbob92/go/pkg/mod/cache/download/gopkg.in/yaml.v3/@v/v3.0.1.ziphash: no such file or directory

@deitch
Copy link

deitch commented Oct 28, 2022

Aha! I have been hunting this down for a little while, started with what I thought was bugs in SBoM software, eventually leading here.

Is it fair to say that this is an error in how it processes and loads the modules? One of the points of go.sum is to validate that this desired version is the actual version being used (by checking sha sums), but since it has it (in this case from vendor/) and builds with it, should it not be able to do one of the following?

  • take the sha from the go.sum file; OR
  • take the sha from the vendor/ directory (or however it is using the content there to validate that it fits the go.sum hash)

In my case, I cannot run go mod tidy before running go build, as I am relying on vendor/ to build offline.

In any case, is there some way I can assist?

@seankhliao
Copy link
Member Author

vendor directories don't contain enough data to verify against checksums, ref #27348

@deitch
Copy link

deitch commented Oct 28, 2022

How interesting. Thanks for the helpful reference.

Does that mean that when I go build -mod=vendor that it does not validate hashes from go.sum, unlike a non-vendored go build? I guess it must, based on the linked issue.

So what is the right approach for combining:

  • Module information embedding provided in the build and hash sums
  • Vendor preset and predownloaded modules (possibly with the assumption of no network access during build)?

@bcmills
Copy link
Contributor

bcmills commented Oct 28, 2022

Probably for now when we are building with -mod=vendor we should embed the module information (from vendor/modules.txt) but not the checksum information (because we don't in general know that the vendor directory still matches the checksums).

@bcmills bcmills added NeedsFix The path to resolution is known, but the work has not been done. help wanted labels Oct 28, 2022
@gopherbot gopherbot removed the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Oct 28, 2022
@caarlos0
Copy link
Contributor

@bcmills sounds like a good compromise yes

@jerbob92
Copy link

@bcmills does it actually verify it for non-vendor builds though?
I also think that not embedding the checksums with mod=vendor (the default when there is a vendor directory) is a good compromise for now.

@bcmills
Copy link
Contributor

bcmills commented Oct 28, 2022

does it actually verify it for non-vendor builds though?

For non-vendor builds, checksums are verified when modules are first downloaded to the module cache and when the user invokes go mod verify explicitly. (We trust that once a module has been downloaded it will not be modified, but of course it's fine to run go mod verify explicitly to check for filesystem corruption.)

@jerbob92
Copy link

@bcmills If it doesn't verify it at build, adding the ziphash guarantees about as much as the go.sum hash, so in that case you might not want to add the checksum at all. Because why trust the user not editing the cache folder, but not trust it with not editing the vendor folder?

@deitch
Copy link

deitch commented Oct 30, 2022

@jerbob92 captured my thoughts really well (thanks).

If when we build w/o -mod=vendor we verify checksums build time but not when we build with -mod=vendor, then I could see a good reason to have a difference between including checksums and not.

However, if the build-time verification is identical - no verification - but just download time, then we should include the checksums both times as well.

An alternative might be to verify build time for both, and therefore include enough in vendor/ to be able to verify the checksums, but I am aware that is a pretty significant change to how vendoring works (and growth in size of every vendor/ dir).

Beyond the "how do we fix this", what is the right approach using the current state of tooling to get checksums into the compiled binary? Is it to ensure that the go mod cache is properly populated before build, even if vendor/ is in place? So if I clone some repo that has checked-in vendor/, I run go mod download or similar before doing build? And if so, how do I handle offline builds (which is one of the key reasons people use vendor/)?

@bcmills
Copy link
Contributor

bcmills commented Nov 1, 2022

@jerbob92

why trust the user not editing the cache folder, but not trust it with not editing the vendor folder?

The module cache is per-user (or per-machine), verified at download time, and by default has read-only permissions.
The vendor folder is generally committed to a shared repository, is not verified, and has the same permissions as other ordinary files in the repo.

They are not equivalent.

@deitch
Copy link

deitch commented Nov 1, 2022

They are not equivalent

That is fair enough. Your point about, "we shouldn't include the hash unless we can verify with what we have at hand" is eminently sensible, although silently not verifying at all if there is no cache is a bit strange behaviour. It means that go build and go build -mod=vendor provide fundamentally different guarantees (or one provides and one does not, or more correctly, one provides always, and one provides only when the go mod cache is there and ignores it when not). That would surprise users, I would think (well, at least to me 😁 ).

What would be the right path to getting equivalency? And what would be the right path to using and verifying and including hashes in vendor/ed environments, especially when there is no ability to download the cache go mod download prior to go build -mod=vendor?

@bcmills bcmills modified the milestones: Backlog, Go1.22 Jun 20, 2023
@bcmills bcmills self-assigned this Jun 20, 2023
@bcmills bcmills modified the milestones: Go1.22, Go1.23 Jan 18, 2024
@bcmills
Copy link
Contributor

bcmills commented Feb 8, 2024

@samthanawalla, this might be another good one for you to look into while you're working on #52792.

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/564195 mentions this issue: cmd/go: Do not embed checksums when building with vendor.

@bcmills bcmills assigned samthanawalla and unassigned bcmills Feb 15, 2024
@samthanawalla samthanawalla added the FixPending Issues that have a fix which has not yet been reviewed or submitted. label Feb 15, 2024
ezz-no pushed a commit to ezz-no/go-ezzno that referenced this issue Feb 18, 2024
Fixes golang#46400

Tested: Ran go test cmd/go
Change-Id: I60655129c55d40a70a13ed23937ef990f315fd73
Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest,gotip-windows-amd64-longtest
Reviewed-on: https://go-review.googlesource.com/c/go/+/564195
Reviewed-by: Bryan Mills <bcmills@google.com>
Run-TryBot: Sam Thanawalla <samthanawalla@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Commit-Queue: Sam Thanawalla <samthanawalla@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
FixPending Issues that have a fix which has not yet been reviewed or submitted. GoCommand cmd/go modules NeedsFix The path to resolution is known, but the work has not been done.
Projects
None yet
Development

No branches or pull requests

7 participants