New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/go: unclear how to cache transitive dependencies in a Docker image #27719

Open
wedow opened this Issue Sep 17, 2018 · 9 comments

Comments

Projects
None yet
4 participants
@wedow

wedow commented Sep 17, 2018

What version of Go are you using (go version)?

go version go1.11 linux/amd64

Does this issue reproduce with the latest release?

yes

What did you do?

I'm attempting to populate a Docker cache layer with compiled dependencies based on the contents of go.mod. The general recommendation with Docker is to use go mod download however this only provides caching of sources.

go build all can be used to compile these sources but instead of relying on go.mod contents, it requires my application source to be present to determine which deps to build. This causes a cache invalidation on every code change and renders the step useless.

Here's a Dockerfile demonstrating my issue:

FROM golang:1.11-alpine
RUN apk add git

ENV CGO_ENABLED=0 GOOS=linux

WORKDIR /app

COPY go.mod go.sum ./

RUN go mod download

# this fails
RUN go build all
# => go: warning: "all" matched no packages

COPY . .

# this now works but isn't needed
RUN go build all

# compile app along with any unbuilt deps
RUN go build

From package lists and patterns:

When using modules, "all" expands to all packages in the main module and their dependencies, including dependencies needed by tests of any of those.

where the main module is defined by the contents of go.mod (if I'm understanding this correctly).

Since "the main module's go.mod file defines the precise set of packages available for use by the go command", I would expect go build all to rely on go.mod and build any packages listed within.

Other actions which support "all" have this issue but some have flags which resolve it (go list -m all).

@davecheney

This comment has been minimized.

Contributor

davecheney commented Sep 18, 2018

@wedow

This comment has been minimized.

wedow commented Sep 18, 2018

Thanks Dave, go build ./... is a bit of an improvement since it doesn't include the test dependencies that all does. However it still requires my application source to be present and gives go: warning: "./..." matched no packages if run with only go.mod and go.sum present.

@davecheney

This comment has been minimized.

Contributor

davecheney commented Sep 18, 2018

@wedow

This comment has been minimized.

wedow commented Sep 18, 2018

For sure. I've found in most previous projects that dependency build times are fast enough to not be an issue so in the end the existing behaviour is probably fine.

Part of my current project is the creation of a custom Terraform Provider for managing some of our internal systems. Building the Terraform packages only happens once locally so not a big deal, but they need to be rebuilt every time a new docker image is built. When these packages are already compiled, go build completes in under a second. When they need to be rebuilt from scratch, go build can take up to two minutes locally or longer on our CI servers.

Some time can be saved by using go mod download to cache the Terraform package sources but afaict there is no command to compile them after download without having our package main present for go build to determine what the dependencies actually are.

Based on the existing module documentation, I would expect the go.mod file to have an accurate list of required dependencies and for the toolchain to be able to rely on it in isolation.

We do similar things with projects in other languages for building Docker images. The flow is generally:

  1. Copy package manifest (Gemfile, package.json, etc.) into container
  2. Download dependency code and compile associated libraries (bundle, npm install, etc.)
  3. Copy the rest of our project source into container

This lets us avoid having to rebuild dependencies on every commit. It would be nice if this could be replicated with the Go module system. go mod download gets us halfway but doesn't allow caching of compilation artifacts.

Here's an example repo: https://github.com/wedow/docker-go-build

To see the issue we're having, clone it and run docker build ., add a comment or something to main.go and run docker build . again. Ideally all deps would be be built and cached prior to the COPY . . step and the final go build would be a sub-second operation.

@bcmills bcmills added this to the Go1.12 milestone Sep 18, 2018

@myitcv

This comment has been minimized.

Member

myitcv commented Nov 13, 2018

I think what you're after here is:

go list -export $(go list -m)/...
The -export flag causes list to set the Export field to the name of a
file containing up-to-date export information for the given package.

This will populate the build cache (go env GOCACHE) with the results of compiling for the -export flag. The module cache ($GOPATH/pkg/mod) as you say contains the module-related caches.

If you want to install main packages too then:

go install $(go list -f '{{ $ip := .ImportPath}}{{if eq .Name "main"}}{{$ip}}{{end}}' $(go list -m)/...)
@bcmills

This comment has been minimized.

Member

bcmills commented Nov 13, 2018

go build all can be used to compile these sources but instead of relying on go.mod contents, it requires my application source to be present to determine which deps to build.

Yes, that is working as designed: in module mode, all refers to the transitive imports of the packages in the main module, not the packages in its module dependencies. That's not going to change.

This causes a cache invalidation on every code change and renders the step useless.

If the code changes are only in your .go source files, then only the cache entries for the packages containing those source files should be invalidated: the cache contents for the other transitive dependencies should be unaffected.

The build artifact cache is separate from the module cache: the former is controlled by GOCACHE (and defaults to $HOME/.cache), while the latter is a subdirectory of the first entry in GOPATH. You may need to set the GOCACHE environment variable to make sure it is within the container; see Build and test caching for detail.

Can you confirm that both the build cache and the module cache are present and populated in your docker image after the first go build all?

@bcmills bcmills changed the title from cmd/go: go action all ignores go.mod file to cmd/go: unclear how to cache transitive dependencies in a Docker image Nov 13, 2018

@bcmills bcmills modified the milestones: Go1.12, Go1.13 Nov 13, 2018

@wedow

This comment has been minimized.

wedow commented Nov 29, 2018

Thanks guys, I think there may be some confusion about which caches are being affected and when.

The issue is in how docker caches layers after each operation. When my source files are changed, all side effects which occur after the COPY . . line (such as populating GOCACHE) are lost. Those changes are isolated in a layer which has been invalidated and must be fully rebuilt.

The go list -export $(go list -m)/... command works great for populating GOCACHE but since it must come after COPY . ., it must be fully re-run during every build. go build all also has this issue.

I'm looking for a command which can compile the dependencies listed in the go.mod file in isolation so that it can occur before that COPY . . line that adds our sources to the container. Totally understand if that's not possible with the current module system. I may just experiment with parsing and building the deps separately.

@myitcv

This comment has been minimized.

Member

myitcv commented Dec 1, 2018

The go list -export $(go list -m)/... command works great for populating GOCACHE but since it must come after COPY . ., it must be fully re-run during every build

I'm unclear why you say it must come after the copy - please can you explain?

I'm looking for a command which can compile the dependencies listed in the go.mod file in isolation so that it can occur before that COPY . .

go list -export $(go list -m)/... should be all you need here. But let's first unravel the question above first.

@bcmills

This comment has been minimized.

Member

bcmills commented Dec 1, 2018

@myitcv, note that -export may at some point do less than a full build. I don't think it's a perfect fit for the use-case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment