Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/go: precompile dependencies for docker image cache #45474

Open
hugbubby opened this issue Apr 9, 2021 · 9 comments
Open

cmd/go: precompile dependencies for docker image cache #45474

hugbubby opened this issue Apr 9, 2021 · 9 comments

Comments

@hugbubby
Copy link

@hugbubby hugbubby commented Apr 9, 2021

This issue may reveal some fundamental misunderstanding I have about golang/compilers, but here goes:

Some of the dependencies for a monolithic golang program on our backend are quite large. Several of them use code generation as a generics workaround and are large enough to force us to use internal gitlab runners because of OOM errors. Currently we use go mod download as a build step in containerizing it, so that downloading those dependencies is cached. However, as far as I can tell, there's no way to run the equivalent of go get for just a go.mod and go.sum file; we can download, but not compile these dependencies that mostly stay the same from build to build.

This means that every time we build our backend golang program we have to recompile all of the libraries that that program uses from scratch, instead of caching that as a build step and only recompiling when we change our go.mod/go.sum files.

I'd like a go command (if it doesn't exist already, and doesn't admit some fundamental misunderstanding of how libraries get compiled) to get much of this work done before we COPY our source code into the docker image. Maybe this would just be a modification of go get that ignored the fact that there was no source code in the working directory when a go.mod file exists specifying all of the libraries used for a package. Such a command would shave a lot of build time in our CI.

@ianlancetaylor ianlancetaylor changed the title Precompile dependencies for docker image cache cmd/go: precompile dependencies for docker image cache Apr 9, 2021
@ianlancetaylor ianlancetaylor added this to the Backlog milestone Apr 9, 2021
@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Apr 9, 2021

Loading

@seankhliao
Copy link
Contributor

@seankhliao seankhliao commented Apr 9, 2021

Without the source code to list your actual dependencies (as opposed to everything that's possibly reachable from the module graph, including test dependencies of you dependencies etc...), you're going to massively overbuild. I think the closest you can do today might be the following, with the caveats that each module is built independently so shared dependencies at different versions will be built multiple times, and it also includes trimmed out dependencies.

for p in $(go list -m -f '{{ if not .Main }}{{ .Path }}/...@{{ .Version }}{{ end}}' all) ; do 
    go install  $p
done

You may have better luck using docker buildx to mount a cache in (caveat moby/buildkit#1512)

#syntax=docker/dockerfile:1.2
FROM golang:alpine AS build
WORKDIR /workspace
COPY . .
RUN --mount=type=cache,id=gomod,target=/go/pkg/mod \
    --mount=type=cache,id=gobuild,target=/root/.cache/go-build \
    go build -o app .

Loading

@jayconrod
Copy link
Contributor

@jayconrod jayconrod commented Apr 9, 2021

I don't think any new feature in the go command is needed. The build cache can be populated with go build, go install, or maybe go list -export -test. You'll need a list of packages to cache or a package that transitively imports those packages.

I'm marking this as a documentation issue though since we could really use a guide on efficiently building container images.

Loading

@hugbubby
Copy link
Author

@hugbubby hugbubby commented Apr 9, 2021

Without the source code to list your actual dependencies (as opposed to everything that's possibly reachable from the module graph, including test dependencies of you dependencies etc...), you're going to massively overbuild. I think the closest you can do today might be the following, with the caveats that each module is built independently so shared dependencies at different versions will be built multiple times, and it also includes trimmed out dependencies.

for p in $(go list -m -f '{{ if not .Main }}{{ .Path }}/...@{{ .Version }}{{ end}}' all) ; do 
    go install  $p
done

You may have better luck using docker buildx to mount a cache in (caveat moby/buildkit#1512)

#syntax=docker/dockerfile:1.2
FROM golang:alpine AS build
WORKDIR /workspace
COPY . .
RUN --mount=type=cache,id=gomod,target=/go/pkg/mod \
    --mount=type=cache,id=gobuild,target=/root/.cache/go-build \
    go build -o app .

Had never heard of docker buildx. Unfortunately our CI uses kaniko and it doesn't support those extra flags ;-; GoogleContainerTools/kaniko#969.

Loading

@seankhliao
Copy link
Contributor

@seankhliao seankhliao commented Apr 9, 2021

You could try a different approach with kaniko: specify a directory under /var/run as the build cache and mount it as a volume to be shared/reused between builds (could also do the same for module cache)

Loading

@andig
Copy link
Contributor

@andig andig commented Apr 15, 2021

@jayconrod I'd like the ability to create the list of modules on demand instead of a manual maintenance task. Seems there are at least some caveats with go:embed:

❯ go list -export -test
Alias tip: gol -export -test
# github.com/andig/evcc/internal/vehicle/cloud
internal/vehicle/cloud/cert.go:9:3: invalid go:embed: build system did not supply embed configuration
internal/vehicle/cloud/cert.go:12:3: invalid go:embed: build system did not supply embed configuration
internal/vehicle/cloud/cert.go:19:3: invalid go:embed: build system did not supply embed configuration
internal/vehicle/cloud/cert.go:12:12: pattern client-key.pem: no matching files found
github.com/andig/evcc

Would go list work when only the go.mod has been downloaded but not the full sources?

Loading

@jayconrod
Copy link
Contributor

@jayconrod jayconrod commented Apr 15, 2021

@andig There's no way for the go command to know which modules are needed to compile the packages in the main module without having the sources for those packages. go list -m all prints all the modules in the build list, but that's the same set downloaded by go mod download all, which is usually too much.

For go list -export, that actually does compile packages into the cache, so you'll need all the files present for that. From the error message, it sounds like something else might be going on though specific to go list.

Loading

@Seb-C
Copy link

@Seb-C Seb-C commented Sep 30, 2021

I believe that this is the same issue as #27719, which is still not solved.
It's a major pain that I meet on every project using golang and docker nowadays.

Loading

@Seb-C
Copy link

@Seb-C Seb-C commented Sep 30, 2021

Without the source code to list your actual dependencies (as opposed to everything that's possibly reachable from the module graph, including test dependencies of you dependencies etc...), you're going to massively overbuild.

A command to do that would be a perfectly fine compromise to me, way better than the current situation anyway.

The dependencies rarely changes or needs to be rebuilt (at most a few times per month anyway), while the actual source code of the application needs to be rebuilt multiple times per minute when developing.
Taking a longer time than necessary to rebuild the dependencies is pretty much a non-issue, while not being able to build both independently is a major problem.

Loading

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
6 participants