Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tag multiple stages and/or build multiple targets with one invocation of buildkit #609

Closed
brad-jones opened this issue Sep 7, 2018 · 7 comments

Comments

@brad-jones
Copy link

I have been experimenting with buildkit, now that it's bundled with docker.

I have a mono repo, currently with a Dockerfile per project, there are some base images but also lots of copy and paste that I have partially solved by defining a single Dockerfile with many smaller stages and using buildkit to perform the builds.

I already have a task runner which knows the dependency order and executes a number build commands like:

  • DOCKER_BUILDKIT=1 docker build . --target base
  • DOCKER_BUILDKIT=1 docker build . --target prj-1 -t prj-1:latest
  • DOCKER_BUILDKIT=1 docker build . --target prj-2 -t prj-2:latest
  • etc...

I then tried to write a Dockerfile like this:

# I want to be able to tag this stage
FROM alpine:latest AS prj-1
LABEL stage="prj-1"
RUN touch /tmp/noop
RUN sleep 6

# And tag this stage but without running 2 seperate `docker build` commands
FROM alpine:latest AS prj-2
LABEL stage="prj-2"
RUN touch /tmp/noop
RUN sleep 7

# I don't care for this image, it's here to make
# buildkit build both images above concurrently.
FROM alpine:latest AS all
COPY --from=prj-1 /tmp/noop /tmp/
COPY --from=prj-2 /tmp/noop /tmp/

Running it like: DOCKER_BUILDKIT=1 docker build . --target all --tag all:latest

This works both the sleep commands run concurrently which is great but I have no way of accessing the intermediate images.

Something like DOCKER_BUILDKIT=1 docker build . --target prj-1 --tag prj-1:latest --target prj-2 --tag prj-2:latest

Then I could almost retire the task runner and rely on buildkit to do 95% of the work.

@tonistiigi
Copy link
Member

I think this should be handled on a higher layer (possibly docker cli could be that layer though). BuildKit is designed to handle parallel calls efficiently and deduplicate work between them. What you are asking would be kind-of a batch request but conceptually no different than what the api already supports.

If you run all these commands in parallel it should be (almost) as efficient as doing the batch request(only problem is with local context sharing for parallel requests that docker cli doesn't do atm although BuildKit api supports it). Or even simpler, just run the commands in reverse order: docker build . --target all --tag all:latest followed by docker build . --target prj-1 -t prj-1:latest. The second one will get a cache hit and only create an image without running any commands again.

@brad-jones
Copy link
Author

Yeah I guess that works, do the entire build once and then just tag the stages I need. Always nice to get a different perspective on things.

@kingdonb
Copy link

kingdonb commented Oct 28, 2020

There's a case where that's not true, if your staged Dockerfile has several targets that diverge like my use case, it's desirable to not have to handle this at a higher level. This issue report described exactly what I think I needed.

I can still think of a couple of ways I might mitigate this, but right now at this stage in my development, this is the particular Dockerfile shape that makes sense for me, and it has this problem; so without further ado here's my real world use case:

# syntax = docker/dockerfile:experimental
FROM kingdonb/rvm-ruby-baseimage:latest as builder-base
ENV APPDIR="/home/${RVM_USER}/app"
ENV RUBY=2.6.6

USER root
COPY jenkins/runasroot-dependencies.sh /home/${RVM_USER}/jenkins/
RUN /home/${RVM_USER}/jenkins/runasroot-dependencies.sh

FROM builder-base AS gem-builder-base
USER ${RVM_USER}
RUN bundle config set app_config .bundle && \
  bundle config set path /tmp/.cache/bundle && mkdir -p /tmp/.cache/bundle
COPY --chown=rvm Gemfile Gemfile.lock .ruby-version ${APPDIR}/
RUN echo 'gem: --no-document' > ~/.gemrc && \
  rvm ${RUBY}@global do gem update bundler && \
  rvm ${RUBY}@global do gem update --system

FROM builder-base AS jenkins-builder
COPY jenkins/runasroot-jenkins-dependencies.sh /home/${RVM_USER}/jenkins/
RUN /home/${RVM_USER}/jenkins/runasroot-jenkins-dependencies.sh

FROM gem-builder-base AS bundler
COPY --chown=${RVM_USER} jenkins/rvm-bundle-cache-build.sh /home/${RVM_USER}/jenkins/
RUN --mount=type=cache,uid=999,gid=1000,target=/tmp/.cache/bundle \
    --mount=type=ssh,uid=999,gid=1000 \
    /home/${RVM_USER}/jenkins/rvm-bundle-cache-build.sh ${RUBY}
RUN echo 'export PATH="$PATH:$HOME/app/bin"' >> ../.profile

FROM bundler AS assets
COPY --chown=${RVM_USER} Rakefile ${APPDIR}
COPY --chown=${RVM_USER} config ${APPDIR}/config
COPY --chown=${RVM_USER} bin ${APPDIR}/bin
COPY --chown=${RVM_USER} app/assets ${APPDIR}/app/assets
COPY --chown=${RVM_USER} vendor/assets ${APPDIR}/vendor/assets
RUN --mount=type=cache,uid=999,gid=1000,target=/home/rvm/app/public/assets \
  rvm ${RUBY} do bundle exec rails assets:precompile && cp -ar /home/rvm/app/public/assets /tmp/assets

FROM builder-base AS builder
COPY --from=bundler --chown=${RVM_USER} /tmp/vendor/bundle ${APPDIR}/vendor/bundle
COPY --from=assets --chown=${RVM_USER} /tmp/assets ${APPDIR}/public/assets
FROM builder AS slug
COPY --chown=${RVM_USER} . ${APPDIR}
RUN bundle config set app_config .bundle && \
    bundle config set path vendor/bundle

FROM jenkins-builder AS jenkins
COPY --from=slug --chown=${RVM_USER} ${APPDIR} ${APPDIR}
USER root
RUN useradd -m -u 1000 -g rvm jenkins && \
    mkdir ${APPDIR}/log ${APPDIR}/tmp ${APPDIR}/coverage && \
    chown jenkins ${APPDIR}/log ${APPDIR}/tmp ${APPDIR}/coverage ${APPDIR} ${APPDIR}/db/schema.rb && \
    chown -R jenkins ${APPDIR}/public/assets
USER jenkins
RUN bundle config set app_config .bundle && \
    bundle config set path vendor/bundle

FROM slug as production
RUN bundle config set app_config .bundle
CMD  bash --login -c 'bundle exec rails server -b 0.0.0.0'
EXPOSE 3000

Here is the same dockerfile visualized as a tree, kind of, (hopefully this is clear... err, graphviz would be better):

└── builder-base
    ├── jenkins-builder
    │   └── jenkins
    ├── gem-builder-base
    │   ├── bundler
    │   │   └── ..slug
    │   └── assets
    │       └── ..slug
    └── slug
        ├── ..jenkins
        └── production

To get to the point there are three meaningful targets here, jenkins (tests run in here, they have extra dependencies), bundler (runtime dependencies, I might be able to do away with this tag, but that doesn't quite help), and production.

This is what I wind up doing in my build pipeline's scripting, and if you're still with me, you can see how it necessarily breaks parallelism in an undesirable way:

docker build -t $repo/$user/$proj:${GIT_COMMIT_SHORT} --target production .
docker build -t $repo/$user/$proj:bundler --target bundler .
docker build -t $repo/$user/$proj:jenkins_${GIT_COMMIT_SHORT} --target jenkins .

I'd like to start pushing the production tag as described below as early as possible, so I build it first, but jenkins-builder target that doesn't depend on any of the slug, assets, bundler stuff... unfortunately won't be building in parallel, it doesn't start its work until the end.

Reversing the builder results in the opposite problem, the production tag isn't ready and won't be tagged until after some irrelevant dependencies finish their build. Building --target production first instead , the jenkins branch of the tree is never invoked. Target bundler after either of those of course takes a fraction of a second, as it was built on the way to production or jenkins, but there are two divergent branches and it isn't possible to tell Docker to build both in parallel.

(Aside: maybe if I started both builds at once, forking off in the background and joining/waiting until both complete, is it possible that buildkit builder is actually smart enough to handle the two concurrent build processes without duplicating efforts or otherwise trampling over each other? I don't know, I haven't tried... but that would be enough to close the issue!)

I've got some conflicting agendas, I'm doing about four things at once here:

(1) we have some test dependencies like a full Chrome browser for Selenium and Capybara to use in system testing, along with a ton of dependencies from apt-get that are required to run Chrome inside of an xvfb-run. Those test deps are heavyweight, only needed by the tester (Jenkins), so they are downloaded in a sort of "dead fork" branch called jenkins, these dependencies never make it to the production image quite on purpose. This image never leaves the build host.

(2) The "bundler" dependencies are captured in a tag and pushed to the registry like a 'latest' tag, so I can use a tool such as Okteto later on, which expects an image has been prepared with all the heavyweight runtime dependencies like bundler gems already compiled. This is like a dev image, you could pull it and sync some new app sources into /home/rvm/app, run them there with your changes without building a new image at all, (irrelevant, but this workflow pattern is very handy.)

(3) The "production" image gets a unique tag and is pushed. There is a FluxCD dev environment monitoring my image repo for new tags shaped like a short Git SHA and deploying them right away. It is advantageous to build and push this image as soon as possible, since we don't care if tests pass first, it's only being promoted to a dev environment. Full integration and system tests with lots of detail can run for up to an hour, thus it's really to our advantage to push this out as soon as it's ready, as if there is a problem we might uncover it through manual testing long before the mammoth test suite is finished.

(4) The "slug" target is a node in common at the base of all of the leaf targets, with everything important for the app in it.

The jenkins-builder dependencies won't honestly need to change very often, Chrome and its dependencies can be installed once and reloaded from the build cache over the next week or two until the upstream rvm-ruby-baseimage changes out from under my pipeline. So while this is a real use case, I don't know how impactful it will be to solve, but the issue was open and lacking a use case, so here is one.

To be real, I don't know if this is really going to cause a problem for me in the end, other than an unnecessary 3 minutes or so spent waiting somewhere once every two weeks? I might not even notice. Maybe this is something Docker shouldn't solve, maybe there's a better way and I just haven't seen it yet, but it looks to me like there's actually no way to do all these parallel items efficiently for structures like mine, at a higher level as suggested.

(Edit: OK, I can think of one solution: make one additional, final throw-away target, that inherits some trivial detail from both production and jenkins targets. Building that first would let buildkit handle the stages in parallel that can be built in parallel, but unfortunately still wouldn't result in me getting any of my tags committed back to docker engine any faster. I'd still have to come back and tag those out individually after the throw-away target is finished being built. Oh well...)

@tonistiigi
Copy link
Member

https://github.com/docker/buildx#buildx-bake-options-target

@kingdonb
Copy link

kingdonb commented Oct 28, 2020

Yipee! TYVM @tonistiigi

Then you can close this issue, thank you so much for pointing that out

@thaJeztah
Copy link
Member

@tonistiigi would this be something to expose on the docker cli as well? (docker build, not buildx)

@kingdonb
Copy link

kingdonb commented Nov 3, 2020

I wonder if there is a way to call docker buildx bake with an SSH agent...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants