New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for compressing build context during image build #25837

Merged
merged 1 commit into from Oct 5, 2016

Conversation

Projects
None yet
7 participants
@reaperhulk
Contributor

reaperhulk commented Aug 18, 2016

- What I did

Added support for optionally compressing build contexts when invoking docker build. This provides significant performance improvements when sending contexts to remote servers with low bandwidth. It is, however, a performance regression when sending to local sockets or over very high bandwidth connections so it should default to off.

- How I did it

Added a new --compress boolean flag. If set then a gzip compression argument is passed to the existing archive.TarWithOptions which already supports this behavior. For example docker build --compress . xz and bz2 support were not added because Go doesn't support compressing these formats (it can decompress bz2, but xz isn't available at all).

- How to verify it

If you build it and use --compress you can see the size difference when sending a build context. However, I was unable to find any unit tests for this file. I may be blind so please let me know if I'm mistaken (or if there are tests I should add in other places to verify this functionality). I would be happy to add testing if someone can point me in the right direction.

- Description for the changelog

  • Added support for optionally compressing build contexts (using --compress) when invoking docker build
@reaperhulk

This comment has been minimized.

Show comment
Hide comment
@reaperhulk

reaperhulk Aug 18, 2016

Contributor

Do I need to update docs/reference/commandline/build.md with the updated options output from the CLI manually or is that part of an automated release process?

Contributor

reaperhulk commented Aug 18, 2016

Do I need to update docs/reference/commandline/build.md with the updated options output from the CLI manually or is that part of an automated release process?

@justincormack

This comment has been minimized.

Show comment
Hide comment
@justincormack

justincormack Aug 18, 2016

Contributor

@reaperhulk yes at present the docs are manual.

I am not entirely sure about having an option, it seems to add a fair amount of complexity for the user. Could we add compression automatically for tcp connections and not for local sockets?

Contributor

justincormack commented Aug 18, 2016

@reaperhulk yes at present the docs are manual.

I am not entirely sure about having an option, it seems to add a fair amount of complexity for the user. Could we add compression automatically for tcp connections and not for local sockets?

@reaperhulk

This comment has been minimized.

Show comment
Hide comment
@reaperhulk

reaperhulk Aug 18, 2016

Contributor

Is there any public API surface for getting the proto that's accessible from runBuild? I haven't been able to find anything, but I'm not very familiar with Docker's internals.

I've added the docs as well if the maintainers decide they want to keep the flag.

Contributor

reaperhulk commented Aug 18, 2016

Is there any public API surface for getting the proto that's accessible from runBuild? I haven't been able to find anything, but I'm not very familiar with Docker's internals.

I've added the docs as well if the maintainers decide they want to keep the flag.

@unclejack

This comment has been minimized.

Show comment
Hide comment
@unclejack

unclejack Aug 26, 2016

Contributor

@reaperhulk Can you provide some numbers to show what kind of improvement this change provides, please?

Contributor

unclejack commented Aug 26, 2016

@reaperhulk Can you provide some numbers to show what kind of improvement this change provides, please?

@reaperhulk

This comment has been minimized.

Show comment
Hide comment
@reaperhulk

reaperhulk Aug 27, 2016

Contributor

@unclejack sure! As an example:

The size of the build context for docker/docker right now is 40MB. With --compress enabled this drops to 13MB. On my machine (2013 15" rMBP 2.7GHz) generating the build context takes ~0.2s uncompressed, but ~1.4s (wallclock) compressed.

On a constrained bandwidth connection sending to a remote docker this reduction in size can be extremely valuable (I am frequently on connections where I can upload no more than 20-50k/sec when I'm outside the US/EU). Conversely, on a fast connection the overhead of gzip would quite possibly harm performance.

For a more extreme example, one of my personal projects has a docker build context size of 19MB (due to some text sitemaps). With compression enabled this drops to 2.2M, which is the difference between success and failure when sending a build context over a flaky 3G tethering connection.

I've replicated this behavior without native support in docker with a shell script:

#!/bin/bash
EXCLUDES=''
while read ignore; do
    EXCLUDES+=" --exclude='$ignore'"
done < .dockerignore

echo "Building Docker Context..."
eval tar cfz buildcontext.tar.gz --exclude='buildcontext.tar.gz' $EXCLUDES *
docker build -t my-great-project - < buildcontext.tar.gz

This works, but is not particularly elegant or discoverable for users who might be facing similar issues.

Contributor

reaperhulk commented Aug 27, 2016

@unclejack sure! As an example:

The size of the build context for docker/docker right now is 40MB. With --compress enabled this drops to 13MB. On my machine (2013 15" rMBP 2.7GHz) generating the build context takes ~0.2s uncompressed, but ~1.4s (wallclock) compressed.

On a constrained bandwidth connection sending to a remote docker this reduction in size can be extremely valuable (I am frequently on connections where I can upload no more than 20-50k/sec when I'm outside the US/EU). Conversely, on a fast connection the overhead of gzip would quite possibly harm performance.

For a more extreme example, one of my personal projects has a docker build context size of 19MB (due to some text sitemaps). With compression enabled this drops to 2.2M, which is the difference between success and failure when sending a build context over a flaky 3G tethering connection.

I've replicated this behavior without native support in docker with a shell script:

#!/bin/bash
EXCLUDES=''
while read ignore; do
    EXCLUDES+=" --exclude='$ignore'"
done < .dockerignore

echo "Building Docker Context..."
eval tar cfz buildcontext.tar.gz --exclude='buildcontext.tar.gz' $EXCLUDES *
docker build -t my-great-project - < buildcontext.tar.gz

This works, but is not particularly elegant or discoverable for users who might be facing similar issues.

@thaJeztah

This comment has been minimized.

Show comment
Hide comment
@thaJeztah

thaJeztah Sep 28, 2016

Member

ping @unclejack ptal

Member

thaJeztah commented Sep 28, 2016

ping @unclejack ptal

@reaperhulk

This comment has been minimized.

Show comment
Hide comment
@reaperhulk

reaperhulk Sep 28, 2016

Contributor

I am happy to rebase this and resolve the conflict if the docker team decides they want to move forward on this!

Contributor

reaperhulk commented Sep 28, 2016

I am happy to rebase this and resolve the conflict if the docker team decides they want to move forward on this!

@thaJeztah

This comment has been minimized.

Show comment
Hide comment
@thaJeztah

thaJeztah Sep 29, 2016

Member

We discussed this in the maintainers meeting and are generally ok with this change, so Im moving it to code review

Member

thaJeztah commented Sep 29, 2016

We discussed this in the maintainers meeting and are generally ok with this change, so Im moving it to code review

Show outdated Hide outdated api/client/image/build.go
@@ -206,8 +208,14 @@ func runBuild(dockerCli *client.DockerCli, options buildOptions) error {
includes = append(includes, ".dockerignore", relDockerfile)
}
var compression archive.Compression

This comment has been minimized.

@anusha-ragunathan

anusha-ragunathan Sep 29, 2016

Contributor

Can be simplified as:

compression := archive.Uncompressed
if options.compress {
   compression = archive.Gzip
}
@anusha-ragunathan

anusha-ragunathan Sep 29, 2016

Contributor

Can be simplified as:

compression := archive.Uncompressed
if options.compress {
   compression = archive.Gzip
}

This comment has been minimized.

@reaperhulk

reaperhulk Sep 29, 2016

Contributor

👍 done!

@reaperhulk

reaperhulk Sep 29, 2016

Contributor

👍 done!

@anusha-ragunathan

This comment has been minimized.

Show comment
Hide comment
@anusha-ragunathan

anusha-ragunathan Sep 29, 2016

Contributor

LGTM.

Contributor

anusha-ragunathan commented Sep 29, 2016

LGTM.

@thaJeztah

This comment has been minimized.

Show comment
Hide comment
@thaJeztah

thaJeztah Sep 29, 2016

Member

can you squash your commits?

Member

thaJeztah commented Sep 29, 2016

can you squash your commits?

@reaperhulk

This comment has been minimized.

Show comment
Hide comment
@reaperhulk

reaperhulk Sep 29, 2016

Contributor

@thaJeztah done!

Contributor

reaperhulk commented Sep 29, 2016

@thaJeztah done!

@@ -208,8 +210,12 @@ func runBuild(dockerCli *command.DockerCli, options buildOptions) error {
includes = append(includes, ".dockerignore", relDockerfile)
}
compression := archive.Uncompressed
if options.compress {
compression = archive.Gzip

This comment has been minimized.

@thaJeztah

thaJeztah Sep 30, 2016

Member

Just to check before we go ahead; do we expect other compression methods to be added at some point (e.g. compress=none|gzip|bzip2) or is that feature creep, and does not have enough advantages to make it worth thinking about?

@thaJeztah

thaJeztah Sep 30, 2016

Member

Just to check before we go ahead; do we expect other compression methods to be added at some point (e.g. compress=none|gzip|bzip2) or is that feature creep, and does not have enough advantages to make it worth thinking about?

This comment has been minimized.

@reaperhulk

reaperhulk Sep 30, 2016

Contributor

In general you need a very fast compression algorithm to make this worthwhile unless your bandwidth to the docker daemon is very low. Alternatives like LZMA (or even bzip2) are too slow to be worth the overhead in most cases. I originally spec'd this with support for alternate schemes, but removed it as it became clear the complexity wasn't worth the (potential) future advantage.

@reaperhulk

reaperhulk Sep 30, 2016

Contributor

In general you need a very fast compression algorithm to make this worthwhile unless your bandwidth to the docker daemon is very low. Alternatives like LZMA (or even bzip2) are too slow to be worth the overhead in most cases. I originally spec'd this with support for alternate schemes, but removed it as it became clear the complexity wasn't worth the (potential) future advantage.

This comment has been minimized.

@thaJeztah

thaJeztah Sep 30, 2016

Member

Yes, I thought that would be the case, but thanks for confirming (better safe then sorry when making these choices 😄)

@thaJeztah

thaJeztah Sep 30, 2016

Member

Yes, I thought that would be the case, but thanks for confirming (better safe then sorry when making these choices 😄)

@thaJeztah

This comment has been minimized.

Show comment
Hide comment
@thaJeztah

thaJeztah Sep 30, 2016

Member

Thanks @reaperhulk, I left one thought (before it's too late 😇)

Also, I did a quick scan of additional changes that are needed;

Otherwise looks great

Member

thaJeztah commented Sep 30, 2016

Thanks @reaperhulk, I left one thought (before it's too late 😇)

Also, I did a quick scan of additional changes that are needed;

Otherwise looks great

@reaperhulk

This comment has been minimized.

Show comment
Hide comment
@reaperhulk

reaperhulk Sep 30, 2016

Contributor

Added the docs and shell autocomplete lines as requested!

I don't think adding to the builder reference is useful right now since as you pointed out the majority of the options aren't covered there. It would be nice to eventually find a good place to document that --compress can speed build context transfers to bandwidth limited remote docker daemons though.

Contributor

reaperhulk commented Sep 30, 2016

Added the docs and shell autocomplete lines as requested!

I don't think adding to the builder reference is useful right now since as you pointed out the majority of the options aren't covered there. It would be nice to eventually find a good place to document that --compress can speed build context transfers to bandwidth limited remote docker daemons though.

@thaJeztah

two tiny nits, but LGTM otherwise

Show outdated Hide outdated contrib/completion/bash/docker
@@ -763,6 +763,7 @@ _docker_build() {
--help
--no-cache
--pull
--compress

This comment has been minimized.

@thaJeztah

thaJeztah Sep 30, 2016

Member

oh! could you keep this list sorted? I know @albers likes it that way

@thaJeztah

thaJeztah Sep 30, 2016

Member

oh! could you keep this list sorted? I know @albers likes it that way

Show outdated Hide outdated contrib/completion/zsh/_docker
@@ -1514,6 +1514,7 @@ __docker_subcommand() {
"($help)*--label=[Set metadata for an image]:label=value: " \
"($help)--no-cache[Do not use cache when building the image]" \
"($help)--pull[Attempt to pull a newer version of the image]" \
"($help)--compress[Compress the build context using gzip]" \

This comment has been minimized.

@thaJeztah

thaJeztah Sep 30, 2016

Member

Same here (sorry, could've mentioned that)

@thaJeztah

thaJeztah Sep 30, 2016

Member

Same here (sorry, could've mentioned that)

Add support for compressing build context during image build
When sending a build context to a remote server it may be
(significantly) advantageous to compress the build context. This commit
adds support for gz compression when constructing a build context
using a command like "docker build --compress ."

Signed-off-by: Paul Kehrer <paul.l.kehrer@gmail.com>
@reaperhulk

This comment has been minimized.

Show comment
Hide comment
@reaperhulk

reaperhulk Sep 30, 2016

Contributor

@thaJeztah alphabetized 😄

Contributor

reaperhulk commented Sep 30, 2016

@thaJeztah alphabetized 😄

@thaJeztah

perfect, LGTM

@LK4D4

This comment has been minimized.

Show comment
Hide comment
@LK4D4

LK4D4 Oct 5, 2016

Contributor

LGTM

Contributor

LK4D4 commented Oct 5, 2016

LGTM

@LK4D4 LK4D4 merged commit f08a450 into moby:master Oct 5, 2016

4 of 5 checks passed

windowsRS1 Jenkins build Docker-PRs-WoW-RS1 3998 has failed
Details
docker/dco-signed All commits signed
Details
documentation success
Details
experimental Jenkins build Docker-PRs-experimental 24561 has succeeded
Details
janky Jenkins build Docker-PRs 33156 has succeeded
Details

dnephin pushed a commit to dnephin/docker that referenced this pull request Apr 17, 2017

Merge pull request #25837 from reaperhulk/support-compressing-build-c…
…ontext

Add support for compressing build context during image build
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment