Skip to content

Ability to filter ADD / COPY during docker build, based on DIFF #13982

@yaronr

Description

@yaronr

Hi
This follows a short discussion on IRC with @duglin

Goal: Minimize docker images
Method: Not copying files that already exist in the base image (or top layer), unless they are different

Use case:
We have a base image, 'multicloud/common'. It includes a whole bunch of stuff, including a set of files under /workdir/lib (a bunch of .jar files, to be specific).
We use maven to run our builds, and maven copies dependencies to ./lib, for both the 'common' project, and the projects that depend on it. One of these projects, for example, is called 'agent', and is packaged in a docker image called 'multicloud/agent', which is FROM multicloud/common.
As mentioned before, when building 'agent', maven copies all of its dependencies (files) to /lib, including all of those that have been packaged under multicloud/common's workdir/lib.
Of course, in the dockerfile for 'agent', I do a COPY (or ADD) ./lib /workdir/lib.

99% of the files copied, are exactly the same (CRC-wise, and probably timestamp-wise) as those that are already on the top layer of the docker image file system. However, copy-on-write adds them to a new layer, effectively increasing the docker image size (dramatically, in some cases).

It would be great, if docker build's ADD or COPY command, had something similar to 'cp -u' - or even better - something that would calculate CRC32 of the files and copy them only if changed.
IMO this could potentially dramatically decrease image sizes in many other use cases as well.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions