Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reuse Docker layers #544

Closed
jfdenise opened this issue Jun 18, 2019 · 5 comments · Fixed by #561
Closed

Reuse Docker layers #544

jfdenise opened this issue Jun 18, 2019 · 5 comments · Fixed by #561

Comments

@jfdenise
Copy link

I would like to only rebuild starting from the module I modified. As an example:
https://github.com/wildfly/wildfly-s2i/blob/master/wildfly-builder-image/image.yaml

If I have:
from X
install
module1
module2
module3
Any change to module3 should only rebuild the image starting from module3.

Today we are rebuilding all.

@jfdenise jfdenise added status/review Sheduled for a review type/enhancement labels Jun 18, 2019
@goldmann
Copy link
Contributor

Thanks for this report. I can confirm this. There are two reasons why we see this. One is artifacts, other are modules. Below is a snippet from a Dockerfile.

FROM centos:7

USER root

RUN yum --setopt=tsflags=nodocs install -y centos-release-scl

# Add scripts used to configure the image
COPY modules /tmp/scripts/

# Add all artifacts to the /tmp/artifacts directory
COPY \
    hawkular-javaagent-1.0.1.Final-redhat-2-shaded.jar \
    jolokia-jvm-1.5.0.redhat-1-agent.jar \
    jmx_prometheus_javaagent-0.3.1.redhat-00006.jar \
    /tmp/artifacts/

# begin jboss.container.user:1.0

# Install required RPMs and ensure that the packages were installed
USER root
RUN yum --setopt=tsflags=nodocs install -y unzip tar rsync shadow-utils \
    && rpm -q unzip tar rsync shadow-utils

The problem is in the two COPY instructions. If we do any change in any module, the COPY modules /tmp/scripts/ will invalidate every single layer below. Same applies to artifacts, but this is not the main problem.

This is NOT how it should be. We should only rebuild modules that were changed as mentioned in the report.

@goldmann goldmann added complexity/medium priority/high and removed status/review Sheduled for a review labels Jun 27, 2019
@goldmann
Copy link
Contributor

goldmann commented Jul 2, 2019

Let's try to schedule this for 3.3.

@goldmann goldmann added this to To do in Release 3.3.0 via automation Jul 2, 2019
@goldmann goldmann added this to the 3.3.0 milestone Jul 2, 2019
@goldmann goldmann self-assigned this Jul 3, 2019
@goldmann goldmann moved this from To do to In progress in Release 3.3.0 Jul 3, 2019
@goldmann
Copy link
Contributor

goldmann commented Jul 3, 2019

Small update.

Copying modules in the place where we are covering it is easy and I have it already working locally. While testing this change I found that it may be problematic to use caching properly, because the squash post-processing of the image removes every intermediate layer resulting in a fresh build all the time. We have a few options here:

  1. Disable cleanup of layers in the squash post-process entirely.
  2. Suggest to use --no-squash parameter when developing image locally to rebuild only changed layers.
  3. Add --no-squash-cleanup parameter that would control whether the cleanup should be performed or not.

My comments:

  • First option is out of question, many images produce big layers and these can easily exhaust the Docker storage.
  • Options two and three are possible to implement.
  • Option two do not require code change, only documentation.
  • Not usre if there is benefit a squashed image for local development purposes:
    • It takes time and resources to build
    • We do not care about smaller images while developing the image.

So, my suggestion is to go with option two: update documentation to use --no-squash to enable layer caching.

Besides this, the last part to properly implement this ticket is to copy artifacts only when required, same as modules. Work on this wasn't started yet.

@rnc @jfdenise Comments?

@jfdenise
Copy link
Author

jfdenise commented Jul 3, 2019

@goldmann I am fine to not squash dev image.

@goldmann
Copy link
Contributor

goldmann commented Jul 3, 2019

Sneak peek:

2019-07-03 12:13:32,960 cekit        INFO     Docker: Step 21/30 : USER root
2019-07-03 12:13:32,960 cekit        INFO     Docker: ---> Using cache
2019-07-03 12:13:32,960 cekit        INFO     Docker: ---> bdaddd403143
2019-07-03 12:13:32,961 cekit        INFO     Docker: Step 22/30 : RUN bash -x /tmp/scripts/tomcat/install.sh
2019-07-03 12:13:32,961 cekit        INFO     Docker: ---> Using cache
2019-07-03 12:13:32,961 cekit        INFO     Docker: ---> 98ea70780dc2
2019-07-03 12:13:32,961 cekit        INFO     Docker: Step 23/30 : LABEL description "Tomcat 8 image" io.cekit.version "3.3.dev0" summary "Tomcat 8 image"
2019-07-03 12:13:32,961 cekit        INFO     Docker: ---> Using cache
2019-07-03 12:13:32,961 cekit        INFO     Docker: ---> 46ba1cb1f6b3
2019-07-03 12:13:32,961 cekit        INFO     Docker: Step 24/30 : USER root
2019-07-03 12:13:32,961 cekit        INFO     Docker: ---> Using cache
2019-07-03 12:13:32,961 cekit        INFO     Docker: ---> 2db2916c3e92
2019-07-03 12:13:32,961 cekit        INFO     Docker: Step 25/30 : RUN [ ! -d /tmp/scripts ] || rm -rf /tmp/scripts
2019-07-03 12:13:32,962 cekit        INFO     Docker: ---> Using cache
2019-07-03 12:13:32,962 cekit        INFO     Docker: ---> 9bc249aeac23
2019-07-03 12:13:32,962 cekit        INFO     Docker: Step 26/30 : RUN [ ! -d /tmp/artifacts ] || rm -rf /tmp/artifacts
2019-07-03 12:13:32,962 cekit        INFO     Docker: ---> Using cache
2019-07-03 12:13:32,962 cekit        INFO     Docker: ---> 31b53b904762
2019-07-03 12:13:32,962 cekit        INFO     Docker: Step 27/30 : RUN yum clean all && [ ! -d /var/cache/yum ] || rm -rf /var/cache/yum
2019-07-03 12:13:32,962 cekit        INFO     Docker: ---> Using cache
2019-07-03 12:13:32,962 cekit        INFO     Docker: ---> b0172c33c165
2019-07-03 12:13:32,962 cekit        INFO     Docker: Step 28/30 : USER 1000
2019-07-03 12:13:32,962 cekit        INFO     Docker: ---> Using cache
2019-07-03 12:13:32,962 cekit        INFO     Docker: ---> babed4ea8ae0
2019-07-03 12:13:32,962 cekit        INFO     Docker: Step 29/30 : WORKDIR /home/user
2019-07-03 12:13:32,963 cekit        INFO     Docker: ---> Using cache
2019-07-03 12:13:32,963 cekit        INFO     Docker: ---> 90346e4eee1f
2019-07-03 12:13:32,963 cekit        INFO     Docker: Step 30/30 : CMD /home/user/apache-tomcat-8.5.24/bin/catalina.sh run
2019-07-03 12:13:32,963 cekit        INFO     Docker: ---> Using cache
2019-07-03 12:13:32,963 cekit        INFO     Docker: ---> da040d35afb5
2019-07-03 12:13:32,963 cekit        INFO     Docker: Successfully built da040d35afb5
2019-07-03 12:13:32,969 cekit        INFO     Image built and available under following tags: cekit/example-tomcat:1.0, cekit/example-tomcat:latest
2019-07-03 12:13:32,969 cekit        INFO     Finished!

goldmann added a commit to goldmann/cekit that referenced this issue Jul 3, 2019
This chnges the way we generate Dockerfile so that
artifacts and modules are not copied at the beginning of
the Dockerfile. This makes it possible to reuse
Docker (and other builder's) cache for content that
was not changed.

To make use of rapid development with cached you need to
make sure the module you modify is installed later in the process
because this invalidates cache and all modules installed
later will need to be executed again.

If you use Docker builder, use the `--no-squash` option.
Without this, the squash post processing will remove all
intermediate container images created at the original image build time
removing the cache at the same time.

Fixes cekit#544
goldmann added a commit to goldmann/cekit that referenced this issue Jul 3, 2019
This chnges the way we generate Dockerfile so that
artifacts and modules are not copied at the beginning of
the Dockerfile. This makes it possible to reuse
Docker (and other builder's) cache for content that
was not changed.

To make use of rapid development with cached you need to
make sure the module you modify is installed later in the process
because this invalidates cache and all modules installed
later will need to be executed again.

If you use Docker builder, use the `--no-squash` option.
Without this, the squash post processing will remove all
intermediate container images created at the original image build time
removing the cache at the same time.

Fixes cekit#544
Release 3.3.0 automation moved this from In progress to Done Jul 4, 2019
goldmann added a commit that referenced this issue Jul 4, 2019
This chnges the way we generate Dockerfile so that
artifacts and modules are not copied at the beginning of
the Dockerfile. This makes it possible to reuse
Docker (and other builder's) cache for content that
was not changed.

To make use of rapid development with cached you need to
make sure the module you modify is installed later in the process
because this invalidates cache and all modules installed
later will need to be executed again.

If you use Docker builder, use the `--no-squash` option.
Without this, the squash post processing will remove all
intermediate container images created at the original image build time
removing the cache at the same time.

Fixes #544
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Release 3.3.0
  
Done
Development

Successfully merging a pull request may close this issue.

2 participants