Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Utilize multi-stage build to improve the image size #184

Merged
merged 3 commits into from
Mar 1, 2021
Merged

Utilize multi-stage build to improve the image size #184

merged 3 commits into from
Mar 1, 2021

Conversation

reitzig
Copy link
Contributor

@reitzig reitzig commented Feb 22, 2021

Avoid bloating up the final image with build dependencies and byproducts.
Pertains to issue #50; see also PR #175

Image sizes (in local Docker demon):

  • Before: ~1.5GB
  • After pulling Haskell build out: 453 MB (commit 1)
  • After pulling Ruby build out: 457 MB (commit 2; changed how the image is built)

Follow-up questions:

  • Why is there a JRE in the image?
  • Why install apk ruby-bigdecimal and gem bigdecimal?
  • Why are make and rake in the final image?
  • Moving the Ruby build out, I had to add a whole bunch of dependencies: cmake bison flex python3 glib-dev cairo-dev pango-dev gdk-pixbuf-dev, amounting to several hundreds of megabytes. It stands to reason that these were somehow installed by the command that's still in the main stage; can they be removed somehow?

Avoid bloating up the final image with build dependencies and byproducts.
@reitzig
Copy link
Contributor Author

reitzig commented Feb 23, 2021

Pulling out the Haskell part paid off big time; "easy" win, IMHO.

Pulling out the Ruby part didn't help as much and made the build more complicated (note all the small tricks to copy over the build gems). May want to revert.

Not sure whether trying the same for the Python block is worth it.

Copy link
Contributor

@dduportal dduportal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello ! Thanks for this work, great job! Please find a few feedbacks to be solved before we go on.

As you stated, pulling out the Ruby build seems to add a lot of complexity in the Dockerfile for a low value (a few megabytes). WDYT about reverting it and we keep only the ERD part (which is already an awesome gain)?

build_image.sh Outdated Show resolved Hide resolved
build_image.sh Outdated Show resolved Hide resolved
@dduportal
Copy link
Contributor

To answer your follow-up questions:

Why is there a JRE in the image?

As far as I remeber, it is required by "asciidoctor diagram" (as per #46)

Why install apk ruby-bigdecimal and gem bigdecimal?

Good catch 👍 It comes from this PR: https://github.com/asciidoctor/docker-asciidoctor/pull/165/files#diff-dd2c0eb6ea5cfc6c4bd4eac30934e2d5746747af48fef6da689e85b752f39557R42 and it looks like we forgot to remove the gem. Do you mind adding a commit here to delete the gem and only keep the native package please?

Why are make and rake in the final image?

These 2 tools are often used to define a lifecycle when building content with asciidoctor and they are present as a commodity (to avoid having to install it when required for content producers). The impact in size is not really high as far as I know.

Moving the Ruby build out, I had to add a whole bunch of dependencies: cmake bison flex python3 glib-dev cairo-dev pango-dev gdk-pixbuf-dev, amounting to several hundreds of megabytes. It stands to reason that these were somehow installed by the command that's still in the main stage; can they be removed somehow?

These dependencies are required by the side-asciidoctor tools installed in the image (same pattern as the asciidoctor-diagram requiring the JRE).

Don't forget that this is a "kitchen sink" image: it provides a complete environment for authoring with asciidoctor and a bunch of associated tools. Such an image is (alas) always bloated in term of amount of dependencies.

If you want to have an "asciidoctor" or any other "minimalistic" image, you might be interested by #79

@dduportal
Copy link
Contributor

Hello @reitzig , is there anything we could to to help you on this topic?

@reitzig
Copy link
Contributor Author

reitzig commented Feb 27, 2021

First thing is having another hour or five to spare. 😅 Then I think I can start working.

Don't forget that this is a "kitchen sink" image: it provides a complete environment for authoring with asciidoctor and a bunch of associated tools. Such an image is (alas) always bloated in term of amount of dependencies.

If you want to have an "asciidoctor" or any other "minimalistic" image, you might be interested by #79

Fair enough. FWIW, using multi-stage builds we could get a minimal image "for free": just start off with bare asciidoctor in one stage, and add additional tools in a later stage. Would that be interesting? I guess the "bare" one could get tagged _-slim so as not to break the convention established up to now?

@dduportal
Copy link
Contributor

@reitzig sure, it could be interesting! May I ask that we split the work in atomic and deliverable unit to avoid PR that would wait ages before being merged?

In this case I propose that you focus the current PR on utilizing Multistage for ERD only, to make the final image back to a decent size.

Then there could be subsequent (but independent and autonomous) PRs, each one solving a problem statement from a given issue. Looks good to you?

@reitzig
Copy link
Contributor Author

reitzig commented Mar 1, 2021

@dduportal Please have another look; I think I addressed your concerns.

Copy link
Contributor

@dduportal dduportal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍 Thanks a lot!

@dduportal dduportal changed the title Utilize multi-stage build Utilize multi-stage build to improve the image size Mar 1, 2021
@dduportal dduportal merged commit 4c97718 into asciidoctor:main Mar 1, 2021
@reitzig
Copy link
Contributor Author

reitzig commented Mar 1, 2021

FWIW, not sure how the Docker demon on Github Action does its caching, but you may be able to speed up the build by actually pushing the build-haskell image to the Github registry and pulling from there before running make build. Chances are it didn't change, in which case you replace the minutes-long build with a 3GB Github-internal download. Might fall on either side, but it may be worth trying (if build times are a concern at all).

@reitzig sure, it could be interesting! May I ask that we split the work in atomic and deliverable unit to avoid PR that would wait ages before being merged?

In this case I propose that you focus the current PR on utilizing Multistage for ERD only, to make the final image back to a decent size.

Then there could be subsequent (but independent and autonomous) PRs, each one solving a problem statement from a given issue. Looks good to you?

--> #187

@reitzig reitzig deleted the patch-1 branch March 3, 2021 07:40
@dduportal dduportal mentioned this pull request May 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants