Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor: Dockerimage #5007

Merged
merged 15 commits into from
Jul 10, 2023
Merged

Conversation

wolflu05
Copy link
Contributor

@wolflu05 wolflu05 commented Jun 8, 2023

This PR refactors the docker image to reduce its size dramatically. It now weights 524MB.

TODO

Edit tasklist title
Beta Give feedback Tasklist TODO, more options

Delete tasklist

Delete tasklist block?
Are you sure? All relationships in this tasklist will be removed.
  1. remove gnupg
    Options
  2. rewrite init script with ash and remove bash
    Options
  3. test with existing installation
    Options
  4. test with all dbms
    Options
  5. figure out how to get the other db drivers to work, there are some native dependencies missing
    Options
image

fixes #4924

@wolflu05 wolflu05 added refactor docker Docker / docker-compose labels Jun 8, 2023
@wolflu05
Copy link
Contributor Author

wolflu05 commented Jun 9, 2023

Is git actually required in the image now? I would say yes, as some plugins are installed via git.

@SchrodingersGat
Copy link
Member

Is git actually required in the image now? I would say yes, as some plugins are installed via git.

To install via git we would need the git executable, yeah

Dockerfile Outdated Show resolved Hide resolved
@wolflu05
Copy link
Contributor Author

wolflu05 commented Jun 9, 2023

Now the image weights about 629MB. I'd love if we could reduce the needed dependencies for the database drivers.

@wolflu05
Copy link
Contributor Author

What's really interesting and what could be used to consider using an alpine image would be this Dockerfile. It's an other open source project that's also based on Django and also has to deal with lots of images (from your cooked meals) and also has ldap support already built in and weights out 350mb (uncompressed).

@SchrodingersGat
Copy link
Member

@wolflu05 two things immediately stick out in the linked image

  • Running purge step after install
  • Running all install steps on the one line

Might be worth applying that to our image

@wolflu05
Copy link
Contributor Author

wolflu05 commented Jun 10, 2023

Running purge step after install

I already do this as much as I can and know with multistage builds. The packages that are installed in the production stage are needed for inventree to work, because they provide native linked libraries that get imported from e.g. the pip psql lib.

Running all install steps on the one line

I tried squashing all image layers, but this also doesn't reduce the size of the image.

@wolflu05
Copy link
Contributor Author

wolflu05 commented Jun 10, 2023

I just tried how alpine turns out. Here is a little comparison with the current state (0080176) of pushed files:

Size Build time *
Old 1.03GB 119.0s
New debian slim based 778MB 135.8s
New alpine based 556MB 141.8s

* all builds on MacBook Pro 2019 (docker cache disabled)

@SchrodingersGat
Copy link
Member

The alpine image is looking pretty appealing then!

@wolflu05
Copy link
Contributor Author

wolflu05 commented Jun 11, 2023

FYI, adding LDAP needed packages adds about 5MB to the alpine image, so it weights: 561MB. And about 5MB to the Debian image so it weights: 781MB.

Dockerfile changes
diff --git a/Dockerfile b/Dockerfile
index d60ab788b..46c0ba518 100644
--- a/Dockerfile
+++ b/Dockerfile
@@ -77,7 +77,7 @@ FROM inventree_base as build_base
 
 # Install build required system packages
 RUN apt-get update && apt-get install -y  --no-install-recommends \
-    gcc g++ libffi-dev libssl-dev && \
+    gcc g++ python3-dev libffi-dev libssl-dev libldap2-dev libsasl2-dev && \
     apt-get autoclean && apt-get autoremove
 
 # For ARMv7 architecture, add the pinwheels repo (for cryptography library)
diff --git a/Dockerfile.alpine b/Dockerfile.alpine
index 5990bd53a..0f261bae2 100644
--- a/Dockerfile.alpine
+++ b/Dockerfile.alpine
@@ -46,7 +46,7 @@ LABEL org.label-schema.schema-version="1.0" \
       org.label-schema.vcs-ref=${commit_tag}
 
 RUN apk add --no-cache \
-    git gnupg gettext py-cryptography bash \
+    git gnupg gettext py-cryptography bash openldap \
     # Image format support
     libjpeg libwebp zlib \
     # Weasyprint requirements : https://doc.courtbouillon.org/weasyprint/stable/first_steps.html#alpine-3-12
@@ -69,7 +69,7 @@ COPY ./docker/requirements.txt base_requirements.txt
 COPY ./requirements.txt ./
 
 RUN apk add --no-cache --virtual .build-deps \
-    gcc g++ musl-dev openssl-dev libffi-dev cargo python3-dev \
+    gcc g++ musl-dev openssl-dev libffi-dev cargo python3-dev openldap-dev \
     # Image format dev libs
     jpeg-dev openjpeg-dev libwebp-dev zlib-dev \
     # DB specific dev libs
diff --git a/requirements.txt b/requirements.txt
index 487dcfead..8bac49a90 100644
--- a/requirements.txt
+++ b/requirements.txt
@@ -305,5 +305,8 @@ zipp==3.15.0
 zopfli==0.2.2
     # via fonttools
 
+python-ldap==3.4.3
+django-auth-ldap==4.2.0
+
 # The following packages are considered to be unsafe in a requirements file:
 # setuptools

@wolflu05
Copy link
Contributor Author

Interested to hear what @matmair has to say to this alpine approach.

@SchrodingersGat
Copy link
Member

It certainly saves a lot of bandwidth. And the LDAP addition seems minor, but I have no horse in that race :)

@matmair
Copy link
Contributor

matmair commented Jun 11, 2023

If it works why not. Have you verified that ldap actually resolved all objectTypes and both ldap and ldaps with the installed dependencies?

@wolflu05 The size reduction is very impressive. Is there something significantly different to the linked project? They seem to be even smaller so I wonder how much we need to change to achieve that size. With the amount the image gets pulled this PR certainly has positive carbon-footprint implications.

The change from Debian to Alpin should to be marked breaking and mentioned in the release notes, plugins will need to adapt their (native) install instructions to work on the docker as apt will not work.

@wolflu05
Copy link
Contributor Author

wolflu05 commented Jun 11, 2023

If it works why not. Have you verified that ldap actually resolved all objectTypes and both ldap and ldaps with the installed dependencies?

Not fully, need some further testing.

The size reduction is very impressive. Is there something significantly different to the linked project? They seem to be even smaller so I wonder how much we need to change to achieve that size. With the amount the image gets pulled this PR certainly has positive carbon-footprint implications.

They don't use lots of things we need:

  • weasyprint
  • mariadb
  • git
  • gnupg
  • bash (maybe we rewrite the init script to use ash)
  • our codebase weights about 90MB

The change from Debian to Alpin should to be marked breaking and mentioned in the release notes, plugins will need to adapt their (native) install instructions to work on the docker as apt will not work.

Yes definitely it will break some things.

Idea: provide both alpine and Debian for the next release for a smooth transition period until plugins are adapted.

Further analysis needed: Does switching out the images work easily with an already installed InvenTree. Maybe the venv that's mounted via volume contains some files that are not compatible with alpine.

@SchrodingersGat
Copy link
Member

Further analysis needed: Does switching out the images work easily with an already installed InvenTree. Maybe the venv that's mounted via volume contains some files that are not compatible with alpine.

We definitely need to investigate this before we push this :)

@wolflu05
Copy link
Contributor Author

Oh and I forgot the biggest point why our alpine image is bigger than tandoors. Our codebase weights about 90MB with all the translated templates and po files.

@matmair
Copy link
Contributor

matmair commented Jun 11, 2023

Oh and I forgot the biggest point why our alpine image is bigger than tandoors. Our codebase weights about 90MB with all the translated templates and po files.

That should be solved by #2789 as we will only ship messages

@matmair
Copy link
Contributor

matmair commented Jun 11, 2023

@wolflu05 I think we can remove gnupg - we do not ship the code that used it anymore.

Would be interested in your thoughts on #5017 - background

@SchrodingersGat
Copy link
Member

@wolflu05 just a friendly bump to see where you are up to with this :)

@wolflu05
Copy link
Contributor Author

Need to remove gnupg and do some more testing with other db drivers and also with existing installs.

@matmair matmair mentioned this pull request Jun 26, 2023
5 tasks
@matmair
Copy link
Contributor

matmair commented Jul 7, 2023

@wolflu05 do you want/need any support on this one? Would like to use this for #5011

@wolflu05
Copy link
Contributor Author

wolflu05 commented Jul 9, 2023

Are we still want to support the dev target from our current image? I don't think it's necessary as we have devcontainer and native install.

@SchrodingersGat
Copy link
Member

I would think so from my side. What do you mean @SchrodingersGat .

I support moving entirely to alpine for base image

Are we still want to support the dev target from our current image? I don't think it's necessary as we have devcontainer and native install.

I suppose this is a fair point? But is there a downside to leaving it in, at least for a while, for anyone who would still be using this?

@wolflu05
Copy link
Contributor Author

wolflu05 commented Jul 9, 2023

I suppose this is a fair point? But is there a downside to leaving it in, at least for a while, for anyone who would still be using this?

I plan to replace the current Dockerfile completely with the .alpine one. And in the alpine one is no dev target. There is just the production target. So it would be more effort to recreate a dev target with alpine smoothly.
EDIT: Added it, was not that much work as I thought it would be.

@wolflu05
Copy link
Contributor Author

wolflu05 commented Jul 9, 2023

@matmair @SchrodingersGat I'm done now with the changes I wanted to do. The image now weights 524MB. The biggest things are (estimated):

  • weasyprint, fonts, ... (~100MB)
  • mysql related stuff (~70MB)
  • translated js templates (~50MB)
  • git (~20MB)

🚧 Help needed

I already tested a lot of scenarios with existing installs, other dbms, already installed plugins, ... and everything worked so far. Before this is ready to merge, I'd like to ask if you could test this too so we do not overlook something.

@SchrodingersGat
Copy link
Member

@wolflu05 awesome :) I'll take a look at this tonight

@SchrodingersGat SchrodingersGat added this to the 0.13.0 milestone Jul 10, 2023
@SchrodingersGat
Copy link
Member

The development image works as expected on my end. @matmair any final thoughts before we merge this in?

@matmair
Copy link
Contributor

matmair commented Jul 10, 2023

LGTM; I take it that we are fully removing support for the Raspberry Pi architecture with this as the package resources are removed?

# Weasyprint requirements : https://doc.courtbouillon.org/weasyprint/stable/first_steps.html#debian-11
poppler-utils libpango-1.0-0 libpangoft2-1.0-0 \
RUN apk add --no-cache \
git gettext py-cryptography \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure we need git any more

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about to install a plugin from a remote git repo?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think it is a good idea to push that way to install plugins; discovery is slower and the support for nested plugins is limited; pip as a distribution way is way cleaner

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would argue to keep git and add a big note somewhere to the plugin dev
section. Sometimes you just want to test something and then git is easier.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pip as a distribution way is way cleaner

Absolutely agreed, but there may be users who have their own plugins not available via pip?


# Server init entrypoint
ENTRYPOINT ["/bin/bash", "./init.sh"]
COPY InvenTree ./InvenTree
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we clean up what we ship in a next step? Ie. remove contrib things, assets and anything else that is not needed to run the app.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there is much we can remove, we already only copy the InvenTree folder.

@SchrodingersGat
Copy link
Member

LGTM; I take it that we are fully removing support for the Raspberry Pi architecture with this as the package resources are removed?

@wolflu05 does removing RPi support gain us any space? I would imagine not? Perhaps it is beneficial to leave it in for anyone who wants to target raspbian

@wolflu05
Copy link
Contributor Author

Not sure, haven't tested, because I have no access to one, should I add it and Someone of you test it?

@SchrodingersGat
Copy link
Member

Not sure, haven't tested, because I have no access to one, should I add it and Someone of you test it?

I don't have a way to test it really, but I think maybe we should just leave those lines in. Unless you are building to target raspbian, they don't do anything

@wolflu05 wolflu05 added the breaking Indicates a major update or change which breaks compatibility label Jul 10, 2023
@SchrodingersGat
Copy link
Member

@wolflu05 thanks for the fine work here :)

@SchrodingersGat SchrodingersGat merged commit 4a1f733 into inventree:master Jul 10, 2023
14 checks passed
@wolflu05 wolflu05 deleted the refactor/dockerfile branch July 10, 2023 09:53
@SchrodingersGat
Copy link
Member

SchrodingersGat commented Jul 10, 2023

@wolflu05 this PR truly was a breaking change: https://github.com/inventree/InvenTree/actions/runs/5506646753/jobs/10035587151

Looks like we did not run through the complete set of unit tests...

Update

ok, I missed that this line was deleted - these are pretty critical for report generation! Was this intentional, or an oversight? We'll need to add these back in.

Screenshot 2023-07-10 at 8 27 57 pm

Fix

See #5213

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking Indicates a major update or change which breaks compatibility docker Docker / docker-compose refactor
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Reduce size of docker image
3 participants