Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker performance improvements and reduce image size #1016

Merged
merged 8 commits into from
Mar 21, 2024

Conversation

k1lgor
Copy link
Contributor

@k1lgor k1lgor commented Feb 14, 2024

Info

  • Refactored Dockerfile in multi-stage build - helps to build faster and reduces the image size

    Time for initial build:

    • [+] Building 113.2s (15/15) FINISHED

image

  • Refactored entrypoint.sh - it is more optimized and avoids unnecessary iterations

k1lgor and others added 8 commits June 30, 2023 13:28
- Change base image from python:3.11-slim to python:3.11-alpine in the builder stage
- Use apk package manager instead of apt-get to install dependencies
- Remove unnecessary sudo command
- Use pip install --no-cache-dir instead of pip install
- Add second stage in Dockerfile for final image
- Copy dependencies and entrypoint script from builder stage to final stage

Signed-off-by: Plamen Ivanov <paco.iwanow@gmail.com>
- add coding utf-8 declaration
- use `find` command to patch permissions of generated files

Signed-off-by: Plamen Ivanov <paco.iwanow@gmail.com>
@definitiontv
Copy link

Pretty sure this will break ARM compatibility

@definitiontv
Copy link

Pretty sure this will break ARM compatibility

Belay - that - compiles fine on my ARM. New size is less than 1/2 the original - nice!!!

@ATheorell
Copy link
Collaborator

Thanks for this! Can you have a look at this too @TheoMcCabe ?

@AntonOsika
Copy link
Collaborator

I don't like the maxdepth. Why do we have that?

Otherwise, do you see any risks with this PR @k1lgor ?

@AntonOsika
Copy link
Collaborator

Curious what was approx image size before this? :)

@k1lgor
Copy link
Contributor Author

k1lgor commented Feb 22, 2024

I don't like the maxdepth. Why do we have that?

Otherwise, do you see any risks with this PR @k1lgor ?

@AntonOsika
Restricts the search to only the immediate children of the directory, and avoids recursive search.

@k1lgor
Copy link
Contributor Author

k1lgor commented Feb 22, 2024

Curious what was approx image size before this? :)

60% more or less
image

@definitiontv
Copy link

Pretty sure this will break ARM compatibility

Belay - that - compiles fine on my ARM. New size is less than 1/2 the original - nice!!!

I may have spoken too soon. I only quickly ran but there are some problems with using alpine for those of us who use docker compose containers.
1:/ ASH instead of default BASH - even adding it to the APK ADD didn't seem to work for me - so I get a binary error on ARM - not sure why.
2:/ Relatexd to 1:/. I I actually always run gpte inside the container and link to project directory - most of the code gererated on my system assumes python, bash and APT available for installing packages. Alpine has significant differences which need to be "explained"
3:/ Should be a seperate issue but the default example project directory inside the git repo should NOT really be persisted for real work- really should use as default root the USERs home folder or path psssed or ENVIRONMENT and that avoids the very dangerous permission hack whch is currently being used. Many of us NEVER store new unrelated code or data in external GIT REPOS.
4:/ This project continues the default of running the example project at startup. That's a costly proof of concept which shoudl not be needed now? I overide this in my compose file but for those running direct from image I think the default should be simply the project folder passed by the user or a prompt?

AM I missing something? are other people succesfully running it all like this in their workflow??

@k1lgor
Copy link
Contributor Author

k1lgor commented Feb 26, 2024

@definitiontv Maybe share some error msg, so I will be able to find a solution.

Those are the supported arch for alpine described in Docker Hub
image

@k1lgor
Copy link
Contributor Author

k1lgor commented Mar 15, 2024

Hi all,
The example below uses the same Ubuntu image. With this example, the reduced size is about 50% and the initial build time was less that 2 minutes. [+] Building 113.4s (14/14) FINISHED
image

# Use a lightweight base image
FROM python:3.11-slim AS builder
# Install necessary system dependencies
RUN apt-get update && apt-get install -y --no-install-recommends tk tcl gcc curl
# Set working directory
WORKDIR /app
# Copy source code and entrypoint script
COPY . .
COPY docker/entrypoint.sh ./entrypoint.sh
# Install dependencies and application in a virtual environment
RUN python -m venv /venv && /venv/bin/pip install --upgrade pip && /venv/bin/pip install .
# Final image
FROM python:3.11-slim
# Set working directory
WORKDIR /app
# Copy application and virtual environment from the builder stage
COPY --from=builder /app /app
COPY --from=builder /venv /venv
# Add the virtual environment to the PATH
ENV PATH="/venv/bin:$PATH"
# Set the entrypoint
ENTRYPOINT ["bash", "entrypoint.sh"]

Please take a look and be more active on the PR @definitiontv @AntonOsika @ATheorell @TheoMcCabe

@AntonOsika
Copy link
Collaborator

Merging!

@AntonOsika AntonOsika merged commit d5a73ab into gpt-engineer-org:main Mar 21, 2024
5 checks passed
@k1lgor
Copy link
Contributor Author

k1lgor commented Mar 23, 2024

If multiple users face issues with Alpine, we can easily switch back to Ubuntu with almost the same reduction in image size.

@snf
Copy link
Contributor

snf commented Apr 18, 2024

@k1lgor it looks like something depends on arrow which doesn't have wheels for Alpine. It tries to build it unsuccesfully because it's missing dependencies like cmake, make, gcc and arrow libraries. Didn't get very far debugging the issue but went back to the ubuntu image which works fine with the current pip dependencies.

Possibly related: apache/arrow#18036

@k1lgor
Copy link
Contributor Author

k1lgor commented Apr 18, 2024

I will take a look at this one.

@viborc
Copy link
Collaborator

viborc commented Apr 25, 2024

Hey @k1lgor - any updates or expected actions on this one?

@k1lgor
Copy link
Contributor Author

k1lgor commented May 17, 2024

@k1lgor it looks like something depends on arrow which doesn't have wheels for Alpine. It tries to build it unsuccesfully because it's missing dependencies like cmake, make, gcc and arrow libraries. Didn't get very far debugging the issue but went back to the ubuntu image which works fine with the current pip dependencies.

Possibly related: apache/arrow#18036

@viborc
The Alpine image does not come with those dependencies pre-installed. Pyarrow comes without a wheel and needs to be installed from source, but I have not had any luck with builds up until this point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

None yet

6 participants