-
-
Notifications
You must be signed in to change notification settings - Fork 55
Description
Currently, if the Dockerfile being analysed is single-stage and AI is enabled for Dockershrink, DS will adopt Multistage by creating a final stage.
The original premise behind this was that it is always good to have a "final" stage in Dockerfile which only contains things required at runtime, ie, nodejs runtime + dependencies + app code.
We need to add another condition:
Only adopt multistage if the resultant Dockerfile is doing something meaningful during its build stage.
eg 1:
Original Dockerfile
FROM node:22-alpine
EXPOSE 5000
WORKDIR /app
RUN npm install --omit=dev
CMD ["npm", "start"]
There is no benefit in adding a new "final" stage in this dockerfile.
It is already using a light base image and is essentially only installing dependencies and running the app.
eg 2:
Original Dockerfile
FROM node:22-alpine
EXPOSE 5000
WORKDIR /app
RUN npm install && npm run build
CMD ["npm", "start"]
This dockerfile would genuinely benefit from multistage. The build stage would install all dependencies and run the build processes.
The final stage would perform a fresh dependency install, excluding devDependencies and only run the application.
To sum up, the additional check is:
If the Dockerfile is single-stage and it only installs dependencies and runs the application, then DO NOT adopt multistage.
But if additional tasks are being performed (eg- build, test, lint, format, merge/minify code, etc), then keep these in `build` and put prod dependencies and app run commands into final stage.
Projects analysed:
NOTE: Running DS over these projects without AI did a great job (in line with expectations)
Ideal solution for this would be to modify the prompt and tell LLM to follow this.
After some experimentation, it seems like gpt-4o doesn't follow this but o1 does.
Plus DS currently uses gpt 4o 08-06 model version. Upgrading to Nov release REALLY messed up the whole multistage change. So further tests needed.
Consider enhanving the prompt with examples. Input: ..., Output: ...
See branch with changes
Other solutions (more effort involved):
Rule engine or custom ML model to classify the stage is either "needs multistage" or "doesn't need multistage".