Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(dockerfile): Add pip caching for faster build #35026

Merged
merged 11 commits into from
Oct 31, 2023
26 changes: 18 additions & 8 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -553,9 +553,9 @@ function common::install_pip_version() {
echo "${COLOR_BLUE}Installing pip version ${AIRFLOW_PIP_VERSION}${COLOR_RESET}"
echo
if [[ ${AIRFLOW_PIP_VERSION} =~ .*https.* ]]; then
pip install --disable-pip-version-check --no-cache-dir "pip @ ${AIRFLOW_PIP_VERSION}"
pip install --disable-pip-version-check "pip @ ${AIRFLOW_PIP_VERSION}"
else
pip install --disable-pip-version-check --no-cache-dir "pip==${AIRFLOW_PIP_VERSION}"
pip install --disable-pip-version-check "pip==${AIRFLOW_PIP_VERSION}"
fi
mkdir -p "${HOME}/.local/bin"
}
Expand Down Expand Up @@ -1123,7 +1123,7 @@ if [[ -n "${_PIP_ADDITIONAL_REQUIREMENTS=}" ]] ; then
>&2 echo " the container starts, so it is only useful for testing and trying out"
>&2 echo " of adding dependencies."
>&2 echo
pip install --root-user-action ignore --no-cache-dir ${_PIP_ADDITIONAL_REQUIREMENTS}
pip install --root-user-action ignore ${_PIP_ADDITIONAL_REQUIREMENTS}
fi


Expand Down Expand Up @@ -1201,7 +1201,8 @@ SHELL ["/bin/bash", "-o", "pipefail", "-o", "errexit", "-o", "nounset", "-o", "n
ARG PYTHON_BASE_IMAGE
ENV PYTHON_BASE_IMAGE=${PYTHON_BASE_IMAGE} \
DEBIAN_FRONTEND=noninteractive LANGUAGE=C.UTF-8 LANG=C.UTF-8 LC_ALL=C.UTF-8 \
LC_CTYPE=C.UTF-8 LC_MESSAGES=C.UTF-8
LC_CTYPE=C.UTF-8 LC_MESSAGES=C.UTF-8 \
PIP_CACHE_DIR=/tmp/.cache/pip

ARG DEV_APT_DEPS=""
ARG ADDITIONAL_DEV_APT_DEPS=""
Expand Down Expand Up @@ -1386,8 +1387,15 @@ WORKDIR ${AIRFLOW_HOME}
COPY --from=scripts install_from_docker_context_files.sh install_airflow.sh \
install_additional_dependencies.sh /scripts/docker/

# hadolint ignore=SC2086, SC2010
RUN if [[ ${INSTALL_PACKAGES_FROM_CONTEXT} == "true" ]]; then \
# Useful for creating a cache id based on the underlying architecture, preventing the use of cached python packages from
# an incorrect architecture.
ARG TARGETARCH
# Value to be able to easily change cache id and therefore use a bare new cache
ARG PIP_CACHE_EPOCH="0"

# hadolint ignore=SC2086, SC2010, DL3042
RUN --mount=type=cache,id=$PYTHON_BASE_IMAGE-$AIRFLOW_PIP_VERSION-$TARGETARCH-$PIP_CACHE_EPOCH,target=/tmp/.cache/pip,uid=${AIRFLOW_UID} \
if [[ ${INSTALL_PACKAGES_FROM_CONTEXT} == "true" ]]; then \
bash /scripts/docker/install_from_docker_context_files.sh; \
fi; \
if ! airflow version 2>/dev/null >/dev/null; then \
Expand All @@ -1405,8 +1413,10 @@ RUN if [[ ${INSTALL_PACKAGES_FROM_CONTEXT} == "true" ]]; then \
# In case there is a requirements.txt file in "docker-context-files" it will be installed
# during the build additionally to whatever has been installed so far. It is recommended that
# the requirements.txt contains only dependencies with == version specification
RUN if [[ -f /docker-context-files/requirements.txt ]]; then \
pip install --no-cache-dir --user -r /docker-context-files/requirements.txt; \
# hadolint ignore=DL3042
RUN --mount=type=cache,id=additional-requirements-$PYTHON_BASE_IMAGE-$AIRFLOW_PIP_VERSION-$TARGETARCH-$PIP_CACHE_EPOCH,target=/tmp/.cache/pip,uid=${AIRFLOW_UID} \
if [[ -f /docker-context-files/requirements.txt ]]; then \
pip install --user -r /docker-context-files/requirements.txt; \
fi

##############################################################################################
Expand Down
4 changes: 2 additions & 2 deletions Dockerfile.ci
Original file line number Diff line number Diff line change
Expand Up @@ -513,9 +513,9 @@ function common::install_pip_version() {
echo "${COLOR_BLUE}Installing pip version ${AIRFLOW_PIP_VERSION}${COLOR_RESET}"
echo
if [[ ${AIRFLOW_PIP_VERSION} =~ .*https.* ]]; then
pip install --disable-pip-version-check --no-cache-dir "pip @ ${AIRFLOW_PIP_VERSION}"
pip install --disable-pip-version-check "pip @ ${AIRFLOW_PIP_VERSION}"
else
pip install --disable-pip-version-check --no-cache-dir "pip==${AIRFLOW_PIP_VERSION}"
pip install --disable-pip-version-check "pip==${AIRFLOW_PIP_VERSION}"
fi
mkdir -p "${HOME}/.local/bin"
}
Expand Down
3 changes: 3 additions & 0 deletions docs/docker-stack/build-arg-ref.rst
Original file line number Diff line number Diff line change
Expand Up @@ -278,3 +278,6 @@ Docker context files.
| | | This allows to optimize iterations for |
| | | Image builds and speeds up CI builds. |
+------------------------------------------+------------------------------------------+------------------------------------------+
| ``PIP_CACHE_EPOCH`` | ``"0"`` | Allow to invalidate cache by passing a |
| | | new argument. |
+------------------------------------------+------------------------------------------+------------------------------------------+
12 changes: 12 additions & 0 deletions docs/docker-stack/build.rst
Original file line number Diff line number Diff line change
Expand Up @@ -972,3 +972,15 @@ The architecture of the images

You can read more details about the images - the context, their parameters and internal structure in the
`IMAGES.rst <https://github.com/apache/airflow/blob/main/IMAGES.rst>`_ document.


Pip packages caching
....................

To enable faster iteration when building the image locally (especially if you are testing different combination of
python packages), pip caching has been enabled. The caching id is based on four different parameters:

1. ``PYTHON_BASE_IMAGE``: Avoid sharing same cache based on python version and target os
2. ``AIRFLOW_PIP_VERSION``
3. ``TARGETARCH``: Avoid sharing architecture specific cached package
4. ``PIP_CACHE_EPOCH``: Enable changing cache id by passing ``PIP_CACHE_EPOCH`` as ``--build-arg``
4 changes: 4 additions & 0 deletions docs/docker-stack/changelog.rst
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,10 @@ here so that users affected can find the reason for the changes.
Airflow 2.7
~~~~~~~~~~~

* 2.7.4

* PIP caching for local builds has been enabled to speed up local custom image building

* 2.7.3

* Add experimental feature for select type of MySQL Client libraries during the build custom image via ``INSTALL_MYSQL_CLIENT_TYPE``
Expand Down
4 changes: 2 additions & 2 deletions scripts/docker/common.sh
Original file line number Diff line number Diff line change
Expand Up @@ -77,9 +77,9 @@ function common::install_pip_version() {
echo "${COLOR_BLUE}Installing pip version ${AIRFLOW_PIP_VERSION}${COLOR_RESET}"
echo
if [[ ${AIRFLOW_PIP_VERSION} =~ .*https.* ]]; then
pip install --disable-pip-version-check --no-cache-dir "pip @ ${AIRFLOW_PIP_VERSION}"
pip install --disable-pip-version-check "pip @ ${AIRFLOW_PIP_VERSION}"
else
pip install --disable-pip-version-check --no-cache-dir "pip==${AIRFLOW_PIP_VERSION}"
pip install --disable-pip-version-check "pip==${AIRFLOW_PIP_VERSION}"
fi
mkdir -p "${HOME}/.local/bin"
}
2 changes: 1 addition & 1 deletion scripts/docker/entrypoint_prod.sh
Original file line number Diff line number Diff line change
Expand Up @@ -308,7 +308,7 @@ if [[ -n "${_PIP_ADDITIONAL_REQUIREMENTS=}" ]] ; then
>&2 echo " the container starts, so it is only useful for testing and trying out"
>&2 echo " of adding dependencies."
>&2 echo
pip install --root-user-action ignore --no-cache-dir ${_PIP_ADDITIONAL_REQUIREMENTS}
pip install --root-user-action ignore ${_PIP_ADDITIONAL_REQUIREMENTS}
fi


Expand Down