Skip to content

Commit

Permalink
Add pip caching for faster build (#35026)
Browse files Browse the repository at this point in the history
---------

Co-authored-by: Arthur Volant <arthur.volant@adevinta.com>
Co-authored-by: Jarek Potiuk <jarek@potiuk.com>
(cherry picked from commit 66871a0)
  • Loading branch information
V0lantis authored and ephraimbuddy committed Nov 1, 2023
1 parent 0abaa44 commit fb2fcfe
Show file tree
Hide file tree
Showing 7 changed files with 40 additions and 13 deletions.
26 changes: 18 additions & 8 deletions Dockerfile
Expand Up @@ -546,9 +546,9 @@ function common::install_pip_version() {
echo "${COLOR_BLUE}Installing pip version ${AIRFLOW_PIP_VERSION}${COLOR_RESET}"
echo
if [[ ${AIRFLOW_PIP_VERSION} =~ .*https.* ]]; then
pip install --disable-pip-version-check --no-cache-dir "pip @ ${AIRFLOW_PIP_VERSION}"
pip install --disable-pip-version-check "pip @ ${AIRFLOW_PIP_VERSION}"
else
pip install --disable-pip-version-check --no-cache-dir "pip==${AIRFLOW_PIP_VERSION}"
pip install --disable-pip-version-check "pip==${AIRFLOW_PIP_VERSION}"
fi
mkdir -p "${HOME}/.local/bin"
}
Expand Down Expand Up @@ -1114,7 +1114,7 @@ if [[ -n "${_PIP_ADDITIONAL_REQUIREMENTS=}" ]] ; then
>&2 echo " the container starts, so it is only useful for testing and trying out"
>&2 echo " of adding dependencies."
>&2 echo
pip install --root-user-action ignore --no-cache-dir ${_PIP_ADDITIONAL_REQUIREMENTS}
pip install --root-user-action ignore ${_PIP_ADDITIONAL_REQUIREMENTS}
fi


Expand Down Expand Up @@ -1190,7 +1190,8 @@ SHELL ["/bin/bash", "-o", "pipefail", "-o", "errexit", "-o", "nounset", "-o", "n
ARG PYTHON_BASE_IMAGE
ENV PYTHON_BASE_IMAGE=${PYTHON_BASE_IMAGE} \
DEBIAN_FRONTEND=noninteractive LANGUAGE=C.UTF-8 LANG=C.UTF-8 LC_ALL=C.UTF-8 \
LC_CTYPE=C.UTF-8 LC_MESSAGES=C.UTF-8
LC_CTYPE=C.UTF-8 LC_MESSAGES=C.UTF-8 \
PIP_CACHE_DIR=/tmp/.cache/pip

ARG DEV_APT_DEPS=""
ARG ADDITIONAL_DEV_APT_DEPS=""
Expand Down Expand Up @@ -1375,8 +1376,15 @@ WORKDIR ${AIRFLOW_HOME}
COPY --from=scripts install_from_docker_context_files.sh install_airflow.sh \
install_additional_dependencies.sh /scripts/docker/

# hadolint ignore=SC2086, SC2010
RUN if [[ ${INSTALL_PACKAGES_FROM_CONTEXT} == "true" ]]; then \
# Useful for creating a cache id based on the underlying architecture, preventing the use of cached python packages from
# an incorrect architecture.
ARG TARGETARCH
# Value to be able to easily change cache id and therefore use a bare new cache
ARG PIP_CACHE_EPOCH="0"

# hadolint ignore=SC2086, SC2010, DL3042
RUN --mount=type=cache,id=$PYTHON_BASE_IMAGE-$AIRFLOW_PIP_VERSION-$TARGETARCH-$PIP_CACHE_EPOCH,target=/tmp/.cache/pip,uid=${AIRFLOW_UID} \
if [[ ${INSTALL_PACKAGES_FROM_CONTEXT} == "true" ]]; then \
bash /scripts/docker/install_from_docker_context_files.sh; \
fi; \
if ! airflow version 2>/dev/null >/dev/null; then \
Expand All @@ -1394,8 +1402,10 @@ RUN if [[ ${INSTALL_PACKAGES_FROM_CONTEXT} == "true" ]]; then \
# In case there is a requirements.txt file in "docker-context-files" it will be installed
# during the build additionally to whatever has been installed so far. It is recommended that
# the requirements.txt contains only dependencies with == version specification
RUN if [[ -f /docker-context-files/requirements.txt ]]; then \
pip install --no-cache-dir --user -r /docker-context-files/requirements.txt; \
# hadolint ignore=DL3042
RUN --mount=type=cache,id=additional-requirements-$PYTHON_BASE_IMAGE-$AIRFLOW_PIP_VERSION-$TARGETARCH-$PIP_CACHE_EPOCH,target=/tmp/.cache/pip,uid=${AIRFLOW_UID} \
if [[ -f /docker-context-files/requirements.txt ]]; then \
pip install --user -r /docker-context-files/requirements.txt; \
fi

##############################################################################################
Expand Down
4 changes: 2 additions & 2 deletions Dockerfile.ci
Expand Up @@ -506,9 +506,9 @@ function common::install_pip_version() {
echo "${COLOR_BLUE}Installing pip version ${AIRFLOW_PIP_VERSION}${COLOR_RESET}"
echo
if [[ ${AIRFLOW_PIP_VERSION} =~ .*https.* ]]; then
pip install --disable-pip-version-check --no-cache-dir "pip @ ${AIRFLOW_PIP_VERSION}"
pip install --disable-pip-version-check "pip @ ${AIRFLOW_PIP_VERSION}"
else
pip install --disable-pip-version-check --no-cache-dir "pip==${AIRFLOW_PIP_VERSION}"
pip install --disable-pip-version-check "pip==${AIRFLOW_PIP_VERSION}"
fi
mkdir -p "${HOME}/.local/bin"
}
Expand Down
3 changes: 3 additions & 0 deletions docs/docker-stack/build-arg-ref.rst
Expand Up @@ -278,3 +278,6 @@ Docker context files.
| | | This allows to optimize iterations for |
| | | Image builds and speeds up CI builds. |
+------------------------------------------+------------------------------------------+------------------------------------------+
| ``PIP_CACHE_EPOCH`` | ``"0"`` | Allow to invalidate cache by passing a |
| | | new argument. |
+------------------------------------------+------------------------------------------+------------------------------------------+
12 changes: 12 additions & 0 deletions docs/docker-stack/build.rst
Expand Up @@ -972,3 +972,15 @@ The architecture of the images

You can read more details about the images - the context, their parameters and internal structure in the
`IMAGES.rst <https://github.com/apache/airflow/blob/main/IMAGES.rst>`_ document.


Pip packages caching
....................

To enable faster iteration when building the image locally (especially if you are testing different combination of
python packages), pip caching has been enabled. The caching id is based on four different parameters:

1. ``PYTHON_BASE_IMAGE``: Avoid sharing same cache based on python version and target os
2. ``AIRFLOW_PIP_VERSION``
3. ``TARGETARCH``: Avoid sharing architecture specific cached package
4. ``PIP_CACHE_EPOCH``: Enable changing cache id by passing ``PIP_CACHE_EPOCH`` as ``--build-arg``
2 changes: 2 additions & 0 deletions docs/docker-stack/changelog.rst
Expand Up @@ -68,6 +68,8 @@ Airflow 2.7

* Docker CLI version in the image is bumped to 24.0.6 version.

* PIP caching for local builds has been enabled to speed up local custom image building

* 2.7.0

* As of now, Python 3.7 is no longer supported by the Python community. Therefore, to use Airflow 2.7.0 and above, you must ensure your Python version is
Expand Down
4 changes: 2 additions & 2 deletions scripts/docker/common.sh
Expand Up @@ -76,9 +76,9 @@ function common::install_pip_version() {
echo "${COLOR_BLUE}Installing pip version ${AIRFLOW_PIP_VERSION}${COLOR_RESET}"
echo
if [[ ${AIRFLOW_PIP_VERSION} =~ .*https.* ]]; then
pip install --disable-pip-version-check --no-cache-dir "pip @ ${AIRFLOW_PIP_VERSION}"
pip install --disable-pip-version-check "pip @ ${AIRFLOW_PIP_VERSION}"
else
pip install --disable-pip-version-check --no-cache-dir "pip==${AIRFLOW_PIP_VERSION}"
pip install --disable-pip-version-check "pip==${AIRFLOW_PIP_VERSION}"
fi
mkdir -p "${HOME}/.local/bin"
}
2 changes: 1 addition & 1 deletion scripts/docker/entrypoint_prod.sh
Expand Up @@ -308,7 +308,7 @@ if [[ -n "${_PIP_ADDITIONAL_REQUIREMENTS=}" ]] ; then
>&2 echo " the container starts, so it is only useful for testing and trying out"
>&2 echo " of adding dependencies."
>&2 echo
pip install --root-user-action ignore --no-cache-dir ${_PIP_ADDITIONAL_REQUIREMENTS}
pip install --root-user-action ignore ${_PIP_ADDITIONAL_REQUIREMENTS}
fi


Expand Down

0 comments on commit fb2fcfe

Please sign in to comment.