Skip to content

Commit

Permalink
Improve speed of regular upgrades when dependencies change. (apache#3…
Browse files Browse the repository at this point in the history
…7360)

When regular dependencies change we enable UPGRADE_TO_NEWER_DEPENDENCIES
flag. After some recent improvements in installing dependencies from
the branch tip where we can install airflow using `[devel-ci]` extra
from the branch tip, this opens up another option for speeding up
the installations, when some dependencies change.

So far - UPGRADE_TO_NEWER_DEPENDENCIES necessitated to reinstall all
dependencies with airflow attempting to upgrade them eagerly. This
takes ~ 28 minutes on CI now from the scratch. However, this is not
needed. We can still start from the set of dependencies pre-installed
by installation from branch tip, and only trigger
"UPGRADE_TO_NEWER_DEPENDENCIES" after that - so that the "URL"
installation cache does not get invalidated.

This should limit the time needed to build such an image to ~ 8 minutes
in CI when cache is built.

In the future we might further optimize it - as we now have all the
mechanisms necessary to do that in selective checks. For example we
could trigger UPGRADE_TO_NEWER_DEPENDENCIES only when some dependencies
are changes, and not when new dependencies are added, further limiting
the time needed to build such image to some ~ 3/4 minutes. But this
should be a separate PR and should be carefuly thought about, as
it might also be that this will lead to conflicts with the latest
constraints..
  • Loading branch information
potiuk committed Feb 12, 2024
1 parent 82d9a89 commit d8a42ca
Showing 1 changed file with 18 additions and 19 deletions.
37 changes: 18 additions & 19 deletions Dockerfile.ci
Original file line number Diff line number Diff line change
Expand Up @@ -1127,23 +1127,6 @@ ENV AIRFLOW_REPO=${AIRFLOW_REPO}\

RUN echo "Airflow version: ${AIRFLOW_VERSION}"

# Those are additional constraints that are needed for some extras but we do not want to
# force them on the main Airflow package. Currently we need no extra limits as PIP 23.1+ has much better
# dependency resolution and we do not need to limit the versions of the dependencies
#
# boto3 is limited to <1.34 because of aiobotocore that only works with 1.33 and we want to help
# `pip` to limit the versions it checks and limit backtracking, by explicitly specifying these limits
# when performing eager upgrade of dependencies - this way it won't even consider 1.34 versions of boto
# We should update it every time a new version of aiobotocore is released supporting 1.34
#
ARG EAGER_UPGRADE_ADDITIONAL_REQUIREMENTS="boto3>=1.33,<1.34"
ARG UPGRADE_TO_NEWER_DEPENDENCIES="false"
ARG VERSION_SUFFIX_FOR_PYPI=""

ENV EAGER_UPGRADE_ADDITIONAL_REQUIREMENTS=${EAGER_UPGRADE_ADDITIONAL_REQUIREMENTS} \
UPGRADE_TO_NEWER_DEPENDENCIES=${UPGRADE_TO_NEWER_DEPENDENCIES} \
VERSION_SUFFIX_FOR_PYPI=${VERSION_SUFFIX_FOR_PYPI}

# Copy all scripts required for installation - changing any of those should lead to
# rebuilding from here
COPY --from=scripts install_pip_version.sh install_airflow_dependencies_from_branch_tip.sh \
Expand All @@ -1158,8 +1141,7 @@ COPY --from=scripts install_pip_version.sh install_airflow_dependencies_from_bra
# the cache is only used when "upgrade to newer dependencies" is not set to automatically
# account for removed dependencies (we do not install them in the first place)
RUN bash /scripts/docker/install_pip_version.sh; \
if [[ ${AIRFLOW_PRE_CACHED_PIP_PACKAGES} == "true" && \
${UPGRADE_TO_NEWER_DEPENDENCIES} == "false" ]]; then \
if [[ ${AIRFLOW_PRE_CACHED_PIP_PACKAGES} == "true" ]]; then \
bash /scripts/docker/install_airflow_dependencies_from_branch_tip.sh; \
fi

Expand All @@ -1184,6 +1166,23 @@ COPY airflow_pre_installed_providers.txt ${AIRFLOW_SOURCES}/
COPY hatch_build.py ${AIRFLOW_SOURCES}/
COPY --from=scripts install_airflow.sh /scripts/docker/

# Those are additional constraints that are needed for some extras but we do not want to
# force them on the main Airflow package. Currently we need no extra limits as PIP 23.1+ has much better
# dependency resolution and we do not need to limit the versions of the dependencies
#
# boto3 is limited to <1.34 because of aiobotocore that only works with 1.33 and we want to help
# `pip` to limit the versions it checks and limit backtracking, by explicitly specifying these limits
# when performing eager upgrade of dependencies - this way it won't even consider 1.34 versions of boto
# We should update it every time a new version of aiobotocore is released supporting 1.34
#
ARG EAGER_UPGRADE_ADDITIONAL_REQUIREMENTS="boto3>=1.33,<1.34"
ARG UPGRADE_TO_NEWER_DEPENDENCIES="false"
ARG VERSION_SUFFIX_FOR_PYPI=""

ENV EAGER_UPGRADE_ADDITIONAL_REQUIREMENTS=${EAGER_UPGRADE_ADDITIONAL_REQUIREMENTS} \
UPGRADE_TO_NEWER_DEPENDENCIES=${UPGRADE_TO_NEWER_DEPENDENCIES} \
VERSION_SUFFIX_FOR_PYPI=${VERSION_SUFFIX_FOR_PYPI}

# The goal of this line is to install the dependencies from the most current pyproject.toml from sources
# This will be usually incremental small set of packages in CI optimized build, so it will be very fast
# In non-CI optimized build this will install all dependencies before installing sources.
Expand Down

0 comments on commit d8a42ca

Please sign in to comment.