Skip to content

Commit

Permalink
Better diagnostics and self-healing of docker-compose (#17484)
Browse files Browse the repository at this point in the history
There are several ways people might get the quick-start
docker-compose running messed up (especially on linux):

1) they do not run initialization steps and run docker-compose-up
2) they do not run docker-compose-init first

Also on MacOS/Windows default memory/disk settings are not
enough to run Airflow via docker-compose and people are reporting
"Airflow not working" where they simply do not allocate enough
resources.

Finally the docker compose does not support all versions of airflow
and various problems might occur when you use this
docker compose with old version of airflow.

This change adds the following improvements:

* automated check of minimum version of airflow supported
* mkdir -p in the directories creation in instructions
* automated checking if AIRFLOW_UID has been set (and printing
  error and instruction link in case it is not)
* prints warning about too-low memory, cpu, disk allocation
  and instruction link where to read about it
* automated fixing of ownership of the directories created in
  case they were not created initially and ended up owned by
  root user

(cherry picked from commit 763860c)
  • Loading branch information
potiuk authored and jhtimmins committed Aug 11, 2021
1 parent bc79182 commit 20ed40b
Show file tree
Hide file tree
Showing 2 changed files with 74 additions and 6 deletions.
57 changes: 55 additions & 2 deletions docs/apache-airflow/start/docker-compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@
# AIRFLOW_UID - User ID in Airflow containers
# Default: 50000
# AIRFLOW_GID - Group ID in Airflow containers
# Default: 50000
# Default: 0
#
# Those configurations are useful mostly in case of standalone testing/running Airflow in test/try-out mode
#
Expand Down Expand Up @@ -133,13 +133,66 @@ services:

airflow-init:
<<: *airflow-common
command: version
entrypoint: /bin/bash
command:
- -c
- |
function ver() {
printf "%04d%04d%04d%04d" $${1//./ }
}
airflow_version=$$(gosu airflow airflow version)
airflow_version_comparable=$$(ver $${airflow_version})
min_airflow_version=2.1.0
min_airlfow_version_comparable=$$(ver $${min_airflow_version})
if (( airflow_version_comparable < min_airlfow_version_comparable )); then
echo -e "\033[1;31mERROR!!!: Too old Airflow version $${airflow_version}!\e[0m"
echo "The minimum Airflow version supported: $${min_airflow_version}. Only use this or higher!"
exit 1
fi
if [[ -z "${AIRFLOW_UID}" ]]; then
echo -e "\033[1;31mERROR!!!: AIRFLOW_UID not set!\e[0m"
echo "Please follow these instructions to set AIRFLOW_UID and AIRFLOW_GID environment variables:
https://airflow.apache.org/docs/apache-airflow/stable/start/docker.html#initializing-environment"
exit 1
fi
one_meg=1048576
mem_available=$$(($$(getconf _PHYS_PAGES) * $$(getconf PAGE_SIZE) / one_meg))
cpus_available=$$(grep -cE 'cpu[0-9]+' /proc/stat)
disk_available=$$(df / | tail -1 | awk '{print $$4}')
warning_resources="false"
if (( mem_available < 4000 )) ; then
echo -e "\033[1;33mWARNING!!!: Not enough memory available for Docker.\e[0m"
echo "At least 4GB of memory required. You have $$(numfmt --to iec $$((mem_available * one_meg)))"
warning_resources="true"
fi
if (( cpus_available < 2 )); then
echo -e "\033[1;33mWARNING!!!: Not enough CPUS available for Docker.\e[0m"
echo "At least 2 CPUs recommended. You have $${cpus_available}"
warning_resources="true"
fi
if (( disk_available < one_meg * 10 )); then
echo -e "\033[1;33mWARNING!!!: Not enough Disk space available for Docker.\e[0m"
echo "At least 10 GBs recommended. You have $$(numfmt --to iec $$((disk_available * 1024 )))"
warning_resources="true"
fi
if [[ $${warning_resources} == "true" ]]; then
echo
echo -e "\033[1;33mWARNING!!!: You have not enough resources to run Airflow (see above)!\e[0m"
echo "Please follow the instructions to increase amount of resources available:"
echo " https://airflow.apache.org/docs/apache-airflow/stable/start/docker.html#before-you-begin"
fi
mkdir -p /sources/logs /sources/dags /sources/plugins
chown -R "${AIRFLOW_UID}:${AIRFLOW_GID}" /sources/{logs,dags,plugins}
exec /entrypoint airflow version
environment:
<<: *airflow-common-env
_AIRFLOW_DB_UPGRADE: 'true'
_AIRFLOW_WWW_USER_CREATE: 'true'
_AIRFLOW_WWW_USER_USERNAME: ${_AIRFLOW_WWW_USER_USERNAME:-airflow}
_AIRFLOW_WWW_USER_PASSWORD: ${_AIRFLOW_WWW_USER_PASSWORD:-airflow}
user: "0:${AIRFLOW_GID:-0}"
volumes:
- .:/sources

flower:
<<: *airflow-common
Expand Down
23 changes: 19 additions & 4 deletions docs/apache-airflow/start/docker.rst
Original file line number Diff line number Diff line change
Expand Up @@ -65,9 +65,6 @@ Some directories in the container are mounted, which means that their contents a
This file uses the latest Airflow image (`apache/airflow <https://hub.docker.com/r/apache/airflow>`__).
If you need install a new Python library or system library, you can :doc:`build your image <docker-stack:index>`.

.. _initializing_docker_compose_environment:


Using custom images
===================

Expand All @@ -80,6 +77,8 @@ to rebuild the images on-the-fly when you run other ``docker-compose`` commands.
Examples of how you can extend the image with custom providers, python packages,
apt packages and more can be found in :doc:`Building the image <docker-stack:build>`.

.. _initializing_docker_compose_environment:

Initializing Environment
========================

Expand All @@ -89,7 +88,7 @@ On **Linux**, the mounted volumes in container use the native Linux filesystem u

.. code-block:: bash
mkdir ./dags ./logs ./plugins
mkdir -p ./dags ./logs ./plugins
echo -e "AIRFLOW_UID=$(id -u)\nAIRFLOW_GID=0" > .env
See :ref:`Docker Compose environment variables <docker-compose-env-variables>`
Expand All @@ -111,6 +110,22 @@ After initialization is complete, you should see a message like below.
The account created has the login ``airflow`` and the password ``airflow``.

Cleaning-up the environment
===========================

The docker-compose we prepare is a "Quick-start" one. It is not intended to be used in production
and it has a number of caveats - one of them that the best way to recover from any problem is to clean it
up and restart from the scratch.

The best way to do it is to:

* Run ``docker-compose down --volumes --remove-orphans`` command in the directory you downloaded the
``docker-compose.yaml`` file
* remove the whole directory where you downloaded the ``docker-compose.yaml`` file
``rm -rf '<DIRECTORY>'``
* re-download the ``docker-compose.yaml`` file
* re-start following the instructions from the very beginning in this guide

Running Airflow
===============

Expand Down

0 comments on commit 20ed40b

Please sign in to comment.