Skip to content

Commit

Permalink
Remove AIRFLOW_GID from Docker images (#18747)
Browse files Browse the repository at this point in the history
The AIRFLOW_GID parameter was in the images for historical reasons,
however for a long time we recommend everyone to use GID=0 in order
to make it possible to run the image with Arbitrary UID. Setting
different group than 0 has NO VALUE actually. You can still
override the group of user when starting the container, so the only
real difference is that the "airflow" unmodifiable files such as
python code belong to different group, which has no real value.
You can still use whatever group you want for mounted files and
modifiable resources. Airflow Docker image will work perfectly fine
when the main group of the user is 0 (and we also have to remember
that if the user belongs to other groups in the host, it will also
belong to those group inside the container, AIRFLOW_GID has only
influence on primary group of that user IN-CONTAINER (not outside
of it).

Removing AIRFLOW_GID seems like best choice for Airflow 2.2.

Fixes: #18709
  • Loading branch information
potiuk authored Oct 5, 2021
1 parent 2c2bbb5 commit 958860f
Show file tree
Hide file tree
Showing 5 changed files with 26 additions and 38 deletions.
15 changes: 4 additions & 11 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,6 @@ ARG ADDITIONAL_PYTHON_DEPS=""

ARG AIRFLOW_HOME=/opt/airflow
ARG AIRFLOW_UID="50000"
ARG AIRFLOW_GID="50000"

ARG PYTHON_BASE_IMAGE="python:3.6-slim-buster"

Expand Down Expand Up @@ -314,15 +313,13 @@ FROM ${PYTHON_BASE_IMAGE} as main
SHELL ["/bin/bash", "-o", "pipefail", "-e", "-u", "-x", "-c"]

ARG AIRFLOW_UID
ARG AIRFLOW_GID

LABEL org.apache.airflow.distro="debian" \
org.apache.airflow.distro.version="buster" \
org.apache.airflow.module="airflow" \
org.apache.airflow.component="airflow" \
org.apache.airflow.image="airflow" \
org.apache.airflow.uid="${AIRFLOW_UID}" \
org.apache.airflow.gid="${AIRFLOW_GID}"
org.apache.airflow.uid="${AIRFLOW_UID}"

ARG PYTHON_BASE_IMAGE
ARG AIRFLOW_PIP_VERSION
Expand Down Expand Up @@ -398,7 +395,7 @@ ENV RUNTIME_APT_DEPS=${RUNTIME_APT_DEPS} \
ADDITIONAL_RUNTIME_APT_COMMAND=${ADDITIONAL_RUNTIME_APT_COMMAND} \
INSTALL_MYSQL_CLIENT=${INSTALL_MYSQL_CLIENT} \
INSTALL_MSSQL_CLIENT=${INSTALL_MSSQL_CLIENT} \
AIRFLOW_UID=${AIRFLOW_UID} AIRFLOW_GID=${AIRFLOW_GID} \
AIRFLOW_UID=${AIRFLOW_UID} \
AIRFLOW__CORE__LOAD_EXAMPLES="false" \
AIRFLOW_USER_HOME_DIR=${AIRFLOW_USER_HOME_DIR} \
AIRFLOW_HOME=${AIRFLOW_HOME} \
Expand Down Expand Up @@ -434,10 +431,7 @@ RUN chmod a+x /scripts/docker/install_mysql.sh && \
/scripts/docker/install_mysql.sh prod && \
chmod a+x /scripts/docker/install_mssql.sh && \
/scripts/docker/install_mssql.sh && \
addgroup --gid "${AIRFLOW_GID}" "airflow" && \
adduser --quiet "airflow" --uid "${AIRFLOW_UID}" \
--gid "${AIRFLOW_GID}" \
--home "${AIRFLOW_USER_HOME_DIR}" && \
adduser --quiet "airflow" --uid "${AIRFLOW_UID}" --gid "0" --home "${AIRFLOW_USER_HOME_DIR}" && \
# Make Airflow files belong to the root group and are accessible. This is to accommodate the guidelines from
# OpenShift https://docs.openshift.com/enterprise/3.0/creating_images/guidelines.html
mkdir -pv "${AIRFLOW_HOME}"; \
Expand All @@ -462,7 +456,7 @@ WORKDIR ${AIRFLOW_HOME}

EXPOSE 8080

RUN usermod -g 0 airflow -G ${AIRFLOW_GID}
RUN usermod -g 0 airflow -G 0

USER ${AIRFLOW_UID}

Expand All @@ -473,7 +467,6 @@ LABEL org.apache.airflow.distro="debian" \
org.apache.airflow.image="airflow" \
org.apache.airflow.version="${AIRFLOW_VERSION}" \
org.apache.airflow.uid="${AIRFLOW_UID}" \
org.apache.airflow.gid="${AIRFLOW_GID}" \
org.apache.airflow.main-image.build-id="${BUILD_ID}" \
org.apache.airflow.main-image.commit-sha="${COMMIT_SHA}" \
org.opencontainers.image.source="${AIRFLOW_IMAGE_REPOSITORY}" \
Expand Down
11 changes: 4 additions & 7 deletions docs/apache-airflow/start/docker-compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -27,9 +27,6 @@
# Default: apache/airflow:|version|
# AIRFLOW_UID - User ID in Airflow containers
# Default: 50000
# AIRFLOW_GID - Group ID in Airflow containers
# Default: 0
#
# Those configurations are useful mostly in case of standalone testing/running Airflow in test/try-out mode
#
# _AIRFLOW_WWW_USER_USERNAME - Username for the administrator account (if requested).
Expand Down Expand Up @@ -64,7 +61,7 @@ x-airflow-common:
- ./dags:/opt/airflow/dags
- ./logs:/opt/airflow/logs
- ./plugins:/opt/airflow/plugins
user: "${AIRFLOW_UID:-50000}:${AIRFLOW_GID:-0}"
user: "${AIRFLOW_UID:-50000}:0"
depends_on:
&airflow-common-depends-on
redis:
Expand Down Expand Up @@ -188,7 +185,7 @@ services:
echo
echo -e "\033[1;33mWARNING!!!: AIRFLOW_UID not set!\e[0m"
echo "If you are on Linux, you SHOULD follow the instructions below to set "
echo "AIRFLOW_UID and AIRFLOW_GID environment variables, otherwise files will be owned by root."
echo "AIRFLOW_UID environment variable, otherwise files will be owned by root."
echo "For other operating systems you can get rid of the warning with manually created .env file:"
echo " See: https://airflow.apache.org/docs/apache-airflow/stable/start/docker.html#setting-the-right-airflow-user"
echo
Expand Down Expand Up @@ -227,7 +224,7 @@ services:
echo
fi
mkdir -p /sources/logs /sources/dags /sources/plugins
chown -R "${AIRFLOW_UID}:${AIRFLOW_GID}" /sources/{logs,dags,plugins}
chown -R "${AIRFLOW_UID}:0" /sources/{logs,dags,plugins}
exec /entrypoint airflow version
# yamllint enable rule:line-length
environment:
Expand All @@ -236,7 +233,7 @@ services:
_AIRFLOW_WWW_USER_CREATE: 'true'
_AIRFLOW_WWW_USER_USERNAME: ${_AIRFLOW_WWW_USER_USERNAME:-airflow}
_AIRFLOW_WWW_USER_PASSWORD: ${_AIRFLOW_WWW_USER_PASSWORD:-airflow}
user: "0:${AIRFLOW_GID:-0}"
user: "0:0"
volumes:
- .:/sources

Expand Down
22 changes: 10 additions & 12 deletions docs/apache-airflow/start/docker.rst
Original file line number Diff line number Diff line change
Expand Up @@ -123,7 +123,7 @@ You have to make sure to configure them for the docker-compose:
.. code-block:: bash
mkdir -p ./dags ./logs ./plugins
echo -e "AIRFLOW_UID=$(id -u)\nAIRFLOW_GID=0" > .env
echo -e "AIRFLOW_UID=$(id -u)" > .env
See :ref:`Docker Compose environment variables <docker-compose-env-variables>`

Expand All @@ -134,7 +134,6 @@ ignore it. You can also manually create the ``.env`` file in the same folder you
.. code-block:: text
AIRFLOW_UID=50000
AIRFLOW_GID=0
Initialize the database
-----------------------
Expand Down Expand Up @@ -294,7 +293,7 @@ Environment variables supported by Docker Compose
=================================================

Do not confuse the variable names here with the build arguments set when image is built. The
``AIRFLOW_UID`` and ``AIRFLOW_GID`` build args default to ``50000`` when the image is built, so they are
``AIRFLOW_UID`` build arg defaults to ``50000`` when the image is built, so it is
"baked" into the image. On the other hand, the environment variables below can be set when the container
is running, using - for example - result of ``id -u`` command, which allows to use the dynamic host
runtime user id which is unknown at the time of building the image.
Expand All @@ -307,22 +306,21 @@ runtime user id which is unknown at the time of building the image.
| ``AIRFLOW_UID`` | UID of the user to run Airflow containers as. | ``50000`` |
| | Override if you want to use use non-default Airflow | |
| | UID (for example when you map folders from host, | |
| | it should be set to result of ``id -u`` call. If | |
| | you change it from default 50000, you must set | |
| | ``AIRFLOW_GID`` to ``0``. When it is changed, | |
| | a 2nd user with the UID specified is dynamically | |
| | it should be set to result of ``id -u`` call. | |
| | When it is changed, a user with the UID is | |
| | created with ``default`` name inside the container | |
| | and home of the use is set to ``/airflow/home/`` | |
| | in order to share Python libraries installed there. | |
| | This is in order to achieve the OpenShift | |
| | compatibility. See more in the | |
| | :ref:`Arbitrary Docker User <arbitrary-docker-user>`| |
+--------------------------------+-----------------------------------------------------+--------------------------+
| ``AIRFLOW_GID`` | Group ID in Airflow containers. It overrides the | ``50000`` |
| | GID of the user. It is ``50000`` by default but if | |
| | you want to use different UID than default it must | |
| | be set to ``0``. | |
+--------------------------------+-----------------------------------------------------+--------------------------+

.. note::

Before Airflow 2.2, the Docker Compose also had ``AIRFLOW_GID`` parameter, but it did not provide any additional
functionality - only added confusion - so it has been removed.


Those additional variables are useful in case you are trying out/testing Airflow installation via docker compose.
They are not intended to be used in production, but they make the environment faster to bootstrap for first time
Expand Down
10 changes: 5 additions & 5 deletions docs/docker-stack/build-arg-ref.rst
Original file line number Diff line number Diff line change
Expand Up @@ -51,11 +51,6 @@ Those are the most common arguments that you use when you want to build a custom
+------------------------------------------+------------------------------------------+---------------------------------------------+
| ``AIRFLOW_UID`` | ``50000`` | Airflow user UID. |
+------------------------------------------+------------------------------------------+---------------------------------------------+
| ``AIRFLOW_GID`` | ``50000`` | Airflow group GID. Note that writable |
| | | files/dirs, created on behalf of airflow |
| | | user are set to the ``root`` group (0) |
| | | to allow arbitrary UID to run the image. |
+------------------------------------------+------------------------------------------+---------------------------------------------+
| ``AIRFLOW_CONSTRAINTS_REFERENCE`` | | Reference (branch or tag) from GitHub |
| | | where constraints file is taken from |
| | | It can be ``constraints-main`` or |
Expand All @@ -67,6 +62,11 @@ Those are the most common arguments that you use when you want to build a custom
| | | Auto-detected if empty. |
+------------------------------------------+------------------------------------------+---------------------------------------------+

.. note::

Before Airflow 2.2, the image also had ``AIRFLOW_GID`` parameter, but it did not provide any additional
functionality - only added confusion - so it has been removed.

List of default extras in the production Dockerfile:

.. BEGINNING OF EXTRAS LIST UPDATED BY PRE COMMIT
Expand Down
6 changes: 3 additions & 3 deletions docs/docker-stack/entrypoint.rst
Original file line number Diff line number Diff line change
Expand Up @@ -49,11 +49,11 @@ those formats (See `Docker Run reference <https://docs.docker.com/engine/referen

In case of Docker Compose environment it can be changed via ``user:`` entry in the ``docker-compose.yaml``.
See `Docker compose reference <https://docs.docker.com/compose/compose-file/compose-file-v3/#domainname-hostname-ipc-mac_address-privileged-read_only-shm_size-stdin_open-tty-user-working_dir>`_
for details. In our Quickstart Guide using Docker-Compose, the UID and GID can be passed via
``AIRFLOW_UID`` and ``AIRFLOW_GID`` variables as described in
for details. In our Quickstart Guide using Docker-Compose, the UID can be passed via the
``AIRFLOW_UID`` variable as described in
:ref:`Initializing docker compose environment <initializing_docker_compose_environment>`.

In case ``GID`` is set to ``0``, the user can be any UID, but in case UID is different than the default
The user can be any UID. In case UID is different than the default
``airflow`` (UID=50000), the user will be automatically created when entering the container.

In order to accommodate a number of external libraries and projects, Airflow will automatically create
Expand Down

0 comments on commit 958860f

Please sign in to comment.