Skip to content

Commit

Permalink
Trigger gevent monkeypatching via environment variable (#28283)
Browse files Browse the repository at this point in the history
Gevent needs to monkeypatch a number of system libraries as soon
as possible when Python interpreter starts, in order to avoid
other libraries monkey-patching them before. We should do it before
any other initialization and it needs to be only run on webserver.

So far it was done by local_settings monkeypatching but that has
been rather brittle and some changes in Airflow made previous attempts
to stop working because the "other" packages could be loaded by
Airflow before - depending on installed providers and configuration
(for example when you had AWS configured as logger, boto could have
been loaded before and it could have monkey patch networking before
gevent had a chance to do so.

This change introduces different mechanism of triggering the
patching - it could be triggered by setting an environment variable.
This has the benefit that we do not need to initialize anything
(including reading settings or setting up logging) before we determine
if gevent patching should be performed.

It has also the drawback that the user will have to set the environment
variable in their deployment manually. However this is a small price to
pay if they will get a stable and future-proof gevent monkeypatching
built-in in Airflow.

Fixes: #8212
  • Loading branch information
potiuk committed Dec 21, 2022
1 parent 38e40c6 commit 2429d07
Show file tree
Hide file tree
Showing 4 changed files with 16 additions and 2 deletions.
9 changes: 9 additions & 0 deletions airflow/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,14 @@
import sys
from typing import Callable

if os.environ.get("_AIRFLOW_PATCH_GEVENT"):
# If you are using gevents and start airflow webserver, you might want to run gevent monkeypatching
# as one of the first thing when Airflow is started. This allows gevent to patch networking and other
# system libraries to make them gevent-compatible before anything else patches them (for example boto)
from gevent.monkey import patch_all

patch_all()

from airflow import settings

__all__ = ["__version__", "login", "DAG", "PY36", "PY37", "PY38", "PY39", "PY310", "XComArg"]
Expand All @@ -41,6 +49,7 @@
# lib.)
__path__ = __import__("pkgutil").extend_path(__path__, __name__) # type: ignore


# Perform side-effects unless someone has explicitly opted out before import
# WARNING: DO NOT USE THIS UNLESS YOU REALLY KNOW WHAT YOU'RE DOING.
if not os.environ.get("_AIRFLOW__AS_LIBRARY", None):
Expand Down
4 changes: 3 additions & 1 deletion airflow/config_templates/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -1233,7 +1233,9 @@ webserver:
worker_class:
description: |
The worker class gunicorn should use. Choices include
sync (default), eventlet, gevent
sync (default), eventlet, gevent. Note when using gevent you might also want to set the
"_AIRFLOW_PATCH_GEVENT" environment variable to "1" to make sure gevent patching is done as
early as possible.
version_added: ~
type: string
example: ~
Expand Down
4 changes: 3 additions & 1 deletion airflow/config_templates/default_airflow.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -640,7 +640,9 @@ secret_key = {SECRET_KEY}
workers = 4

# The worker class gunicorn should use. Choices include
# sync (default), eventlet, gevent
# sync (default), eventlet, gevent. Note when using gevent you might also want to set the
# "_AIRFLOW_PATCH_GEVENT" environment variable to "1" to make sure gevent patching is done as
# early as possible.
worker_class = sync

# Log files for the gunicorn webserver. '-' means log to stderr.
Expand Down
1 change: 1 addition & 0 deletions newsfragments/08212.misc.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
If you are using gevent for your webserver deployment and used local settings to monkeypatch gevent, you might want to replace local settings patching with an ``_AIRFLOW_PATCH_GEVENT`` environment variable set to 1 in your webserver. This ensures gevent patching is done as early as possible.

0 comments on commit 2429d07

Please sign in to comment.