Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CeleryExecutor gevent/eventlet pools need monkey patching #8023

Closed
aamangeldi opened this issue Mar 31, 2020 · 6 comments · Fixed by #8559
Closed

CeleryExecutor gevent/eventlet pools need monkey patching #8023

aamangeldi opened this issue Mar 31, 2020 · 6 comments · Fixed by #8559
Labels
area:Scheduler Scheduler or dag parsing Issues kind:bug This is a clearly a bug

Comments

@aamangeldi
Copy link
Contributor

aamangeldi commented Mar 31, 2020

Apache Airflow version:
1.10.9

Kubernetes version (if you are using kubernetes) (use kubectl version):

Client Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.2", GitCommit:"c97fe5036ef3df2967d086711e6c0c405941e14b", GitTreeState:"clean", BuildDate:"2019-10-15T23:42:50Z", GoVersion:"go1.12.10", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"14+", GitVersion:"v1.14.10-gke.17", GitCommit:"bdceba0734835c6cb1acbd1c447caf17d8613b44", GitTreeState:"clean", BuildDate:"2020-01-17T23:10:13Z", GoVersion:"go1.12.12b4", Compiler:"gc", Platform:"linux/amd64"}

Note: the issue is not specific to k8s.

Environment:
Any. I was able to reproduce using CeleryExecutor docker-compose in the puckel repo (code version tagged as 1.10.9).

What happened:
When setting pool in the [celery] section in airflow.cfg to eventlet or gevent, task instances get scheduled, queued, picked up by the workers, but not executed.

What you expected to happen:
Task instances should be executed. The problem is that the application is not monkey-patched. Celery by default handles monkey-patching but not in all scenarios (e.g. only if Celery is invoked via command line, more info).

Airflow invokes Celery workers in Python via .run(). Unfortunately, this function does not handle monkey patching.

How to reproduce it:

  1. Clone puckel's docker-airflow:

    git clone git@github.com:puckel/docker-airflow.git
    
  2. Modify Dockerfile to:

    RUN pip install eventlet
    

    Then:

    docker build --rm -t puckel/docker-airflow:1.10.9 .
    
  3. Set pool = eventlet in airflow.cfg (the file will be mounted by docker-compose).

  4. Spin up [the CeleryExecutor docker compose](CeleryExecutor docker-compose:

    docker-compose -f docker-compose-CeleryExecutor.yml up -d
    
  5. Navigate to http://localhost:8080, and run an example DAG.

  6. Notice that no task ever gets to the running state.

Solution:
Ideally this should be fixed in Celery, but in the meantime it might be good to have a solution here as well. Here is a patch that I applied to solve this (on Airflow 1.10.9):

--- cli.py	2020-03-27 17:05:45.000000000 -0400
+++ cli-new.py	2020-03-27 17:19:48.000000000 -0400
@@ -1098,7 +1098,10 @@
     }

     if conf.has_option("celery", "pool"):
-        options["pool"] = conf.get("celery", "pool")
+        pool = conf.get("celery", "pool")
+        options["pool"] = pool
+        from celery import maybe_patch_concurrency
+        maybe_patch_concurrency(['-P', pool])

     if args.daemon:
         pid, stdout, stderr, log_file = setup_locations("worker",
@aamangeldi aamangeldi added the kind:bug This is a clearly a bug label Mar 31, 2020
@boring-cyborg
Copy link

boring-cyborg bot commented Mar 31, 2020

Thanks for opening your first issue here! Be sure to follow the issue template!

@turbaszek
Copy link
Member

@aamangeldi would you like to open a PR with suggested change? 🚀

@turbaszek turbaszek added the area:Scheduler Scheduler or dag parsing Issues label Apr 3, 2020
@aamangeldi
Copy link
Contributor Author

@turbaszek Yup, can do.

@turbaszek
Copy link
Member

@aamangeldi thank you!

@sophieherrmann
Copy link

I experienced the same issue using exactly the same setup (Airflow 1.10.9 with puckel/docker-airflow). @aamangeldi s patch fixes the issue for me.

Is there any schedule for bringing this upstream?

@kaxil
Copy link
Member

kaxil commented Apr 8, 2020

I experienced the same issue using exactly the same setup (Airflow 1.10.9 with puckel/docker-airflow). @aamangeldi s patch fixes the issue for me.

Is there any schedule for bringing this upstream?

We can push that out in 1.10.11

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:Scheduler Scheduler or dag parsing Issues kind:bug This is a clearly a bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants