Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why do we need a CeleryExecutor in order to use the UI features? #51

Closed
r39132 opened this issue Jun 19, 2015 · 11 comments
Closed

Why do we need a CeleryExecutor in order to use the UI features? #51

r39132 opened this issue Jun 19, 2015 · 11 comments

Comments

@r39132
Copy link
Contributor

r39132 commented Jun 19, 2015

It seems the start up costs of this are a bit high. I am trying out Luigi and Airflow, and need to set up a central scheduler for both> The UI features for Airflow are more attractive than those of Luigi, but the UI features in Airflow are only usable (actionable) if Celery is installed. I am running on EC2 and don't want to install too much to just take this for a spin. My recommendation is to make the UI features work for the LocalExecutor. Otherwise, Luigi seems attractive -- the startup cost is similar but it is more battletested

-s

@mistercrunch
Copy link
Member

Only the Run feature from the UI isn't working isn't it? The problem is I don't want to run an executor and a task in the scope of a web request, I need to run that task async, and without a remote service it's just impossible.

You can use airflow run from the CLI until you move to CeleryExecutor. BTW it's super easy to set up and it can run on the same box. You can use sqlalchemy as a broker and see how much mileage you get.

@martingrayson
Copy link

I don't suppose it would be possible to have a quick-start / few steps added to the documentation to get going started with Celery? I'm having trouble convincing my colleagues that it wouldn't be a massive overhead maintaining Celery too.

@mistercrunch
Copy link
Member

Well Celery is integrated with Airflow, it's just a Python library that ships with Airflow. The Celery broker (most likely RabbitMQ or Redis) is a piece of infrastructure that is required and someone needs to keep up and running. Redis is fairly common nowaways and a breeze to setup, at Airbnb we already had both systems running in production and in-house knowledge about them.

But note that Celery supports using a database (through SqlAlchemy) as a broker, which you already should have setup. So using your same SqlAlchemy connection as a broker seems pretty reasonable to me, even though it is "experimental" as far as Celery support.

The thing is Celery is an async framework that can operate at web scale (a common use case is to process thumbnails for uploaded images outside the scope of a web request), and is setup to handle dozens, if not thousands of messages per second. A database might have some troubles with that many messages, plus the workers constantly poking at it. But with Airflow, the number of messages you'd send is probably in the few hundreds, or thousands a day, so using Celery as a broker might be very reasonable, especially in a pre-production-type setup.

As far as getting a proper Celery setup going, people should refer to the Celery docs. I just added a reference in the docs here:
65c5f0a

Both hey, it'd be nice the best of both worlds in terms of "get going quickly" and "scale to infinity", but the later one has to require some infrastructure.

For the record, when RabbitMQ was having some problems (unrelated to Airflow), I setup a survival Redis box and migrated in about 20 minutes. Of course productionizing Redis, setting up a slave and monitoring it is more workload, but you can do all of this once Airflow becomes an important part of your ecosystem. This should be somewhat trivial for ops folks or data-infra people: that should be part of their job description to provide the services you need to do your work. I respect trying to keep the ecosystem simple though!

@r39132
Copy link
Contributor Author

r39132 commented Jun 20, 2015

I'm leaning towards using our Postgres DB as the broker as it is the quickest route to adoption within the company and will be fine for a while. When we reach scale, I'd lean towards SQS over anything else because its infrastructure that I don't need to maintain, ansibilize, monitor, etc... and because it scales to hundreds of millions of messages per day.

@mistercrunch
Copy link
Member

Postgres/SqlAlchemy should work just fine as a CeleryBroker, please let us know how much mileage you get out of it. I'd bet twelve bucks that it would just never become the bottleneck.

@r39132
Copy link
Contributor Author

r39132 commented Jun 22, 2015

I've opened #63

The experimental status and list of limitations of Sqlalchemy is a real turn-off : http://celery.readthedocs.org/en/latest/getting-started/brokers/sqlalchemy.html#broker-sqlalchemy

I'm looking for a workflow engine that can be light-weight during the adoption phase at our company and fault-tolerant down the line. I've started playing with Celery a bit but I don't want to stand up RabbitMQ/Redis or any other backend right now, even in production because there is a cost of my launching infrastructure in production -- need to ansibilize it, set up logging and alerting, set up monit, etc... all before anyone is using it in production. SQS and Postgres both have limitations as brokers and known bugs.

I liked the support for the LocalExecutor and Sequential Executor because they were lightweight. If and when adoption grows here, we will consider celery and setting up Redis/RabbitMQ, but for now, we won't. In addition to supporting the broker infrastructure for celery, I also need to run a separate "airflow worker" and make sure it is fault-tolerant (e.g. monit, etc...). It would have been nice if a worker started in the main "airflow webserver", but I don't see any queue consumers running when I run "airflow webserver".

Finally, I'm not clear why running the LocalExecutor (if I don't have more than 3 flows running) is a bad idea. But, I would like to have the UI features work and I would like to have the dags imported into the DB, not just showing up on the UI.

@mistercrunch
Copy link
Member

You're talking about 1 UI feature (TaskDialog->Run) that we lived without for months. It's pretty minimal.

I'm not sure if you have tried it but airflow scheduler does start a working LocalExecutor in the background if it is setup that way.

As for keeping two commands up and running that should be pretty easy to do. I haven't seen airflow webserver and airflow scheduler go down in long time. noup or screen should give you mileage beyond POC. Though clearly in a production setup it should be kept up and monitored.

I feel like we have it pretty good on offering variety on the spectrum of ramping up to production. Maybe it could be better, but it's pretty decent as is. I don't see us spending cycles there for a moment.

@r39132
Copy link
Contributor Author

r39132 commented Jun 23, 2015

Sorry, I thought airflow scheduler was related to Celery execution... My colleagues and I somehow missed that after going through both the quick start and tutorial. There is a reference to "master scheduler" in the tutorial, which led to some conflation of the airflow (local) scheduler and celery scheduler. Makes more sense now, so we will launch with the scheduler and local executor.

@mistercrunch
Copy link
Member

Hopefully this clarifies things a bit
b235411

@Dowwie
Copy link

Dowwie commented Sep 20, 2015

What are your thoughts on using disque rather than redis as broker?

@mistercrunch
Copy link
Member

If you mean as a broker for Celery, Disque doesn't seem to be documented here:
http://celery.readthedocs.org/en/latest/getting-started/brokers/

mobuchowski pushed a commit to mobuchowski/airflow that referenced this issue Jan 4, 2022
Signed-off-by: Julien Le Dem <julien@apache.org>
rajatsri28 pushed a commit to rajatsri28/airflow that referenced this issue Jan 25, 2022
* EWT-569 : Initial Commit for migrations

* [EWT-569] Airflow Upgrade to 1.10.14, Cherry-Pick  76fe7ac from 1.10.4

* CP Contains fb64f2e: [TWTR][AIRFLOW-XXX] Twitter Airflow Customizations + Fixup job scheduling without explicit_defaults_for_timestamp

* [EWT-569] Airflow Upgrade to 1.10.14, Cherry-Pick 91d2b00
[CP][EWT-548][AIRFLOW-6527] Make send_task_to_executor timeout configurable (apache#63)

* [EWT-569] Airflow Upgrade to 1.10.14, Cherry-Pick 91d2b00
CP contains [EWT-16]: Airflow fix for manual trigger during version upgrade (apache#13)

* [EWT-16]: Airflow fix for manual trigger during version upgrade

* [EWT-569] Airflow Upgrade to 1.10.14, Cherry-Pick 91d2b00
[CP][EWT-548][AIRFLOW-6527] Make send_task_to_executor timeout configurable (apache#63)

CP of f757a54

* CP(55bb579) [AIRFLOW-5597] Linkify urls in task instance log (apache#16)

* [EWT-569] Airflow Upgrade to 1.10.14 [CP] from 1.10.4+twtr : 94cdcf6
[CP] Contains [AIRFLOW-5597] Linkify urls in task instance log

CP of f757a54

* [EWT-569] Airflow Upgrade to 1.10.14, Cherry-Pick  4ce8d4c from 1.10.4
CP contains [TWTTR] Fix for rendering code on UI (apache#34)

* [EWT-569] Airflow Upgrade to 1.10.14, Cherry-Pick  299b4d8 from 1.10.4
CP contains [TWTR] CP from 1.10+twtr (apache#35)

* 99ee040: CP from 1.10+twtr

* 2e01c24: CP from 1.10.4 ([TWTR][AIRFLOW-4939] Fixup use of fallback kwarg in conf.getint)

* 00cb4ae: [TWTR][AIRFLOW-XXXX] Cherry-pick d4a83bc and bump version (apache#21)

* CP 51b1aee: Relax version requiremets (apache#24)

* CP 67a4d1c: [CX-16266] Change with reference to 1a4c164 commit in open source (apache#25)

* CP 54bd095: [TWTR][CX-17516] Queue tasks already being handled by the executor (apache#26)

* CP 87fcc1c: [TWTR][CX-17516] Requeue tasks in the queued state (apache#27)

* CP 98a1ca9: [AIRFLOW-6625] Explicitly log using utf-8 encoding (apache#7247) (apache#31)

* [EWT-569] Airflow Upgrade to 1.10.14 [CP] from 1.10.4+twtr : f7050fb
CP Contains Experiment API path fix (apache#37)

* [EWT-569] Airflow Upgrade to 1.10.14, Cherry-Pick  8a689af from 1.10.4
CP Contains Export scheduler env variable into worker pods. (apache#38)

* [EWT-569] Airflow Upgrade to 1.10.14, Cherry-Pick  5875a15 from 1.10.4
Cp Contains [EWT-115][EWT-118] Initialise dag var to None and fix for DagModel.fileloc (missed in EWT-16) (apache#39)

* [EWT-569] Airflow Upgrade to 1.10.14, Cherry-Pick  a68e2b3 from 1.10.4
[CX-16591] Fix regex to work with impersonated clusters like airflow_scheduler_ddavydov (apache#42)

* [EWT-569] Airflow Upgrade to 1.10.14 [CP] from 1.10.4+twtr : e9642c2
[CP][EWT-128] Fetch task logs from worker pods (19ac45a) (apache#43)

* [EWT-569] Airflow Upgrade to 1.10.14 [CP] from 1.10.4+twtr : d5d0a07
[CP][AIRFLOW-6561][EWT-290]: Adding priority class and default resource for worker pod. (apache#47)

* [EWT-569] Airflow Upgrade to 1.10.14 [CP] from 1.10.4+twtr : 9b58c88
[CP][EWT-302]Patch Pool.DEFAULT_POOL_NAME in BaseOperator (apache#8587) (apache#49)

Open source commit id: b37ce29

* [EWT-569] Airflow Upgrade to 1.10.14 [CP] from 1.10.4+twtr : 7b52a71
[CP][AIRFLOW-3121] Define closed property on StreamLogWriter (apache#3955) (apache#52)

CP of 2d5b8a5

* [EWT-361] Fix broken regex pattern for extracting dataflow job id (apache#51)

Update the dataflow URL regex as per AIRFLOW-9323

* [EWT-569] Airflow Upgrade to 1.10.14 [CP] from 1.10.4+twtr : 4b5b977
EWT-370: Use python3 to launch the dataflow job. (apache#53)

* [EWT-569] Airflow Upgrade to 1.10.14 [CP] from 1.10.4+twtr : 596e24f
* [EWT-450] fixing sla miss triggering duplicate alerts every minute (apache#56)

* [EWT-569] Airflow Upgrade to 1.10.14 [CP] from 1.10.4+twtr : b3d7fb4
[CP] Handle IntegrityErrors for trigger dagruns & add Stacktrace when DagFileProcessorManager gets killed (apache#57)

CP of faaf179 - from master
CP of 2102122 - from 1.10.12

* [EWT-569] Airflow Upgrade to 1.10.14 [CP] from 1.10.4+twtr : bac4acd
[TWTR][EWT-472] Add lifecycle support while launching worker pods (apache#59)

* [EWT-569] Airflow Upgrade to 1.10.14 [CP] from 1.10.4+twtr : 6162402
[TWTTR] Don't enqueue tasks again if already queued for K8sExecutor(apache#60)

Basically reverting commit 87fcc1c  and making changes specifically into the Celery Executor class only.

* [EWT-569] Airflow Upgrade to 1.10.14 [CP] from 1.10.4+twtr : 1991419
[CP][TWTR][EWT-377] Fix DagBag bug when a Dag has invalid schedule_interval (apache#61)

CP of 5605d10 & apache#11462

* [EWT-569] Airflow Upgrade to 1.10.14 [CP] from 1.10.4+twtr : 48be0f9
[TWTR][EWT-350] Reverting the last commit partially (apache#62)

* [EWT-569] Airflow Upgrade to 1.10.14 [CP] from 1.10.4+twtr : d8c473e
[CP][EWT-548][AIRFLOW-6527] Make send_task_to_executor timeout configurable (apache#63)

CP of f757a54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants