Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable pools to consider deferred tasks #32709

Merged
merged 16 commits into from
Aug 9, 2023

Conversation

Usiel
Copy link
Contributor

@Usiel Usiel commented Jul 20, 2023

closes: #21243

This is my attempt at solving the issue raised in the linked discussion. With this change we would introduce a new flag on Pool, that allows an admin to decide whether to include tasks in deferred state in the pool open slot calculation or not. By default we would not consider them, which mirrors the current behavior.

As mentioned by various people on the discussion, having this as an option can be quite useful, especially in cases where the system that is "shielded" by the pool is not completely in the control of the developer. This allows us to (mis)use pools as a rate-limiting device even when using the new deferrable operators.

@boring-cyborg boring-cyborg bot added area:API Airflow's REST/HTTP API area:Scheduler Scheduler or dag parsing Issues area:webserver Webserver related Issues kind:documentation labels Jul 20, 2023
@boring-cyborg
Copy link

boring-cyborg bot commented Jul 20, 2023

Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst)
Here are some useful points:

  • Pay attention to the quality of your code (ruff, mypy and type annotations). Our pre-commits will help you with that.
  • In case of a new feature add useful documentation (in docstrings or in docs/ directory). Adding a new operator? Check this short guide Consider adding an example DAG that shows how users should use it.
  • Consider using Breeze environment for testing locally, it's a heavy docker but it ships with a working Airflow and a lot of integrations.
  • Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
  • Please follow ASF Code of Conduct for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack.
  • Be sure to read the Airflow Coding style.
    Apache Airflow is a community-driven project and together we are making it better 🚀.
    In case of doubts contact the developers at:
    Mailing List: dev@airflow.apache.org
    Slack: https://s.apache.org/airflow-slack

@Usiel Usiel force-pushed the usiel/allow-pools-to-consider-deferrable branch from 5cc1eb0 to a64938f Compare July 20, 2023 10:19
@Usiel
Copy link
Contributor Author

Usiel commented Jul 20, 2023

Rebased on main and fixed the failed migration and ERD pre-commit check.

@Usiel Usiel force-pushed the usiel/allow-pools-to-consider-deferrable branch 2 times, most recently from 6a1d3b0 to 3c31c45 Compare July 24, 2023 04:44
@@ -20,7 +20,7 @@ Deferrable Operators & Triggers

Standard :doc:`Operators </core-concepts/operators>` and :doc:`Sensors <../core-concepts/sensors>` take up a full *worker slot* for the entire time they are running, even if they are idle; for example, if you only have 100 worker slots available to run Tasks, and you have 100 DAGs waiting on a Sensor that's currently running but idle, then you *cannot run anything else* - even though your entire Airflow cluster is essentially idle. ``reschedule`` mode for Sensors solves some of this, allowing Sensors to only run at fixed intervals, but it is inflexible and only allows using time as the reason to resume, not anything else.

This is where *Deferrable Operators* come in. A deferrable operator is one that is written with the ability to suspend itself and free up the worker when it knows it has to wait, and hand off the job of resuming it to something called a *Trigger*. As a result, while it is suspended (deferred), it is not taking up a worker slot and your cluster will have a lot less resources wasted on idle Operators or Sensors.
This is where *Deferrable Operators* come in. A deferrable operator is one that is written with the ability to suspend itself and free up the worker when it knows it has to wait, and hand off the job of resuming it to something called a *Trigger*. As a result, while it is suspended (deferred), it is not taking up a worker slot and your cluster will have a lot less resources wasted on idle Operators or Sensors. Note that by default a deferred tasks will not use up a pool slot, if you would like them to, you can change this by editing the pool in question.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in the following sentence ("task" should be singular)

Note that by default a deferred tasks will not use up a pool slot, if you would like them to, you can change this by editing the pool in question.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, fixed by making everything plural

@Usiel Usiel force-pushed the usiel/allow-pools-to-consider-deferrable branch from 3c31c45 to 20460d5 Compare July 25, 2023 02:04
Copy link
Member

@uranusjr uranusjr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we could add some help text in the pool form to explain what the flag means, it’s not as straightfoward as other fields. I’m not sure how to do that though, probably via some WTForm trick.

airflow/models/pool.py Outdated Show resolved Hide resolved
airflow/models/pool.py Outdated Show resolved Hide resolved
airflow/models/pool.py Outdated Show resolved Hide resolved
@Usiel
Copy link
Contributor Author

Usiel commented Jul 25, 2023

I wonder if we could add some help text in the pool form to explain what the flag means, it’s not as straightfoward as other fields. I’m not sure how to do that though, probably via some WTForm trick.

Looks like Flask AppBuilder has a feature for that. I'm open for suggestions if someone has a better help text:
image
(will push this shortly after addressing the other comments)

@uranusjr
Copy link
Member

I think you also need to remove the required marker on the checkbox field (not sure how), otherwise the form would not allow the box to be unchecked.

@Usiel
Copy link
Contributor Author

Usiel commented Jul 25, 2023

I think you also need to remove the required marker on the checkbox field (not sure how), otherwise the form would not allow the box to be unchecked.

Ah, that explains why the test is failing. Could have sworn I ran this test before, but neglected to run it again after changing the column to nullable=False on Pool.

I would tend to revert that decision and instead make the column nullable with a default similar to Variable.is_encrypted or Connection.is_[extra_]encrypted. This avoids hacking around Flask AppBuilder's default of making non-nullable columns required. What do you think @uranusjr?

@uranusjr
Copy link
Member

Making it nullable would cause the unchecked case to return NULL instead of FALSE, which is not a good idea. NULL in SQL is just bad. I’m pretty sure there’s a way to make a checkbox return true/false instead.

@Usiel Usiel force-pushed the usiel/allow-pools-to-consider-deferrable branch from 20460d5 to 8ce2737 Compare July 25, 2023 05:45
@Usiel
Copy link
Contributor Author

Usiel commented Jul 25, 2023

Making it nullable would cause the unchecked case to return NULL instead of FALSE, which is not a good idea. NULL in SQL is just bad. I’m pretty sure there’s a way to make a checkbox return true/false instead.

OK, managed to get it working by overriding the form field. Using an additional validator in validators_columns was not sufficient, as the DataRequired (due to nullable=False) is still appended and has some effects for the form template.
https://github.com/apache/airflow/pull/32709/files#diff-917b70d2661fae8322c538a4f60b9d4ba755556036fe0c527eb0411a6ac60f2eR5051-R5056

@Usiel Usiel force-pushed the usiel/allow-pools-to-consider-deferrable branch from 8186261 to c0827b3 Compare July 25, 2023 06:02
@Usiel
Copy link
Contributor Author

Usiel commented Jul 25, 2023

Couple of places that break due to having no default value. I'll fix those and push once the full test suite completes locally. Sorry for that!

@potiuk potiuk force-pushed the usiel/allow-pools-to-consider-deferrable branch from 0965969 to 8131f43 Compare August 9, 2023 10:35
@eladkal
Copy link
Contributor

eladkal commented Aug 9, 2023

Tests are failing:

FAILED tests/cli/commands/test_pool_command.py::TestCliPools::test_pool_list - TypeError: create_pool() got an unexpected keyword argument 'include_deferred'
FAILED tests/cli/commands/test_pool_command.py::TestCliPools::test_pool_list_with_args - IndexError: tuple index out of range
FAILED tests/cli/commands/test_pool_command.py::TestCliPools::test_pool_create - TypeError: create_pool() got an unexpected keyword argument 'include_deferred'
FAILED tests/cli/commands/test_pool_command.py::TestCliPools::test_pool_update_deferred - TypeError: create_pool() got an unexpected keyword argument 'include_deferred'
FAILED tests/cli/commands/test_pool_command.py::TestCliPools::test_pool_get - TypeError: create_pool() got an unexpected keyword argument 'include_deferred'
FAILED tests/cli/commands/test_pool_command.py::TestCliPools::test_pool_delete - TypeError: create_pool() got an unexpected keyword argument 'include_deferred'
FAILED tests/cli/commands/test_pool_command.py::TestCliPools::test_pool_import_backwards_compatibility - TypeError: create_pool() got an unexpected keyword argument 'include_deferred'
FAILED tests/cli/commands/test_pool_command.py::TestCliPools::test_pool_import_export - TypeError: create_pool() got an unexpected keyword argument 'include_deferred'

@potiuk
Copy link
Member

potiuk commented Aug 9, 2023

Yeah. My bad. In fact the client is still used internally by pool command it seems. So yeah. My change was too much - will remove it and only remove stuff from experimental client

@potiuk potiuk force-pushed the usiel/allow-pools-to-consider-deferrable branch from 8131f43 to 260d51e Compare August 9, 2023 11:47
@potiuk
Copy link
Member

potiuk commented Aug 9, 2023

Re-pushed original change - the back compatibiliyt is not needed there and the change was good as it was! I completely forgot that we used to use the "CLI API" before - but it's really not part of the public API since 2.0

@potiuk
Copy link
Member

potiuk commented Aug 9, 2023

Let's just run tests again after rebase and merge it.

@potiuk potiuk merged commit 70a050b into apache:main Aug 9, 2023
40 checks passed
@boring-cyborg
Copy link

boring-cyborg bot commented Aug 9, 2023

Awesome work, congrats on your first merged pull request! You are invited to check our Issue Tracker for additional contributions.

potiuk pushed a commit to potiuk/airflow that referenced this pull request Aug 9, 2023
We should change the defaults to client defaults for MSSQL compatibility
The change in apache#32709 was not liked by MSSQL
potiuk added a commit that referenced this pull request Aug 9, 2023
We should change the defaults to client defaults for MSSQL compatibility
The change in #32709 was not liked by MSSQL

Co-authored-by: Ephraim Anierobi <splendidzigy24@gmail.com>
ephraimbuddy pushed a commit that referenced this pull request Aug 9, 2023
* Makes pools respect deferrable tasks (with extra setting)

See #21243

This commit makes pools consider deferred tasks if the `include_deferred` flag is set. By default a pool will not consider deferred tasks as occupied slots, but still show the number of deferred tasks in its stats.

---------

Co-authored-by: Tzu-ping Chung <uranusjr@gmail.com>
(cherry picked from commit 70a050b)
ephraimbuddy pushed a commit that referenced this pull request Aug 9, 2023
We should change the defaults to client defaults for MSSQL compatibility
The change in #32709 was not liked by MSSQL

Co-authored-by: Ephraim Anierobi <splendidzigy24@gmail.com>
(cherry picked from commit d989e9d)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:API Airflow's REST/HTTP API area:Scheduler Scheduler or dag parsing Issues area:webserver Webserver related Issues kind:documentation type:new-feature Changelog: New Features
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants