New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Fix teardown scheduling if indirect upstream is cleared #38788

Draft

jscheffl wants to merge 3 commits into apache:main from jscheffl:bugfix/29332-fix-teardown-with-direct-upsteam-success

+164 −70

Contributor

jscheffl commented Apr 5, 2024 •

edited

Loading

This PR fixes the calculation of the scheduling decision if upstream to a teardown task an individual task is reset (w/o resetting all downstream!) but the direct upstream tasks of the teardown are completed.

As reported in #29332 in such case as direct upstream dependencies of teardown tasks were completed, teardown was starting too early.

With this fix teardown scheduling evaluates all upstream dependencies if any upstream is not completed.

How to test?

Use the DAG in the comments of the bug report Ensure teardown doesn't run until all other tasks complete #29332 and deploy this
Run this DAG (at least once) till success
Clear the first normal task w/o clearing downstream
See that teardown is scheduled only if the cleared task is completed (before this PR it is immediately scheduled after setup was completed in parallel to the actual cleared task

closes: #29332

jscheffl added area:Scheduler type:bug-fix AIP-52 labels

jscheffl requested review from ashb, uranusjr, dstandish and jedcunningham

April 5, 2024 22:24

jscheffl changed the title ~~Fix teardown if indirect upstream is cleared~~ Fix teardown scheduling if indirect upstream is cleared

eladkal added this to the Airflow 2.9.1 milestone

jscheffl added 2 commits

April 6, 2024 11:23


          Fix teardown if indirect upstream is cleared

606a588


          Attempt to fix pytests

31f21f2

jscheffl force-pushed the bugfix/29332-fix-teardown-with-direct-upsteam-success branch from bc7dfae to 31f21f2 Compare

April 6, 2024 16:38


          Attempt to fix pytests

75c6724

jscheffl marked this pull request as draft

April 6, 2024 16:42

jscheffl commented

View reviewed changes

Contributor Author

jscheffl left a comment

Unfortunately I fail miserably fixing the pytests, I need expert knowledge to fix this it seems :-(

tests/models/test_taskinstance.py

 ,
-                              _UpstreamTIStates(6, 0, 1, 0, 0, 7, 1, 0),
+                              [SUCCESS, SUCCESS, SUCCESS, SUCCESS, SUCCESS, SUCCESS, FAILED],

Contributor Author

jscheffl Apr 6, 2024

This pytest fails because it seems the dependency context is returned empty after evaluation. I am not sure why. I assume something is not "correctly mocked".

tests/models/test_taskinstance.py

 ,
-                              _UpstreamTIStates(6, 1, 0, 0, 0, 7, 1, 1),
+                              [SUCCESS, SUCCESS, SUCCESS, SUCCESS, SUCCESS, SUCCESS, SKIPPED],

Contributor Author

jscheffl Apr 6, 2024

This pytest fails because it seems the dependency context is returned empty after evaluation. I am not sure why. I assume something is not "correctly mocked".

tests/models/test_taskinstance.py

+                              True,
+                              (False, "skipped"),  # is_teardown=False, expect_state="skipped'
+                              True,
+                              id="not teardown all done one skipped",

Contributor Author

jscheffl Apr 6, 2024

In this test I would have expected a skipped result but it is returning with Noneafter evaluation. I assume some mocks are wrong here :-(

tests/models/test_taskinstance.py

                           for i in range(upstream_setups):
                               task = EmptyOperator(task_id=f"setup_{i}", dag=dag).as_setup()
-                              task.set_downstream(downstream)
+                              task.set_downstream(work_tasks)

Contributor Author

jscheffl Apr 6, 2024

Note: I re-modelled the test DAG such that the setup tasks are in front of the worker tasks, else the setup is directly linked to downstream teardown which (1) does not make sense and (2) is explicitly not handled.

Contributor

dstandish May 8, 2024

i don't understand. it might be best to make a separate PR for your test refactor without simultaneously changing the code. then followup with the real code change (and the test changes required just for that will be easier to see)

ephraimbuddy modified the milestones: Airflow 2.9.1, Airflow 2.9.2

dstandish reviewed

View reviewed changes

tests/models/test_taskinstance.py

-                          ["one_success", 0, _UpstreamTIStates(0, 0, 5, 0, 0, 5, 0, 0), True, State.UPSTREAM_FAILED, False],
-                          ["one_success", 0, _UpstreamTIStates(0, 0, 4, 1, 0, 5, 0, 0), True, State.UPSTREAM_FAILED, False],
-                          ["one_success", 0, _UpstreamTIStates(0, 0, 0, 5, 0, 5, 0, 0), True, State.UPSTREAM_FAILED, False],
+                          ["one_success", 0, [SUCCESS, SUCCESS, SUCCESS, SUCCESS, SUCCESS], True, None, True],

Contributor

dstandish May 8, 2024

i'm not a super huge fan of the scrutability of the "old" way, but it's hard to see how the new way tests the same thing. like i said below, i think best to separate test change from code+test change

utkarsharma2 modified the milestones: Airflow 2.9.2, Airflow 2.9.3

utkarsharma2 modified the milestones: Airflow 2.9.3, Airflow 2.10.0

ephraimbuddy modified the milestones: Airflow 2.10.0, Airflow 2.11.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

AIP-52 area:Scheduler type:bug-fix