Skip to content

http patched an in-progress dagRun to state=failed, but the change was clobbered by a subsequent TI state change. #18363

@MatrixManAtYrService

Description

@MatrixManAtYrService

Apache Airflow version

2.2.0b1 (beta snapshot)

Operating System

debian buster

Versions of Apache Airflow Providers

n/a

Deployment

Astronomer

Deployment details

astro dev start image: quay.io/astronomer/ap-airflow-dev:2.2.0-buster-43897

What happened

I patched a dagRun's state to failed while it was running.

curl -i -X PATCH "http://localhost:8080/api/v1/dags/steady_task_stream/dagRuns/scheduled__2021-09-18T00:00:00+00:00"   -H 'Content-Type: application/json' --user 'admin:admin' \
 -d '{ "state": "failed" }'

For a brief moment, I saw my change reflected in the UI. Then auto-refresh toggled itself to disabled. When I re-enabled it, the dag_run the state: "running".

new_api

Presumably this happened because the new API only affects the dagRun. The tasks keep running, so they updated dagRun state and overwrite the patched value.

What you expected to happen

The UI will not let you set a dagrun's state without also setting TI states. Here's what it looks like for "failed":
ui

Notice that it forced me to fail a task instance. This is what prevents the dag from continuing onward and clobbering my patched value. It's similar when you set it to "success", except every child task gets set to "success".

At the very least, if the API lets me patch the dagrun state then it should also lets me patch the TI states. This way an API user can patch both at once and prevent the clobber. (/api/v1/dags/{dag_id}/dagRuns/{dag_run_id}/taskInstances/{task_id} doesn't yet support the patch verb, so I can't currently do this).

Better still would be for the API to make it easy for me to do what the UI does. For instance, patching a dagrun with:

{ "state": "failed" }

Should return code 409 and a message like:

To patch steady_task_stream.scheduled__2021-09-18T00:00:00+00:00.state to success, you must also make the following changes: { "task_1":"success", "task_2":"success", ... }. Supply "update_tasks": "true" to do this.

And updating it with:

{ "state": "failed",
  "update_tasks": "true"}

Should succeed and provide feedback about which state changes occurred.

How to reproduce

Any dag will do here, but here's the one I used:
https://gist.github.com/MatrixManAtYrService/654827111dc190407a3c81008da6ee16

Be sure to run it in an airflow that has #17839, which introduced the patch functionality that makes it possible to reach this bug.

  • Unpause the dag
  • Make note of the execution date
  • Delete the dag and wait for it to repopulate in the UI (you may need to refresh).
  • Prepare this command in your terminal, you may need to tweak the dag name and execution date to match your scenario:
curl -i -X PATCH "http://localhost:8080/api/v1/dags/steady_task_stream/dagRuns/scheduled__2021-09-18T00:00:00+00:00"   -H 'Content-Type: application/json' --user 'admin:admin' \
 -d '{ "state": "failed" }'
  • Unpause the dag again
  • Before it completes, run the command to patch its state to "failed"

Note that as soon as the next task completes, your patched state has been overwritten and is now "running" or maybe "success"

Anything else

@ephraimbuddy, @kaxil mentioned that you might be interested in this.

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions