Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clear tasks by task ids in REST API #14500

Merged
merged 1 commit into from Apr 7, 2021

Conversation

vemikhaylov
Copy link
Contributor

closes: #13225

@boring-cyborg boring-cyborg bot added the area:API Airflow's REST/HTTP API label Feb 27, 2021
@@ -1227,6 +1230,8 @@ def clear(
tis = tis.filter(or_(TI.state == State.FAILED, TI.state == State.UPSTREAM_FAILED))
if only_running:
tis = tis.filter(TI.state == State.RUNNING)
if task_ids:
tis = tis.filter(TI.task_id.in_(task_ids))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How this will work with

 tis = tis.filter(TI.task_id.in_(self.task_ids))

from L1197?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually the conditions are just added with conjunction:

# tst_double_in_query.py
from sqlalchemy.orm import Session

from airflow.models import TaskInstance

session = Session()
query = session.query(TaskInstance.task_id).filter(TaskInstance.task_id.in_(["foo"]))
print(str(query.statement))
query = query.filter(TaskInstance.task_id.in_(["bar"]))
print(str(query.statement))
$ python tst_double_in_query.py
First query:
SELECT task_instance.task_id
FROM task_instance
WHERE task_instance.task_id IN (:task_id_1)

Second query:
SELECT task_instance.task_id
FROM task_instance
WHERE task_instance.task_id IN (:task_id_1) AND task_instance.task_id IN (:task_id_2)

So the second filter narrows down the search space if task_ids are provided.

Naturally we can intersect the sets preliminary and apply the filter once, it can make the generated SQL code a little more efficient. Would it be better, how do you feel?

@vemikhaylov
Copy link
Contributor Author

I'm not sure whether we should check provided task_ids and raise the 400 or the 404 error if any of them doesn't exist for the given DAG or just apply the filter blindly.

Copy link
Contributor

@ephraimbuddy ephraimbuddy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@github-actions github-actions bot added the full tests needed We need to run full set of tests for this PR to merge label Mar 6, 2021
@github-actions
Copy link

github-actions bot commented Mar 6, 2021

The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest master at your convenience, or amend the last commit of the PR, and push it with --force-with-lease.

@ephraimbuddy
Copy link
Contributor

ephraimbuddy commented Mar 8, 2021

Please rebase and push

@vemikhaylov vemikhaylov force-pushed the feat/clear-tasks-task-id branch 3 times, most recently from 2801f9c to 005193d Compare March 11, 2021 09:58
@kurtqq
Copy link
Contributor

kurtqq commented Apr 5, 2021

@kaxil @ephraimbuddy gentle ping

@ephraimbuddy
Copy link
Contributor

Closing and reopening to trigger test runs

@ephraimbuddy ephraimbuddy reopened this Apr 5, 2021
@ephraimbuddy ephraimbuddy merged commit e150bbf into apache:master Apr 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:API Airflow's REST/HTTP API full tests needed We need to run full set of tests for this PR to merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Clear Tasks via the stable REST API with task_id filter
4 participants