Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Button on UI to pause all DAGs #36994

Open
1 of 2 tasks
yshkp opened this issue Jan 24, 2024 · 15 comments
Open
1 of 2 tasks

Button on UI to pause all DAGs #36994

yshkp opened this issue Jan 24, 2024 · 15 comments
Assignees
Labels

Comments

@yshkp
Copy link

yshkp commented Jan 24, 2024

Description

Slider button on UI of airflow to pause/unpause all DAGs.

Use case/motivation

During deployment, every time there is a use case to pause all dags to not run tasks in the old build. So this button will come handy during deployment.

Related issues

No response

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@yshkp yshkp added kind:feature Feature Requests needs-triage label for new issues that we didn't triage yet labels Jan 24, 2024
Copy link

boring-cyborg bot commented Jan 24, 2024

Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval.

@potiuk potiuk added good first issue and removed needs-triage label for new issues that we didn't triage yet labels Jan 24, 2024
@coolbeans201
Copy link
Contributor

If there's a button for this in the UI, we'd definitely want to make sure that only the highest level of roles within a given server have access to this feature. That's probably assumed and would be taken care of, but just want to state the obvious.

@cnkumar20
Copy link

cnkumar20 commented Jan 26, 2024

this is not ideal if the UI button doesn't revert dags back to its original state on unpause. having list of dags as params should be strictly required for this for pause /unpause to original state. should be covered in the caveat

@rsdel2007
Copy link

Hey @potiuk I am willing to work on this.

@potiuk
Copy link
Member

potiuk commented Jan 29, 2024

Cool

@eladkal
Copy link
Contributor

eladkal commented Jan 30, 2024

I would rather this not be a UI feature. This is very dangerous operation and with a single click of a button that was done by mistake can cause serious problems to large deployments.

If this must be a UI button it must come with seperated role with default of usage only to Admin. However I would rather this functionality to be API only. To my prespective (and correct me if I am wrong) this operation should be used rarely so having it as button in the main page might be not the best?

WDYT?

@SamWheating
Copy link
Contributor

Agreed that this is dangerous as the implied "unpause all" can resume DAGs which were previously paused for good reason.

As an admin, I have the option to scale the scheduler down to 0 replicas (preventing any runs from advancing) or setting AIRFLOW__SCHEDULER__USE_JOB_SCHEDULE=False (preventing any new DAG Runs from being created). So far this has been sufficient, but maybe there's a use-case I'm missing here.

@cmarteepants
Copy link
Collaborator

cmarteepants commented Jan 31, 2024

What if we switched from a toggle per dag to a multi-selector with a pause/unpause action button? Similar to what we have here:
image

It's no longer one-click so it would be harder to accidentally pause all dags, and you can pause a subset more easily if you want. However, it does mean more clicks if you're only pausing one dag.

FWIW, I think the env var is fine for the limited times you need to do this but I can appreciate that people like buttons. I may also missing a use-case.

@rsdel2007
Copy link

@potiuk I agree with @cmarteepants suggestion. This seems to be the right way. WDYT?

@potiuk
Copy link
Member

potiuk commented Feb 5, 2024

I am fine with that. @eladkal ?

@eladkal
Copy link
Contributor

eladkal commented Feb 6, 2024

What if we switched from a toggle per dag to a multi-selector with a pause/unpause action button? Similar to what we have here:

Can you be specific into what view you are referring to?
You mentioned similar to the task instance view can you explain which view you had in mind? The tasks instance and the dag run views are not suitable for this button.

@cmarteepants
Copy link
Collaborator

I agree - the task instance and dag run view are not the right the place - it was the only view I could find that had a multi-selector. My idea was to replace the individual toggles to pause/unpause dags with a multi-selector + action button in the home page / dag list view.

@cmarteepants
Copy link
Collaborator

Thinking about this further: Another idea is an extension of @yshkp's original proposal, which takes into account whether individual dags were paused before globally pausing all dags. We proceed with a button to pause/unpause all dags, however:

  • When dags are paused, we throw a warning box to confirm that this is intentional
  • We add a banner saying that all dags are paused
  • We disable all the pause/unpause toggles, but leave them in their current position. This would act as a visual identifier to see whether a dag was paused before someone globally paused all dags
  • Dags have to be unpaused via the same button.
  • When dags are unpaused, we enable all the toggles and dags will revert back to their individual paused/unpaused states

To be fair, this is becoming a lot more complicated.

@eladkal
Copy link
Contributor

eladkal commented Feb 6, 2024

To be fair, this is becoming a lot more complicated

I belive its complicated because you are combining the business logic of the dag with adminstative operation of upgrade. DAG author should not have business logic impacted (on/off) because cluster admin decided to make an upgrade. Meaning that if you switch my dag to off for upgrade then you need to turn it on once upgrade is done how would you do that for a deployemnt of 1000 dags authored by 200 different developers?

I think you are consedring only deployments where the dag author and the cluster admin are the same person :)
This is why I say it is a dangerous operation and since the use case you described is for upgrade I suggested to not make it a UI button but API call.

I am happy to discuss ideas of how to accommodate your request but you need to understand that we have to consider all kinds of deployments and the implication of such button would make.

I would be more happy with adding such operation on a folder base rather than all dags once #24464 is implemented.

(On side note I still don't quite follow why termination of scheduler is not good enough to handle your use case)

@cmarteepants
Copy link
Collaborator

I'll defer to others on the use-cases. These were just two drive-by suggestions on how we can add UI capability for this while minimizing risk, assuming this is something the community truly wants :) Personally, I think the existing env var solution is fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

8 participants