Skip to content

[AIRFLOW-6874] Reap cgroup procs when terminate in cgroup taskrunner#7498

Closed
YingboWang wants to merge 1 commit intoapache:masterfrom
YingboWang:AIRFLOW-6847-reap-cgroup-procs
Closed

[AIRFLOW-6874] Reap cgroup procs when terminate in cgroup taskrunner#7498
YingboWang wants to merge 1 commit intoapache:masterfrom
YingboWang:AIRFLOW-6847-reap-cgroup-procs

Conversation

@YingboWang
Copy link
Contributor

@YingboWang YingboWang commented Feb 22, 2020


Issue link: AIRFLOW-6874

Make sure to mark the boxes below before creating PR: [x]

Many airflow tasks create subprocesses and these subprocesses may create more subprocesses. In our experience, there is a risk that although a task failed and tried to reap the process group, there are still left over processes running and cause issues with both resources and correctness.

Propose to improve the cgroup task runner to reap all processes for current node on node termination.

  • Description above provides context of the change
  • Commit message/PR title starts with [AIRFLOW-NNNN]. AIRFLOW-NNNN = JIRA ID*
  • Unit tests coverage for changes (not needed for documentation changes)
  • Commits follow "How to write a good git commit message"
  • Relevant documentation is updated including usage instructions.
  • I will engage committers as explained in Contribution Workflow Example.

* For document-only changes commit message can start with [AIRFLOW-XXXX].


In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.
Read the Pull Request Guidelines for more information.

@boring-cyborg boring-cyborg bot added the area:Scheduler including HA (high availability) scheduler label Feb 22, 2020
Copy link
Member

@KevinYang21 KevinYang21 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very valuable change. Can we have some unit test around this change and have a green light on the CI?

@stale
Copy link

stale bot commented Apr 25, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale Stale PRs per the .github/workflows/stale.yml policy file label Apr 25, 2020
@stale stale bot closed this May 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:Scheduler including HA (high availability) scheduler stale Stale PRs per the .github/workflows/stale.yml policy file

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants