Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

introduce dag processor job fix #27140 #28799

Merged
merged 1 commit into from
Feb 21, 2023

Conversation

farhan0syakir
Copy link
Contributor

@farhan0syakir farhan0syakir commented Jan 9, 2023

fixes: #27140


^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@farhan0syakir
Copy link
Contributor Author

fix #27140

@potiuk
Copy link
Member

potiuk commented Jan 18, 2023

Looks cool @farhan0syakir. But I have two asks:

  1. Did you test it with more than one DagFileProcessor running on two different subdirs - each of them should be a potentially separate job and have separate liveness probe.

  2. Could you please add more description in the commit than fixes: https://github.com/apache/airflow/issues/27140 ? The thing is that commits remain with us - while issues/PR history are kept in GitHub and we need to (as per ASF rules) keep all the context in Git, so we need to have more complete description o fhte contex and the fix in the commit.

@farhan0syakir farhan0syakir marked this pull request as draft January 21, 2023 12:32
@farhan0syakir
Copy link
Contributor Author

Hi @potiuk
let me answer your question.

  1. I didn't. However, when I tried that, I got another question, is it possible to specify subdir in the current helm values.yaml?If not, probably need to raise another issue(?). Also for the liveness I use airflow jobs check --hostname $(hostname) which specify hostname so it will check it separately
  2. I agree, I accidentally commited that in another branch, which I deleted now. In here the commit message is introduce dag processor job. Is it okay?

@farhan0syakir farhan0syakir marked this pull request as ready for review January 21, 2023 16:43
@potiuk
Copy link
Member

potiuk commented Feb 19, 2023

Sorry for the delay::

  1. I think it would be good to maka a PR with this change.
  2. Yeah. rebase and add it.

@mhenc - also can yoy please let us know what you think about it?

@mhenc
Copy link
Collaborator

mhenc commented Feb 21, 2023

Thank you for looking int that.
This change looks good for me.
TBH I haven't tested it with Helm (as we use other way of deploying workloads to K8s).

@potiuk potiuk merged commit 0018b94 into apache:main Feb 21, 2023
@boring-cyborg
Copy link

boring-cyborg bot commented Feb 21, 2023

Awesome work, congrats on your first merged pull request!

@pierrejeambrun pierrejeambrun added this to the Airflow 2.5.2 milestone Feb 27, 2023
@pierrejeambrun pierrejeambrun added the type:bug-fix Changelog: Bug Fixes label Feb 27, 2023
pierrejeambrun pushed a commit that referenced this pull request Mar 6, 2023
DagFileProcessorManager was not creating jobs in the metadata DB, so the livenessProbe was not valid.
A new is created for the Standalone DAG Processor.
By doing that, the airflow jobs check --hostname command would work correctly and the livenessProbe wouldn't fail

(cherry picked from commit 0018b94)
pierrejeambrun pushed a commit that referenced this pull request Mar 8, 2023
DagFileProcessorManager was not creating jobs in the metadata DB, so the livenessProbe was not valid.
A new is created for the Standalone DAG Processor.
By doing that, the airflow jobs check --hostname command would work correctly and the livenessProbe wouldn't fail

(cherry picked from commit 0018b94)
potiuk added a commit to potiuk/airflow that referenced this pull request Mar 24, 2023
The DagProcessorJob integration implemented in apache#28799 was not
complete. It missed a few crucial changes:

* importing DagProcessorJob in airflow/models/__init__.py - not
  importing it there caused `airflow jobs check` to fail, when
  querying DagProcessorJob in the BaseJob query, because
  the DagProcessorJob was not registered by the time the query
  was run (so polimorphic ORM model retrieval was not aware of
  DagProcessorJob model.

* airflow jobs check command did not have DagProcessorJob
  added as valid job type, so it was impossible to monitor for it

* also the processor manager did not set heartbeats periodically,
  so the Job for the DagFileProcessor was considered as not alive
  pretty quickly even if standalone dag-processor was running.

This PR fixes all three problems.

Fixes: apache#30251
pierrejeambrun pushed a commit that referenced this pull request Mar 24, 2023
The DagProcessorJob integration implemented in #28799 was not
complete. It missed a few crucial changes:

* importing DagProcessorJob in airflow/models/__init__.py - not
  importing it there caused `airflow jobs check` to fail, when
  querying DagProcessorJob in the BaseJob query, because
  the DagProcessorJob was not registered by the time the query
  was run (so polimorphic ORM model retrieval was not aware of
  DagProcessorJob model.

* airflow jobs check command did not have DagProcessorJob
  added as valid job type, so it was impossible to monitor for it

* also the processor manager did not set heartbeats periodically,
  so the Job for the DagFileProcessor was considered as not alive
  pretty quickly even if standalone dag-processor was running.

This PR fixes all three problems.

Fixes: #30251
pierrejeambrun pushed a commit that referenced this pull request Mar 24, 2023
The DagProcessorJob integration implemented in #28799 was not
complete. It missed a few crucial changes:

* importing DagProcessorJob in airflow/models/__init__.py - not
  importing it there caused `airflow jobs check` to fail, when
  querying DagProcessorJob in the BaseJob query, because
  the DagProcessorJob was not registered by the time the query
  was run (so polimorphic ORM model retrieval was not aware of
  DagProcessorJob model.

* airflow jobs check command did not have DagProcessorJob
  added as valid job type, so it was impossible to monitor for it

* also the processor manager did not set heartbeats periodically,
  so the Job for the DagFileProcessor was considered as not alive
  pretty quickly even if standalone dag-processor was running.

This PR fixes all three problems.

Fixes: #30251
(cherry picked from commit c858509)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:CLI area:Scheduler Scheduler or dag parsing Issues type:bug-fix Changelog: Bug Fixes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Invalid livenessProbe for Standalone DAG Processor
4 participants