Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix "run_id" k8s and elasticsearch compatibility with Airflow 2.1 #22385

Merged
merged 1 commit into from Mar 22, 2022

Conversation

potiuk
Copy link
Member

@potiuk potiuk commented Mar 20, 2022

The execution_date -> run_id change (#21960) attempted to make it
Airflow 2.1 backwards-compatible, but the problem is that in
Airflo2 2.1 retrieving run_id attribute of TaskInstance throws
AttributeError rather than returns None. It turns out that when
you have a field defined in an ORM model, it will never throw
AtributeError (even if you delete the attribute it will return
None.

Accesising run_id with getattr raises
AttributeError in Airflow 2.1 (because there TaskInstance has no
run_id defined).

This PR adds automated pre-commit to check if other providers
have not suffered (and will not suffer) the same problem.


^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.

@potiuk
Copy link
Member Author

potiuk commented Mar 20, 2022

I need to re-release the providers due to bug in install_requires found (#22380 ) but before that, I would like to also get that one fixed (it was raised by one of the users in https://apache-airflow.slack.com/archives/CCV3FV9KL/p1647519576375459

@potiuk
Copy link
Member Author

potiuk commented Mar 20, 2022

Looks like it works :)

@potiuk potiuk force-pushed the add-run-id-fix-for-airflow-2-1 branch from 080ff01 to 5ecb889 Compare March 21, 2022 09:29
@potiuk potiuk requested a review from mik-laj as a code owner March 21, 2022 09:29
@potiuk potiuk force-pushed the add-run-id-fix-for-airflow-2-1 branch 2 times, most recently from 764c8f7 to 413cd7b Compare March 21, 2022 09:32
@potiuk potiuk changed the title Fix cncf.kubernetes provider compatibility with Airflow 2.1 Fix "run_id" k8s and elasticsearch compatibility with Airflow 2.1 Mar 21, 2022
@potiuk potiuk force-pushed the add-run-id-fix-for-airflow-2-1 branch 2 times, most recently from 400c696 to bcf7dab Compare March 21, 2022 10:27
The execution_date -> run_id change (apache#21960) attempted to make it
Airflow 2.1 backwards-compatible, but the problem is that in
Airflo2 2.1 retrieving `run_id` attribute of TaskInstance throws
AttributeError rather than returns None. It turns out that when
you have a field defined in an ORM model, it will never throw
AtributeError (even if you delete the attribute it will return
None.

Accesising `run_id` with getattr raises
AttributeError in Airflow 2.1 (because there TaskInstance has no
run_id defined).

This PR adds automated pre-commit to check if other providers
have not suffered (and will not suffer) the same problem.
@potiuk potiuk force-pushed the add-run-id-fix-for-airflow-2-1 branch from bcf7dab to 89a991d Compare March 21, 2022 10:30
@potiuk
Copy link
Member Author

potiuk commented Mar 21, 2022

I also removed the unit test (it's not really needed now and it was pretty "convoluted" anyway).

@potiuk
Copy link
Member Author

potiuk commented Mar 21, 2022

All green now.

@@ -129,7 +129,7 @@ def _render_log_id(self, ti: TaskInstance, try_number: int) -> str:
return self.log_id_template.format(
dag_id=ti.dag_id,
task_id=ti.task_id,
run_id=ti.run_id,
run_id=getattr(ti, "run_id", ""),
Copy link
Member

@kaxil kaxil Mar 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work with an empty string as a default?

cc @jedcunningham @uranusjr

Copy link
Member Author

@potiuk potiuk Mar 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's only formatting hte log entry so I "guess" it produces proper log entry. But good point - I do not know if the es logs will be properly parsed by ES engine then (though having unparseable logs is still infintely better than crashing airflow in this case :D)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jedcunningham @uranusjr 🙏 - I'd love to re-release providers asap as the ones we have install gitpython and wheel due to my sloppines :(

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @jedcunningham @uranusjr @kaxil -> I'd really love to merge it and re-release the providers soon. The "extra packages" in the last list need to go away and the cncf.kuberbnetes starts to be a problem for users of Airflow 2.1 (likely this one https://apache-airflow.slack.com/archives/CCV3FV9KL/p1647936437392429)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved the PR but I still don't know about this and haven't dug deep

Copy link
Member Author

@potiuk potiuk Mar 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well. It certainly won't be worse than crashing when you try to send an elasticsearch log :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what 2.1 users would experience if they run this provider version.

@potiuk
Copy link
Member Author

potiuk commented Mar 22, 2022

Anyone ?

@github-actions github-actions bot added the full tests needed We need to run full set of tests for this PR to merge label Mar 22, 2022
@github-actions
Copy link

The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease.

@potiuk potiuk merged commit 0f977da into apache:main Mar 22, 2022
@potiuk potiuk deleted the add-run-id-fix-for-airflow-2-1 branch March 22, 2022 21:01
@ephraimbuddy ephraimbuddy added type:misc/internal Changelog: Misc changes that should appear in change log type:bug-fix Changelog: Bug Fixes and removed type:misc/internal Changelog: Misc changes that should appear in change log labels Apr 11, 2022
@potiuk potiuk restored the add-run-id-fix-for-airflow-2-1 branch April 26, 2022 20:49
pankajkoti added a commit to astronomer/astronomer-providers that referenced this pull request May 4, 2022
…ion_date

As part of PR apache/airflow#22385 released
in Airflow 2.3.0, TaskInstance now mandatorily needs to have either
of run_id or execution_date. Add the same to the failing test to make
it working.
@potiuk potiuk deleted the add-run-id-fix-for-airflow-2-1 branch July 29, 2022 20:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:providers full tests needed We need to run full set of tests for this PR to merge provider:cncf-kubernetes Kubernetes provider related issues type:bug-fix Changelog: Bug Fixes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants