Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix DetachedInstanceError when dag_run attrs are accessed from ti #18499

Merged
merged 3 commits into from
Sep 27, 2021

Conversation

ephraimbuddy
Copy link
Contributor

Fix DetachedInstanceError when dag_run attrs are accessed from ti

Loading a taskinstance doesn't load the corresponding dag_run with
the effect that when the dag_run attr is accessed from the ti it gives
a DetachedInstanceError, dag_run is not bound to a session.

This PR fixes it by making an extra query to get the dagrun using the
relationship between the two objects

cc: @ash @uranusjr
I think with this, we can also get rid of ti.get_dagrun method?


^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.

@ashb
Copy link
Member

ashb commented Sep 24, 2021

Hmmmmm, the selectin to the relationship is probably safer, but in cases where we know we are going to do this it's better to add options(eagerload(TI.dag_run)) etc to avoid the extra query.

No we can't remove it -- we can deprecate it but it can't be removed until 3.0

(12, False),
(5, True),
(13, False),
(7, True),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If possible we don't want these query counts to change.

@ashb
Copy link
Member

ashb commented Sep 24, 2021

@ephraimbuddy A fix like #18359 might also fix this problem, as might #18503

I think having this "fallback" to loading is better than an error, but by doing it like one of these other two it should avoid an extra query.

@ephraimbuddy
Copy link
Contributor Author

It's not obvious that ti.dag_run is a detached object

@ephraimbuddy A fix like #18359 might also fix this problem, as might #18503

I think having this "fallback" to loading is better than an error, but by doing it like one of these other two it should avoid an extra query.

Yeah. You're right.

@ephraimbuddy
Copy link
Contributor Author

ephraimbuddy commented Sep 24, 2021

The cause can easily be fixed by using ti.get_dagrun instead of ti.dag_run, see

dag_run = ti.dag_run
if self.json_format:
data_interval_start = self._clean_date(dag_run.data_interval_start)
data_interval_end = self._clean_date(dag_run.data_interval_end)
execution_date = self._clean_date(dag_run.execution_date)
else:
data_interval_start = dag_run.data_interval_start.isoformat()
data_interval_end = dag_run.data_interval_end.isoformat()
execution_date = dag_run.execution_date.isoformat()

Should I fix it at the source of the error?
EDIT:
Working on a solution now

@ephraimbuddy
Copy link
Contributor Author

After not finding a way I could work this to load dag_run, this fix worked. I think it's better to use this instead of an extra query each time a task instance is loaded.

@ephraimbuddy
Copy link
Contributor Author

After not finding a way I could work this to load dag_run, this fix worked. I think it's better to use this instead of an extra query each time a task instance is loaded.

If this fix is Ok, I will modify the title and commit message

@github-actions
Copy link

The PR is likely OK to be merged with just subset of tests for default Python and Database versions without running the full matrix of tests, because it does not modify the core of Airflow. If the committers decide that the full tests matrix is needed, they will add the label 'full tests needed'. Then you should rebase to the latest main or amend the last commit of the PR, and push it with --force-with-lease.

@github-actions github-actions bot added the okay to merge It's ok to merge this PR as it does not require more tests label Sep 26, 2021
Copy link
Member

@ashb ashb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much more direct - any we want to deprecate uses of ti.executipn_date.

That said: does this mean that the old version of the ES provider won't work with Airflow 2.2?

We can fix that by setting the deps somewhere in provider_manager; or add a eagerload(TI.dag_run) in airflow core before hitting this code

@ephraimbuddy
Copy link
Contributor Author

@ashb, I applied this fix #18503 and it worked.
I also applied a fix where I attached dag_run to this ti:

ti = dag_run.get_task_instance(task.task_id)

ti.dag_run=dag_run did not work but ti.dag_run = ti.get_dagrun() worked.
I would go with #18503. What do you suggest?

Loading a taskinstance doesn't load the corresponding dag_run with
the effect that when the dag_run attr is accessed from the ti it gives
a DetachedInstanceError, dag_run is not bound to a session.

This PR fixes it by making an extra query to get the dagrun using the
relationship between the two objects
@ephraimbuddy
Copy link
Contributor Author

@ashb, I applied this fix #18503 and it worked.
I also applied a fix where I attached dag_run to this ti:

ti = dag_run.get_task_instance(task.task_id)

ti.dag_run=dag_run did not work but ti.dag_run = ti.get_dagrun() worked.
I would go with #18503. What do you suggest?

I found that there's an eager loading option in the relationship(used for an attribute that's guaranteed to have an element) and that reduced the query!

@github-actions github-actions bot added the full tests needed We need to run full set of tests for this PR to merge label Sep 27, 2021
@github-actions
Copy link

The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease.

@kaxil kaxil merged commit 80ae70c into apache:main Sep 27, 2021
@kaxil kaxil deleted the load-dagrun-obj branch September 27, 2021 19:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
full tests needed We need to run full set of tests for this PR to merge okay to merge It's ok to merge this PR as it does not require more tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants