Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cascade to DagRun/TaskInstance relationship #18434

Merged
merged 2 commits into from
Sep 22, 2021

Conversation

ephraimbuddy
Copy link
Contributor

We currently have an issue where deleting dagruns causes a dependency error in Sqlalchemy because
the session doesn't know what to do with the related taskinstances.

This PR adds cascade so that when a dagrun is marked for deletion, the related taskinstances
are also deleted

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/flask_appbuilder/models/sqla/interface.py", line 698, in delete
    self.session.commit()
  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/scoping.py", line 163, in do
    return getattr(self.registry(), name)(*args, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/session.py", line 1046, in commit
    self.transaction.commit()
  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/session.py", line 504, in commit
    self._prepare_impl()
  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/session.py", line 483, in _prepare_impl
    self.session.flush()
  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/session.py", line 2540, in flush
    self._flush(objects)
  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/session.py", line 2682, in _flush
    transaction.rollback(_capture_exception=True)
  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/util/langhelpers.py", line 70, in __exit__
    with_traceback=exc_tb,
  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
    raise exception
  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/session.py", line 2642, in _flush
    flush_context.execute()
  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/unitofwork.py", line 422, in execute
    rec.execute(self)
  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/unitofwork.py", line 538, in execute
    self.dependency_processor.process_deletes(uow, states)
  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/dependency.py", line 547, in process_deletes
    state, child, None, True, uowcommit, False
  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/dependency.py", line 604, in _synchronize
    sync.clear(dest, self.mapper, self.prop.synchronize_pairs)
  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/sync.py", line 88, in clear
    "column '%s' on instance '%s'" % (r, orm_util.state_str(dest))
AssertionError: Dependency rule tried to blank-out primary key column 'task_instance.dag_id' on instance '<TaskInstance at 0x7fbdca3bc3c8>

^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.

@@ -124,7 +124,7 @@ class DagRun(Base, LoggingMixin):
),
)

task_instances = relationship(TI, back_populates="dag_run")
task_instances = relationship(TI, back_populates="dag_run", cascade='all, delete, delete-orphan')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to documentation all implies delete so the latter is not needed. However, I don’t think we should do expunge (also implied by all). I’m also not sure about refresh-expire. Maybe a safer route would be save-update, merge, delete, delete-orphan (the first two are the default value).

https://docs.sqlalchemy.org/en/14/orm/relationship_api.html#sqlalchemy.orm.relationship.params.cascade

@ashb
Copy link
Member

ashb commented Sep 22, 2021

When did this start failing? Cos it has been passing for a while.

Oh new test added in #16634 that passedon PR but never worked on main :(

@github-actions github-actions bot added the full tests needed We need to run full set of tests for this PR to merge label Sep 22, 2021
@github-actions
Copy link

The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease.

We currently have issue where deleting dagruns causes a dependency error in Sqlalchemy because
the session doesn't know what to do with the related taskinstances.

This PR adds cascade so that when a dagrun is marked for deletion, the related taskinstances
are also deleted
@potiuk potiuk merged commit 13a558d into apache:main Sep 22, 2021
@ephraimbuddy ephraimbuddy deleted the fix-dagrun-deletion branch September 22, 2021 16:53
@potiuk
Copy link
Member

potiuk commented Sep 22, 2021

Oh new test added in #16634 that passedon PR but never worked on main :(

Sorry again - my fault in #17883 - missed image tagging. This has been fixed in #18433 .
Also protection for the future added in #18435

@ephraimbuddy ephraimbuddy added this to the Airflow 2.2.0 milestone Sep 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
full tests needed We need to run full set of tests for this PR to merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants