Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: add unique constraint to tagged_objects #26654

Merged
merged 16 commits into from
Jan 19, 2024

Conversation

mistercrunch
Copy link
Member

@mistercrunch mistercrunch commented Jan 17, 2024

SUMMARY

I ran into this issue where I had duplicates in tagged_object that prevented the deletion of a dashboard. This PR adds a unique constraint to that table, preventing this issue from happening again. This may create another error on update or insert of a dashboard object, hoping a unit test would catch that if there's such a raise condition.

The error I observed was:

sqlalchemy.orm.exc.StaleDataError: DELETE statement on table 'tagged_object' expected to delete 2 row(s); Only 3 were matched.

Looking into my database and researching the issue, I found dups in my tagged_object table:

mysql> select * from tagged_object where object_type ='dashboard' and object_id=11 LIMIT 10;
+---------------------+---------------------+-----+--------+-----------+-------------+---------------+---------------+
| created_on          | changed_on          | id  | tag_id | object_id | object_type | created_by_fk | changed_by_fk |
+---------------------+---------------------+-----+--------+-----------+-------------+---------------+---------------+
| 2024-01-16 15:21:13 | 2024-01-16 15:21:13 | 133 |      3 |        11 | dashboard   |             1 |             1 |
| 2024-01-16 16:19:29 | 2024-01-16 16:19:29 | 180 |      4 |        11 | dashboard   |             1 |             1 |
| 2024-01-16 16:19:29 | 2024-01-16 16:19:29 | 181 |      4 |        11 | dashboard   |             1 |             1 |
+---------------------+---------------------+-----+--------+-----------+-------------+---------------+---------------+

Handling it on the cascade delete seemed harder than preventing it, and prevention is also a great thing. Note that this includes a migration that does delete eventual dups too.

phase 2 - fixing the root cause

After setting up the unique constraint, I hit the root cause, something around auto-owner management where we maintain special owner tags for objects. The approach was to delete-and-recreate them all systematically, and I think the logic + transaction handling was the root cause for the dups. I changed the logic there to go id-by-id and remove and only remove or add individual ids if needed.

I also cleaned up the API to upsert tags it gets instead of raising issue. So if you call the API now to add a tag that's already set, it won't return an error, just add if needed.

reuse code in alembic migration

I cracked a solution to share a module across alembic migrations. Somehow that's way more tricky than it seems, I failed at least a few times before at finding a solution for this. We now have superset/migrations/migration_utils.py for that purpose

@mistercrunch mistercrunch requested a review from a team as a code owner January 17, 2024 21:36
Copy link

codecov bot commented Jan 17, 2024

Codecov Report

Attention: 20 lines in your changes are missing coverage. Please review.

Comparison is base (1010294) 69.56% compared to head (62d86a9) 69.46%.
Report is 4 commits behind head on master.

Files Patch % Lines
superset/migrations/migration_utils.py 0.00% 11 Missing ⚠️
superset/tags/models.py 74.19% 8 Missing ⚠️
superset/daos/tag.py 83.33% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #26654      +/-   ##
==========================================
- Coverage   69.56%   69.46%   -0.11%     
==========================================
  Files        1892     1894       +2     
  Lines       74162    74150      -12     
  Branches     8263     8263              
==========================================
- Hits        51593    51506      -87     
- Misses      20488    20563      +75     
  Partials     2081     2081              
Flag Coverage Δ
hive 53.85% <25.00%> (+0.07%) ⬆️
mysql 77.99% <61.53%> (-0.21%) ⬇️
postgres 78.12% <61.53%> (-0.21%) ⬇️
presto 53.80% <25.00%> (+0.07%) ⬆️
python 83.01% <61.53%> (-0.21%) ⬇️
sqlite 77.70% <61.53%> (-0.21%) ⬇️
unit 56.37% <25.00%> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

.alias("min_ids")
)

delete_query = tagged_object_table.delete().where(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a little scary, needs review, but should be ok

@john-bodley
Copy link
Member

Thanks @mistercrunch for the cleanup. There's been a slew of PRs in recent months where we're tried to adopt the "shift left" approach having added various uniqueness/foreign key constraints and cascade deletes at the database level.

@mistercrunch
Copy link
Member Author

Yeah @john-bodley it seemed hard to fix the cascading delete, shit-left / prevention seemed like the way to go. That delete on the migration could use another set of eyes. I'm pretty sure it's good, but a little scary to look at. I wouldn't want to wipe tag associations for folks out there.

@mistercrunch
Copy link
Member Author

Oh also about this, I noticed that running the alembic auto-migration, there are some extra metadata around index/constraints/nullable in the ORM that are not in sync with set of migrations (not implemented in the database). Maybe a few dozens or so, it could be good to create a sync migration PR eventually to sync up everything.

@john-bodley
Copy link
Member

@mistercrunch I wonder if that's because we're not tightly overly coupled with Flask-Migrate in terms of how migrations are defined and the lack of auto generation. I tried to tackle that in #26172 but hit a brick wall. There's a discussion I created in Flask-Migrate related to it as well.

type_ = TagType.custom
tag_name = name.strip()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could have led to dups, if you add "hello" and "hello " it would probably trigger an error. Wasn't the root cause but might as well clean it up.

import sys

# hack to be able to import / reuse migration_utils.py in revisions
module_dir = os.path.dirname(os.path.realpath(__file__))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit of a breakthrough, I tried before to get some sort of migration_utils.py module going to reuse code across revisions, and never was able to do it. Here's the hack that enables it. This is great because constraint handling and such get verbose if done right, and all the dialect-specific stuff shouldn't be repeated in each revision. Hoping we can grow migration_utils in the future to simplify and improve revisions.

@@ -63,7 +63,7 @@ def test_create_custom_tag_command(self):
example_dashboard = (
db.session.query(Dashboard).filter_by(slug="world_health").one()
)
example_tags = ["create custom tag example 1", "create custom tag example 2"]
example_tags = {"create custom tag example 1", "create custom tag example 2"}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wondering if that was flaky before, but it turned flaky with the new logic

I ran into this issue where I had duplicates in tagged_object
that prevented the deletion of a dashboard. This PR adds a unique
constraint to that table, preventing this issue from happening again. This may
create another error on update or insert of a dashboard object, hoping
a unit test would catch that if there's such a raise condition.

Handling it on the cascade delete seemed harder than preventing it, and
prevention is also a great thing. Note that this includes a migration
that does delete eventual dups too.
@mistercrunch
Copy link
Member Author

@hash-data do you mean you upgraded and can still see the same bug? Can you share more? Are you sure you properly upgraded, database migration and all?

@mfmo92
Copy link

mfmo92 commented Feb 22, 2024

I'm by no means an expert at SqlAlchemy, but I suspect there may still be a bug in the way the relationship between Dashboards (and any other model that may be a target of tags for that matter) and Tags are defined:

tags = relationship(
        "Tag",
        overlaps="objects,tag,tags,tags",
        secondary="tagged_object",
        primaryjoin="and_(Dashboard.id == TaggedObject.object_id)",
        secondaryjoin="and_(TaggedObject.tag_id == Tag.id, "
        "TaggedObject.object_type == 'dashboard')",
    )

I have spun up a local environment for testing and included the fixes from this ticket but ran into the same StaleDataError issue because of this setup:

> select * from tagged_objects;

tag_id, object_id, object_type, other columns...
1, 12, dashboard, ...
2, 12, dashboard, ...
3, 12, chart, ...
...

Note that there is a tagged object entry pointing to a Dashboard with ID 12 and a separate tagged object entry pointing to a completely unrelated chart that happens to have the same ID as the dashboard.

Deleting the dashboard with ID 12 then triggers the StaleDataError. (deletes 2 rows but only 3 were matched)

Unfortunately I can't think of an easy fix for this but fwiw this is how you can reproduce it:

MRE
from sqlalchemy import create_engine, UniqueConstraint
from sqlalchemy.orm import sessionmaker
from sqlalchemy import Column, Integer, String, ForeignKey
from sqlalchemy.orm import relationship
from sqlalchemy.ext.declarative import declarative_base

Base = declarative_base()


class A1(Base): # Dashboard
    __tablename__ = 'a1'
    id = Column(Integer, primary_key=True)
    cs = relationship(
        "C",
        overlaps="bs,c,cs,cs",
        secondary="b",
        primaryjoin="and_(A1.id == B.object_id)",
        secondaryjoin="and_(B.c_id == C.id, B.object_type == 'a1')",
    )


class A2(Base): # Slice
    __tablename__ = 'a2'
    id = Column(Integer, primary_key=True)
    cs = relationship(
        "C",
        overlaps="bs,c,cs,cs",
        secondary="b",
        primaryjoin="and_(A2.id == B.object_id)",
        secondaryjoin="and_(B.c_id == C.id, B.object_type == 'a2')",
    )


class A3(Base): # SqlaTable
    __tablename__ = 'a3'
    id = Column(Integer, primary_key=True)
    cs = relationship(
        "C",
        overlaps="bs,c,cs,cs",
        secondary="b",
        primaryjoin="and_(A3.id == B.object_id)",
        secondaryjoin="and_(B.c_id == C.id, B.object_type == 'a3')",
    )


class B(Base): # TaggedObjects
    __tablename__ = 'b'
    id = Column(Integer, primary_key=True)
    object_id = Column(
        Integer,
        ForeignKey("a1.id"),
        ForeignKey("a2.id"),
        ForeignKey("a3.id"),
    )
    object_type = Column(String)
    c_id = Column(Integer, ForeignKey('c.id'))
    c = relationship("C", back_populates="bs", overlaps="cs")
    __table_args__ = (
        UniqueConstraint(
            "c_id", "object_id", "object_type", name="b_uniq_constraint"
        ),
    )


class C(Base): # Tags
    __tablename__ = 'c'
    id = Column(Integer, primary_key=True)
    bs = relationship(
        "B", back_populates="c", overlaps="bs,cs"
    )


engine = create_engine('sqlite:///sqlalchemy_example.db')

Base.metadata.drop_all(engine)
Base.metadata.create_all(engine)
Base.metadata.bind = engine

DBSession = sessionmaker(bind=engine)
session = DBSession()

# Insert some data into a1, a2, a3, b, c
a1 = A1(id=1)
session.add(a1)

a2 = A2(id=2)
session.add(a2)

a3 = A3(id=3)
session.add(a3)

a3_1 = A3(id=1)
session.add(a3_1)

c = C(id=1)
session.add(c)

b1 = B(id=1, object_id=a1.id, object_type='a1', c_id=c.id)
session.add(b1)

b2 = B(id=2, object_id=a2.id, object_type='a2', c_id=c.id)
session.add(b2)

b3 = B(id=3, object_id=a3.id, object_type='a3', c_id=c.id)
session.add(b3)

b7 = B(id=7, object_id=a3_1.id, object_type='a3', c_id=c.id)
session.add(b7)

session.commit()
session.expire_all()
session.delete(a1)

# Commit the transaction
session.commit()

@mistercrunch
Copy link
Member Author

Are you sure you're on the latest version of Superset, and that you're upgrade to the latest database migration? I think there's a version table in the database that has a single row managed by Alembic you can query and/or some commands you can run to check on your database migration version.

I'm asking because it should be impossible to be on that database version and have the duplicates you are mentioning. The constraint that we put in place should prevent the state you're in and that error from happening.

@mfmo92
Copy link

mfmo92 commented Feb 23, 2024

We are maintaining a fork based on 3.1.0, so running superset db upgrade indeed won't include the migration that is relevant here. However, I have manually added the changes related to this issue as well as the unique constraint on (tag_id, object_id, object_type).

In my understanding that unique constraint does not stop the situation that I mentioned earlier, as the combination of (tag_id, object_id, object_type) is still different in every row:

> select * from tagged_objects;

tag_id, object_id, object_type, other columns...
1, 12, dashboard, ...
2, 12, dashboard, ...
3, 12, chart, ...

Of course it's possible that there exists a relevant code change/migration that was added through another ticket that I'm not aware of yet and that isn't included in 3.1 (but in a later version and/or master it is)

Thank you for the quick response!

@jbat jbat mentioned this pull request Mar 11, 2024
3 tasks
sfirke pushed a commit to sfirke/superset that referenced this pull request Mar 22, 2024
@mistercrunch mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 4.0.0 labels Apr 17, 2024
@kamelhouchat
Copy link

Thanks for the clarifications, I'm encountering the same issue.
It appears that the fixes have been pushed; however, version 4.0.0 of the docker-compose-non-dev deployment (tag version: 4.0.0) still suffers from the same problem... (unable to delete the dashboards), is it potentially because the Docker image is not up to date?

@zijiwork
Copy link

zijiwork commented May 27, 2024

@mistercrunch I also encountered this problem. It seems that 4.0.0 cannot delete the chart normally.
image

@mistercrunch
Copy link
Member Author

mistercrunch commented May 28, 2024

@zijiwork can you share your stacktrace? From my understanding, the [bad] state that I was in prior to this particular PR and the error I had should not be possible after my fix.

My fix not only removed the duplicates for the particular combination of fields, but it prevented the same issue from occurring again with a unique constraint on the database.

Now I'm guessing that your error, while similar, must have been slightly different, especially since i see you seem to have the uix_tagged_object unique constraint in place... Though notice that in my particular case I have dups for this particular combination of tag_id, object_id, object_type ^^^

Heard @Vitor-Avila might have seen something similar too (?) if you can chime in.

@zijiwork
Copy link

zijiwork commented May 29, 2024

@mistercrunch stacktrace logs

DELETE statement on table 'tagged_object' expected to delete 1 row(s); Only 2 were matched.
Traceback (most recent call last):
  File "/app/superset/daos/base.py", line 217, in delete
    db.session.commit()
  File "<string>", line 2, in commit
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 1435, in commit
    self._transaction.commit(_to_root=self.future)
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 829, in commit
    self._prepare_impl()
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 808, in _prepare_impl
    self.session.flush()
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 3367, in flush
    self._flush(objects)
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 3507, in _flush
    transaction.rollback(_capture_exception=True)
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/util/langhelpers.py", line 70, in __exit__
    compat.raise_(
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/util/compat.py", line 207, in raise_
    raise exception
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 3467, in _flush
    flush_context.execute()
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/unitofwork.py", line 456, in execute
    rec.execute(self)
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/unitofwork.py", line 577, in execute
    self.dependency_processor.process_deletes(uow, states)
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/dependency.py", line 1110, in process_deletes
    self._run_crud(
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/dependency.py", line 1207, in _run_crud
    raise exc.StaleDataError(
sqlalchemy.orm.exc.StaleDataError: DELETE statement on table 'tagged_object' expected to delete 1 row(s); Only 2 were matched.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/app/superset/commands/chart/delete.py", line 49, in run
    ChartDAO.delete(self._models)
  File "/app/superset/daos/base.py", line 220, in delete
    raise DAODeleteFailedError(exception=ex) from ex
superset.daos.exceptions.DAODeleteFailedError: Delete failed
2024-05-29 16:33:38,274:ERROR:superset.commands.chart.delete:DELETE statement on table 'tagged_object' expected to delete 1 row(s); Only 2 were matched.
Traceback (most recent call last):
  File "/app/superset/daos/base.py", line 217, in delete
    db.session.commit()
  File "<string>", line 2, in commit
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 1435, in commit
    self._transaction.commit(_to_root=self.future)
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 829, in commit
    self._prepare_impl()
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 808, in _prepare_impl
    self.session.flush()
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 3367, in flush
    self._flush(objects)
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 3507, in _flush
    transaction.rollback(_capture_exception=True)
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/util/langhelpers.py", line 70, in __exit__
    compat.raise_(
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/util/compat.py", line 207, in raise_
    raise exception
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 3467, in _flush
    flush_context.execute()
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/unitofwork.py", line 456, in execute
    rec.execute(self)
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/unitofwork.py", line 577, in execute
    self.dependency_processor.process_deletes(uow, states)
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/dependency.py", line 1110, in process_deletes
    self._run_crud(
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/dependency.py", line 1207, in _run_crud
    raise exc.StaleDataError(
sqlalchemy.orm.exc.StaleDataError: DELETE statement on table 'tagged_object' expected to delete 1 row(s); Only 2 were matched.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/app/superset/commands/chart/delete.py", line 49, in run
    ChartDAO.delete(self._models)
  File "/app/superset/daos/base.py", line 220, in delete
    raise DAODeleteFailedError(exception=ex) from ex
superset.daos.exceptions.DAODeleteFailedError: Delete failed

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/app/superset/charts/api.py", line 460, in delete
    DeleteChartCommand([pk]).run()
  File "/app/superset/commands/chart/delete.py", line 52, in run
    raise ChartDeleteFailedError() from ex
superset.commands.chart.exceptions.ChartDeleteFailedError: Charts could not be deleted.

@Vitor-Avila
Copy link
Contributor

@mistercrunch we got a few reports of users facing errors to delete objects on our end as well (after the fix was released to production). Here's an example stack trace:

Traceback (most recent call last):
  File "/home/superset/preset/superset/superset/daos/base.py", line 217, in delete
    db.session.commit()
  File "<string>", line 2, in commit
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 1435, in commit
    self._transaction.commit(_to_root=self.future)
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 829, in commit
    self._prepare_impl()
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 808, in _prepare_impl
    self.session.flush()
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 3367, in flush
    self._flush(objects)
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 3507, in _flush
    transaction.rollback(_capture_exception=True)
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/util/langhelpers.py", line 70, in __exit__
    compat.raise_(
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/util/compat.py", line 207, in raise_
    raise exception
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 3467, in _flush
    flush_context.execute()
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/unitofwork.py", line 456, in execute
    rec.execute(self)
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/unitofwork.py", line 577, in execute
    self.dependency_processor.process_deletes(uow, states)
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/dependency.py", line 1110, in process_deletes
    self._run_crud(
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/dependency.py", line 1207, in _run_crud
    raise exc.StaleDataError(
sqlalchemy.orm.exc.StaleDataError: DELETE statement on table 'tagged_object' expected to delete 1 row(s); Only 2 were matched.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/superset/preset/superset/superset/commands/dashboard/delete.py", line 49, in run
    DashboardDAO.delete(self._models)
  File "/home/superset/preset/superset/superset/daos/base.py", line 220, in delete
    raise DAODeleteFailedError(exception=ex) from ex
superset.daos.exceptions.DAODeleteFailedError: Delete failed

I managed to reproduce with a certain dashboard on my end as well, but it's inconsistent (I can't reliably reproduce with every new dashboard).

@mistercrunch
Copy link
Member Author

mistercrunch commented May 29, 2024

Thinking about this, I think I understand the issue better now, and it's not the same issue as the one I tackled in this PR originally - though it raises the same exception...

Looking simply at this:

class TaggedObject(Model, AuditMixinNullable):
    """An association between an object and a tag."""

    __tablename__ = "tagged_object"
    id = Column(Integer, primary_key=True)
    tag_id = Column(Integer, ForeignKey("tag.id"))
    object_id = Column(
        Integer,
        ForeignKey("dashboards.id"),
        ForeignKey("slices.id"),
        ForeignKey("saved_query.id"),
    )
    object_type = Column(Enum(ObjectType))

    tag = relationship("Tag", back_populates="objects", overlaps="tags")

there's nothing that clarifies the nature of the relationship being dependent on object_type. For sqlalchemy to properly cascade deletes here from dashboard or other object type, is has to apply a condition on object_type, which I'm guessing it doesn't. I really can't given that semantically we don't really tell it as we defined the model.

In terms of solutions, there's either:

  • do some sqlalchemy kungfu to model/clarify the nature of the relationship do that cascade delete work as expected, this requires digging a bit deeper into sqlaclehmy semantics and seeing what's possible there
  • improve the DAOs / commands to delete tags proactively BEFORE deleting the objects, which means the delete CASCADE is effectively a no-op

I think option #1 seems best AFAIC, if possible.

@mistercrunch
Copy link
Member Author

actually nevermind the relationship is defined well on the other side:

class Dashboard(AuditMixinNullable, ImportExportMixin, Model):
    {{ ... }}
    tags = relationship(
        "Tag",
        overlaps="objects,tag,tags,tags",
        secondary="tagged_object",
        primaryjoin=f"and_(Dashboard.id == TaggedObject.object_id)",
        secondaryjoin="and_(TaggedObject.tag_id == Tag.id, TaggedObject.object_type == 'dashboard')",
    )

@mistercrunch
Copy link
Member Author

I wasn't able to recreate the issue (tried to associate a tag to all the objects I could, and then tried deleting dashboards after that, hoping to hit the same issue).

BUT, here's one idea I'd recommend trying, promoting the object_type filter to the primaryjoin, as in

    tags = relationship(
        "Tag",
        overlaps="objects,tag,tags,tags",
        secondary="tagged_object",
        primaryjoin=f"and_(Dashboard.id == TaggedObject.object_id, TaggedObject.object_type == 'dashboard')",
        secondaryjoin="and_(TaggedObject.tag_id == Tag.id)",
    )

I think it's worth giving it a shot in your environment.

mistercrunch added a commit that referenced this pull request May 29, 2024
I wasn't able to reproduce the issues in #26654, but wanted to submit this diff here as something to try. I'm not likely to push this through but thought a PR would be a good way to suggest a fix and open centralize the discussion.
@mistercrunch
Copy link
Member Author

Can someone with the issue try this DRAFT solution? #28769

@zijiwork
Copy link

Can someone with the issue try this DRAFT solution? #28769

I tried to use the DRAFT solution, but there still seems to be a problem

2024-05-31 14:25:43,712:ERROR:superset.charts.api:Error deleting model ChartRestApi: Charts could not be deleted.
Traceback (most recent call last):
  File "/app/superset/daos/base.py", line 217, in delete
    db.session.commit()
  File "<string>", line 2, in commit
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 1435, in commit
    self._transaction.commit(_to_root=self.future)
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 829, in commit
    self._prepare_impl()
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 808, in _prepare_impl
    self.session.flush()
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 3367, in flush
    self._flush(objects)
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 3507, in _flush
    transaction.rollback(_capture_exception=True)
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/util/langhelpers.py", line 70, in __exit__
    compat.raise_(
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/util/compat.py", line 207, in raise_
    raise exception
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 3467, in _flush
    flush_context.execute()
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/unitofwork.py", line 456, in execute
    rec.execute(self)
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/unitofwork.py", line 577, in execute
    self.dependency_processor.process_deletes(uow, states)
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/dependency.py", line 1110, in process_deletes
    self._run_crud(
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/dependency.py", line 1207, in _run_crud
    raise exc.StaleDataError(
sqlalchemy.orm.exc.StaleDataError: DELETE statement on table 'tagged_object' expected to delete 1 row(s); Only 2 were matched.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/app/superset/commands/chart/delete.py", line 49, in run
    ChartDAO.delete(self._models)
  File "/app/superset/daos/base.py", line 220, in delete
    raise DAODeleteFailedError(exception=ex) from ex
superset.daos.exceptions.DAODeleteFailedError: Delete failed

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/app/superset/charts/api.py", line 460, in delete
    DeleteChartCommand([pk]).run()
  File "/app/superset/commands/chart/delete.py", line 52, in run
    raise ChartDeleteFailedError() from ex
superset.commands.chart.exceptions.ChartDeleteFailedError: Charts could not be deleted.
Loaded your LOCAL configuration at [/app/docker/pythonpath_dev/superset_config.py]

@mistercrunch
Copy link
Member Author

Can you run a SELECT * FROM tagged_object WHERE object_id = {the_dashboard_id_youre_trying_to_delete}. What I'm guessing is that there's an issue around handling objects of different types that have the same id, as in the DELETE statement that's issued fails at adding that object_type predicate somehow.

The other option I mentioned is to not rely on the ORM's cascading the delete and handling this proactively in the dao here superset/daos/dashboard.py. I don't think it's as clean of a solution and may have to be repeated for all "taggable" object types.

@zijiwork
Copy link

zijiwork commented Jun 3, 2024

@mistercrunch image

@Vitor-Avila
Copy link
Contributor

hey @zijiwork could you please test the change implemented in #29117 and let us know if it fixes the issue for you? Feel free to backup the tagged_object table if you want to compare/validate results.

@zijiwork
Copy link

@Vitor-Avila Thank you, it seems that the issue still exists. The record with ID 71 has been deleted

image

I tried using this PR for repair, but it didn't solve the problem
image

2024-06-12 11:43:50,101:ERROR:superset.commands.chart.delete:DELETE statement on table 'tagged_object' expected to delete 1 row(s); Only 2 were matched.
Traceback (most recent call last):
  File "/app/superset/daos/base.py", line 217, in delete
    db.session.commit()
  File "<string>", line 2, in commit
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 1435, in commit
    self._transaction.commit(_to_root=self.future)
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 829, in commit
    self._prepare_impl()
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 808, in _prepare_impl
    self.session.flush()
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 3367, in flush
    self._flush(objects)
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 3507, in _flush
    transaction.rollback(_capture_exception=True)
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/util/langhelpers.py", line 70, in __exit__
    compat.raise_(
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/util/compat.py", line 207, in raise_
    raise exception
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 3467, in _flush
    flush_context.execute()
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/unitofwork.py", line 456, in execute
    rec.execute(self)
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/unitofwork.py", line 577, in execute
    self.dependency_processor.process_deletes(uow, states)
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/dependency.py", line 1110, in process_deletes
    self._run_crud(
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/dependency.py", line 1207, in _run_crud
    raise exc.StaleDataError(
sqlalchemy.orm.exc.StaleDataError: DELETE statement on table 'tagged_object' expected to delete 1 row(s); Only 2 were matched.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/app/superset/commands/chart/delete.py", line 49, in run
    ChartDAO.delete(self._models)
  File "/app/superset/daos/base.py", line 220, in delete
    raise DAODeleteFailedError(exception=ex) from ex
superset.daos.exceptions.DAODeleteFailedError: Delete failed
Error deleting model ChartRestApi: Charts could not be deleted.
Traceback (most recent call last):
  File "/app/superset/daos/base.py", line 217, in delete
    db.session.commit()
  File "<string>", line 2, in commit
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 1435, in commit
    self._transaction.commit(_to_root=self.future)
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 829, in commit
    self._prepare_impl()
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 808, in _prepare_impl
    self.session.flush()
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 3367, in flush
    self._flush(objects)
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 3507, in _flush
    transaction.rollback(_capture_exception=True)
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/util/langhelpers.py", line 70, in __exit__
    compat.raise_(
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/util/compat.py", line 207, in raise_
    raise exception
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 3467, in _flush
    flush_context.execute()
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/unitofwork.py", line 456, in execute
    rec.execute(self)
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/unitofwork.py", line 577, in execute
    self.dependency_processor.process_deletes(uow, states)
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/dependency.py", line 1110, in process_deletes
    self._run_crud(
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/dependency.py", line 1207, in _run_crud
    raise exc.StaleDataError(
sqlalchemy.orm.exc.StaleDataError: DELETE statement on table 'tagged_object' expected to delete 1 row(s); Only 2 were matched.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/app/superset/commands/chart/delete.py", line 49, in run
    ChartDAO.delete(self._models)
  File "/app/superset/daos/base.py", line 220, in delete
    raise DAODeleteFailedError(exception=ex) from ex
superset.daos.exceptions.DAODeleteFailedError: Delete failed

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/app/superset/charts/api.py", line 460, in delete
    DeleteChartCommand([pk]).run()
  File "/app/superset/commands/chart/delete.py", line 52, in run
    raise ChartDeleteFailedError() from ex
superset.commands.chart.exceptions.ChartDeleteFailedError: Charts could not be deleted.

@Vitor-Avila
Copy link
Contributor

thanks for the update, @zijiwork! Could you please test again using this branch? #29229

Feel free to backup the tagged_object table to compare results before/after the deletion (in case it works).

Thank you so much for your patience and collaboration here!

@zijiwork
Copy link

@Vitor-Avila Thanks for your #29229 PR, it helped me solve the problem.

@Vitor-Avila
Copy link
Contributor

great news, @zijiwork! I'm glad it worked 🙌

@mistercrunch
Copy link
Member Author

dude

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels risk:db-migration PRs that require a DB migration size/L 🚢 4.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants