Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(dao): Add explicit ON DELETE CASCADE when deleting datasets #24488

Merged

Conversation

john-bodley
Copy link
Member

@john-bodley john-bodley commented Jun 22, 2023

SUMMARY

Per SIP-92 Proposal for restructuring the Python code base #24331 organized all the DAOs to be housed within a shared folder—the result of which highlighted numerous inconsistencies, repetition, and inefficiencies.

Personally I really struggle at times with the SQLAlchemy model especially with regards to the relationship logic and why the cascade paradigm only impacts the SQLAlchemy ORM, i.e., isn't materialized to the actual table schema. The TL;DR is there always seems to be multiple ways to skin the cat which just adds to the confusion.

Something clearly is quite right with our model as it's often the case when deleting a dataset we first delete the owners, columns, and metrics to avoid foreign key constrain violations. I gather this is because of our bulk delete logic (see this discussion for more detail) bypasses the SQLAlchemy ORM or there are other gremlins at play—there is this issue reported on Stack Overflow regarding cascading for association tables.

I recently added the DAO-Style-Guidelines-and-Best-Practices wiki page, and I hope there is general agreement in item 5 which mentions the "shift left" mentality where the data model should persist in the database schema than rely on Python logic per the SQLAlchemy ORM. The beauty of this approach is it simplifies the codebase and mental model and ensures that the correct logic is adhered to if/when executing SQL outside of the ORM (be that using the SQLAlchemy bulk workflows, executing SQL against the DB-API, using the MySQL/PostgreSQL CLI, etc.).

The logic I employed follows this SQLAlchemy documentation. Per their suggestion I opted for passive_deletes=True as we don't require the ORM to load the columns, metrics, and owners (if lazy loaded) when deleting the dataset.

The TL;DR is prior to adding the migration CI was failing so it seems like this approach feels somewhat correct.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TESTING INSTRUCTIONS

CI.

ADDITIONAL INFORMATION

  • Has associated issue: [SIP-92] Proposal for restructuring the Python code base #20630
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided: Takes ~ 1 minute for Airbnb's instance w/o load.
  • Introduces new feature or API
  • Removes existing feature or API

@john-bodley john-bodley force-pushed the john-bodley--dao-tables-cascade-delete branch from ab461d7 to 81224e0 Compare June 22, 2023 21:46
@john-bodley john-bodley requested a review from a team as a code owner June 22, 2023 21:46
@pull-request-size pull-request-size bot added size/L and removed size/M labels Jun 22, 2023
@codecov
Copy link

codecov bot commented Jun 22, 2023

Codecov Report

Merging #24488 (20dad20) into master (2a4ef5c) will increase coverage by 0.01%.
The diff coverage is 80.23%.

❗ Current head 20dad20 differs from pull request most recent head bc60158. Consider uploading reports for the commit bc60158 to get more accurate results

@@            Coverage Diff             @@
##           master   #24488      +/-   ##
==========================================
+ Coverage   69.03%   69.04%   +0.01%     
==========================================
  Files        1901     1901              
  Lines       74006    74020      +14     
  Branches     8115     8119       +4     
==========================================
+ Hits        51088    51107      +19     
+ Misses      20807    20805       -2     
+ Partials     2111     2108       -3     
Flag Coverage Δ
hive 53.91% <40.90%> (+0.02%) ⬆️
mysql 79.40% <81.81%> (-0.02%) ⬇️
postgres 79.49% <81.81%> (-0.02%) ⬇️
presto 53.81% <40.90%> (+0.02%) ⬆️
python 83.49% <100.00%> (+<0.01%) ⬆️
sqlite 78.04% <59.09%> (+0.04%) ⬆️
unit 54.69% <59.09%> (+0.03%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
superset-frontend/src/SqlLab/App.jsx 0.00% <0.00%> (ø)
...d/src/SqlLab/components/AceEditorWrapper/index.tsx 67.07% <0.00%> (ø)
...frontend/src/SqlLab/components/SqlEditor/index.jsx 59.88% <ø> (ø)
...src/SqlLab/utils/reduxStateToLocalStorageHelper.js 80.00% <0.00%> (-20.00%) ⬇️
...perset-frontend/src/hooks/apiResources/queryApi.ts 100.00% <ø> (ø)
...d/src/SqlLab/components/SqlEditorLeftBar/index.tsx 62.12% <50.00%> (ø)
...ntend/src/SqlLab/components/TableElement/index.tsx 81.92% <73.07%> (+8.01%) ⬆️
superset-frontend/src/SqlLab/actions/sqlLab.js 69.62% <93.75%> (+0.45%) ⬆️
...et-frontend/src/SqlLab/reducers/getInitialState.js 91.66% <100.00%> (+1.04%) ⬆️
superset-frontend/src/hooks/apiResources/tables.ts 73.80% <100.00%> (+2.75%) ⬆️
... and 5 more

... and 1 file with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@john-bodley john-bodley force-pushed the john-bodley--dao-tables-cascade-delete branch from 81224e0 to d8b9191 Compare June 22, 2023 22:13
@@ -364,17 +364,6 @@ def create_metric(
@staticmethod
def bulk_delete(models: Optional[list[SqlaTable]], commit: bool = True) -> None:
item_ids = [model.id for model in models] if models else []
# bulk delete, first delete related data
Copy link
Member Author

@john-bodley john-bodley Jun 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is no longer needed as the database will handle said logic.

try:
# Even though SQLAlchemy should in theory delete rows from the association
Copy link
Member Author

@john-bodley john-bodley Jun 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@betodealmeida this should address the problem you were trying to circumvent in #23414.

@john-bodley john-bodley force-pushed the john-bodley--dao-tables-cascade-delete branch from d8b9191 to d1b2033 Compare June 22, 2023 22:34
@@ -189,11 +189,6 @@ def test_extra_cache_keys(self, flask_g):
self.assertTrue(table3.has_extra_cache_key_calls(query_obj))
assert extra_cache_keys == ["abc"]

# Cleanup
Copy link
Member Author

@john-bodley john-bodley Jun 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI was failing for PostgreSQL (though not MySQL or SQLite—which is a little perplexing) with SQLAlchemy error which would indicate that we're trying to delete objects which previously weren't committed to the database. Given how these tables are constructed,

table3 = SqlaTable(
    table_name="test_has_no_extra_cache_keys_table",
    sql=query,
    database=get_example_database(),
)

and there's no clear add/commit call it seems that SQLAlchemy's viewpoint is right, i.e., there's no cleanup required. If I'm right then this is a win for SIP-92. I suspect throughout the code and tests we're doing numerous database operations inefficiently with superfluous commits, not rolling back test state, etc.

@john-bodley john-bodley force-pushed the john-bodley--dao-tables-cascade-delete branch 2 times, most recently from 5600f4e to 52afb2c Compare June 22, 2023 23:51
try:
DatasetDAO.bulk_delete(self._models)
for model in self._models:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is logic taken #24466. Previously a test was failing because the relationship no longer existed in Python after the commit. See here for details. The issue was that:

DatasetDAO.bulk_delete(self._models)

includes a db.session.commit() in which all the attributes have expired. I'm not really sure why this worked before, but the TL;DR is this logic has been moved inside the DAO and is executed prior to the commit.

@john-bodley john-bodley force-pushed the john-bodley--dao-tables-cascade-delete branch 3 times, most recently from dd031bd to d138de0 Compare June 23, 2023 01:34
if branch:
# 'branch' refers to a sub-connection of a connection,
# we don't want to bother pinging on these.
return

if connection.dialect.name == "sqlite":
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See here for why this is needed. I do wonder if we should drop support for SQLite given it's not supported in production. We already skip a bunch of tests, i.e.,

if backend() == "sqlite":
    return

possibly because of how the foreign key constraints have no impact on the underlying table.

@john-bodley john-bodley force-pushed the john-bodley--dao-tables-cascade-delete branch 3 times, most recently from 09a83c8 to ab9e94a Compare June 23, 2023 02:23
@@ -849,6 +851,19 @@ def ping_connection(connection: Connection, branch: bool) -> None:
# restore 'close with result'
connection.should_close_with_result = save_should_close_with_result

if some_engine.dialect.name == "sqlite":
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this is why a number of tests currently have,

if backend() == "sqlite":
    return

@john-bodley john-bodley force-pushed the john-bodley--dao-tables-cascade-delete branch 2 times, most recently from 986ac75 to 355c177 Compare June 23, 2023 03:31
@john-bodley john-bodley force-pushed the john-bodley--dao-tables-cascade-delete branch from 355c177 to bc60158 Compare June 23, 2023 03:55
@@ -430,7 +425,7 @@ def test_multiple_sql_statements_raises_exception(self):
}

table = SqlaTable(
table_name="test_has_extra_cache_keys_table",
table_name="test_multiple_sql_statements",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comment above. SQLite was complaining that said table already existed when I removed the "cleanup", though neither MySQL nor PostgreSQL did. I'm not entirely sure what's going on here, but it does seem like the table name was copypasta from the other test and thus the easiest solution was to rename the table in accordance with the test.

I sense with this refactoring work it's a case of two steps forwards, one step back at time.

if some_engine.dialect.name == "sqlite":

@event.listens_for(some_engine, "connect")
def set_sqlite_pragma( # pylint: disable=unused-argument
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hopefully (if approved) this will remove the need for this logic in a number of tests.

@craig-rueda
Copy link
Member

Totally agree with this philosophy, except for situations where you really want to leave FK data hanging. IMO, the DB should keep things clean wherever possible and should enforce referential integrity (kinda the point of FK's in the first place :D ). Delegating to the DB to deal with cleanup keeps app-tier logic cleaner and less buggy as it's less concerned with cascades and what not.

mapper = next(iter(cls.model_cls.registry.mappers)) # type: ignore

for model in models:
security_manager.dataset_after_delete(mapper, connection, model)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, we wouldn't reference the SM from the DAO layer as it's more of a mid-tier concern. This belongs in the CMD.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@craig-rueda could you help educate me as to why this is? Granted that the DAO is only abstraction from the database (and not the security manager which in turns interfaces with the database), but isn't this the interface we expect commands, APIs, etc. to interface with? Note this logic would be handled by the after_delete event listener if we were using the ORM, as opposed to bulk queries, to delete the assets.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, DAO's should be super light, only concerned with interfacing with the DB. They shouldn't have dependencies upwards (to the mid-tier), which can lead to circular dependencies as basically all mid-tier logic will at some point need a DAO. This can obviously be worked around using some late-binding technique.

From a semantic POV, the command is just asking to "please delete these records". Calling the SM to do additional work here is a side effect, which ultimately makes this code less reusable for lower level operations.

Copy link
Member Author

@john-bodley john-bodley Jun 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@craig-rueda I generally agree, but how does this differ from the DAO asking "please delete these records" which triggers the SQLAlchemy ORM to handle post deletion events (per the after_delete event handler) which in turn invokes the security manager?

Note we typically use these SQLAlchemy events because of short comings with either our data model and/or constructs which can't be modeled by our database schema.

I guess what I’m saying is I would expect the DAO to behave in the same way whether it was deleting a single entity or bulk deleting multiple entities and explicitly calling the affer_delete callback (which is only invoked for the former) ensures that the behavior is consistent.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, either way. It's more of a stylistic thing IMO. Keeping the DAO layer as decoupled as possible has worked well for me in the past, hence my comments :).

"fk": "fk_%(table_name)s_%(column_0_name)s_%(referred_table_name)s",
}

for table in ("sql_metrics", "table_columns"):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@john-bodley john-bodley merged commit 75543af into apache:master Jun 28, 2023
33 checks passed
@john-bodley john-bodley deleted the john-bodley--dao-tables-cascade-delete branch June 28, 2023 23:08
@mistercrunch mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 3.0.0 labels Mar 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels size/L 🚢 3.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants