Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(dao/command): Add transaction decorator to try to enforce "unit of work" #24969

Merged

Conversation

john-bodley
Copy link
Member

@john-bodley john-bodley commented Aug 11, 2023

SUMMARY

This is a PR I've had on the back-burner for many months, but have struggled with on numerous occasions—often in part due to the flakey/delicate tests (and their associated frameworks). The initial desire was to fulfill the approach outlined in [SIP-99B] Proposal for (re)defining a "unit of work", but alas I failed, in part due to the challenges trying to untangle Superset logic which inherently is not overly conducive to adhering to the construct that a command should serve as a "unit of work".

Why is that? It's complicated, but asynchronous logic does not help given that a Celery task running within the confines of another command needs to read a previously committed state given the READ COMMITTED isolation level. Issues like this could likely be overcome by having two commands—prepare and execute—as opposed to a single execute command.

The TL;DR is this PR should likely be interpreted as the first phase of SIP-99B. The general framework holds, i.e., DAOs no longer commit and a transaction decorator is used to wrap any command which perform either an INSERT, UPDATE, or DELETE.

Finally, I apologize for the size of the PR. I struggled to downside the footprint, but once you start enforcing that DAOs should not commit, then the files which touched begins to snowball.

Regrettably my time (for now) working on Apache Superset is likely drawing to a close, so for completeness I thought there was merit in sharing the incremental diff for what I was hoping to achieve in case @michael-s-molina @villebro et al. wanted to carry the baton on.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TESTING INSTRUCTIONS

CI.

ADDITIONAL INFORMATION

  • Has associated issue: [SIP-99B] Proposal for (re)defining a "unit of work" #25108
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

@john-bodley john-bodley changed the title John bodley dao nested session chore(dao): Use nested sessions Aug 11, 2023
db.session.commit()
except SQLAlchemyError as ex:
db.session.rollback()
raise ex
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're really inconsistent with our error handling. The BaseDAO.delete method wraps all SQLAlchemyError errors as DAODeleteFailedError whereas here they are left as is.

@john-bodley john-bodley force-pushed the john-bodley--dao-nested-session branch 2 times, most recently from de2c324 to 37f0b24 Compare August 16, 2023 23:54
@john-bodley john-bodley force-pushed the john-bodley--dao-nested-session branch from 37f0b24 to 4e51d4d Compare August 19, 2023 05:53
@john-bodley john-bodley force-pushed the john-bodley--dao-nested-session branch from 4e51d4d to b5c2e81 Compare August 19, 2023 05:55
@john-bodley john-bodley force-pushed the john-bodley--dao-nested-session branch 4 times, most recently from 7b201dd to ef7bcd1 Compare January 30, 2024 01:32
@john-bodley john-bodley changed the title chore(dao): Use nested sessions chore(dao/command): Use nested sessions Jan 30, 2024
@john-bodley john-bodley force-pushed the john-bodley--dao-nested-session branch 11 times, most recently from f03d7b0 to 10dc755 Compare January 31, 2024 00:38
@john-bodley john-bodley force-pushed the john-bodley--dao-nested-session branch from 10dc755 to 092aa45 Compare February 13, 2024 05:08
@github-actions github-actions bot added the api Related to the REST API label Feb 13, 2024
@john-bodley john-bodley force-pushed the john-bodley--dao-nested-session branch 2 times, most recently from 82de210 to 7b651ef Compare February 13, 2024 23:16
@john-bodley john-bodley force-pushed the john-bodley--dao-nested-session branch from 4b409d4 to 01cb5a7 Compare June 27, 2024 01:27
@john-bodley john-bodley changed the title chore(dao/command): Use nested sessions chore(dao/command): Add transaction decorator to help enforce "unit of work" Jun 27, 2024
@john-bodley john-bodley force-pushed the john-bodley--dao-nested-session branch 2 times, most recently from 2dbe1c6 to 5c2be4e Compare June 27, 2024 02:22
@@ -1211,6 +1211,9 @@ def test_chart_data_cache_no_login(self, cache_loader):
"""
Chart data cache API: Test chart data async cache request (no login)
"""
if get_example_database().backend == "presto":
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test only seems to fail for the test-postgres-presto workflow.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unrelated comment: at some point we should replace Presto with Trino, as that's really where the broader community is at right now..

@john-bodley john-bodley changed the title chore(dao/command): Add transaction decorator to help enforce "unit of work" chore(dao/command): Add transaction decorator to try to enforce "unit of work" Jun 27, 2024
@john-bodley john-bodley force-pushed the john-bodley--dao-nested-session branch from 5c2be4e to 03a1c64 Compare June 27, 2024 02:36
@john-bodley john-bodley marked this pull request as ready for review June 27, 2024 03:33
@dosubot dosubot bot added the change:backend Requires changing the backend label Jun 27, 2024
@john-bodley john-bodley force-pushed the john-bodley--dao-nested-session branch from 03a1c64 to fcfdf63 Compare June 27, 2024 03:53
Copy link
Member

@michael-s-molina michael-s-molina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for all the hard work here @john-bodley. Even though we were not able to fully implement SIP-99B, this PR is a step in the right direction and removes a lot of unnecessary code. I left some first-pass comments:

tests/integration_tests/databases/api_tests.py Outdated Show resolved Hide resolved
tests/integration_tests/databases/api_tests.py Outdated Show resolved Hide resolved
tests/integration_tests/databases/api_tests.py Outdated Show resolved Hide resolved
tests/integration_tests/databases/api_tests.py Outdated Show resolved Hide resolved
tests/integration_tests/databases/api_tests.py Outdated Show resolved Hide resolved
superset/sqllab/sql_json_executer.py Show resolved Hide resolved
tests/integration_tests/celery_tests.py Outdated Show resolved Hide resolved
tests/integration_tests/celery_tests.py Outdated Show resolved Hide resolved
tests/integration_tests/celery_tests.py Outdated Show resolved Hide resolved

try:
result = func(*args, **kwargs)
db.session.commit() # pylint: disable=consider-using-transaction
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we were not able to use begin_nested here, do you see any point where previously we had only a flush that could be potentially rollbacked and now we have a @transaction which will effectively commit? Something like:

Previously:

CommandA:
   try:
      do_something()
      CommandB()
      commit()
   except Exception:
      rollback()

CommandB:
   do_something()
   flush()

Now:

@transaction
CommandA:
   do_something()
   CommandB()
 
@transaction
CommandB:
   do_something()

Copy link
Member Author

@john-bodley john-bodley Jun 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@michael-s-molina given that these @transaction decorators are defined at the "unit of work" level I think we're ok, i.e., I'm not sure where we ever had nested commands where one never committed and the outer explicitly rolled back.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for now we should consider commands as the unit of work, meaning we should assume they always commit at the end. If this is not the case we should probably introduce a sort-of notion of a sub-command, that doesn't commit. But let's leave that for a follow-up.

self._properties,
commit=False,
)
database = DatabaseDAO.update(self._model, self._properties)
database.set_sqlalchemy_uri(database.sqlalchemy_uri)
ssh_tunnel = self._handle_ssh_tunnel(database)
self._refresh_catalogs(database, original_database_name, ssh_tunnel)
except SSHTunnelError: # pylint: disable=try-except-raise
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe in this case you don't need the try/catch as there's no event logging or anything in the catch block.

@john-bodley john-bodley force-pushed the john-bodley--dao-nested-session branch from fcfdf63 to 06ae67c Compare June 27, 2024 16:33
@apache apache deleted a comment from michael-s-molina Jun 27, 2024
@apache apache deleted a comment from michael-s-molina Jun 27, 2024
Copy link
Member

@villebro villebro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a HUGE step in the right direction, and finally introduces a coherent pattern for dealing with complex ORM handling during the request lifecycle. Given that this fundamentally changes how the backend operates, I fear there may be significant risk for regressions here. However, those should be easy to fix now that we have consistent flushing, committing and rollbacking. If nothing else, these potential regrssions will highlight critical gaps in our test coverage. Therefore, I feel the benefits of this change far outweigh the intermediate regression risks it introduces.

@@ -240,6 +240,7 @@ ignore_basepython_conflict = true
commands =
superset db upgrade
superset init
superset load-test-users
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Random observation that's not directly related to this PR: I've always felt it's weird that the core application has functionality for loading test users. I feel at some point we should break that out into the test suite.

@@ -29,7 +31,6 @@ def cleanup_permissions() -> None:
for pvm in pvms:
pvms_dict[(pvm.permission, pvm.view_menu)].append(pvm)
duplicates = [v for v in pvms_dict.values() if len(v) > 1]
len(duplicates)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What on earth was this? 🤔


try:
result = func(*args, **kwargs)
db.session.commit() # pylint: disable=consider-using-transaction
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for now we should consider commands as the unit of work, meaning we should assume they always commit at the end. If this is not the case we should probably introduce a sort-of notion of a sub-command, that doesn't commit. But let's leave that for a follow-up.

@@ -1211,6 +1211,9 @@ def test_chart_data_cache_no_login(self, cache_loader):
"""
Chart data cache API: Test chart data async cache request (no login)
"""
if get_example_database().backend == "presto":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unrelated comment: at some point we should replace Presto with Trino, as that's really where the broader community is at right now..

@@ -592,7 +592,6 @@ def test_import_v1_dashboard_multiple(self, mock_g):
}
command = v1.ImportDashboardsCommand(contents, overwrite=True)
command.run()
command.run()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice

if entry is None or entry.is_expired():
return None

return JsonKeyValueCodec().decode(entry.value)


def _get_other_session() -> Session:
Copy link
Member

@villebro villebro Jun 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I disagree with this change (I may not have been able to accurately communicate why this is needed). But no worries, I will address this in #29344 after this PR lands and try to document the logic better.

Copy link
Member

@michael-s-molina michael-s-molina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @john-bodley for addressing the comments. I agree with @villebro that the benefits greatly outweigh the risks here.

@john-bodley john-bodley merged commit 8fb8199 into apache:master Jun 28, 2024
37 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api Related to the REST API change:backend Requires changing the backend size/XXL
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants