Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: Set isolation level to READ COMMITTED for testing et al. #28628

Merged

Conversation

john-bodley
Copy link
Member

@john-bodley john-bodley commented May 21, 2024

SUMMARY

The default isolation level for PostgreSQL is READ COMMITTED whereas for MySQL it is REPEATABLE READ. Ideally we would strive to have these the same (for consistency reasons) and the former is likely preferred given the async nature of Superset (especially SQL Lab) where the backend is polling to determine whether a query has been cancelled from another Flask-SQLAlchemy session.

This PR:

  1. Adds a comment in config.py to try to persuade institutions to set the isolation level. It's actually hard/difficult to impose this given that the default config is configured for SQLite which doesn't support READ COMMITTED.
  2. Updates the test configs to ensure that—for MySQL and PostgreSQL—the isolation level is READ COMMITTED.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TESTING INSTRUCTIONS

CI.

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

@john-bodley john-bodley added review:checkpoint Last PR reviewed during the daily review standup review:draft and removed review:checkpoint Last PR reviewed during the daily review standup labels May 21, 2024
@john-bodley john-bodley force-pushed the john-bodley--mysql-read-committed branch from 4076016 to 05a93c6 Compare June 12, 2024 16:04
Copy link

codecov bot commented Jun 12, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 83.72%. Comparing base (76d897e) to head (6557097).
Report is 686 commits behind head on master.

Additional details and impacted files
@@             Coverage Diff             @@
##           master   #28628       +/-   ##
===========================================
+ Coverage   60.48%   83.72%   +23.23%     
===========================================
  Files        1931      518     -1413     
  Lines       76236    37523    -38713     
  Branches     8568        0     -8568     
===========================================
- Hits        46114    31416    -14698     
+ Misses      28017     6107    -21910     
+ Partials     2105        0     -2105     
Flag Coverage Δ
hive 48.94% <100.00%> (-0.22%) ⬇️
javascript ?
mysql 77.22% <100.00%> (?)
postgres 77.33% <100.00%> (?)
presto 53.57% <100.00%> (-0.24%) ⬇️
python 83.72% <100.00%> (+20.23%) ⬆️
sqlite 76.79% <100.00%> (?)
unit 59.21% <100.00%> (+1.58%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@john-bodley john-bodley changed the title chore: Set isolation level as READ COMMITTED chore: Set isolation level as READ COMMITTED for MySQL Jun 12, 2024
@john-bodley john-bodley changed the title chore: Set isolation level as READ COMMITTED for MySQL chore: Set isolation level as READ COMMITTED Jun 12, 2024
@john-bodley john-bodley changed the title chore: Set isolation level as READ COMMITTED chore: Set isolation level as READ COMMITTED for MySQL Jun 12, 2024
@john-bodley john-bodley changed the title chore: Set isolation level as READ COMMITTED for MySQL chore: Set isolation level Jun 12, 2024
@john-bodley john-bodley force-pushed the john-bodley--mysql-read-committed branch from 05a93c6 to da94f58 Compare June 12, 2024 17:58
@john-bodley john-bodley force-pushed the john-bodley--mysql-read-committed branch from da94f58 to 8639177 Compare June 12, 2024 21:23
@john-bodley john-bodley force-pushed the john-bodley--mysql-read-committed branch from 8639177 to 6557097 Compare June 12, 2024 22:14
@john-bodley john-bodley changed the title chore: Set isolation level chore: Set isolation level to READ COMMITTED Jun 12, 2024
@john-bodley john-bodley marked this pull request as ready for review June 12, 2024 22:41
@dosubot dosubot bot added data:connect:mysql Related to MySQL data:connect:postgres Related to Postgres labels Jun 12, 2024
@@ -52,12 +54,15 @@
if "UPLOAD_FOLDER" in os.environ: # noqa: F405
UPLOAD_FOLDER = os.environ["UPLOAD_FOLDER"] # noqa: F405

if "sqlite" in SQLALCHEMY_DATABASE_URI:
if make_url(SQLALCHEMY_DATABASE_URI).get_backend_name() == "sqlite":
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Safer than doing a string match.

@michael-s-molina
Copy link
Member

Set isolation level to READ COMMITTED

@john-bodley looking at the changed files I was not able to see where you set the isolation level to READ COMMITTED in this PR. Did you forget to commit something?

@john-bodley john-bodley changed the title chore: Set isolation level to READ COMMITTED chore: Set isolation level to READ COMMITTED for testing Jun 13, 2024
@john-bodley
Copy link
Member Author

john-bodley commented Jun 13, 2024

@michael-s-molina I updated the PR title and description.

It's actually hard to set this globally given that not all of our backends support the READ COMMITTED. I did add a comment in the config.py and updated the test configs so both MySQL and PostgreSQL will have similar behavior.

@john-bodley john-bodley changed the title chore: Set isolation level to READ COMMITTED for testing chore: Set isolation level to READ COMMITTED for testing et al. Jun 13, 2024
@john-bodley john-bodley merged commit f185bbe into apache:master Jun 13, 2024
72 checks passed
@john-bodley john-bodley deleted the john-bodley--mysql-read-committed branch June 13, 2024 16:36
# isolation level is READ COMMITTED. All backends should use READ COMMITTED (or similar)
# to help ensure consistent behavior.
SQLALCHEMY_ENGINE_OPTIONS = {
"isolation_level": "SERIALIZABLE", # SQLite does not support READ COMMITTED.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@john-bodley Reading this it looks like everything will be SERIALIZABLE right? we're overriding it in the tests but are we overriding it as READ COMMITTED anywhere else for the other databases?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sadpandajoe followed the rabbit hole here, and I think you're right, this will bump all envs who don't set their SQLALCHEMY_ENGINE_OPTIONS to a higher isolation level.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about this draft? #30174

@john-bodley john-bodley mentioned this pull request Jun 25, 2024
9 tasks
@mistercrunch mistercrunch added the risk:breaking-change Issues or PRs that will introduce breaking changes label Sep 6, 2024
mistercrunch added a commit that referenced this pull request Sep 6, 2024
#28628 had the virtuous intent to align mysql and postgres to a consistent isolation_level (READ COMMITTED), which seems fitted for Superset. Though instead of doing this, and because sqlite doesn't support that specific one, it set the default to SERIALIZABLE which seems to exist by many engines.

Here I'm realizing we need dynamic defaults for isolation_level based on which engine is in used, which can't be done in config.py as we don't know yet what engine will be set by the admin at that time. So I thought the superset/initialization package might be the right place for this.

Open to other solutions, but I think this one works.

* using DB_CONNECTION_MUTATOR, but that one applies only to analytics workloads, not the metadata db
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data:connect:mysql Related to MySQL data:connect:postgres Related to Postgres risk:breaking-change Issues or PRs that will introduce breaking changes size/S
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants