Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(sqllab): Replace stringified 'null' schema column values with NULL #18992

Merged

Conversation

john-bodley
Copy link
Member

@john-bodley john-bodley commented Mar 1, 2022

SUMMARY

I'm not sure exactly when this came about, though I believe the issue has been resolved recently (albeit not being able to identify the relevant PR(s)), but it seems at some stage we were stringifying an undefined (NULL) schema in SQL Lab as "null" in both the query and saved_query tables resulting in some undesirable user behavior.

This PR includes a migration which simply replaces the null string with NULL, i.e., undefined.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TESTING INSTRUCTIONS

CI and ran the upgrade/downgrade manually. Also ran the bench-marking script (thanks @betodealmeida) which produced the following results:

Migration for 1000+ entities took: 0.42 seconds

Results:

Current: 0.50 s
10+: 0.35 s
100+: 0.35 s
1000+: 0.42 s

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

@john-bodley john-bodley requested a review from a team as a code owner March 1, 2022 20:59
__tablename__ = "saved_query"

id = Column(Integer, primary_key=True)
schema = Column(String(128))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not evident why these tables use different string sizes for encoding the schema.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yikes

Copy link
Member

@etr2460 etr2460 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you run Beto's db migration benchmarking script and post the results?

__tablename__ = "saved_query"

id = Column(Integer, primary_key=True)
schema = Column(String(128))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yikes

@codecov
Copy link

codecov bot commented Mar 1, 2022

Codecov Report

Merging #18992 (40c2332) into master (150fd0d) will decrease coverage by 0.19%.
The diff coverage is n/a.

❗ Current head 40c2332 differs from pull request most recent head 88af195. Consider uploading reports for the commit 88af195 to get more accurate results

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #18992      +/-   ##
==========================================
- Coverage   66.58%   66.39%   -0.20%     
==========================================
  Files        1641     1641              
  Lines       63524    63524              
  Branches     6421     6421              
==========================================
- Hits        42299    42178     -121     
- Misses      19548    19669     +121     
  Partials     1677     1677              
Flag Coverage Δ
hive ?
mysql 81.83% <ø> (ø)
postgres 81.88% <ø> (ø)
presto ?
python 81.93% <ø> (-0.39%) ⬇️
sqlite ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
superset/db_engines/hive.py 0.00% <0.00%> (-85.19%) ⬇️
superset/db_engine_specs/hive.py 70.00% <0.00%> (-15.77%) ⬇️
superset/db_engine_specs/presto.py 83.47% <0.00%> (-5.65%) ⬇️
superset/db_engine_specs/sqlite.py 91.89% <0.00%> (-5.41%) ⬇️
superset/connectors/sqla/utils.py 88.23% <0.00%> (-3.93%) ⬇️
superset/utils/celery.py 86.20% <0.00%> (-3.45%) ⬇️
superset/result_set.py 96.77% <0.00%> (-1.62%) ⬇️
superset/databases/commands/test_connection.py 98.57% <0.00%> (-1.43%) ⬇️
superset/connectors/sqla/models.py 88.81% <0.00%> (-1.24%) ⬇️
superset/utils/cache.py 73.58% <0.00%> (-0.95%) ⬇️
... and 4 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 150fd0d...88af195. Read the comment docs.

@john-bodley
Copy link
Member Author

@etr2460 I've addressed your comment.

@john-bodley john-bodley requested a review from etr2460 March 2, 2022 01:12
Copy link
Member

@etr2460 etr2460 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks for the cleanup migration

@john-bodley john-bodley merged commit 19eb73b into apache:master Mar 2, 2022
villebro pushed a commit that referenced this pull request Apr 3, 2022
#18992)

Co-authored-by: John Bodley <john.bodley@airbnb.com>
(cherry picked from commit 19eb73b)
@mistercrunch mistercrunch added 🍒 1.5.0 🍒 1.5.1 🍒 1.5.2 🍒 1.5.3 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 2.0.0 labels Mar 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels lts-v1 size/M 🍒 1.5.0 🍒 1.5.1 🍒 1.5.2 🍒 1.5.3 🚢 2.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants