Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(dashboard): make permalinks stable #20632

Merged
merged 1 commit into from
Jul 12, 2022

Conversation

ktmud
Copy link
Member

@ktmud ktmud commented Jul 6, 2022

SUMMARY

Make dashboard permalink API always return the same key if the request payload is the same---i.e., for a given dashboard state, a user should always get a stable permalink when they request one.

This is a required change for #20552 and #19354 as we don't want to generate a new permalink each time a dashboard report is executed.

The Explore permalink should probably do the same, but this PR only changes the behavior for dashboard permalink just in case it causes any problems.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

N/A. This is a pure backend change. There should be no visible changes to users.

TESTING INSTRUCTIONS

  1. Copy a permalink for dashboard---you can do it in the "..." menu on the dashboard page, or in charts or tabs.
    image

  2. The link should always be the same no matter how many times you click on the trigger button/link

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

@ktmud ktmud marked this pull request as ready for review July 6, 2022 23:06
@codecov
Copy link

codecov bot commented Jul 6, 2022

Codecov Report

Merging #20632 (43e9ff2) into master (5317462) will decrease coverage by 0.19%.
The diff coverage is 100.00%.

❗ Current head 43e9ff2 differs from pull request most recent head 6fd93bd. Consider uploading reports for the commit 6fd93bd to get more accurate results

@@            Coverage Diff             @@
##           master   #20632      +/-   ##
==========================================
- Coverage   66.85%   66.65%   -0.20%     
==========================================
  Files        1753     1752       -1     
  Lines       65833    65647     -186     
  Branches     7007     6940      -67     
==========================================
- Hits        44011    43758     -253     
- Misses      20036    20127      +91     
+ Partials     1786     1762      -24     
Flag Coverage Δ
hive ?
mysql 82.39% <100.00%> (-0.08%) ⬇️
postgres 82.47% <100.00%> (-0.09%) ⬇️
presto ?
python 82.54% <100.00%> (-0.44%) ⬇️
sqlite 82.25% <100.00%> (-0.08%) ⬇️
unit ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...uperset/dashboards/filter_state/commands/create.py 100.00% <100.00%> (ø)
...uperset/dashboards/filter_state/commands/delete.py 95.83% <100.00%> (+0.18%) ⬆️
...uperset/dashboards/filter_state/commands/update.py 100.00% <100.00%> (ø)
superset/dashboards/permalink/commands/create.py 94.11% <100.00%> (+5.22%) ⬆️
superset/explore/form_data/commands/create.py 95.23% <100.00%> (+0.11%) ⬆️
superset/explore/form_data/commands/delete.py 92.50% <100.00%> (+0.39%) ⬆️
superset/explore/form_data/commands/update.py 93.87% <100.00%> (+0.12%) ⬆️
superset/key_value/commands/create.py 93.61% <100.00%> (+0.43%) ⬆️
superset/key_value/commands/upsert.py 89.58% <100.00%> (+0.45%) ⬆️
superset/key_value/utils.py 95.12% <100.00%> (+1.00%) ⬆️
... and 154 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5317462...6fd93bd. Read the comment docs.

@ktmud ktmud force-pushed the keyvalue-deterministic-key branch from 91ed359 to 43a6f74 Compare July 6, 2022 23:13
@ktmud ktmud changed the title feat(dashboard): make permalink deterministic feat(dashboard): make permalinks stable Jul 6, 2022
@ktmud ktmud force-pushed the keyvalue-deterministic-key branch from 43a6f74 to 02d3699 Compare July 6, 2022 23:16
}
key = CreateKeyValueCommand(
key = UpsertKeyValueCommand(
key=get_deterministic_uuid(self.salt, self.signature),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Theoretically the uuid3 would never collide with another uuid3, but there is an extremely small chance of colliding with a uuid4. Once we change the Explore permalink to this deterministic id, too, this should not be a concern.

return uuid3(get_uuid_namespace(namespace), payload_str)


def get_owner_id(user: Optional[User]) -> Optional[int]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to self. This should be replaced with get_user_id if/when #20499 is merged.

}

@property
def signature(self) -> Tuple[Optional[int], DashboardPermalinkValue]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the signature a function of the user?

Copy link
Member Author

@ktmud ktmud Jul 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just thought it'd be safer to give each user a unique permalink in case we want to do anything special about it. (Can't think of a use case just yet) Plus user id is also saved for KeyValue so you can also argue this is for completeness sake.

@ktmud ktmud force-pushed the keyvalue-deterministic-key branch from 02d3699 to 53c1c1a Compare July 9, 2022 00:59
@pull-request-size pull-request-size bot added size/M and removed size/L labels Jul 9, 2022
@ktmud ktmud force-pushed the keyvalue-deterministic-key branch 2 times, most recently from 8addd03 to c6041a5 Compare July 11, 2022 18:00

def get_deterministic_uuid(namespace: str, payload: Any) -> UUID:
"""Get a deterministic UUID (uuid3) from a salt and a payload."""
payload_str = json_dumps_w_dates(payload, sort_keys=True)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@villebro @michael-s-molina any particular reason why KeyValue was stored with Python pickle instead of JSON? IMO we should avoid storing pickles in database as they are not portable and could be versioned out if an advanced python object was used. It'd also be impossible to export the database directly for offline analysis.

Copy link
Member

@villebro villebro Jul 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ktmud not all KeyValue implementations will necessarily be JSON serializable. For instance, the SupersetMetastoreCache KeyValue objects aren't JSON serialisable. While pickle isn't necessarily perfect in all scenarios, IMO it's the most generally supported serialising/deserializing utility for Python, and as such seems like the best solution for a general purpose key-value store.

At the time this feature was designed there was really no requirement for analysing the contents of the value payload, so this wasn't considered. But if this is needed I don't see any reason why we couldn't make the serializer and deserialiser customizable to make it possible to use another library for encoding/decoding the values and then swapping it out for json.dumps and json.loads for the permalink implementations (would of course require a migration).

In the case of the proposed function, I'd recommend adding a comment in the doctoring that the payload needs to be JSON serializable.

Copy link
Member Author

@ktmud ktmud Jul 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't think it's a good idea to store language-specific serialized data in databases---even though Superset is predominantly a Python app. I can see the case when we need to store other binary data, e.g., image files, cache for zipped exports, encrypted data, etc, so maybe the value column indeed needs to be binary. A customizable serializer sounds like a good idea. Let's revisit when offline analysis becomes a real need.

@ktmud ktmud merged commit c3ac612 into apache:master Jul 12, 2022
@john-bodley john-bodley deleted the keyvalue-deterministic-key branch February 17, 2023 22:16
@mistercrunch mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 2.1.0 labels Mar 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels size/M 🚢 2.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants