Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: save columns reference from sqllab save datasets flow #24248

Merged
merged 61 commits into from
Jun 20, 2023

Conversation

hughhhh
Copy link
Member

@hughhhh hughhhh commented May 30, 2023

SUMMARY

With the deprecation of /sqllab_viz we need to update the API for /datasets to manage saving columns when saving a dataset. Also added a refactor for changing column.name to always be column.column_name to match the backend TableColumn model

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TESTING INSTRUCTIONS

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

@codecov
Copy link

codecov bot commented May 30, 2023

Codecov Report

Merging #24248 (c5c3cd2) into master (c3b5d72) will decrease coverage by 10.69%.
The diff coverage is 72.50%.

❗ Current head c5c3cd2 differs from pull request most recent head 2a1f09e. Consider uploading reports for the commit 2a1f09e to get more accurate results

@@             Coverage Diff             @@
##           master   #24248       +/-   ##
===========================================
- Coverage   68.85%   58.16%   -10.69%     
===========================================
  Files        1901     1901               
  Lines       73969    74000       +31     
  Branches     8119     8116        -3     
===========================================
- Hits        50931    43044     -7887     
- Misses      20927    28845     +7918     
  Partials     2111     2111               
Flag Coverage Δ
hive 53.93% <69.44%> (?)
javascript 55.65% <25.00%> (-0.01%) ⬇️
postgres ?
presto 53.83% <63.88%> (?)
python 60.88% <77.77%> (-22.25%) ⬇️
sqlite ?
unit 54.62% <66.66%> (+0.03%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...ckages/superset-ui-chart-controls/src/constants.ts 100.00% <ø> (ø)
...erset-ui-chart-controls/src/utils/columnChoices.ts 100.00% <ø> (ø)
...packages/superset-ui-core/src/query/types/Query.ts 100.00% <ø> (ø)
...d/src/SqlLab/components/SaveDatasetModal/index.tsx 50.00% <0.00%> (ø)
superset-frontend/src/SqlLab/fixtures.ts 100.00% <ø> (ø)
...rset-frontend/src/explore/components/SaveModal.tsx 35.08% <ø> (+0.30%) ⬆️
superset/databases/utils.py 28.88% <ø> (-53.34%) ⬇️
superset/datasets/api.py 47.07% <0.00%> (-41.59%) ⬇️
superset/models/sql_lab.py 75.10% <ø> (-2.82%) ⬇️
superset/result_set.py 83.09% <ø> (-14.79%) ⬇️
... and 17 more

... and 284 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@pull-request-size pull-request-size bot added size/M and removed size/S labels May 30, 2023
@@ -218,7 +220,9 @@ export const SaveDatasetModal = ({
...formDataWithDefaults,
datasource: `${datasetToOverwrite.datasetid}__table`,
...(defaultVizType === 'table' && {
all_columns: datasource?.columns?.map(column => column.column_name),
all_columns: datasource?.columns?.map(
column => column.name || column.column_name,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've asked this before, but I don't remember the answer. Why can't we standardize this? This is definitely going to cause bugs in the future, having two possible options for the attribute name.

Can't we normalize the schema early, if they have to be stored differently, instead of normalizing it late here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking into this now @betodealmeida, i think it was just a difference between how sqllab consumes columns vs. explore

@eschutho
Copy link
Member

@hughhhh can you add some tests on this for the bugs fixed? Thanks!

Copy link
Member

@eschutho eschutho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's add some tests for this specific use case since it's not on the codebase already.

@@ -59,6 +59,8 @@ def update_saved_query_exec_info(query_id: int) -> None:
def save_metadata(query: Query, payload: Dict[str, Any]) -> None:
# pull relevant data from payload and store in extra_json
columns = payload.get("columns", {})
for col in columns:
col["column_name"] = col.pop("name")
Copy link
Member Author

@hughhhh hughhhh May 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from the query execution the dict comes back column[name], and decided to update the naming here before returning it to the client

@pull-request-size pull-request-size bot added size/L and removed size/M labels May 31, 2023
@pull-request-size pull-request-size bot added size/M and removed size/L labels Jun 1, 2023
@eschutho
Copy link
Member

@hughhhh does this break any apis? i.e., change the schema for the request or response?

@john-bodley
Copy link
Member

@hughhhh what's the main reasoning behind renaming the column name to column_name? It seems like the column_ prefix if superfluous and inconsistent, i.e., we now have column_name and type.

@eschutho
Copy link
Member

/testenv up

@github-actions
Copy link
Contributor

@eschutho Container image not yet published for this PR. Please try again when build is complete.

@github-actions
Copy link
Contributor

@eschutho Ephemeral environment creation failed. Please check the Actions logs for details.

@hughhhh
Copy link
Member Author

hughhhh commented Jun 15, 2023

@hughhhh what's the main reasoning behind renaming the column name to column_name? It seems like the column_ prefix if superfluous and inconsistent, i.e., we now have column_name and type.

@john-bodley

The main reason is actually keep consistency across the app when it comes to Column objects/dicts, since we we've been incorporated query objects inside explore there has been tons of places where we have conditional logic to manage SqlaTables vs Query (example here)[https://github.com//pull/24248/files#diff-407b1e72d7f4462c85eca33d215cd9981b18db050f649dfaaac5748a262d09b8R503] which has also caused a few bugs whenever a user goes from sqllab -> explore. So wanted to create a consistent convention. I picked .column_name since this it equivalent to our TableColumn model field.

@hughhhh
Copy link
Member Author

hughhhh commented Jun 15, 2023

@hughhhh does this break any apis? i.e., change the schema for the request or response?

@eschutho

Yea the column object now have a different key for name it's now column_name, I can add back the key name for clients that might be relying on the key thats not coming through our UI

@eschutho
Copy link
Member

Yea the column object now have a different key for name it's now column_name, I can add back the key name for clients that might be relying on the key thats not coming through our UI

That would be great. Yeah, we should always keep the apis backward compatible. Even if the app has a breaking change, the apis have their own versioning and the request/response schemas should be backward compatible until we can bump to the next version of the api.

@hughhhh hughhhh force-pushed the fix-save-ds-in-sqllab branch 2 times, most recently from 311d07b to 3fc4ea9 Compare June 15, 2023 19:21
Copy link
Member

@eschutho eschutho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I agree with @john-bodley that the column.column_name property sounds redundant over just name but maybe it's something we can revisit for the v3 api?

@hughhhh hughhhh merged commit 93e1db4 into apache:master Jun 20, 2023
29 checks passed
@hughhhh hughhhh deleted the fix-save-ds-in-sqllab branch June 20, 2023 17:54
@mistercrunch mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 3.0.0 labels Mar 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels size/XL 🚢 3.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants