Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: add check constraint to restrict Slice models datasource_type != "table" #23614

Merged
merged 7 commits into from
Apr 20, 2023

Conversation

hughhhh
Copy link
Member

@hughhhh hughhhh commented Apr 6, 2023

SUMMARY

Currently experiencing errors in production where charts/slices are being saved with datasource_type = "query". Currently charts can only be powered by SqlaTable models.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TESTING INSTRUCTIONS

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

@hughhhh hughhhh requested a review from a team as a code owner April 6, 2023 15:29
@john-bodley
Copy link
Member

john-bodley commented Apr 6, 2023

@hughhhh at Airbnb we're seeing examples of this as well. My question is your fix only includes a migration, i.e., it will remedy any currently ill-defined charts, but there's no fix for actually preventing this again—beyond the check constraint? Has this already been resolved?

@codecov
Copy link

codecov bot commented Apr 6, 2023

Codecov Report

Merging #23614 (7ac7868) into master (bccd267) will increase coverage by 0.36%.
The diff coverage is 74.22%.

❗ Current head 7ac7868 differs from pull request most recent head 6aee98b. Consider uploading reports for the commit 6aee98b to get more accurate results

@@            Coverage Diff             @@
##           master   #23614      +/-   ##
==========================================
+ Coverage   67.64%   68.00%   +0.36%     
==========================================
  Files        1916     1920       +4     
  Lines       74036    73990      -46     
  Branches     8039     8091      +52     
==========================================
+ Hits        50078    50314     +236     
+ Misses      21911    21607     -304     
- Partials     2047     2069      +22     
Flag Coverage Δ
hive 53.18% <50.70%> (?)
mysql 79.21% <74.82%> (+0.72%) ⬆️
postgres ?
presto 53.09% <50.70%> (+0.41%) ⬆️
python 82.98% <75.53%> (+0.74%) ⬆️
sqlite ?
unit 53.03% <42.90%> (+0.38%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...d/packages/superset-ui-chart-controls/src/types.ts 100.00% <ø> (ø)
...ackages/superset-ui-core/src/utils/featureFlags.ts 100.00% <ø> (ø)
...s/plugin-chart-echarts/src/Timeseries/constants.ts 100.00% <ø> (ø)
...gin-chart-echarts/src/Timeseries/transformProps.ts 57.27% <0.00%> (-1.07%) ⬇️
...tend/plugins/plugin-chart-echarts/src/constants.ts 100.00% <ø> (ø)
...tend/plugins/plugin-chart-echarts/src/controls.tsx 76.00% <ø> (ø)
...ugins/preset-chart-xy/src/components/Line/Line.tsx 0.00% <0.00%> (ø)
...preset-chart-xy/src/utils/createMarginSelector.tsx 0.00% <0.00%> (ø)
superset-frontend/src/SqlLab/App.jsx 0.00% <0.00%> (ø)
...d/src/SqlLab/components/SaveDatasetModal/index.tsx 52.27% <0.00%> (ø)
... and 64 more

... and 139 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

__tablename__ = "slices"

id = sa.Column(sa.Integer, primary_key=True)
slice_name = sa.Column(sa.String(250))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unused.

id = sa.Column(sa.Integer, primary_key=True)
slice_name = sa.Column(sa.String(250))
datasource_type = sa.Column(sa.String(200))
query_context = sa.Column(sa.Text)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unused.

with op.batch_alter_table("slices") as batch_op:
for slc in session.query(Slice).filter(Slice.datasource_type != "table").all():
# clean up all charts with datasource_type not != table
slc.datasource_type = "table"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'll also have to update the params which has datasource field defined as <datasource_id>_<datasource_type>.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hughhhh we're seeing a defined slices.datasource_name column when slices.datasource_type = 'query' however it doesn't seem to be a dataset in the tables table—which would indicate that your migration would likely break said slices when you make the switch from query to table. BTW where is this query stored?

Copy link
Member Author

@hughhhh hughhhh Apr 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hughhhh we're seeing a defined slices.datasource_name column when slices.datasource_type = 'query' however it doesn't seem to be a dataset in the tables table—which would indicate that your migration would likely break said slices when you make the switch from query to table. BTW where is this query stored?

So those slices are most likely broken, because charts/slices don't know who to execute grabbing data when the datasource_type is Query

Query is defined here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the stack trace we are encountering in production:

return [
  File "/usr/local/lib/python3.8/site-packages/marshmallow/schema.py", line 515, in <listcomp>
    self._serialize(d, many=False)
  File "/usr/local/lib/python3.8/site-packages/marshmallow/schema.py", line 520, in _serialize
    value = field_obj.serialize(attr_name, obj, accessor=self.get_attribute)
  File "/usr/local/lib/python3.8/site-packages/marshmallow/fields.py", line 338, in serialize
    return self._serialize(value, attr, obj, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/marshmallow/fields.py", line 1922, in _serialize
    return self._call_or_raise(self.serialize_func, obj, attr)
  File "/usr/local/lib/python3.8/site-packages/marshmallow/fields.py", line 1936, in _call_or_raise
    return func(value)
  File "/usr/local/lib/python3.8/site-packages/superset/models/slice.py", line 167, in datasource_url
    return datasource.explore_url if datasource else None
AttributeError: 'Query' object has no attribute 'explore_url'

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hughhhh per your response,

So those slices are most likely broken, because charts/slices don't know who to execute grabbing data when the datasource_type is Query

I guess the question/concern is simply changing the datasource_type to table doesn't really mitigate the problematic charts as the datasource_id is then likely incorrect as well, i.e., it's referencing the query.id column rather than the tables.id column.

Also please refer to my comment regarding the chart parameters as we (for right or wrong) doubly define the ID/type tuple.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so the idea here was to try and see if some of them will work if we change the type, since we can't just delete these charts. We are assuming these charts are failing already and hoping this might help fix it, but no guarantee. Also now with the constraint we should see error when the user/api tries to write slice.datasource_type = query

@pull-request-size pull-request-size bot added size/L and removed size/M labels Apr 13, 2023
@hughhhh hughhhh merged commit c441a70 into master Apr 20, 2023
@rusackas rusackas deleted the chart-ds-constraint branch April 20, 2023 21:24
sebastianliebscher pushed a commit to sebastianliebscher/superset that referenced this pull request Apr 28, 2023
@mistercrunch mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 3.0.0 labels Mar 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels size/L 🚢 3.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants