Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(embedded): adding logic to check dataset used by filters #24808

Merged
merged 8 commits into from
Jul 31, 2023

Conversation

Vitor-Avila
Copy link
Contributor

SUMMARY

When granting dashboard access to a guest user, it's only granted access to datasets used by its charts. If the dashboard has any native filters powered by datasets that aren't used by any chart, the filter wouldn't load with a permission error. This PR changes this logic to also allow access to datasets used by filters.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

Before

image

After

image

TESTING INSTRUCTIONS

  1. Create a chart using any dataset.
  2. Save the chart and add it to a dashboard.
  3. Create a virtual dataset for the same table (a select * ... would be enough).
  4. Create a dashboard filter using the virtual dataset.
  5. Enable embedded access for the dashboard.
  6. Create a guest_token and grant access to this dashboard.
  7. Access this dashboard in embedded mode.
  8. Validate that the dashboard filter loads properly.

ADDITIONAL INFORMATION

@codecov
Copy link

codecov bot commented Jul 26, 2023

Codecov Report

Merging #24808 (aa2c6f7) into master (c17accc) will increase coverage by 10.44%.
Report is 20 commits behind head on master.
The diff coverage is 73.63%.

❗ Current head aa2c6f7 differs from pull request most recent head 7668aec. Consider uploading reports for the commit 7668aec to get more accurate results

@@             Coverage Diff             @@
##           master   #24808       +/-   ##
===========================================
+ Coverage   58.40%   68.85%   +10.44%     
===========================================
  Files        1902     1903        +1     
  Lines       73996    74089       +93     
  Branches     8195     8194        -1     
===========================================
+ Hits        43220    51013     +7793     
+ Misses      28657    20955     -7702     
- Partials     2119     2121        +2     
Flag Coverage Δ
hive ?
mysql 79.21% <71.96%> (?)
postgres 79.31% <71.96%> (?)
presto ?
python 83.05% <73.64%> (+21.78%) ⬆️
sqlite 77.88% <71.12%> (?)
unit 54.97% <50.62%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Changed Coverage Δ
...et-chart-deckgl/src/layers/Geojson/controlPanel.ts 50.00% <ø> (ø)
...egacy-preset-chart-deckgl/src/layers/Path/Path.jsx 0.00% <ø> (ø)
...reset-chart-deckgl/src/layers/Path/controlPanel.ts 50.00% <ø> (ø)
...preset-chart-deckgl/src/layers/Polygon/Polygon.jsx 0.00% <ø> (ø)
...et-chart-deckgl/src/layers/Polygon/controlPanel.ts 33.33% <ø> (ø)
...reset-chart-deckgl/src/utilities/Shared_DeckGL.jsx 86.48% <ø> (ø)
superset-frontend/src/SqlLab/App.jsx 0.00% <ø> (ø)
...d/src/SqlLab/components/SaveDatasetModal/index.tsx 60.63% <ø> (+10.63%) ⬆️
...frontend/src/SqlLab/components/SouthPane/index.tsx 79.54% <ø> (ø)
superset-frontend/src/SqlLab/fixtures.ts 100.00% <ø> (ø)
... and 109 more

... and 222 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Copy link
Member

@eschutho eschutho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @Vitor-Avila!

for target in filter_.get("targets", [])
]
if datasource.id in filter_dataset_ids:
exists = True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
exists = True
return True

We should short circuit when possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes perfect sense! Made this change.

exists = True
except ValueError:
pass

return exists
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably cleaner on line #2066 to have:

if db.session.query(query.exists()).scalar():
    return True

and then on line #2088 have

return False

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here as well.

@@ -2063,6 +2064,27 @@ def can_access_based_on_dashboard(datasource: "BaseDatasource") -> bool:
)

exists = db.session.query(query.exists()).scalar()

# check for datasets that are only used by filters
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this logic makes sense. The one thing I struggle with regarding this method (for both the existing and proposed logic) is how is agnostic of the specific dashboard in question and thus iterates over all dashboards said user has access to. This raises two questions i) correctness, and ii) efficiency.

Currently I can't formulate a situation where (i) is a problem, however for (ii) this method seems highly inefficient, e.g. we loop over all the dashboards a user has access to in relation to said dataset/datasource, whereas in actuality we likely know the context a priori.

Note in #24789 this method is slated for removal, but the addition of the integration test will ensure that the logic will be preserved.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@john-bodley I totally agree that this approach is far from ideal from a performance point of view. I think with the bigger changes that are going to be discussed, we can reformulate this process and make sure this validation happens with a target dashboard ID in mind. I saw in #24804 that you have updated the raise_for_access function so it can also receive a dashboard, so I think it would be easier to implement this improvement once those changes (and other decisions made in regards to expected behavior) are made.

Personally this is my second contribution to Superset so I would rather avoid doing bigger estructural changes until I get more familiar with the code as a whole.

@eschutho
Copy link
Member

@john-bodley thanks for the feedback! We'll wait for your approval before merging.

@john-bodley john-bodley merged commit 7f9b038 into apache:master Jul 31, 2023
29 checks passed
@michael-s-molina michael-s-molina added the v3.0 Label added by the release manager to track PRs to be included in the 3.0 branch label Aug 1, 2023
sadpandajoe pushed a commit to preset-io/superset that referenced this pull request Aug 1, 2023
@sadpandajoe
Copy link
Contributor

🏷️ preset:2023.31

@mistercrunch mistercrunch added 🍒 3.0.4 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 3.1.0 labels Mar 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels size/M v3.0 Label added by the release manager to track PRs to be included in the 3.0 branch 🍒 3.0.0 🍒 3.0.1 🍒 3.0.2 🍒 3.0.3 🍒 3.0.4 🚢 3.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Guest user (embedded) doesn't get access to datasets used only in dashboard filters
6 participants