-
Notifications
You must be signed in to change notification settings - Fork 13.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Address regression introduced in #24789 #25008
Conversation
superset/security/manager.py
Outdated
form_data | ||
and (dashboard_id := form_data.get("dashboardId")) | ||
and (dashboard := DashboardDAO.find_by_id(dashboard_id)) | ||
and any(slc.datasource == datasource for slc in dashboard.slices) | ||
and self.can_access_dashboard(dashboard) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another thing that occurred to me recently, shouldn't we only be doing this check if dashboard rbac is enabled (or if the user is a guest user for embedded purposes).
For example before #24789 there was the following logic:
should_check_dashboard_access = (
feature_flag_manager.is_feature_enabled("DASHBOARD_RBAC")
or self.is_guest_user()
)
if not (
self.can_access_schema(datasource)
or self.can_access("datasource_access", datasource.perm or "")
or self.is_owner(datasource)
or (
should_check_dashboard_access
and self.can_access_based_on_dashboard(datasource)
but now, we're checking if if the user can access the dataset via the dashboard no matter what
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jfrag1 I updated the PR description.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My concern is that we're applying security rules intended for DASHBOARD_RBAC
even when the flag isn't enabled. With the flag disabled, a user has access to any published dashboard containing a chart they have access to. With this logic, the user would then automatically get access to all other charts/datasets on the dashboard (albeit only with the dashboard context included), which I don't think should happen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to bring back should_check_dashboard_access
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jfrag1 the self.can_access_dashboard(dashboard)
check is still being invoked and will always raise if a user doesn't have access to the dashboard regardless of what the context is.
The other formulation I can think of is to use the referrer to identify the dashboard in question, however I guess per here one can also spoof the referrer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand that, here's a more concrete example to illustrate what I'm trying to get at:
DASHBOARD_RBAC
is disabled. There's a published dashboard with 2 charts, powered by datasets A and B. A user has access to dataset A, but not dataset B. The user has access to the dashboard because they have access to dataset A. The question I'd then like to pose is, what is the expected behavior when the user goes to view the dashboard?
My understanding is that the expected behavior is that the user can view the chart powered by dataset A, but the chart powered by dataset B shows an error because they don't have access to the dataset.
However, after #24789 removed the should_check_dashboard_access
check, the user would be able to view both charts on the dashboard. The user's access to dataset A grants them access to the dashboard, which then grants them access to all other charts on the dashboard, which I'd argue is another regression from #24789
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jfrag1 I think the problem we're all struggling with is what the expected behavior should be. Superset's security manager was already complex and difficult to grok—which doesn't bode well from an understanding perspective—which we made worse with the introduction of the dashboard RBAC feature and embedded dashboards (which potentially have access patterns which conflict with the core manager). As the rules become more complex and/or conflicting we end up playing the game of Whac-A-Mole—where closing one attack vector likely opens another.
Fundamentally I think this—per your problem statement—comes down to "context". Let's say your two charts are A' and B' which are powered by datasets A and B respectively. With dashboard RBAC enabled the user can view chart A' in the "context" of the dashboard, but not in standalone/explorer mode. The challenge is the request is coming from the ChartDataCommand
command which is dashboard agnostic.
I here the point you're making. There's an issue where the access to a dashboard is circumventing the need to check whether the user has access to the datasource. So in addition to checking whether the user can access the dashboard we also need to check (if the check is at the datasource, query-context, or viz level) whether the dashboard access check is specific to RBAC, i.e.,
and self.can_access_dashboard(dashboard)
should likely be,
and self.can_access_dashboard(dashboard)
and is_feature_enabled("DASHBOARD_RBAC")
and dashboard.roles
Again we're adding complex logic to try to model a complex security manager which likely exposes more corner cases/issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a few scenarios we need to enumerate and document, this is what makes the most sense to me.
- DASHBOARDS_RBAC disabled: superset should behave the same way it did before this feature was introduced and permissions are applied at the dataset level via "datasource access on x"
- DASHBOARD_RBAC enabled:
- The dashboard has roles: If the user has rbac access via their role, all charts/fitlers/etc in the dashboard should render and the user has access to all of them.
- The dashboard doesn't have any attached roles: the access perms should work the same way they do as if the feature is disabled: charts only render if the user has dataset access, users see the dashboard in their list if they have the "can read Dashboard" permission
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nytai I think that makes a lot of sense! Just a quick question, if DASHBOARD_RBAC
is enabled and the dashboard has roles, the regular/default access validation should still work, right? For example:
- If the FF is enabled and the dashboard has roles, both someone with role-based access and a user with regular access (granted via
datasource access on x
/schema access on x
) should work.
Another thing to consider is if we still want to handle guest users (embedded) and DASHBOARD_RBAC
access similarly. Guest users can't access explore, they are only granted access to dashboards during the token generation, and should be able to access the dashboard entirely (including filters and etc). On the other hand, DASHBOARD_RBAC
users are authenticated users that can access explore/other parts of the app, so here the context is really important.
superset/security/manager.py
Outdated
@@ -1858,9 +1858,13 @@ def raise_for_access( | |||
or self.can_access("datasource_access", datasource.perm or "") | |||
or self.is_owner(datasource) | |||
or ( | |||
# Check whether the datasource is associated with a dashboard in a | |||
# trustworthy manner, i.e., we need to validate that the specified | |||
# dashboard is actually associated with said datasource. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think I understand this logic completely, but if dashboardId
can be spoofed in the form what's preventing users from reading any datasource?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's validation in the lines below that:
a. The user has access the the dashboard, and
b. The datasource being requested is present on the dashboard
@betodealmeida and @jfrag1 the caller for this check is coming from here, which mentions,
where access is checked here. The challenge here is given that the query context is constructed in the client then isn't any |
f2e6fb7
to
c3e19d2
Compare
c3e19d2
to
84ffd94
Compare
and (dashboard := DashboardDAO.find_by_id(dashboard_id)) | ||
and self.can_access_dashboard(dashboard) | ||
and ( | ||
dashboard_ := self.get_session.query(Dashboard) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The DashboardDAO
and ChartDAO
have additional filtering and/or security checks when fetching objects and thus I had to revert to using the raw SQLAlchemy ORM for fetching the dashboard and chart associated with the specified IDs.
Thanks—drum roll please—to @betodealmeida, @eschutho, @jfrag1 , @michael-s-molina, @nytai , and @villebro for jumping on the call earlier to discuss an interim solution to this regression which should (🤞) allow us to roll forward. I've updated the PR to extend the previous logic (stemming from @nytai's suggestion) I had drafted to ensure that:
Note this PR does not address granting access for native filters when the |
84ffd94
to
c465fc5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This updated logic looks good to me - I think there's just one more loose end other than native filters, which is ensuring embedded guest users are granted access to datasources (for charts and native filters) with the proper dashboard context. I'm happy to work on this as a follow-up once this PR is merged
SUMMARY
In #24996, @jfrag1 identified an issue with #24789 as there were no guarantees that form-data provided was backed by a trusted source.
Per #24996 (comment), I looked into seeing whether the access request could be more context aware, though regrettably the caller is of type
QueryContext
which is dashboard agnostic. The proposed fix, per @nytai's suggestion, was to add an additional check to verify that the dashboard chart (per thedashboardId
andslice_id
form-data fields) is associated with the datasource in question.BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
TESTING INSTRUCTIONS
Added unit tests.
ADDITIONAL INFORMATION