Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use AccessControl#filterColumns to filter columns from table schema #8752

Closed
wants to merge 1 commit into from

Conversation

youngchen7
Copy link
Contributor

Alternative approach to #7893. Instead of changing the AccessControl interface, this PR reuses filterColumns to achieve the same behavior. This enforces consistency between what's visible in the metadata and what can be queried. Possible disadvantages and a topic for further discussion is ensuring that we support all existing use cases, as well as whether those use cases are reasonable. For example:

  • Only filtering metadata and not filtering actual query results (is this useful?)
  • Filtering metadata but throwing AccessDenied on the data instead of hiding columns
  • Filtering the actual query results but showing full metadata (is this useful?)

In general, there's a multiplex of possibilities between hiding, throwing, and showing metadata and data. We should confirm which ones are useful and necessary for us to support.

@cla-bot cla-bot bot added the cla-signed label Aug 2, 2021
@youngchen7
Copy link
Contributor Author

@findepi @kokosing - here's an alternative approach to #7893 - let me know your thoughts.

@kokosing
Copy link
Member

kokosing commented Aug 3, 2021

This would be misleading to users to return object does not exists when it exists but when access is denied. On the other hand there are users that may want the opposite.

This should be behind feature toggle that is turned off by default.

@kokosing
Copy link
Member

kokosing commented Aug 3, 2021

I agree it is much less complex.

@findepi
Copy link
Member

findepi commented Aug 3, 2021

This would be misleading to users to return object does not exists when it exists but when access is denied.

I agree, this can be misleading, but it's a fair price.

Also, some users consider revealing object existence in case of permission denied a security issue.

.map(ColumnSchema::getName)
.collect(toImmutableSet()));
List<ColumnSchema> accessibleColumns = tableSchema.getColumns().stream()
.filter(column -> accessibleColumnNames.contains(column.getName()))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This behavior is non standard (non-specwise) AFAIK, so should be an opt-in behavior, guarded with a toggle.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where should the toggle knob be located, in your opinion? I imagine it could be in the SystemAccessControl implementation itself if we pass a flag when calling to tell the AccessControl whether this call is for SELECT or for metadata.

Or, should there be another method in the SystemAccessControl interface like
boolean shouldFilterColumns(TableName name) that we check first? This would give the AccessControl implementations the ability to behave differently for different tables.

I suppose the more fine grained control we give to the AccessControl implementations, the closer this becomes to the implementation in #7893 where we expose two methods and let the SystemAccessControl decide entirely when it wants to hide/throw on inaccessible columns.

I think if we really want to maintain a consistent behavior here, we should also take a look at checkCanSelectFromColumns, and maybe enforce that if a column is supposed to be inaccessible then the toggle only controls whether the behavior is to throw or hide.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if toggle belongs to access control. To me it belongs to the engine as it we change the semantic of SQL. Authorization works same way, but its result is interpreted differently.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm - is there any value in allowing access control implementations to decide which tables should throw vs which tables should hide? If not, then toggling in the engine makes sense to me.

@findepi what do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any value in making this behave differently for a subset of tables.

@youngchen7
Copy link
Contributor Author

Fixes #7461

@colebow
Copy link
Member

colebow commented Oct 27, 2022

👋 @youngchen7 - this PR has become inactive. If you're still interested in working on it, please let us know, and we can try to get reviewers to help with that.

We're working on closing out old and inactive PRs, so if you're too busy or this has too many merge conflicts to be worth picking back up, we'll be making another pass to close it out in a few weeks.

@colebow
Copy link
Member

colebow commented Nov 30, 2022

Closing this one out due to inactivity, but please reopen if you would like to pick this back up.

@colebow colebow closed this Nov 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

None yet

4 participants