Skip to content

Comments

Allow for Input source security in native task layer#14003

Merged
zachjsh merged 23 commits intoapache:masterfrom
zachjsh:input-source-security-native
Apr 6, 2023
Merged

Allow for Input source security in native task layer#14003
zachjsh merged 23 commits intoapache:masterfrom
zachjsh:input-source-security-native

Conversation

@zachjsh
Copy link
Contributor

@zachjsh zachjsh commented Mar 30, 2023

Fixes #13837.

Description

This change allows for input source type security in the native task layer.

To enable this feature, the user must set the following property to true:

druid.auth.enableInputSourceSecurity=true

The default value for this property is false, which will continue the existing functionality of needing authorization to write to the respective datasource.

When this config is enabled, the users will be required to be authorized for the following resource action, in addition to write permission on the respective datasource.

new ResourceAction(new Resource(ResourceType.EXTERNAL, {INPUT_SOURCE_TYPE}, Action.READ

where {INPUT_SOURCE_TYPE} is the type of the input source being used;, http, inline, s3, etc..

Only tasks that provide a non-default implementation of the getInputSourceResources method can be submitted when config druid.auth.enableInputSourceSecurity=true is set. Otherwise, a 400 error will be thrown.

This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • a release note entry in the PR description.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

@zachjsh zachjsh requested a review from paul-rogers March 30, 2023 20:29
@Override
public Set<String> getInputSourceTypes()
{
return getIngestionSchema().getIOConfig().getInputSource() != null ?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code appears again and again. Can it be promoted to a base class?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These task definitions are a little brittle, think this would be a little risky to do at this time.

default Set<String> getTypes()
{
return null;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to add a comment explaining why we return a set, not a single key. Also, see the note above regarding null vs. and empty set.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's better here to return a Set<Resource> rather than Set<String>. It makes it clearer that this is used for security purposes, since Resource is a security-specific thing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ResourceAction isn't available to InputSources at the moment. I can add a dependency. Ok to do this? Not sure if it will create a dependency cycle

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried this and unfortunately it added a cyclic dependency

public Set<String> getTypes()
{
return Collections.singleton(TYPE_KEY);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move to the base class since all input sources supported thus far have only one type. (The catalog and table function stuff depend on this fact.)


default boolean usesFirehose() {
return false;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps add comments to explain these method. Especially the usesFirehose() method: I gather that firehose doesn't fit into the input security model? Why not? An explanation will help future readers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added comments

@zachjsh zachjsh marked this pull request as ready for review April 4, 2023 16:48
@zachjsh zachjsh requested a review from paul-rogers April 4, 2023 19:51
Copy link
Contributor

@gianm gianm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just looked at the interfaces for this particular review. (InputSource, Task, and SupervisorSpec)

default Set<String> getTypes()
{
return null;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's better here to return a Set<Resource> rather than Set<String>. It makes it clearer that this is used for security purposes, since Resource is a security-specific thing.

* input sources but not others, using the
* {@link org.apache.druid.server.security.AuthConfig#enableInputSourceSecurity} config.
*/
default Set<String> getInputSourceTypes() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to the comment on InputSource: it's better for this to be Set<Resource>, so it's clear it's security-related. Method name would be getInputSourceResources() in that case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion, fixed.

* {@link org.apache.druid.server.security.AuthConfig#enableInputSourceSecurity} config is
* enabled, then tasks that use firehose cannot be used.
*/
default boolean usesFirehose() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should change this to what callers care about. The caller isn't interested in whether a task uses firehoses: it's interested in whether the getInputSourceTypes method can be used for authorization. (Consider a case where a task doesn't use firehoses, but still also doesn't support input source authorization.)

The default is also problematic: security stuff must always fail-secure. Doing return false here fails insecure: it means that a task that doesn't implement any of this stuff would be allowed. So we should flip that. Putting these together: I'd consider changing this to boolean canUseInputSourceTypeAuthorization() and having the default be return false.

Or, another option would be eliminating this method, and having getInputSourceTypes() (or getInputSourceResources()) throw an exception if the task doesn't support input source authorization. The default implementation would need to throw that exception.

(Same comment for Task, btw.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion, fixed.

Comment on lines 48 to 56
new DataSchema(
"foo", null, new AggregatorFactory[0], new UniformGranularitySpec(
Granularities.DAY,
null,
ImmutableList.of(Intervals.of("2010-01-01/P1D"))
),
null,
jsonMapper
), new HadoopIOConfig(ImmutableMap.of("paths", "bar"), null, null), null

Check notice

Code scanning / CodeQL

Deprecated method or constructor invocation

Invoking [DataSchema.DataSchema](1) should be avoided because it has been deprecated.
@zachjsh zachjsh requested a review from gianm April 4, 2023 23:53
@Override
public Set<ResourceAction> getInputSourceResources()
{
if (getIngestionSchema().getIOConfig().getFirehoseFactory() != null) {

Check notice

Code scanning / CodeQL

Deprecated method or constructor invocation

Invoking [IndexIOConfig.getFirehoseFactory](1) should be avoided because it has been deprecated.
@Override
public Set<ResourceAction> getInputSourceResources()
{
if (getIngestionSchema().getIOConfig().getFirehoseFactory() != null) {

Check notice

Code scanning / CodeQL

Deprecated method or constructor invocation

Invoking [IndexIOConfig.getFirehoseFactory](1) should be avoided because it has been deprecated.
@Override
public Set<ResourceAction> getInputSourceResources()
{
if (getIngestionSchema().getIOConfig().getFirehoseFactory() != null) {

Check notice

Code scanning / CodeQL

Deprecated method or constructor invocation

Invoking [IndexIOConfig.getFirehoseFactory](1) should be avoided because it has been deprecated.
@Override
public Set<ResourceAction> getInputSourceResources()
{
if (getIngestionSchema().getIOConfig().getFirehoseFactory() != null) {

Check notice

Code scanning / CodeQL

Deprecated method or constructor invocation

Invoking [IndexIOConfig.getFirehoseFactory](1) should be avoided because it has been deprecated.
@Override
public Set<ResourceAction> getInputSourceResources()
{
if (getIngestionSchema().getIOConfig().getFirehoseFactory() != null) {

Check notice

Code scanning / CodeQL

Deprecated method or constructor invocation

Invoking [IndexIOConfig.getFirehoseFactory](1) should be avoided because it has been deprecated.
@Override
public Set<ResourceAction> getInputSourceResources()
{
if (getIngestionSchema().getIOConfig().getFirehoseFactory() != null) {

Check notice

Code scanning / CodeQL

Deprecated method or constructor invocation

Invoking [IndexIOConfig.getFirehoseFactory](1) should be avoided because it has been deprecated.
@Override
public Set<ResourceAction> getInputSourceResources()
{
if (getIngestionSchema().getIOConfig().getFirehoseFactory() != null) {

Check notice

Code scanning / CodeQL

Deprecated method or constructor invocation

Invoking [IndexIOConfig.getFirehoseFactory](1) should be avoided because it has been deprecated.
Copy link
Contributor

@paul-rogers paul-rogers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zachjsh zachjsh merged commit 5c02213 into apache:master Apr 6, 2023
@zachjsh zachjsh deleted the input-source-security-native branch April 6, 2023 17:13
@clintropolis clintropolis added this to the 26.0 milestone Apr 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Input source security model for MSQ table functions and more

4 participants