Allow for Input source security in native task layer#14003
Allow for Input source security in native task layer#14003zachjsh merged 23 commits intoapache:masterfrom
Conversation
...ava/org/apache/druid/indexing/common/task/batch/parallel/PartialHashSegmentGenerateTask.java
Fixed
Show fixed
Hide fixed
...va/org/apache/druid/indexing/common/task/batch/parallel/PartialRangeSegmentGenerateTask.java
Fixed
Show fixed
Hide fixed
...e/src/main/java/org/apache/druid/indexing/common/task/batch/parallel/SinglePhaseSubTask.java
Fixed
Show fixed
Hide fixed
indexing-service/src/main/java/org/apache/druid/indexing/overlord/http/OverlordResource.java
Fixed
Show fixed
Hide fixed
...main/java/org/apache/druid/indexing/common/task/batch/parallel/LegacySinglePhaseSubTask.java
Fixed
Show fixed
Hide fixed
...n/java/org/apache/druid/indexing/common/task/batch/parallel/ParallelIndexSupervisorTask.java
Fixed
Show fixed
Hide fixed
...va/org/apache/druid/indexing/common/task/batch/parallel/PartialDimensionCardinalityTask.java
Fixed
Show fixed
Hide fixed
...a/org/apache/druid/indexing/common/task/batch/parallel/PartialDimensionDistributionTask.java
Fixed
Show fixed
Hide fixed
...ore/kafka-indexing-service/src/main/java/org/apache/druid/indexing/kafka/KafkaIndexTask.java
Outdated
Show resolved
Hide resolved
...src/main/java/org/apache/druid/indexing/common/task/AppenderatorDriverRealtimeIndexTask.java
Outdated
Show resolved
Hide resolved
| @Override | ||
| public Set<String> getInputSourceTypes() | ||
| { | ||
| return getIngestionSchema().getIOConfig().getInputSource() != null ? |
There was a problem hiding this comment.
This code appears again and again. Can it be promoted to a base class?
There was a problem hiding this comment.
These task definitions are a little brittle, think this would be a little risky to do at this time.
indexing-service/src/main/java/org/apache/druid/indexing/overlord/http/OverlordResource.java
Outdated
Show resolved
Hide resolved
| default Set<String> getTypes() | ||
| { | ||
| return null; | ||
| } |
There was a problem hiding this comment.
Would be good to add a comment explaining why we return a set, not a single key. Also, see the note above regarding null vs. and empty set.
There was a problem hiding this comment.
It's better here to return a Set<Resource> rather than Set<String>. It makes it clearer that this is used for security purposes, since Resource is a security-specific thing.
There was a problem hiding this comment.
ResourceAction isn't available to InputSources at the moment. I can add a dependency. Ok to do this? Not sure if it will create a dependency cycle
There was a problem hiding this comment.
I tried this and unfortunately it added a cyclic dependency
processing/src/main/java/org/apache/druid/data/input/impl/CombiningInputSource.java
Show resolved
Hide resolved
| public Set<String> getTypes() | ||
| { | ||
| return Collections.singleton(TYPE_KEY); | ||
| } |
There was a problem hiding this comment.
Move to the base class since all input sources supported thus far have only one type. (The catalog and table function stuff depend on this fact.)
|
|
||
| default boolean usesFirehose() { | ||
| return false; | ||
| } |
There was a problem hiding this comment.
Perhaps add comments to explain these method. Especially the usesFirehose() method: I gather that firehose doesn't fit into the input security model? Why not? An explanation will help future readers.
gianm
left a comment
There was a problem hiding this comment.
Just looked at the interfaces for this particular review. (InputSource, Task, and SupervisorSpec)
| default Set<String> getTypes() | ||
| { | ||
| return null; | ||
| } |
There was a problem hiding this comment.
It's better here to return a Set<Resource> rather than Set<String>. It makes it clearer that this is used for security purposes, since Resource is a security-specific thing.
| * input sources but not others, using the | ||
| * {@link org.apache.druid.server.security.AuthConfig#enableInputSourceSecurity} config. | ||
| */ | ||
| default Set<String> getInputSourceTypes() { |
There was a problem hiding this comment.
Similar to the comment on InputSource: it's better for this to be Set<Resource>, so it's clear it's security-related. Method name would be getInputSourceResources() in that case.
There was a problem hiding this comment.
Good suggestion, fixed.
| * {@link org.apache.druid.server.security.AuthConfig#enableInputSourceSecurity} config is | ||
| * enabled, then tasks that use firehose cannot be used. | ||
| */ | ||
| default boolean usesFirehose() { |
There was a problem hiding this comment.
We should change this to what callers care about. The caller isn't interested in whether a task uses firehoses: it's interested in whether the getInputSourceTypes method can be used for authorization. (Consider a case where a task doesn't use firehoses, but still also doesn't support input source authorization.)
The default is also problematic: security stuff must always fail-secure. Doing return false here fails insecure: it means that a task that doesn't implement any of this stuff would be allowed. So we should flip that. Putting these together: I'd consider changing this to boolean canUseInputSourceTypeAuthorization() and having the default be return false.
Or, another option would be eliminating this method, and having getInputSourceTypes() (or getInputSourceResources()) throw an exception if the task doesn't support input source authorization. The default implementation would need to throw that exception.
(Same comment for Task, btw.)
There was a problem hiding this comment.
Good suggestion, fixed.
indexing-service/src/main/java/org/apache/druid/indexing/overlord/http/OverlordResource.java
Fixed
Show fixed
Hide fixed
| new DataSchema( | ||
| "foo", null, new AggregatorFactory[0], new UniformGranularitySpec( | ||
| Granularities.DAY, | ||
| null, | ||
| ImmutableList.of(Intervals.of("2010-01-01/P1D")) | ||
| ), | ||
| null, | ||
| jsonMapper | ||
| ), new HadoopIOConfig(ImmutableMap.of("paths", "bar"), null, null), null |
Check notice
Code scanning / CodeQL
Deprecated method or constructor invocation
indexing-service/src/test/java/org/apache/druid/indexing/common/task/HadoopIndexTaskTest.java
Fixed
Show fixed
Hide fixed
...n/java/org/apache/druid/indexing/common/task/batch/parallel/ParallelIndexSupervisorTask.java
Fixed
Show fixed
Hide fixed
...va/org/apache/druid/indexing/common/task/batch/parallel/PartialDimensionCardinalityTask.java
Fixed
Show fixed
Hide fixed
...a/org/apache/druid/indexing/common/task/batch/parallel/PartialDimensionDistributionTask.java
Fixed
Show fixed
Hide fixed
...ava/org/apache/druid/indexing/common/task/batch/parallel/PartialHashSegmentGenerateTask.java
Fixed
Show fixed
Hide fixed
...va/org/apache/druid/indexing/common/task/batch/parallel/PartialRangeSegmentGenerateTask.java
Fixed
Show fixed
Hide fixed
...e/src/main/java/org/apache/druid/indexing/common/task/batch/parallel/SinglePhaseSubTask.java
Fixed
Show fixed
Hide fixed
...main/java/org/apache/druid/indexing/common/task/batch/parallel/LegacySinglePhaseSubTask.java
Fixed
Show fixed
Hide fixed
| @Override | ||
| public Set<ResourceAction> getInputSourceResources() | ||
| { | ||
| if (getIngestionSchema().getIOConfig().getFirehoseFactory() != null) { |
Check notice
Code scanning / CodeQL
Deprecated method or constructor invocation
| @Override | ||
| public Set<ResourceAction> getInputSourceResources() | ||
| { | ||
| if (getIngestionSchema().getIOConfig().getFirehoseFactory() != null) { |
Check notice
Code scanning / CodeQL
Deprecated method or constructor invocation
| @Override | ||
| public Set<ResourceAction> getInputSourceResources() | ||
| { | ||
| if (getIngestionSchema().getIOConfig().getFirehoseFactory() != null) { |
Check notice
Code scanning / CodeQL
Deprecated method or constructor invocation
| @Override | ||
| public Set<ResourceAction> getInputSourceResources() | ||
| { | ||
| if (getIngestionSchema().getIOConfig().getFirehoseFactory() != null) { |
Check notice
Code scanning / CodeQL
Deprecated method or constructor invocation
| @Override | ||
| public Set<ResourceAction> getInputSourceResources() | ||
| { | ||
| if (getIngestionSchema().getIOConfig().getFirehoseFactory() != null) { |
Check notice
Code scanning / CodeQL
Deprecated method or constructor invocation
| @Override | ||
| public Set<ResourceAction> getInputSourceResources() | ||
| { | ||
| if (getIngestionSchema().getIOConfig().getFirehoseFactory() != null) { |
Check notice
Code scanning / CodeQL
Deprecated method or constructor invocation
| @Override | ||
| public Set<ResourceAction> getInputSourceResources() | ||
| { | ||
| if (getIngestionSchema().getIOConfig().getFirehoseFactory() != null) { |
Check notice
Code scanning / CodeQL
Deprecated method or constructor invocation
Fixes #13837.
Description
This change allows for input source type security in the native task layer.
To enable this feature, the user must set the following property to true:
druid.auth.enableInputSourceSecurity=trueThe default value for this property is false, which will continue the existing functionality of needing authorization to write to the respective datasource.
When this config is enabled, the users will be required to be authorized for the following resource action, in addition to write permission on the respective datasource.
new ResourceAction(new Resource(ResourceType.EXTERNAL, {INPUT_SOURCE_TYPE}, Action.READwhere
{INPUT_SOURCE_TYPE}is the type of the input source being used;, http, inline, s3, etc..Only tasks that provide a non-default implementation of the
getInputSourceResourcesmethod can be submitted when configdruid.auth.enableInputSourceSecurity=trueis set. Otherwise, a 400 error will be thrown.This PR has: