-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enabling datasource level authorization in Druid #2424
Conversation
There is an interesting mix of standardized and non standardized auth methods here. On one hand the authorization tries to provide a standard resource related framework, but on the other hand it completely relies on the endpoint to do all the auth requesting and logic. I suspect this is because the resource of interest is contained within the body rather than as part of the http request metadata. IMHO a "cleaner" solution would be to architect the requests such that any endpoint that touches a sensitive resource must have the appropriate annotations on what resources it touches. Then the auth layer transparently handles the auth based on request metadata, and the duty within the endpoint itself is simply to wire up the authorized resource to the resource actually being used. Another way to say it is that it would be awesome if by the time the method is called, the auth system has already performed its function, and it is simply the job of the endpoint method to ensure compliance with the auth system's expectations on resource usage. |
Yes the auth checks are performed inside endpoints as many of the Druid endpoints does not follow REST conventions and resource information is inside the body. |
@drcrallen your suggestion is valid, however something that can be done independently of this PR and would probably require druid client API changes. |
@@ -0,0 +1,7 @@ | |||
package io.druid.server.security; | |||
|
|||
public enum AllowAccess |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we just call it Access ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
@@ -0,0 +1,6 @@ | |||
package io.druid.server.security; | |||
|
|||
public enum ResourceType |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why does this need to be enum
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@drcrallen i think, the set of valid resource types would be fixed by druid-core because druid-core calls in to the action and provides those as arguments, so it makes sense to use enum to represent resource types.
in any case, since we are calling the api experimental and potentially changeable in near future. I wouldn't worry too much about it either ways.
@himanshug / @pjain1 would you guys feel comfortable calling auth an experimental feature that may change significantly in the near-ish future? If that's the case then a stop-gap that solves your main pain points should be ok as long as its not intrusive in other scenarios (which I don't think this PR is) |
General PR comments:
|
@drcrallen I'm fine calling it experimental. |
* @param action action to be performed on the resource | ||
* @return {@link AllowAccess#ALLOW} if authorized otherwise {@link AllowAccess#DENY} | ||
* */ | ||
AllowAccess isAuthorized(Resource resource, Action action); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not just return boolean
?
@pjain1 please comment in the code where you're adding the auth checks that auth is experimental and reference this PR. ((so future developers know why it is there)) |
@drcrallen sure I will put the necessary comments. I will put the headers. |
@pjain1 good points. My reasons for not favoring enums are:
So basically, if you're trying to force any extensions to ONLY use the values presented, and not intending the security set to be extendable, then there can be a good argument for enums. |
@himanshug I think from #2424 (comment) you're supporting B) from my list above, where you are purposefully limiting the options available. Is that correct? |
@drcrallen yes |
@pjain1 merge conflicts |
@fjy I am still working on it, I will resolve the conflicts when it is reviewable |
@@ -278,6 +279,7 @@ public static Injector makeInjectorWithModules(final Injector baseInjector, Iter | |||
{ | |||
final ModuleList defaultModules = new ModuleList(baseInjector); | |||
defaultModules.addModules( | |||
new DruidAuthModule(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
after log4j shutter downer module
@drcrallen @fjy @himanshug the PR is reviewable. Updated the top level comment to reflect the state of latest changes. @drcrallen using annotation based resource filtering for performing access checks. |
@@ -372,7 +454,36 @@ public Response getRunningTasks() | |||
@Override | |||
public Collection<? extends TaskRunnerWorkItem> apply(TaskRunner taskRunner) | |||
{ | |||
return taskRunner.getRunningTasks(); | |||
if (authConfig.isEnabled()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this and the change in getPendingTasks() are same, can you refactor them into a private helper method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@pjain1 can we finish this up? there's merge conflicts |
private void getDataSourcesHelper(final Query query, List<String> datasources) { | ||
if (query.getDataSource() instanceof TableDataSource) { | ||
// there will only be one datasource for TableDataSource | ||
datasources.addAll(query.getDataSource().getNames()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which types break using only this call?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the getNames call on the DataSource impls here, it looks like you should be able to just use getNames
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch, dataSource.getNames() already returns appropriate list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
check now
@drcrallen I think I addressed your comments, can you have a look again ? |
@@ -0,0 +1,115 @@ | |||
package io.druid.indexing.overlord.http.security; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
header
@pjain1 Please check out https://github.com/pjain1/druid/pull/1 and see if that works for you |
@drcrallen fixed UTs. have a look now |
@Override | ||
public void configure(Binder binder) | ||
{ | ||
JsonConfigProvider.bind(binder, "druid.auth", AuthConfig.class); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(optional) would this be more appropriate as druid.request.auth
or something else a little more descriptive?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if request
should be included in the property name as the scope is much more than just requests to Druid. It can be called druid.security
or something like that or can be kept as it is..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
keep as is then.
few minor comments but looking very good overall. |
authorizationInfo = EasyMock.createStrictMock(AuthorizationInfo.class); | ||
|
||
// Memory barrier | ||
synchronized (this) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@drcrallen btw can you please explain what it this for ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought I was having trouble with threading, but it might not be right now.
Sometimes if a mocked object is accessed by multiple threads, the threads see an inconsistent rule set if the expectations are set in a different thread than the object is used. Memory barriers make the memory consistent at least to the point you pass the barrier.
You can tell you are hitting this kind of mocking race condition if you get an error like "expected 1 actual 1"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
@pjain1 can you comment in the master comment why you opted for this route instead of going through |
@drcrallen updated the master comment with the info you asked for. Please, see if it makes sense. |
@pjain1 yes thanks! |
Please ping me when #2424 (comment) is resolved and I think this should be ready to go |
@drcrallen done |
Cool 👍 |
@drcrallen squashed the commits |
- Introduce `AuthorizationInfo` interface, specific implementations of which would be provided by extensions - If the `druid.auth.enabled` is set to `true` then the `isAuthorized` method of `AuthorizationInfo` will be called to perform authorization checks - `AuthorizationInfo` object will be created in the servlet filters of specific extension and will be passed as a request attribute with attribute name as `AuthConfig.DRUID_AUTH_TOKEN` - As per the scope of this PR, all resources that needs to be secured are divided into 3 types - `DATASOURCE`, `CONFIG` and `STATE`. For any type of resource, possible actions are - `READ` or `WRITE` - Specific ResourceFilters are used to perform auth checks for all endpoints that corresponds to a specific resource type. This prevents duplication of logic and need to inject HttpServletRequest inside each endpoint. For example - `DatasourceResourceFilter` is used for endpoints where the datasource information is present after "datasources" segment in the request Path such as `/druid/coordinator/v1/datasources/`, `/druid/coordinator/v1/metadata/datasources/`, `/druid/v2/datasources/` - `RulesResourceFilter` is used where the datasource information is present after "rules" segment in the request Path such as `/druid/coordinator/v1/rules/` - `TaskResourceFilter` is used for endpoints is used where the datasource information is present after "task" segment in the request Path such as `druid/indexer/v1/task` - `ConfigResourceFilter` is used for endpoints like `/druid/coordinator/v1/config`, `/druid/indexer/v1/worker`, `/druid/worker/v1` etc - `StateResourceFilter` is used for endpoints like `/druid/broker/v1/loadstatus`, `/druid/coordinator/v1/leader`, `/druid/coordinator/v1/loadqueue`, `/druid/coordinator/v1/rules` etc - For endpoints where a list of resources is returned like `/druid/coordinator/v1/datasources`, `/druid/indexer/v1/completeTasks` etc. the list is filtered to return only the resources to which the requested user has access. In these cases, `HttpServletRequest` instance needs to be injected in the endpoint method. Note - JAX-RS specification provides an interface called `SecurityContext`. However, we did not use this but provided our own interface `AuthorizationInfo` mainly because it provides more flexibility. For example, `SecurityContext` has a method called `isUserInRole(String role)` which would be used for auth checks and if used then the mapping of what roles can access what resource needs to be modeled inside Druid either using some convention or some other means which is not very flexible as Druid has dynamic resources like datasources. Fixes #2355 with PR #2424
return action; | ||
} | ||
|
||
public abstract boolean isApplicable(String requestPath); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does anyone know what this method is for? This method is used in only unit tests.
Fixes #2355. This PR is meant to put necessary abstractions inside Druid for enabling authorization as discussed in #2355
AuthorizationInfo
interface, specific implementations of which would be provided by extensionsdruid.auth.enabled
is set totrue
then theisAuthorized
method ofAuthorizationInfo
will be called to perform authorization checksAuthorizationInfo
object will be created in the servlet filters of specific extension and will be passed as a request attribute with attribute name asAuthConfig.DRUID_AUTH_TOKEN
DATASOURCE
,CONFIG
andSTATE
. For any type of resource, possible actions are -READ
orWRITE
DatasourceResourceFilter
is used for endpoints where the datasource information is present after "datasources" segment in the request Path such as/druid/coordinator/v1/datasources/
,/druid/coordinator/v1/metadata/datasources/
,/druid/v2/datasources/
RulesResourceFilter
is used where the datasource information is present after "rules" segment in the request Path such as/druid/coordinator/v1/rules/
TaskResourceFilter
is used for endpoints is used where the datasource information is present after "task" segment in the request Path such asdruid/indexer/v1/task
ConfigResourceFilter
is used for endpoints like/druid/coordinator/v1/config
,/druid/indexer/v1/worker
,/druid/worker/v1
etcStateResourceFilter
is used for endpoints like/druid/broker/v1/loadstatus
,/druid/coordinator/v1/leader
,/druid/coordinator/v1/loadqueue
,/druid/coordinator/v1/rules
etc/druid/coordinator/v1/datasources
,/druid/indexer/v1/completeTasks
etc. the list is filtered to return only the resources to which the requested user has access. In these cases,HttpServletRequest
instance needs to be injected in the endpoint method.Note -
JAX-RS specification provides an interface called
SecurityContext
. However, we did not use this but provided our own interfaceAuthorizationInfo
mainly because it provides more flexibility. For example,SecurityContext
has a method calledisUserInRole(String role)
which would be used for auth checks and if used then the mapping of what roles can access what resource needs to be modeled inside Druid either using some convention or some other means which is not very flexible as Druid has dynamic resources like datasources.