-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-20239][Core] Improve HistoryServer's ACL mechanism #17582
Conversation
Test build #75641 has finished for PR 17582 at commit
|
Test build #75643 has finished for PR 17582 at commit
|
Sorry but I'm confused by the explanation in the description. I didn't completely follow what problems you are seeing that aren't intended and I don't understand how you are proposing to fix. Can you please describe the design you are proposing in more detail? On the description can you please clarify each of your bullets? For instance:
Are you saying user A is not in the list of acls or is? if they have no view permission then they shouldn't be able to see the app. I don't understnad what you mean by "could still access details of it's own app"? Is this user A's application (meaning they started it) and hence he would automatically be in the acl list? Clarifying the other bullets would be helpful as well. |
So to me this is the only bug; which means that maybe ACLs on the listing itself shouldn't ever be applied, and this PR should be a lot simpler, right? Most of it seem to be dealing with filtering the list of apps so that only applications the user can see are shown. I wonder if that's necessary, since the only thing that's showing is the existence of the application, not any data about it that could be considered sensitive. There's also a minor thing that the listing being different for different users might cause confusion; but if there's a good reason for filtering, then that concern can be overridden. I'm just not sure there is a good reason for it. |
@tgravescs sorry for the confuse.
Here actually has two list of acls, one is controlled by
If "spark.acls.enabled" is disabled, then This
This is the same issue as above. So basically what I fixed is that:
So the result of my PR is:
@vanzin your suggestion is to only disable ACLs on the listing, that definitely simplifies the fix, but IMO that "all or nothing" solution is not so ideal:
So IMO filtering based on users is better than "all or nothing" solution. Also it doesn't increase the code complex much. |
@tgravescs @vanzin do you have any comment on this JIRA? A compromise is that any user could see all the app list but detailed information is still controlled by per app ACLs. But at least we should fix event log download issue, currently anyone could download the event log if "spark.acls.enable" is disabled, even it is not permitted by HDFS. This is definitely a security hole should be fixed. What do you think? |
Sorry again the wording above and all the different configs are a bit confusing to me as to what the real issues are here.
You are mixing things here. You say that if user "A" is not added to acl list he cannot see the app list. This is broken then and I assume only applies to rest api not UI? But I'm not sure what that has to do with your second sentence, if user "A" ran the app then of course he can see the details of the app, that is intended. I'm not sure what that has to do with the first issue? If you don't have spark.history.ui.acls.enabled then it is up to what the user set. Generally in any secure environment you should set spark.history.ui.acls.enabled=true and it should enforce acls no matter what user set. It might help for you to describe these in terms of configs. Which exact configs are set on the history server and which exact configs are set on the application side and which exact apis are being used (Rest vs Web UI). so all the urls you list are the REST API, is this only an issue with rest api or the actual web UI as well? It sounds like things are definitely broke there but I'm not sure it requires changing the configs just fixing the things that are broken. Its supposed to be that if spark.history.ui.acls.enable is enabled it doesn't matter what the setting of spark.acls.enable is, acls should always be enforced on the history server. see the description: https://spark.apache.org/docs/latest/monitoring.html Certain UI's don't have information that should be sensitive. I thought the list of applications was one of those things, all users should be able to see the entire list of applications. Nothing sensitive there, but once you look at the application details that should be acl'd. If someone added something sensitive then it should be protected or it should be moved from that page. My opinions on your response to @vanzin
|
Been following this but haven't had time to do a proper review, but @tgravescs since you brought up the UI vs API thing, as of 2.0 the UI gets it's list from the API so that's where the security has to be handled. |
@tgravescs , with the changes of history UI, REST API and web UI are now mixed. The base URL to list all the apps is through REST API. The key problem here is that in History Server we could have configured two ACLs (spark.acls.enable here and spark.history.ui.acls.enable here), this two ACLs checks different URLs, for example:
are controlled by And all the other URLs (web UI and REST API) for application details are controlled by If we configured differently for this two ACLs, then we will get some unexpected behaviors. So what I here fixed is to unify the ACL and offer the right behavior.
We worked with customers and they wish to filter and list apps based on the login users. But I don't have strong opinion it. I could change to what you suggested. |
so we should definitely fix the /api/v1/applications//logs to go through the acls. It looks like it should be protected in ApiRootResource.java. You have the app id so it needs to do something like the withSparkUI to get the acls included in that application. Like I mentioned the listing (/api/v1/applications) and /api/v1/applications/ (which is same info I believe as listing) were intentionally left open. I don't really see a reason to change that but if other people have a use case for it then perhaps we should make which pages are protected by acls configurable. on the history server I would expect spark.acls.enable=false and spark.history.ui.acls.enable=true, I can see where that could be confusing, perhaps we should document this better. spark.acls.enable on the history UI really is protecting the root UI, not the app level ui's. We could explicitly turn this off. |
Thanks @tgravescs for your reply.
This could be happened when history server and spark application shares same configuration file. That's why in our internal test the behavior is not expected. |
Change-Id: I378c5c4082a7a2567eb108847e9eb637f62fe748
Test build #75934 has finished for PR 17582 at commit
|
@@ -301,6 +301,14 @@ object HistoryServer extends Logging { | |||
logDebug(s"Clearing ${SecurityManager.SPARK_AUTH_CONF}") | |||
config.set(SecurityManager.SPARK_AUTH_CONF, "false") | |||
} | |||
|
|||
if (config.getBoolean("spark.acls.enable", config.getBoolean("spark.ui.acls.enable", false))) { | |||
logInfo(s"Either spark.acls.enable or spark.ui.acles.enable is configured, clearing it and " + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo ("acles")
|
||
if (config.getBoolean("spark.acls.enable", config.getBoolean("spark.ui.acls.enable", false))) { | ||
logInfo(s"Either spark.acls.enable or spark.ui.acles.enable is configured, clearing it and " + | ||
s"only honor spark.history.ui.acl.enable") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"only using ..."
@@ -184,14 +184,27 @@ private[v1] class ApiRootResource extends ApiRequestContext { | |||
@Path("applications/{appId}/logs") | |||
def getEventLogs( | |||
@PathParam("appId") appId: String): EventLogDownloadResource = { | |||
new EventLogDownloadResource(uiRoot, appId, None) | |||
try { | |||
// withSparkUI will throw NotFoundException if attemptId is existed for this application. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/is existed/exists
@@ -602,6 +606,36 @@ class HistoryServerSuite extends SparkFunSuite with BeforeAndAfter with Matchers | |||
} | |||
} | |||
|
|||
test("acls with downloading files") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this test just be a new URL added to testUrls
in "ui and api authorization checks"? What is it doing that is different than what that test does and why?
try { | ||
// withSparkUI will throw NotFoundException if attemptId is existed for this application. | ||
// So we need to try again with attempt id "1". | ||
withSparkUI(appId, None) { _ => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm... optimally you wouldn't do this, because loading the UI can be expensive. That would require recording extra information when listing applications and some massaging of internal APIs, so this is probably OK for now, but would be nice to track it separately as an enhancement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's true. Either we could get application acls in listing applications like what did before.
I assume download is not used very frequently, so workaround probably should be OK.
Change-Id: I037137cd3125547fc408dfc69c73ca9b5d3b1ce5
Test build #75967 has started for PR 17582 at commit |
Change-Id: Ia45d5aedab8c2a7bd65ccd842d29c5159de53604
Test build #75970 has started for PR 17582 at commit |
Jenkins, retest this please. |
Test build #75977 has finished for PR 17582 at commit
|
retest this please |
@jerryshao is the PR description still accurate? It seems you're not really implementing 2 anymore. |
Test build #75998 has finished for PR 17582 at commit
|
Just update the description, please review again @vanzin , thanks! |
changes lgtm. Did you file a jira to track changing to not use withSparkUI? If user is downloading because the file is huge and takes a long time to render or causes history server to have issue this would hurt that use case. We could wait and see if someone has that use case too. |
Thanks @tgravescs for your comments. Do you think it is a good idea to read out ACLs when |
As @vanzin said I think this is fine for now to get this fixed quickly, but filing a follow up jira makes sense. Actually this might be good to get into the 2.1.1 release if they are going to spin another rc. |
OK, thanks @tgravescs . |
LGTM. Merging to master / 2.2, will try 2.1 and 2.0 too. |
## What changes were proposed in this pull request? Current SHS (Spark History Server) two different ACLs: * ACL of base URL, it is controlled by "spark.acls.enabled" or "spark.ui.acls.enabled", and with this enabled, only user configured with "spark.admin.acls" (or group) or "spark.ui.view.acls" (or group), or the user who started SHS could list all the applications, otherwise none of them can be listed. This will also affect REST APIs which listing the summary of all apps and one app. * Per application ACL. This is controlled by "spark.history.ui.acls.enabled". With this enabled only history admin user and user/group who ran this app can access the details of this app. With this two ACLs, we may encounter several unexpected behaviors: 1. if base URL's ACL (`spark.acls.enable`) is enabled but user A has no view permission. User "A" cannot see the app list but could still access details of it's own app. 2. if ACLs of base URL (`spark.acls.enable`) is disabled, then user "A" could download any application's event log, even it is not run by user "A". 3. The changes of Live UI's ACL will affect History UI's ACL which share the same conf file. The unexpected behaviors is mainly because we have two different ACLs, ideally we should have only one to manage all. So to improve SHS's ACL mechanism, here in this PR proposed to: 1. Disable "spark.acls.enable" and only use "spark.history.ui.acls.enable" for history server. 2. Check permission for event-log download REST API. With this PR: 1. Admin user could see/download the list of all applications, as well as application details. 2. Normal user could see the list of all applications, but can only download and check the details of applications accessible to him. ## How was this patch tested? New UTs are added, also verified in real cluster. CC tgravescs vanzin please help to review, this PR changes the semantics you did previously. Thanks a lot. Author: jerryshao <sshao@hortonworks.com> Closes #17582 from jerryshao/SPARK-20239. (cherry picked from commit 5280d93) Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
No luck with 2.1, please file a separate PR. |
What about branch 2.0, do we also need to backport to it @vanzin ? |
It would be good, but maybe the 2.1 backport will merge cleanly to 2.0. |
OK, let me try it, thanks. |
Current SHS (Spark History Server) two different ACLs: * ACL of base URL, it is controlled by "spark.acls.enabled" or "spark.ui.acls.enabled", and with this enabled, only user configured with "spark.admin.acls" (or group) or "spark.ui.view.acls" (or group), or the user who started SHS could list all the applications, otherwise none of them can be listed. This will also affect REST APIs which listing the summary of all apps and one app. * Per application ACL. This is controlled by "spark.history.ui.acls.enabled". With this enabled only history admin user and user/group who ran this app can access the details of this app. With this two ACLs, we may encounter several unexpected behaviors: 1. if base URL's ACL (`spark.acls.enable`) is enabled but user A has no view permission. User "A" cannot see the app list but could still access details of it's own app. 2. if ACLs of base URL (`spark.acls.enable`) is disabled, then user "A" could download any application's event log, even it is not run by user "A". 3. The changes of Live UI's ACL will affect History UI's ACL which share the same conf file. The unexpected behaviors is mainly because we have two different ACLs, ideally we should have only one to manage all. So to improve SHS's ACL mechanism, here in this PR proposed to: 1. Disable "spark.acls.enable" and only use "spark.history.ui.acls.enable" for history server. 2. Check permission for event-log download REST API. With this PR: 1. Admin user could see/download the list of all applications, as well as application details. 2. Normal user could see the list of all applications, but can only download and check the details of applications accessible to him. New UTs are added, also verified in real cluster. CC tgravescs vanzin please help to review, this PR changes the semantics you did previously. Thanks a lot. Author: jerryshao <sshao@hortonworks.com> Closes apache#17582 from jerryshao/SPARK-20239. Change-Id: I65d5d0c5e5a76f08abbe2b7dd43a2e08d295f6b6
What changes were proposed in this pull request?
Current SHS (Spark History Server) two different ACLs:
With this two ACLs, we may encounter several unexpected behaviors:
spark.acls.enable
) is enabled but user A has no view permission. User "A" cannot see the app list but could still access details of it's own app.spark.acls.enable
) is disabled, then user "A" could download any application's event log, even it is not run by user "A".The unexpected behaviors is mainly because we have two different ACLs, ideally we should have only one to manage all.
So to improve SHS's ACL mechanism, here in this PR proposed to:
With this PR:
How was this patch tested?
New UTs are added, also verified in real cluster.
CC @tgravescs @vanzin please help to review, this PR changes the semantics you did previously. Thanks a lot.