New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-28419][SQL] Enable SparkThriftServer support proxy user's authentication . #25201
Conversation
@juliuszsompolski @wangyum Can you give some advise? |
ok to test |
Test build #108773 has finished for PR 25201 at commit
|
Test build #108813 has finished for PR 25201 at commit
|
Test build #108815 has finished for PR 25201 at commit
|
Test build #108816 has finished for PR 25201 at commit
|
Test build #108819 has finished for PR 25201 at commit
|
Test build #108821 has finished for PR 25201 at commit
|
Test build #108828 has finished for PR 25201 at commit
|
Test build #108830 has finished for PR 25201 at commit
|
Test build #108831 has finished for PR 25201 at commit
|
Test build #108839 has finished for PR 25201 at commit
|
Test build #108854 has finished for PR 25201 at commit
|
Test build #108856 has finished for PR 25201 at commit
|
Test build #108859 has finished for PR 25201 at commit
|
Test build #108948 has finished for PR 25201 at commit
|
@gatorsmile |
Test build #108966 has finished for PR 25201 at commit
|
Test build #108967 has finished for PR 25201 at commit
|
Test build #108975 has finished for PR 25201 at commit
|
@dongjoon-hyun @gatorsmile Can you help to review and give some advise about how to add Unit test |
Can one of the admins verify this patch? |
val existing = ugi.getCredentials() | ||
existing.mergeAll(originalCreds) | ||
ugi.addCredentials(existing) | ||
sparkSqlOperationManager.sessionToTokens.put(session.getSessionHandle, tokens) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @AngersZhuuuu, while applying your patch I found a bug here, session.getSessionHandle
will have an exception because session
is null
.
Then I moved session = HiveSessionProxy.getProxy(sessionWithUGI, sessionWithUGI.getSessionUgi)
to before the if statement will solve this issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @AngersZhuuuu, while applying your patch I found a bug here,
session.getSessionHandle
will have an exception becausesession
isnull
.Then I moved
session = HiveSessionProxy.getProxy(sessionWithUGI, sessionWithUGI.getSessionUgi)
to before the if statement will solve this issue.
yea, some mistake, here is just a way to implement this. If you have interesting in this way , you can see this
spark-thriftserver/spark-thriftserver#53
We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. |
this PR in non kerberos environment, work fine. but in kerberos environment cannot get the hdfs token! |
To be honest, I have never test this in non-kerberos environment since I don't have this env. |
ok |
this pr is good, I test this pr in spark2.4.3 envrionment, this pr is base on spark3. get hdfs token is different. |
cc @AngersZhuuuu @gatorsmile @dongjoon-hyun @vinodkc |
@AngersZhuuuu We tested this patch with spark 3.x version in kerberos environment and looks good. Would be great if we can get review done from experts. Thanks |
Your env with kerberos? |
👍 to get the PR reopened or get the reasons for why not ? |
@AngersZhuuuu i'm facing the same this problem on spark 3.2.2. Can u tell me how to fix it. Thank you |
@AngersZhuuuu can you please provide the steps as how you have tested this patch |
What changes were proposed in this pull request?
#25179
Since origin SparkThriftServer can't proxy client user's authentication about hive and HDFS authentications.
this PR is to enable SparkThriftServer can proxy user's authorities.
For this pr, we first obtain HDFS token for current UGI, the it can truly proxy HDFS priority. Then for each proxy user, when we create hiveClient, we obtain a Hive token for proxy user. Different user use different SparkSession.sharedState.
Then pass DFS toke to each Task, to proxy user's DFS behavior in Executors.
How was this patch tested?
manual test