Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Enhance zookeeper authentication and acls of Kyuubi HA module #1204

Closed
3 tasks done
wForget opened this issue Oct 9, 2021 · 18 comments
Closed
3 tasks done
Labels
kind:feature Feature request

Comments

@wForget
Copy link
Member

wForget commented Oct 9, 2021

Code of Conduct

Search before asking

  • I have searched in the issues and found no similar issues.

Describe the feature

Support zookeeper sasl kerberos authentication of engine and more zookeeper ACL Schemes.

Motivation

No response

Describe the solution

Zookeeper 支持多种类型的 ACL Schemes,下面列出两种典型进行说明

  1. 支持 SASL Kerberos 类型的 ACL

    节点 ACLs 示例:

    'world,'anyone
    : r
    'sasl,'test
    : cdrwa
    

    配置规划:

    kyuubi.ha.zookeeper.acl.enabled=true
    kyuubi.ha.zookeeper.auth.sasl.kerberos=true   # 使用 sasl kerberos 类型认证
    
    # 还需要 kerberos 相关配置
    

    其它修改:

    • Yarn Cluster 模式运行 Engine 时,通过 --file 上传 keytab 文件,并指定相对路径访问 keytab 文件
  2. 支持 Digest 类型的 ACL

    节点 ACLs 示例:

    'world,'anyone
    : r
    'digest,'test:V28q/NynI4JI3Rk54h0r8O5kMug=
    : cdrwa
    

    配置规划:

    kyuubi.ha.zookeeper.acl.enabled=true
    kyuubi.ha.zookeeper.auth=digest:test:test   # 使用 auth string 进行认证,格式:scheme:expression:perms
    

    其它修改:

    • 将 kyuubi.ha.zookeeper.auth 配置解析为 AuthInfo 对象,通过 CuratorFrameworkFactory.Builder#authorization 方法添加认证

参考:

Additional context

I don’t have a deep understanding of the Zookeeper authentication mechanism. If you have any questions, please point them out.

Are you willing to submit PR?

  • Yes I am willing to submit a PR!
@wForget wForget added the kind:feature Feature request label Oct 9, 2021
@yaooqinn
Copy link
Member

Yarn Cluster 模式运行 Engine 时,通过 --file 上传 keytab 文件,并指定相对路径访问 keytab 文件

we can use --file for both client and cluster mode

@wForget
Copy link
Member Author

wForget commented Oct 11, 2021

Yarn Cluster 模式运行 Engine 时,通过 --file 上传 keytab 文件,并指定相对路径访问 keytab 文件

we can use --file for both client and cluster mode

when running in spark yarn-client mode, the driver should be able to read the keytab file of the kyuubi server. is it necessary to add the --file ?

@yaooqinn
Copy link
Member

That's true. But it will be difficult and hacky to let the Kyuubi server know whether the engine is in client mode or not.

@wForget
Copy link
Member Author

wForget commented Oct 11, 2021

That's true. But it will be difficult and hacky to let the Kyuubi server know whether the engine is in client mode or not.

After adding the --file configuration, the keytab path needs to be changed to a relative path. Is there a problem changing to a relative path in the yarn client mode?

@yaooqinn
Copy link
Member

Is there a problem changing to a relative path in the yarn client mode?

I guess it is not a problem and as same as in the yarn cluster mode, maybe also other cluster managers

@wForget
Copy link
Member Author

wForget commented Oct 11, 2021

Is there a problem changing to a relative path in the yarn client mode?

I guess it is not a problem and as same as in the yarn cluster mode, maybe also other cluster managers

OK, thanks @yaooqinn . I will not distinguish between client and cluster modes and test them.

@wForget
Copy link
Member Author

wForget commented Oct 11, 2021

Hi @yaooqinn , There is a problem with adding --file in the yarn-client mode and changing the path to a relative path.

spark conf:

--conf spark.master=yarn \
--conf spark.submit.deployMode=client \
--conf spark.files=/***/kyuubi.keytab \
--conf spark.kyuubi.kinit.keytab=kyuubi.keytab \
--conf spark.kyuubi.kinit.principal=*** \

error log:

Diagnostic: Failed to initialize SparkSQLEngine: kyuubi.kinit.keytab does not exists
org.apache.kyuubi.KyuubiException: Failed to initialize SparkSQLEngine: kyuubi.kinit.keytab does not exists
	at org.apache.kyuubi.engine.spark.SparkSQLEngine$.$anonfun$startEngine$1(SparkSQLEngine.scala:130)
	at org.apache.kyuubi.engine.spark.SparkSQLEngine$.$anonfun$startEngine$1$adapted(SparkSQLEngine.scala:113)
	at scala.Option.foreach(Option.scala:407)
	at org.apache.kyuubi.engine.spark.SparkSQLEngine$.startEngine(SparkSQLEngine.scala:113)
	at org.apache.kyuubi.engine.spark.SparkSQLEngine$.main(SparkSQLEngine.scala:154)
	at org.apache.kyuubi.engine.spark.SparkSQLEngine.main(SparkSQLEngine.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
	at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:951)
	at org.apache.spark.deploy.SparkSubmit$$anon$1.run(SparkSubmit.scala:165)
	at org.apache.spark.deploy.SparkSubmit$$anon$1.run(SparkSubmit.scala:163)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
	at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:163)
	at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
	at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
	at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1030)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1039)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.io.IOException: kyuubi.kinit.keytab does not exists
	at org.apache.kyuubi.ha.client.ZooKeeperClientProvider$.setUpZooKeeperAuth(ZooKeeperClientProvider.scala:106)
	at org.apache.kyuubi.ha.client.ZooKeeperClientProvider$.buildZookeeperClient(ZooKeeperClientProvider.scala:42)
	at org.apache.kyuubi.ha.client.ServiceDiscovery.initialize(ServiceDiscovery.scala:73)
	at org.apache.kyuubi.service.CompositeService.$anonfun$initialize$1(CompositeService.scala:40)
	at org.apache.kyuubi.service.CompositeService.$anonfun$initialize$1$adapted(CompositeService.scala:40)
	at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
	at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
	at org.apache.kyuubi.service.CompositeService.initialize(CompositeService.scala:40)
	at org.apache.kyuubi.service.AbstractFrontendService.initialize(AbstractFrontendService.scala:42)
	at org.apache.kyuubi.service.ThriftBinaryFrontendService.initialize(ThriftBinaryFrontendService.scala:104)
	at org.apache.kyuubi.service.CompositeService.$anonfun$initialize$1(CompositeService.scala:40)
	at org.apache.kyuubi.service.CompositeService.$anonfun$initialize$1$adapted(CompositeService.scala:40)
	at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
	at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
	at org.apache.kyuubi.service.CompositeService.initialize(CompositeService.scala:40)
	at org.apache.kyuubi.service.Serverable.initialize(Serverable.scala:46)
	at org.apache.kyuubi.engine.spark.SparkSQLEngine.initialize(SparkSQLEngine.scala:49)
	at org.apache.kyuubi.engine.spark.SparkSQLEngine$.$anonfun$startEngine$1(SparkSQLEngine.scala:126)
	... 22 more

@yaooqinn
Copy link
Member

--conf spark.kyuubi.kinit.keytab=kyuubi.keytab \

Looks like we have to detect the value of --conf spark.kyuubi.kinit.keytab=kyuubi.keytab \ at engine side after spark gets instantiated not pass it from server to engine

@wForget
Copy link
Member Author

wForget commented Oct 12, 2021

Looks like we have to detect the value of --conf spark.kyuubi.kinit.keytab=kyuubi.keytab \ at engine side after spark gets instantiated not pass it from server to engine

Sorry, @yaooqinn , I don't quite understand this reply. The current detection is on the Engine side.
This problem may be due to the fact that spark.files are not placed in the user.home directory in yarn-client mode, resulting in inaccessible relative paths. Should we distinguish between client and cluster modes?

@yaooqinn
Copy link
Member

  1. pass --conf spark.kyuubi.kinit.keytab=/the/absolute/path/of/kyuubi.keytab to engine side
  2. at engine side, replace it with relative path of kyuubi.keytab, if needed and the relative kyuubi.keytab is present, otherwise leave it empty or the AS-IS absolute path

@wForget
Copy link
Member Author

wForget commented Oct 12, 2021

  1. pass --conf spark.kyuubi.kinit.keytab=/the/absolute/path/of/kyuubi.keytab to engine side
  2. at engine side, replace it with relative path of kyuubi.keytab, if needed and the relative kyuubi.keytab is present, otherwise leave it empty or the AS-IS absolute path

OK, thanks you for your guidance. I will implement it in this way.

@wForget
Copy link
Member Author

wForget commented Oct 12, 2021

I have tested it. Please help me see if there are problems with the implementation and configuration. cc @yaooqinn

The results are as follows:

1. sasl kerberos

kyuubi conf:

kyuubi.ha.zookeeper.acl.enabled=true
kyuubi.ha.zookeeper.auth.sasl.kerberos=true

kyuubi.authentication   KERBEROS
kyuubi.kinit.principal  hue/***@****
kyuubi.kinit.keytab     /****/hue.keytab
kyuubi.ha.zookeeper.quorum=***:2181
kyuubi.ha.zookeeper.namespace=kyuubi_***-test
kyuubi.ha.zookeeper.acl.engine.enabled=true

acls:
1

2. digest

kyuubi conf:

kyuubi.ha.zookeeper.acl.enabled=true
kyuubi.ha.zookeeper.auth.sasl.kerberos=false
kyuubi.ha.zookeeper.auth=digest:hue:***

kyuubi.ha.zookeeper.quorum=***:2181
kyuubi.ha.zookeeper.namespace=kyuubi_***-test
kyuubi.ha.zookeeper.acl.engine.enabled=true

acls:
2

@yaooqinn
Copy link
Member

It looks fine to me. However, can we merge some of these configurations, it is now very hard to explain and use.

kyuubi.ha.zookeeper.acl.enabled=true
kyuubi.ha.zookeeper.auth.sasl.kerberos=false
kyuubi.ha.zookeeper.auth=digest:hue:***
kyuubi.ha.zookeeper.acl.engine.enabled=true

cc @zhouyifan279, do you have any idea, if we can add some unit tests to test with a kerberied zookeeper and acls

@yaooqinn
Copy link
Member

How about

kyuubi.ha.zookeeper.acl.enabled=true // deprecated this
kyuubi.ha.zookeeper.acl.engine.enabled=true  // remove this as it's still under dev
kyuubi.ha.zookeeper.auth.type=none/kerberos/digest
kyuubi.ha.zookeeper.engine.auth.type=none/kerberos/digest,  where none  = kyuubi.ha.zookeeper.acl.enabled=false
# we can introduce these a new PR later to avoid staging service keytab in engine side, which is unsecure
kyuubi.ha.zookeeper.auth.principal
kyuubi.ha.zookeeper.auth.keytab
kyuubi.ha.zookeeper.auth.digest=digest contents?

@wForget
Copy link
Member Author

wForget commented Oct 12, 2021

@yaooqinn Looks good, Can we add the following fallback configuration?

kyuubi.ha.zookeeper.engine.auth.type     fallback to kyuubi.ha.zookeeper.auth.type
kyuubi.ha.zookeeper.auth.principal       fallback to kyuubi.kinit.principal
kyuubi.ha.zookeeper.auth.keytab          fallback to kyuubi.kinit.keytab

@yaooqinn
Copy link
Member

@yaooqinn Looks good, Can we add the following fallback configuration?

kyuubi.ha.zookeeper.engine.auth.type     fallback to kyuubi.ha.zookeeper.auth.type
kyuubi.ha.zookeeper.auth.principal       fallback to kyuubi.kinit.principal
kyuubi.ha.zookeeper.auth.keytab          fallback to kyuubi.kinit.keytab

SGTM also cc @turboFei

@zhouyifan279
Copy link
Contributor

zhouyifan279 commented Oct 13, 2021

It looks fine to me. However, can we merge some of these configurations, it is now very hard to explain and use.

kyuubi.ha.zookeeper.acl.enabled=true
kyuubi.ha.zookeeper.auth.sasl.kerberos=false
kyuubi.ha.zookeeper.auth=digest:hue:***
kyuubi.ha.zookeeper.acl.engine.enabled=true

cc @zhouyifan279, do you have any idea, if we can add some unit tests to test with a kerberied zookeeper and acls

As we already have org.apache.kyuubi.KerberizedTestHelper to setup a KDC, and zookeeper uses JAAS to integrates with kerberos , it should be easy to setup an kerberied embedded zookeeper server.

@wForget , would you mind to add these test cases ?

I'm also glad to do the work if you have no time.

@wForget
Copy link
Member Author

wForget commented Oct 13, 2021

As we already have org.apache.kyuubi.KerberizedTestHelper to setup a KDC, and zookeeper uses JAAS to integrates with kerberos , it should be easy to setup an kerberied embedded zookeeper server.

@wForget , would you mind to add these test cases ?

I'm also glad to do the work if you have no time.

Thanks @zhouyifan279 , I still have some configurations to be adjusted. After completion, I will improve the test cases according to your suggestions.

wForget added a commit to wForget/kyuubi that referenced this issue Oct 20, 2021
wForget added a commit to wForget/kyuubi that referenced this issue Oct 20, 2021
zhouyifan279 pushed a commit to zhouyifan279/kyuubi that referenced this issue Nov 12, 2021
pan3793 pushed a commit that referenced this issue Nov 12, 2021
…ion and acls

Co-authored-by: wForget <643348094@qq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:feature Feature request
Projects
None yet
Development

No branches or pull requests

3 participants