-
Notifications
You must be signed in to change notification settings - Fork 871
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEATURE] 让 Kyuubi Engine 跑在阿里MaxCompute或AWS Glue上 #3409
Comments
Hello @kevinclcn, |
Have a quick look at the doc, I think Kyuubi should work out-of-box w/ MaxCompute, but not Glue. Since Kyuubi uses |
@kevinclcn would you like to try deploying Kyuubi on MaxCompute? and the docs are welcome. |
Sure. |
I'm trying to run Kyuubi with Adb spark (it is similar to MaxCompute Spark), I got this error in Adb Spark:
I'm using a standalone Kyuubi which has an EmbeddedZookeeper service, so the question is how to set the connection string of zookeeper to be the I've tried set kyuubi.zookeeper.embedded.client.port.address to be the public IP, it does not work. |
the embedded zk is not recommended for production, it's designed to use for local testing, please deploy a dedicated zk first |
After fixing the connection between the zookeeper and Adb Spark, I got a
any ideas to fix it? @pan3793 part of my kyuubi conf:
|
Does ADB Spark allow Kyuubi Server to access the Driver through IP directly? |
And |
Kyuubi uses ISO-8601 standard duration format, please read comments of |
No, the Kyuubi server can not access this IP, I'll try to fix it.
|
sorry, my bad. I've read the doc, just forget the unit. |
Yes, that's exactly how Kyuubi works, you got it. |
Turns out the Adb Spark cluster has two NICs(Network Interface Cards), and the default NIC is used when the service starts. Is there a way to get it to boot and register to the second NIC? |
Seems it is using this findLocalInetAddress function to find the default IP. Currently, there is no easy way to use the second NIC, am I right? @pan3793 |
Yes, we need to enhance this part to make it more flexible, e.g. introduce an address-binding election strategy, it also helps for K8s environment. |
Cool. I guess this is the last problem to make it work. I may not have the ability to contribute the code, but I'd like to write a doc. Let me know if there is any progress on this feature. |
Finally solved, I wrote a doc: https://gist.github.com/badbye/2618d6ef47a042427836d4ba9518e203 |
Code of Conduct
Search before asking
Describe the feature
目前Kyuubi Engine可以运行在Yarn或K8s上以执行通过JDBC提交的任务,但在云原生环境里,通常云提供商都提供弹性的云计算资源,比如阿里云的MaxCompute和AWS Glue。如果Kyuubi Engine支持运行在MaxCompute和Glue上,可以大大降低Spark的运行成本和维护成本。
阿里云的通过MaxCompute运行spark任务的API:
https://help.aliyun.com/document_detail/102357.html
AWS的通过Glue运行spark任务的API:
https://docs.aws.amazon.com/glue/latest/dg/aws-glue-api-jobs-job.html#aws-glue-api-jobs-job-CreateJob
Motivation
目前Kyuubi Engine只能运行在Yarn或K8S上,这样在云原生的环境里要么需要申请EMR资源,要么需要申请K8S计算节点,这里存在两个问题:
Describe the solution
通过将Kyuubi Engine运行在MaxCompute和Glue这种弹性Spark计算资源上,可以让离线批量任务和交互式查询共用相同的spark sql能力,也可以让计算资源有弹性,节省基础设施成本和运维成本。
Additional context
No response
Are you willing to submit PR?
The text was updated successfully, but these errors were encountered: