Skip to content

Conversation

@Ocean22
Copy link
Contributor

@Ocean22 Ocean22 commented Nov 15, 2023

Why are the changes needed?

Close #5680

Background:

  1. In default case, when kyuubi submit spark sql engine on kubernetes, will generate spark driver pod name with "kyuubi-${app_name}-${engine_ref_id}-driver".
  2. And app_name will be "kyuubi_${shareLevel}${engineType}${appUser}_${engineRefId}" if not set by user.
  3. In result, we may get spark driver pod name with "kyuubi-kyuubi-${shareLevel}-${engineType}-${appUser}-${engineRefId}-${engineRefId}-driver".
  4. We were hoping for a more concise and readable name, such as "kyuubi-${shareLevel}-${engineType}-${appUser}-${app_name}-${engineRefId}-driver".

How was this patch tested?

  • Add some test cases that check the changes thoroughly including negative and positive cases if possible

  • Add screenshots for manual tests if appropriate
    1) Before modification:
    Left is unset spark.app.name, and right is set to “ocean”
    image

2) Modify the code
image

3) Build module and get the jar
image

4) Build a new images
image

5) After modification:
image

  • Run test locally before make a pull request

Was this patch authored or co-authored using generative AI tooling?

No

@yaooqinn
Copy link
Member

Hi @Ocean22, thanks for the PR.

There is also an original purpose for spark.app.name generation I want to share. In early versions of kyuubi, I used(just an attemption) it as a unique key for retrieval. However, we do not rely on it for this purpose.

So, since we are now touching this part, I suggest we make it as simple as possible. for example, the appUser is not necessary here as both YARN UI or Spark History UI display with a USER field, uuid is not necessary as it is also in the app tags,

Copy link
Contributor

@zwangsheng zwangsheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can just simply changed appendPodName logic in SparkProcessBuilder#appendPodNameConf

  1. remove kyuubi prefix
  2. only append engineRefId, when appName not contains it.

builder = engineType match {
case SPARK_SQL =>
conf.setIfMissing(SparkProcessBuilder.APP_KEY, defaultEngineName)
conf.set(SparkProcessBuilder.APP_KEY, generateAppKey(conf.getOption(SparkProcessBuilder.APP_KEY).getOrElse(null)))
Copy link
Contributor

@zwangsheng zwangsheng Nov 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, if user set spark app name, we should not over write this, so let's keep setIfMissing here.

@Ocean22
Copy link
Contributor Author

Ocean22 commented Nov 15, 2023

Thanks for guidance,I will continue to improve it when I have free time.

If the `appName` contains the `engineRefId`, it indicates that this `appName` is the `defaultEngineNmae` from `EngineRef.scala`, not set by the user.
@Ocean22
Copy link
Contributor Author

Ocean22 commented Nov 18, 2023

@yaooqinn @zwangsheng Hi, following our previous discussion, I have changed a way to handle this issue.

@Ocean22
Copy link
Contributor Author

Ocean22 commented Nov 18, 2023

@yaooqinn @zwangsheng Hi, following our previous discussion, I have changed a way to handle this issue.

Sorry, it still need some else change,I still need to do more testing

@Ocean22
Copy link
Contributor Author

Ocean22 commented Nov 18, 2023

@yaooqinn @zwangsheng Hi, following our previous discussion, I have changed a way to handle this issue.

Sorry, it still need some else change,I still need to do more testing

Now is OK, and this photo is the result that I test.
企业微信截图_17003043289059

Copy link
Contributor

@zwangsheng zwangsheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for late review, we should determine if resolvedResourceName carries engine ref id before determining if it's too long.

Comment on lines 137 to 144
lazy val resolvedResourceName = s"kyuubi-${getResourceNamePrefix(appName, engineRefId)}-driver"
if (forciblyRewrite || resolvedResourceName.length > DRIVER_POD_NAME_MAX_LENGTH) {
s"kyuubi-$engineRefId-driver"
} else if (appName.contains(engineRefId)){
getResourceNamePrefix(appName, "driver")
} else {
resolvedResourceName
}
Copy link
Contributor

@zwangsheng zwangsheng Nov 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
lazy val resolvedResourceName = s"kyuubi-${getResourceNamePrefix(appName, engineRefId)}-driver"
if (forciblyRewrite || resolvedResourceName.length > DRIVER_POD_NAME_MAX_LENGTH) {
s"kyuubi-$engineRefId-driver"
} else if (appName.contains(engineRefId)){
getResourceNamePrefix(appName, "driver")
} else {
resolvedResourceName
}
val resourceNamePrefix = if (appName.contains(engineRefId)) {
getResourceNamePrefix(appName, "")
} else {
getResourceNamePrefix(appName, engineRefId)
}
val resolvedResourceName = if (resourceNamePrefix.startsWith("kyuubi")) {
s"$resourceNamePrefix-driver"
} else {
s"kyuubi-$resourceNamePrefix-driver"
}
if (forciblyRewrite || resolvedResourceName.length > DRIVER_POD_NAME_MAX_LENGTH) {
s"kyuubi-$engineRefId-driver"
} else {
resolvedResourceName
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes,you are right.And I tested with your code and it passed, but about resourceNamePrefix. startsWith ("kyuubi"), if spark. app. name is set to the name starting with kyuubi, it will have the effect of the arrow in the photo. So should it be changed to resourceNamePrefix. startsWith ("kyuubi-")?
企业微信截图_17004520968190

@yaooqinn yaooqinn added this to the v1.9.0 milestone Nov 20, 2023
Copy link
Contributor

@zwangsheng zwangsheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, also FYI @pan3793

@pan3793 pan3793 changed the title [KYUUBI #5680] Optimize kyuubi spark engine driver&executor pod name generation [KYUUBI #5680] Optimize Spark engine pod name generation Nov 23, 2023
@pan3793 pan3793 modified the milestones: v1.9.0, v1.8.1 Nov 23, 2023
Copy link
Member

@pan3793 pan3793 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, though @yaooqinn's suggestion is not applied, we can do it in the future

@pan3793
Copy link
Member

pan3793 commented Nov 23, 2023

image there still is a minor issue on the final result

@Ocean22
Copy link
Contributor Author

Ocean22 commented Nov 23, 2023

image there still is a minor issue on the final result

Sorry,I thought this method would avoid this situation.Let me improve it
image

@codecov-commenter
Copy link

codecov-commenter commented Nov 23, 2023

Codecov Report

Attention: 6 lines in your changes are missing coverage. Please review.

Comparison is base (2568b0d) 61.46% compared to head (6dbac57) 61.34%.
Report is 68 commits behind head on master.

Files Patch % Lines
...scala/org/apache/kyuubi/util/KubernetesUtils.scala 53.84% 3 Missing and 3 partials ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master    #5695      +/-   ##
============================================
- Coverage     61.46%   61.34%   -0.13%     
  Complexity       23       23              
============================================
  Files           604      608       +4     
  Lines         35697    35941     +244     
  Branches       4889     4940      +51     
============================================
+ Hits          21942    22048     +106     
- Misses        11380    11495     +115     
- Partials       2375     2398      +23     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@pan3793
Copy link
Member

pan3793 commented Nov 23, 2023

how about this?

- private def getResourceNamePrefix(appName: String, engineRefId: String): String = {
-   s"$appName-$engineRefId"
+ private def getResourceNamePrefix(appName: String, engineRefId: Option[String]): String = {
+   engineRefId.map(refId => s"$appName-$refId").getOrElse(appName)

@Ocean22
Copy link
Contributor Author

Ocean22 commented Nov 23, 2023

how about this?

- private def getResourceNamePrefix(appName: String, engineRefId: String): String = {
-   s"$appName-$engineRefId"
+ private def getResourceNamePrefix(appName: String, engineRefId: Option[String]): String = {
+   engineRefId.map(refId => s"$appName-$refId").getOrElse(appName)

Is it right what I modified?
image

@pan3793
Copy link
Member

pan3793 commented Dec 4, 2023

@Ocean22 sorry I missed the message, it's correct, please push it

@Ocean22
Copy link
Contributor Author

Ocean22 commented Dec 4, 2023

how about this?

- private def getResourceNamePrefix(appName: String, engineRefId: String): String = {
-   s"$appName-$engineRefId"
+ private def getResourceNamePrefix(appName: String, engineRefId: Option[String]): String = {
+   engineRefId.map(refId => s"$appName-$refId").getOrElse(appName)

Is it right what I modified? image

@pan3793 Hi,the code passed the test,and it's the result
企业微信截图_17016561025430

@Ocean22
Copy link
Contributor Author

Ocean22 commented Dec 4, 2023

@Ocean22 sorry I missed the message, it's correct, please push it

Good,I will push it later.

Optimize engine pod name generation.
@pan3793
Copy link
Member

pan3793 commented Dec 4, 2023

@Ocean22 could you please provide a Public Name and email (which should be linked to your GitHub account), e.g. San Zhang <sanzhang@example.com>, otherwise, your contribution will not be counted by the GitHub

@pan3793 pan3793 closed this in 25eb53d Dec 4, 2023
pan3793 added a commit that referenced this pull request Dec 4, 2023
### _Why are the changes needed?_

Close #5680

**Background**:
1) In default case, when kyuubi submit spark sql engine on kubernetes, will generate spark driver pod name with "kyuubi-${app_name}-${engine_ref_id}-driver".
2) And app_name will be "kyuubi_${shareLevel}_${engineType}_${appUser}_${engineRefId}" if not set by user.
3) In result, we may get spark driver pod name with "kyuubi-kyuubi-${shareLevel}-${engineType}-${appUser}-${engineRefId}-${engineRefId}-driver".
4) We were hoping for a more concise and readable name, such as "kyuubi-${shareLevel}-${engineType}-${appUser}-${app_name}-${engineRefId}-driver".

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate
**1) Before modification:**
Left is unset `spark.app.name`, and right is set to “ocean”
![image](https://github.com/apache/kyuubi/assets/45907917/350aa09f-a9cb-4cc4-bdc6-71b085ba0824)

**2) Modify the code**
![image](https://github.com/apache/kyuubi/assets/45907917/f9cbef91-9e1a-482a-82bb-a78caf669438)

**3) Build module and get the jar**
![image](https://github.com/apache/kyuubi/assets/45907917/b2cf7595-dd54-46eb-bb79-354f3301499f)

**4) Build a new images**
![image](https://github.com/apache/kyuubi/assets/45907917/c0418f2d-dcde-4845-a568-84e920892359)

**5) After modification:**
![image](https://github.com/apache/kyuubi/assets/45907917/5db95d5b-2550-41b7-a11c-d8e4a6c0f09f)

- [ ] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request

### _Was this patch authored or co-authored using generative AI tooling?_

No

Closes #5695 from Ocean22/master.

Closes #5680

6dbac57 [Cheng Pan] Update kyuubi-server/src/main/scala/org/apache/kyuubi/util/KubernetesUtils.scala
5ddc9f6 [no 会 English] Update KubernetesUtils.scala
b685b60 [no 会 English] Update KubernetesUtils.scala
6a64e17 [no 会 English] Update KubernetesUtils.scala
1335bbd [no 会 English] Update EngineRef.scala
f2af955 [no 会 English] Update KubernetesUtils.scala
891f172 [no 会 English] Update KubernetesUtils.scala
c20b755 [no 会 English] Update EngineRef.scala
09f3ed1 [no 会 English] Update EngineRef.scala
64bd2c1 [no 会 English] Update EngineRef.scala

Lead-authored-by: Ocean22 <1058853680@qq.com>
Co-authored-by: no 会 English <45907917+Ocean22@users.noreply.github.com>
Co-authored-by: Cheng Pan <pan3793@gmail.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
(cherry picked from commit 25eb53d)
Signed-off-by: Cheng Pan <chengpan@apache.org>
@pan3793
Copy link
Member

pan3793 commented Dec 4, 2023

Thanks, merged to master/1.8

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[TASK][EASY] Optimize kyuubi spark engine driver pod name generation

5 participants